Looking for Data Analytics (Eng) / Data Analitika (Ing) - 344 test answers and solutions? Browse our comprehensive collection of verified answers for Data Analytics (Eng) / Data Analitika (Ing) - 344 at stemlearn.sun.ac.za.
Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!
Before we cluster the data we first need to evaluate the cluster tendency of the data. To evaluate the cluster tendency of the data we plan to use visual inspection of clusters. How many distance calculations do we need to perform to compute the dissimilarity matrix optimally?
The dendrogram below shows the results obtained when clustering the players using hierarchical clustering. If the dendrogram is cut at y = 25, how many clusters would remain?
The dendrogram below shows the results obtained when clustering the players using hierarchical clustering. The name of each player is displayed on the x-axis and the colour assigned to the player name is based on the clustering results previously obtained with k-means clustering. If we would have selected three clusters using the hierarchical clustering approach, would we have obtained similar results as the k-means clustering?
To validate if the clusters were not found by chance, random sampling was performed and k-means clustering was reapplied, setting k equal to three. The results of the clustering are illustrated in the parallel coordinate plot provided below. Are the clusters unstable?
The parallel coordinate plot below visualises the clusters assigned to each instance when k was set to three.
Based on the parallel coordinate plot provided select the most correct option: Players assigned to cluster one are:
Assume that we want to cluster the instances using k-means clustering. Since we do not know the number of clusters e.g. k in advance we need to try different values of k. The graph below displays the total within-cluster sum of squares (w) obtained for different values of k. W increased when k increased from eight to nine. Can w increase when k increase?
Inspect the ordered dissimilarity image provided in the figure below for potential clusters. Is the data completely random?
The graph provided below displays the distribution of each of the 13 features after a specific numerical transformation has been applied. Based on the graph provided was normalisation or standardisation performed?