Looking for Data Analytics (Eng) / Data Analitika (Ing) - 344 test answers and solutions? Browse our comprehensive collection of verified answers for Data Analytics (Eng) / Data Analitika (Ing) - 344 at stemlearn.sun.ac.za.
Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!
Is the following true for X-means clustering? When a cluster is selected to be split, it is best to select that cluster with the largest intra-cluster distance.
Given the data set D = {(3, 4), (2, 4), (8, 9)}, will k-medoids or k-means clustering produce a centroid that is least sensitive to outliers?
Given the instances , the randomly selected points , and the randomly sampled points , calculate the Hopkins statistic. Round your answer to three decimal places.
The table below shows a small data set in which each instance is described by three features. The k-means clustering algorithm is to be applied to this data set with k = 2 and using Euclidean distance. The initial centroids for the two clusters C1 and C2 are c1 = (0.4; 0.3; 0.2) and c2 = (0.7; 0.8; 0.7).
d1 d2 d3 dist(c1) dist(c2)
0.392 1.258 0.666 1.065356 0.552977
0.251 1.781 1.495 1.972964 1.340144
0.823 0.042 1.254 1.164650 0.946894
0.917 0.961 0.055 0.851607 0.699310
0.736 1.694 0.686 1.514044 0.894834
1.204 0.605 0.351 0.873065 0.643306
0.778 0.436 0.220 0.402219 0.607437
1.075 1.199 0.141 1.125747 0.782500
0.854 0.654 0.771 0.810847 0.223770
Assign the data points to the appropriate clusters and calculate the new cluster centroids. What is the d1 coordinate of the centroid of the first cluster?
We want to use clustering to transmit a compressed input signal over a network. Our network consists of an encoder and a decoder. The encoder uses pre-computed cluster centroids found with k-means clustering and Euclidean distance to find a representative example of the input signal. The ID of the representative example is then transmitted over the network. Given the cluster centroids:
ID d1 d2 d3
1 0.5 0.3 0.4
2 0.7 0.8 0.3
3 0.6 0.6 0.5
4 0.2 0.1 0.15
What ID will be transmitted if the input signal is (0.4, 0.4, 0.4)?
Given the instances and the randomly selected points , calculate the distance from each point in to its nearest neighbour in . Then, compute the average value.
Consider two instances a and b each described by the categorical feature (Feature A) and the categorical feature (Feature B). Given that the values for a are ("Yes", "<20") and the values for b are ("Yes", "<20"), what is the Gower distance between a and b?