Шукаєте відповіді та рішення тестів для Data Analytics (Eng) / Data Analitika (Ing) - 344? Перегляньте нашу велику колекцію перевірених відповідей для Data Analytics (Eng) / Data Analitika (Ing) - 344 в stemlearn.sun.ac.za.
Отримайте миттєвий доступ до точних відповідей та детальних пояснень для питань вашого курсу. Наша платформа, створена спільнотою, допомагає студентам досягати успіху!
Assume that the K-medoid clustering algorithm has been applied to a data set described by two descriptive features. The table below shows the instances assigned to cluster 1. What is the ID of the instance that should represent the cluster? Assume that Manhattan distance is used.
ID d1 d2
4 3 4
5 6 2
9 6 4
13 7 3
15 7 4
17 7 6
19 8 5
What is a significant advantage of using the X-Means clustering algorithm over the standard k-means algorithm?
Given the instances , the randomly selected points , and the randomly sampled points , calculate the Hopkins statistic. Round your answer to three decimal places.
The table below shows a small data set in which each instance is described by three features. The k-means clustering algorithm is to be applied to this data set with k = 2 and using Euclidean distance. The initial centroids for the two clusters C1 and C2 are c1 = (0.4; 0.3; 0.2) and c2 = (0.7; 0.8; 0.7).
d1 d2 d3 dist(c1) dist(c2)
0.392 1.258 0.666 1.065356 0.552977
0.251 1.781 1.495 1.972964 1.340144
0.823 0.042 1.254 1.164650 0.946894
0.917 0.961 0.055 0.851607 0.699310
0.736 1.694 0.686 1.514044 0.894834
1.204 0.605 0.351 0.873065 0.643306
0.778 0.436 0.220 0.402219 0.607437
1.075 1.199 0.141 1.125747 0.782500
0.854 0.654 0.771 0.810847 0.223770
Assign the data points to the appropriate clusters and calculate the new cluster centroids. What is the d1 coordinate of the centroid of the first cluster?
We want to use clustering to transmit a compressed input signal over a network. Our network consists of an encoder and a decoder. The encoder uses pre-computed cluster centroids found with k-means clustering and Euclidean distance to find a representative example of the input signal. The ID of the representative example is then transmitted over the network. Given the cluster centroids:
ID d1 d2 d3
1 0.5 0.3 0.4
2 0.7 0.8 0.3
3 0.6 0.6 0.5
4 0.2 0.1 0.15
What ID will be transmitted if the input signal is (0.4, 0.4, 0.4)?
Given the instances and the randomly selected points , calculate the distance from each point in to its nearest neighbour in . Then, compute the average value.
Consider two instances a and b each described by the categorical feature (Feature A) and the categorical feature (Feature B). Given that the values for a are ("Yes", "<20") and the values for b are ("Yes", "<20"), what is the Gower distance between a and b?