logo

Crowdly

Browser

Add to Chrome

Data Analytics (Eng) / Data Analitika (Ing) - 344

Looking for Data Analytics (Eng) / Data Analitika (Ing) - 344 test answers and solutions? Browse our comprehensive collection of verified answers for Data Analytics (Eng) / Data Analitika (Ing) - 344 at stemlearn.sun.ac.za.

Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!

The graph provided below displays the distribution of each of the thirteen features. What three features will influence the distance calculation the most?

0%
0%
0%
0%
0%
0%
100%
0%
0%
0%
100%
100%
0%
View this question

Hierarchical clustering is usually defined as an optimization problem

0%
100%
View this question

The result of a clustering algorithm can be used to perform classification.

100%
0%
View this question

Consider the proximity matrix provided below.

     A       B       C        D

A   0       3       2.2      1

B   3       0      1.41     3.16

C  2.2    1.41   0         2

D 1       3.16   2         0

What instances should be joined when average link is used?

0%
0%
0%
0%
100%
0%
View this question

Consider the proximity matrix provided below.

     A       B       C        D

A   0       3       2.2      1

B   3       0      1.41     3.16

C  2.2    1.41   1        2

D 1       3.16   2         0

Does the proximity matrix contain any errors?

100%
0%
View this question

Consider the dendrogram illustrated below. If a horizontal line is drawn at 3.5 how many clusters will be formed?

4.0     xxxxx 
3.5     x   x 
3.0   xxxx  x 
2.5   x  x  x 
2.0 xxxx x  x 
1.5 x  x x xxx
1.0xxx x x x x
0.5x x x x x x
0x x x x x x
 A B C D E F
View this question
Given two clusters, Cluster A containing the instances { (1, 2), (3, 4), (5, 6) } and Cluster B containing the instances { (7, 8), (9, 10), (11, 12) }, calculate the single link distance between these two clusters using Manhatten distance.
View this question

Assume that the K-medoid clustering algorithm has been applied to a data set described by two descriptive features. The table below shows the instances assigned to cluster 1. What is the ID of the instance that should represent the cluster? Assume that Manhattan distance is used.

ID   d1   d2

4     3     4

5     6     2

9     6     4

13   7     3

15   7     4

17   7     6

19   8     5

0%
0%
0%
0%
0%
100%
0%
View this question

Which of the following best describes the steps involved in the k-means++ initialization process?

0%
0%
0%
100%
View this question

What is a significant advantage of using the X-Means clustering algorithm over the standard k-means algorithm?

0%
0%
100%
0%
View this question

Want instant access to all verified answers on stemlearn.sun.ac.za?

Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!

Browser

Add to Chrome