Crowdly

Add to Chrome

Data Analytics (Eng) / Data Analitika (Ing) - 344

Looking for Data Analytics (Eng) / Data Analitika (Ing) - 344 test answers and solutions? Browse our comprehensive collection of verified answers for Data Analytics (Eng) / Data Analitika (Ing) - 344 at stemlearn.sun.ac.za.

Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!

When converting a numerical feature into a categorical feature with 5 categories for use in a decision tree root node split, the size (in terms of number of instances) of all the child nodes of the root after splitting on the categorical feature, is the same regardless of whether equal width or equal frequency binning is used.

True

False

View this question

Consider a scenario where a dataset (D) has 10 instances. Each of these instances is represented by 4 descriptive feature (F1, F2, F3 and F4) and has 1 target feature that takes binary values. Because the target feature is binary, only 1 bit is required to store the target value of each instance. Features F1, F2, F3 and F4 are all categorical, with 5, 4, 3, and 2 possible discrete values, respectively. The decision trees to be trained on D will be deployed and utilized to classify previously unseen instances on a high-performance computer, hence algorithm efficiency is not a concern. Given this context, which measure is the most appropriate for the ID3 algorithm to induce a decision tree using D?

Variance

Chi-square

Entropy-based information gain

Gini-index-based information gain

View this question

What is the cost of the product associated with the transaction with an ID of 6000 in the given dataset?

View this question

What is the entropy of the last 10 instances of the given dataset?

View this question

Use rpart to build a decision tree based on the dataframe created from the given dataset. Train the model on only the descriptive features, but excluding ID. Set the parameters as follows: complexity parameter should be 0.001, the minimum split size should be 10, the minimum leaf size should be 5, and the maximum tree depth should be 3. How many instances are represented in the right most leaf node in the lowest level of the resulting decision tree?

View this question

Use rpart to build a decision tree based on the dataframe created from the given dataset. Train the model on only the descriptive features, but excluding ID. Set the parameters as follows: complexity parameter should be 0.001, the minimum split size should be 10, the minimum leaf size should be 5, and the maximum tree depth should be 2. How many instances are represented in the left most leaf node in the lowest level of the resulting decision tree?

View this question

Use rpart to build a decision tree based on the last 1000 instances from the given dataset. Train the model on only the descriptive features, but excluding ID. Set the parameters as follows: complexity parameter should be 0.01, the minimum split size should be 2, the minimum leaf size should be 1, and the maximum tree depth should be 5. Use this tree to make a prediction on the instance represented by the following vector (only the target feature is excluded from this vector): (10001,B,Flight,3,4,2412,3,high,F,6,5.68).

View this question

What is the gini index information gain of the instances with IDs in the range of [6000, 6020] when they are split on Product_importance? Note the range is inclusive of the boundary values.

View this question

What is the gini index of the instances with IDs in the range of [6000, 6020]? Note the range is inclusive of the boundary values.

View this question

What is the entropy information gain of the instances with IDs in the range of [6000, 6020] when they are split on Gender? Note the range is inclusive of the boundary values.

View this question

Want instant access to all verified answers on stemlearn.sun.ac.za?

Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!

Add to Chrome

Telegram Instagram TikTok Question Bank