logo

Crowdly

Browser

Add to Chrome

Introduction to Data Science (LTAT.02.002)

Looking for Introduction to Data Science (LTAT.02.002) test answers and solutions? Browse our comprehensive collection of verified answers for Introduction to Data Science (LTAT.02.002) at moodle.ut.ee.

Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!

Your task is to train a regression model on a dataset that contains 2000 numeric features. You have tried and found out that using all of these features results in massive overfitting. Can the following methods help?

View this question

You are given 10 people’s shopping lists:

ID

Items

1

Shirt, Lantern, Meat, Milk

2

Bread, Shampoo, Ice Cream, Meat

3

Eggs, Frozen Pizza, Ice Cream

4

Ice Cream, Lantern, Meat, Milk

5

Milk, Bread, Sugar

6

Salt, Milk, Muesli

7

Chewing gum, Milk, Sour Cream

8

Sweets, Nuts, Cookies

9

Ice Cream, Bread

10

Meat, Milk, Butter, Bread

Using these data, determine what the confidence is for the association rule Milk -> Meat ?

View this question
Suppose you have trained a model that can predict any student’s exam score based on homework scores. After the exam, you want to evaluate the quality of predictions. Which measures can be used to evaluate how far the predicted scores are from the actual scores?
View this question

You decided to use grid search to tune the hyperparameters of your RandomForest model on your test set. Then you estimate the accuracy of your model on that test set. Your estimate of the accuracy will be:

View this question

You have a dataset that contains information about land prices in different areas of the world. Your task is to create a model that would be capable of predicting the price of any piece of land. Which of the following methods could be used for this task?

View this question

Which tasks can be achieved with principal component analysis (PCA) without having to combine with other methods?

View this question

The practice exam is only for practicing, and the points gathered here do not contribute to the total points in this course!

The practice exam consists of two parts:

Part 1 has a time limit of 15 minutes and consists of 7 short questions (2 questions each worth 0.5 points and 5 questions each worth 0.8 points).

Part 2 has a time limit of 15 minutes and consists of 1 long question (worth 5 points).

The tasks of the practice exam will be made available for all of you and there is no need to save them for yourself.

0%
0%
View this question

On the topic of statistical hypothesis testing, please do the following: 

1. Come up with and describe a data mining scenario where statistical

hypothesis testing could be used. 

2. Define a null hypothesis and the

corresponding alternative hypothesis in this scenario.

3. Explain what confidence

level and p-value would mean in this scenario.

4. What are the two different possible

results that the statistical test can have? 

5. What can the data miner conclude

from one or the other result?

Note that we expect about 1-3 sentences of text for each of the above points. Please start the answer to each point from a new line and with the respective number (1., 2., 3., 4., 5.).

View this question

Want instant access to all verified answers on moodle.ut.ee?

Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!

Browser

Add to Chrome