Шукаєте відповіді та рішення тестів для Data Mining and Decision Support-Lecture,Section-1-Fall 2025? Перегляньте нашу велику колекцію перевірених відповідей для Data Mining and Decision Support-Lecture,Section-1-Fall 2025 в moodle.nu.edu.kz.
Отримайте миттєвий доступ до точних відповідей та детальних пояснень для питань вашого курсу. Наша платформа, створена спільнотою, допомагає студентам досягати успіху!
During your MLP cross-validation process, you observed clear overfitting patterns having a big gap/difference between the training and test errors. Which one will be right direction(s) to resolve it?
Compared to the SVC in sklearn, SVR has one important parameter, which decides a margin of tolerance meaning that no penalty is associated in the training loss function with points predicted within this distance from the actual value. What is the parameter?
How do we prevent overfitting?
Choose all the right evaluation metrics for classification.
(Note that, TP: True Positive, FN: False Negative, FP: False Positive and TN: True Negative).
For the below binary classification examples, which one is more important among 'a false negative' and 'a false positive'? (NOTE: Deducted points for the wrong answer(s). 0 point for 'Not Answering'. If students have specific/unique assumptions, please write them down on your plank paper).
Example 1: What if Jury or judge decides to make a criminal go free? (assuming the 'positive' is the prediction of 'criminal').
Example 2: Assume there is an airport ‘A’ which has received high-security threats and based on certain characteristics they identify whether a particular passenger can be a threat or not. Due to a shortage of staff, they decide to scan passengers being predicted as risk positives by their predictive model. What will happen if a true threat customer is being flagged/predicted as non-threat by airport model? (assuming the 'positive' is the prediction of 'threat/risk').
While high bias and low variance could result in over-fitting, low bias and high variance could result in under-fitting.
(NOTE: Deducted points for the wrong answer(s). 0 point for 'Not Answering').
Based on unlabeled training data, supervised learning (SL) algorithm(s) can be used to build an optimized model.
(NOTE: Deducted points for the wrong answer(s). 0 point for 'Not Answering').
Let's assume that a data scientist splits a dataset to training and test datasets using a test size of 0.3; and then he/she adopts a K-fold (k = 5) cross-validation technique. If so, which dataset(s) can be used for a validation dataset? (NOTE: Deducted points for the wrong answers. 0 point for 'Not Answering').
In general ML/DM process, the feature selection/reduction should be processed after the model selection.
For the learning rate in gradient descent, if it is too big, it will require a lot of training time.
(NOTE: Deducted points for the wrong answer(s). 0 point for 'Not Answering').