Шукаєте відповіді та рішення тестів для Course 40420? Перегляньте нашу велику колекцію перевірених відповідей для Course 40420 в learning.monash.edu.
Отримайте миттєвий доступ до точних відповідей та детальних пояснень для питань вашого курсу. Наша платформа, створена спільнотою, допомагає студентам досягати успіху!
The following output in the summary of a model-based clustering. Which model would be considered to be the third best?
Bayesian Information Criterion (BIC):
EII VII EEI VEI EVI VVI EEE VEE EVE VVE
1 -3907 -3907 -3925 -3925 -3925 -3925 -3114 -3114 -3114 -3114
2 -3174 -3143 -3058 -3048 -3040 -3025 -2663 -2616 -2667 -2609
3 -2910 -2921 -2901 -2892 -2884 -2876 -2512 -2510 -2517 -2515
4 -2777 -2708 -2740 -2667 -2755 -2682 -2530 -2497 -2562 -2532
5 -2645 -2620 -2645 -2625 -2685 -2679 -2515 -2504 -2563 -2549
6 -2584 -2642 -2662 -2650 -2655 -2683 -2525 -2532 -2590 -2579
7 -2598 -2609 -2609 -2621 -2669 -2690 -2537 -2552 -2605 -2608
8 -2587 -2589 -2596 -2596 -2661 -2679 -2550 -2577 -2642 -2651
9 -2599 -2601 -2612 -2610 -2690 -2696 -2586 -2590 -2668 -2683
EEV VEV EVV VVV
1 -3114 -3114 -3114 -3114
2 -2648 -2587 -2650 -2583
3 -2534 -2533 -2551 -2550
4 -2581 -2549 -2619 -2597
5 -2600 -2592 -2654 -2657
6 -2634 -2635 -2714 -2719
7 -2659 -2682 -2737 -2760
8 -2713 -2735 -2807 -2827
9 -2760 -2784 -2872 -2888
Top models based on the BIC criterion:
VEE,4 VEE,5 VEE,3
-2497 -2504 -2510
Consider the following distance matrix representing the dissimilarities between four objects (A, B, C, and D) in a hierarchical clustering analysis:
A B C D
A 0.0 5.0 9.0 6.0
B 5.0 0.0 7.0 8.0
C 9.0 7.0 0.0 3.0
D 6.0 8.0 3.0 0.0
If you perform single-linkage hierarchical clustering (also known as the nearest neighbor method) on this data, what is the height at which the clusters {A, B} and {C, D} will be merged?
This data set has 5 observations.
id x1 x2
<int> <dbl> <dbl>
1 1 -1.29 0.56
2 2 0.92 -0.40
3 3 0.61 -0.59
4 4 -0.77 -0.29
5 5 2.12 -0.14
The following is the interpoint distance matrix between the 5 observations.
1 2 3 4
2 2.6
3 3.0 1.1
4 3.2 2.6 1.7
5 5.3 3.2 2.5 3.1
Observations 2 and 3 would be joined at the first step in hierarchical clustering, at a distance of 1.1. If observation 4 is then joined at the next step, using complete linkage, what would be the distance reported between cluster (2,3) and observation 4?
This summarises a linear support vector machine fit to the last 25 years of Australian tourism data modeling the difference in patterns between Cairns and Melbourne. Only holiday travel is examined, and the four variables used are Q1, Q2, Q3, Q4 which are quarters in the year. Each series was standardised on itself, so values represent proportion of travel for holidays relative to other types of travel to the city in each quarter of each year. We are curious to determine whether holiday travel tends to be in different seasons in the two locations.
> melb_cairns_svm_h$fit@b
[1] -2.4
> melb_cairns_svm_h$fit@SVindex
[1] 3 5 9 10 11 12 14 17 19 20 21 23 24 30 35 36 37 38 39 40
[21] 41 42 44 46 47 50
> melb_cairns_svm_h$fit@coef
[[1]]
[1] -10.0 -10.0 -10.0 -10.0 -10.0 -10.0 -10.0 -10.0 -10.0 -1.8
[11] -10.0 -10.0 -10.0 1.8 10.0 10.0 10.0 10.0 10.0 10.0
[21] 10.0 10.0 10.0 10.0 10.0 10.0
The top few rows of the data are:
> melb_cairns[,c(1,3,5,7,9)] |> slice_head(n=5)
# A tibble: 5 × 5
Region Q1 Q2 Q3 Q4
1 Cairns 0.259 0.220 0.629 0.300
2 Cairns 0.205 0.345 0.586 0.348
3 Cairns 0.272 0.475 0.500 0.275
4 Cairns 0.173 0.533 0.523 0.380
5 Cairns 0.360 0.498 0.565 0.374
The coefficients for the separating hyperplane plane are calculated to be:
> melb_cairns_betas_h
Q1 Q2 Q3 Q4
4.76 0.80 -8.95 -0.57
This summarises a tree fit to the last 25 years of Australian tourism data modeling the difference in patterns between Cairns and Melbourne. Only holiday travel is examined, and the four variables used are Q1, Q2, Q3, Q4 which are quarters in the year. Each series was standardised on itself, so values represent proportion of travel for holidays relative to other types of travel to the city in each quarter of each year. We are curious to determine whether holiday travel tends to be in different seasons in the two locations.
n= 50
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 50 25 Cairns (0.500 0.500)
2) Q3>=0.43 24 1 Cairns (0.958 0.042) *
3) Q3< 0.43 26 2 Melbourne (0.077 0.923)
6) Q3>=0.39 7 2 Melbourne (0.286 0.714)
12) Q2< 0.32 2 0 Cairns (1.000 0.000) *
13) Q2>=0.32 5 0 Melbourne (0.000 1.000) *
7) Q3< 0.39 19 0 Melbourne (0.000 1.000) *
From the following summaries:
\bar{x}_A=5, \bar{x}_B=8, \sigma^2_A=2, \sigma^2_B=4
answer the following questions:
From the following plot of data, what would likely be the pooled variance-covariance matrix?
VC1 | VC2 | VC3 | VC4
x1 x2 | x1 x2 | x1 x2 | x1 x2
x1 5.6 -3.0 | x1 1.03 0.98 | x1 5.4 2.9 | x1 1.14 -0.98
x2 -3.0 5.6 | x2 0.98 1.14 | x2 2.9 4.9 | x2 -0.98 1.03
Which of the following categorical response variables matches the binary matrix coding below:
\begin{align*}\begin{array}{ccc} A & B & C\\ 1 & 0 & 0\\ 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ 0 & 0 & 1\\ 1 & 0 & 0\\\end{array}\end{align*}
/* tables with alternating shading */
.table_shade {
border-collapse: collapse;
border-spacing: 0;
border:1px solid #FFFFFF;
background-color: #FFFFFF;
}
.table_shade th {
border:1px solid #FFFFFF;
background: #D5D5D5;
}
.table_shade td {
border:1px solid #FFFFFF;
}
.table_shade .odd {
background: #EEEEEE;
}
.table_shade .even {
background: #FBFBFB;
}
Which of these is the primary difference between linear discriminant analysis and quadratic discriminant analysis?
/* tables with alternating shading */
.table_shade {
border-collapse: collapse;
border-spacing: 0;
border:1px solid #FFFFFF;
background-color: #FFFFFF;
}
.table_shade th {
border:1px solid #FFFFFF;
background: #D5D5D5;
}
.table_shade td {
border:1px solid #FFFFFF;
}
.table_shade .odd {
background: #EEEEEE;
}
.table_shade .even {
background: #FBFBFB;
}
Why is it recommended to standardise the variables before doing linear discriminant analysis?