Crowdly

Add to Chrome

Course 40420

Looking for Course 40420 test answers and solutions? Browse our comprehensive collection of verified answers for Course 40420 at learning.monash.edu.

Get instant access to accurate answers and detailed explanations for your course questions. Our community-driven platform helps students succeed!

The following output in the summary of a model-based clustering. Which model would be considered to be the third best?

Bayesian Information Criterion (BIC): 
    EII   VII   EEI   VEI   EVI   VVI   EEE   VEE   EVE   VVE
1 -3907 -3907 -3925 -3925 -3925 -3925 -3114 -3114 -3114 -3114
2 -3174 -3143 -3058 -3048 -3040 -3025 -2663 -2616 -2667 -2609
3 -2910 -2921 -2901 -2892 -2884 -2876 -2512 -2510 -2517 -2515
4 -2777 -2708 -2740 -2667 -2755 -2682 -2530 -2497 -2562 -2532
5 -2645 -2620 -2645 -2625 -2685 -2679 -2515 -2504 -2563 -2549
6 -2584 -2642 -2662 -2650 -2655 -2683 -2525 -2532 -2590 -2579
7 -2598 -2609 -2609 -2621 -2669 -2690 -2537 -2552 -2605 -2608
8 -2587 -2589 -2596 -2596 -2661 -2679 -2550 -2577 -2642 -2651
9 -2599 -2601 -2612 -2610 -2690 -2696 -2586 -2590 -2668 -2683
    EEV   VEV   EVV   VVV
1 -3114 -3114 -3114 -3114
2 -2648 -2587 -2650 -2583
3 -2534 -2533 -2551 -2550
4 -2581 -2549 -2619 -2597
5 -2600 -2592 -2654 -2657
6 -2634 -2635 -2714 -2719
7 -2659 -2682 -2737 -2760
8 -2713 -2735 -2807 -2827
9 -2760 -2784 -2872 -2888
Top models based on the BIC criterion: 
VEE,4 VEE,5 VEE,3 
-2497 -2504 -2510

View this question

Consider the following distance matrix representing the dissimilarities between four objects (A, B, C, and D) in a hierarchical clustering analysis:

      A   B   C   D
A   0.0 5.0 9.0 6.0
B   5.0 0.0 7.0 8.0
C   9.0 7.0 0.0 3.0
D   6.0 8.0 3.0 0.0

If you perform single-linkage hierarchical clustering (also known as the nearest neighbor method) on this data, what is the height at which the clusters {A, B} and {C, D} will be merged?

View this question

This data set has 5 observations.

     id    x1    x2
  <int> <dbl> <dbl>
1     1 -1.29  0.56
2     2  0.92 -0.40 
3     3  0.61 -0.59
4     4 -0.77 -0.29
5     5  2.12 -0.14

The following is the interpoint distance matrix between the 5 observations.

    1   2   3   4
2 2.6            
3 3.0 1.1        
4 3.2 2.6 1.7    
5 5.3 3.2 2.5 3.1

Observations 2 and 3 would be joined at the first step in hierarchical clustering, at a distance of 1.1. If observation 4 is then joined at the next step, using complete linkage, what would be the distance reported between cluster (2,3) and observation 4?

View this question

This summarises a linear support vector machine fit to the last 25 years of Australian tourism data modeling the difference in patterns between Cairns and Melbourne. Only holiday travel is examined, and the four variables used are Q1, Q2, Q3, Q4 which are quarters in the year. Each series was standardised on itself, so values represent proportion of travel for holidays relative to other types of travel to the city in each quarter of each year. We are curious to determine whether holiday travel tends to be in different seasons in the two locations.

> melb_cairns_svm_h$fit@b
[1] -2.4
> melb_cairns_svm_h$fit@SVindex
 [1]  3  5  9 10 11 12 14 17 19 20 21 23 24 30 35 36 37 38 39 40
[21] 41 42 44 46 47 50
> melb_cairns_svm_h$fit@coef
[[1]]
 [1] -10.0 -10.0 -10.0 -10.0 -10.0 -10.0 -10.0 -10.0 -10.0  -1.8
[11] -10.0 -10.0 -10.0   1.8  10.0  10.0  10.0  10.0  10.0  10.0
[21]  10.0  10.0  10.0  10.0  10.0  10.0

The top few rows of the data are:

> melb_cairns[,c(1,3,5,7,9)] |> slice_head(n=5)
# A tibble: 5 × 5
  Region         Q1         Q2         Q3         Q4
                           
1 Cairns      0.259      0.220      0.629      0.300
2 Cairns      0.205      0.345      0.586      0.348
3 Cairns      0.272      0.475      0.500      0.275
4 Cairns      0.173      0.533      0.523      0.380
5 Cairns      0.360      0.498      0.565      0.374

The coefficients for the separating hyperplane plane are calculated to be:

> melb_cairns_betas_h
        Q1         Q2         Q3         Q4 
      4.76       0.80      -8.95      -0.57

(1pt) How many support vectors are used to compute the coefficients for the separating hyperplane?

(1pt) Write down the equation of the separating hyperplane?

(1pt) Which variable(s) would be considered to be the most important to distinguish the difference between holiday trips to Cairns and Melbourne?

(2pts) Explain how you would use the quantities from the fitted model object to compute the coefficients.

(2pts) Was Melbourne or Cairns coded as -1? Why do you think so?

View this question

This summarises a tree fit to the last 25 years of Australian tourism data modeling the difference in patterns between Cairns and Melbourne. Only holiday travel is examined, and the four variables used are Q1, Q2, Q3, Q4 which are quarters in the year. Each series was standardised on itself, so values represent proportion of travel for holidays relative to other types of travel to the city in each quarter of each year. We are curious to determine whether holiday travel tends to be in different seasons in the two locations.

n= 50 
node), split, n, loss, yval, (yprob)
      * denotes terminal node
 1) root 50 25 Cairns (0.500 0.500)  
   2) Q3>=0.43 24  1 Cairns (0.958 0.042) *
   3) Q3< 0.43 26  2 Melbourne (0.077 0.923)  
     6) Q3>=0.39 7  2 Melbourne (0.286 0.714)  
      12) Q2< 0.32 2  0 Cairns (1.000 0.000) *
      13) Q2>=0.32 5  0 Melbourne (0.000 1.000) *
     7) Q3< 0.39 19  0 Melbourne (0.000 1.000) *

(1pt) How many terminal nodes in the tree?

(1pt) How many of the four variables are used in the model?

(1pt) Which variable would be considered to be the most important?

(1pt) Which terminal nodes are pure nodes (having only one class)?

(1pt) How many observations are there at node 7?

(2pts) Based on this model, how would you describe the differences in holiday travel between Melbourne and Cairns?

View this question

From the following summaries:

$\bar{x}_A=5, \bar{x}_B=8, \sigma^2_A=2, \sigma^2_B=4$ \bar{x}_A=5, \bar{x}_B=8, \sigma^2_A=2, \sigma^2_B=4

answer the following questions:

(1pt) What is the data dimension, p?

(1pt) What is the pooled variance-covariance, S?

(3pts) Compute and report the LDA rule to classify group A from group B, assuming equal prior probabilities.

View this question

From the following plot of data, what would likely be the pooled variance-covariance matrix?

scatterplot

 
VC1          |  VC2          |  VC3        |  VC4
     x1   x2 |       x1   x2 |      x1  x2 |        x1    x2
x1  5.6 -3.0 |  x1 1.03 0.98 |  x1 5.4 2.9 |  x1  1.14 -0.98
x2 -3.0  5.6 |  x2 0.98 1.14 |  x2 2.9 4.9 |  x2 -0.98  1.03

View this question

Which of the following categorical response variables matches the binary matrix coding below:

$\begin{align*}\begin{array}{ccc} A & B & C\\ 1 & 0 & 0\\ 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1\\ 0 & 0 & 1\\ 1 & 0 & 0\\\end{array}\end{align*}$

\begin{align*}\begin{array}{ccc} A & B & C\\

1 & 0 & 0\\

0 & 1 & 0\\

0 & 0 & 1\\

1 & 0 & 0\\\end{array}\end{align*}

View this question

/* tables with alternating shading */

.table_shade {

border-collapse: collapse;

border-spacing: 0;

border:1px solid #FFFFFF;

background-color: #FFFFFF;

}

.table_shade th {

border:1px solid #FFFFFF;

background: #D5D5D5;

}

.table_shade td {

border:1px solid #FFFFFF;

}

.table_shade .odd {

background: #EEEEEE;

}

.table_shade .even {

background: #FBFBFB;

}

Which of these is the primary difference between linear discriminant analysis and quadratic discriminant analysis?

Data is assumed to be sampled from different statistical distributions.

Class means are assumed to be different.

The sample sizes are assumed to be different.

Class variance-covariances are assumed to be different.

The order of the polynomial fit.

View this question

/* tables with alternating shading */

.table_shade {

border-collapse: collapse;

border-spacing: 0;

border:1px solid #FFFFFF;

background-color: #FFFFFF;

}

.table_shade th {

border:1px solid #FFFFFF;

background: #D5D5D5;

}

.table_shade td {

border:1px solid #FFFFFF;

}

.table_shade .odd {

background: #EEEEEE;

}

.table_shade .even {

background: #FBFBFB;

}

Why is it recommended to standardise the variables before doing linear discriminant analysis?

To examine the relative magnitude of the model coefficients, for understanding the importance of each variable.

To reduce the effect of extreme values on the model fit.

So that the different class covariance matrices can be compared.

The variables might have been measured in different units which will affect the model fit.

100%

To avoid over-fitting the model.

View this question

Want instant access to all verified answers on learning.monash.edu?

Get Unlimited Answers To Exam Questions - Install Crowdly Extension Now!

Add to Chrome

Telegram Instagram TikTok Question Bank