✅ Перевірена відповідь на це питання доступна нижче. Наші рішення, перевірені спільнотою, допомагають краще зрозуміти матеріал.
A classification tree is induced, on four input features A, B, C, and D. Features A and D are both continuous and thresholding (as opposed to binning) is used to determine the split criteria for both features. Feature A has values ranging from 0 to 10 and feature D has values ranging from 1 to 100. Features B and C are categorical, with feature B having 5 outcomes and feature C having 3 outcomes. We have the following entropy information gains for the four features:
gain(A) = 0.7
gain(B) = 0.75
gain(C) = 0.7
gain(D) = 0.75
On which of the features should the dataset be split?