Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello,
I am currently running a flow to do a binary classification model. After running the model on my training data, I want to compare the results of the top three models on my test data set (accuracy, precision, recall, etc.). I know how to do it on the train dataset, but I am unsure on how to do it on the test data and compare it via model comparisons.
Also, after running my flow on the entire set of features, is there a way to only select the top 5 features to run a new model on?
Thank you!
Hi,
To evaluate on the test dataset you would need to perform the split using the split recipe and then use explicit extracts for your train/test sets.
You can do this Visual Analysis > select the model > Design > Train/Test Set and choose "Explicit extracts from two datasets"
To reduce the number of features you can have a look at: https://doc.dataiku.com/dss/latest/machine-learning/supervised/settings.html#settings-feature-reduct...
Let me know if that helps.
Hi,
But where do we see the results of the test dataset after splitting? How do we get the recall, precision, accuracy etc?