Q & A
Governance & Security
Learn Dataiku DSS
Q & A
Ask a Question
Email or Username
I forgot my password
Interpretation of model performance in Dataiku built-in models
I created a model using built-in Dataiku models. However, results are quite suspicious. So I would like to ask you some questions.
In the attached screenshot you can see that the model I created is a Decision tree using 10 fold CV. This model was created only for testing purposes so I intentionally set tree max depth to 100. This makes the tree very deep (I could see that in the Interpretation part). Also this model should be overfitting a lot. It should perform very well on train set and very bad on test set. In case of cross validation it should have a bad performance as well because we should be evaluating results on each untrained fold. However, we can see 0.892 AUC here. Can you explain why we get this kind of performance which is obviously not right for this model? And on which data exactly this ROC AUC in the centre is calculated?
Forgot to add screenshot
to add a comment.
to answer this question.
The resulting metrics is the average of the metric on each of the 10 folds (each time on the untrained fold).
What kind of performance are you getting:
* Without K-Fold ?
* On a reasonably-sized decision tree ?
* On a reasonable-sized random forest ?
ask related question
* Without k-fold (simple train-test split) I get very similar performance. Again it is a very deep tree and I guess it is not a performance on test set but on train set
* Decision tree with max depth = 5 gives 0.64 AUC
* Typical Random Forrest gives 0.95 (!)
I tried same dataset using code for XGBoost and GBM. It gives not more than 0.75 AUC using 10 fold CV
We confirm that all performance metrics shown in DSS are based on the test set - we currently never show performance on the train test - in the case of KFold, it's the mean of out-of-fold (so test set too)
So you do see a downwards trend when going from "reasonable" to "very deep" random forest (from 0.95 to 0.892) which is indeed probably indicative of overfitting, although it's not as severe as you expected - possibly because: (a) Your train and test sets are very similar (b) The random picking of features adds sufficient diversity to counteract parts of the overfitting effect - it could also happen if you don't have that much data, which means that your trees are not "full"
to add a comment.
Most popular tags
Retrained model does not update the number of class in the target variable
How to reapply rescaling when you want to predict your data
How to force model rebuild?
Support for distributed/multi node grid search/training? python/sci-kit learn based
Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Welcome to Dataiku Answers, where you can ask questions and receive answers from other members of the community.
©Dataiku 2012-2018 -