Q & A
Governance & Security
Learn Dataiku DSS
Q & A
Ask a Question
Email or Username
I forgot my password
Retrained model does not update the number of class in the target variable
I had trained and deployed a multiclassification GBM model where the target variable had 5 classes. I changed the train set and the target changed from 5 to 4 classes. However, after I retrained the model I had deployed, the new confusion matrix is still showing me the class which was dropped (with all NAs values in the matrix).
I am sure the dataset has 4 classes because when I click "Analyse" I see the correct ones, however the model seem to keep in memory the old class too.
I had to delete my model and retrain/deploy a new one.
Please can you advice whether there is an easier way?
Dec 22, 2017
to add a comment.
to answer this question.
In case of permanent class change, you need to deploy a new model to the flow. This is to ensure your model works properly if there is a temporary change in the distribution of classes in the training set.
In your case, is the class mapping change permanent or temporary?
Dec 26, 2017
ask related question
It is a permanent change.
The only I could make it work was start a New Analysis on the new training set and train a new model. Then I deployed that one in my workflow. However this was not an easy and quick solution because then I had to add a new scoring step and change the input of the following recipe. Why does the "Retrain" model does not pick the classes from the new training set but keeps in memory the original ones?
If it is a permanent change, then you can retrain a model in "Analysis" mode, and redeploy it to the Flow.
At the moment, the way we designed the "Retrain" recipe in the Flow assumes your class mapping is fixed. This is indeed different from the "Train" feature of an Analysis which updates the classes dynamically.
We will see in the future how we can allow for dynamic class remapping in the Flow in addition to in Analysis mode. Thanks for your feedback!
Thanks Alexandre, it is really useful to know that the "Retrain" recipe in the Flow is different from the "Train" feature in the Analysis section.
Would be great to include the feature of dynamic class remapping in the Flow too!
to add a comment.
Most popular tags
Interpretation of model performance in Dataiku built-in models
How to force model rebuild?
Support for distributed/multi node grid search/training? python/sci-kit learn based
Using a project variable in the message of a reporter.
Modify the value of a variable in a Python recipe
Welcome to Dataiku Answers, where you can ask questions and receive answers from other members of the community.
©Dataiku 2012-2018 -