I will try to give an overview of the steps I took. However the data is sensitive, so I cannot post any screenshots or code.
- I created a model (RF) in the analyses-menu
- I deployed the model to the flow (I selected a train set and gave the model a name)
- Then I go to the dataset I want to predict on, click on the dataset and select the 'predict' recipe
- I choose the input dataset (the dataset I want to predict on) and then select the name of the model. The recipe is created
- Then I click on the scored dataset, and use the 'evaluate' recipe. I select the model and I use the scored dataset as an input. There is no difference if I select the scored dataset or the orginal dataset here.
- The 'evaluate' recipe shows two datasets, one containing the metrics (recall, accuracy, etc).
- Against our expectations, these metrics were quite high. So I investigated the other dataset that the evaluate recipe gives. I loaded this dataset in the Jupyter Notebook and used the 'recall_score(), precision_score() etc from sklearn. The scores are then different from the metrics. This is also the case if I export the file to excel and calculate the confusionmatrix there.
I hope you can help me :)