0 votes
I would like to score my own prediction method against the same test set that DSS generated to test a model. The test set being generated by a sampling method with randomization, it's quite tricky.

Is there a simple way to perform that ? Maybe by extracting the test set to a dataset on which I could do some analysis ?

Thanks !
by

1 Answer

+2 votes
Best answer
Hi,

There are two solutions:

* Recommended: Split the dataset yourself and use the ability of the Analysis Models to use predefined train and test sets instead of letting it do a random split. At the moment, doing a random split using the split recipe is a bit tricky, you'd have to first create a random column with a Python processor in a preparation recipe

* Hackish / Not officially supported: When using memory-based models in DSS, the train and test sets are dumped as CSV files in the DSS datadir > analysis-data > project > analysis_id > model_id > splits
by
selected by
1,117 questions
1,157 answers
1,302 comments
11,022 users

©Dataiku 2012-2018 - Privacy Policy