0 votes
I have a flow with 4 models (xgboost) configured from the visual ML interface. If I run them there each training takes less than a minute to compute.

However when I start the job training in the flow of the 4 models it takes up to 60 min and more ... I did not have the nerve and aborted them early.
edited by
Have you checked that the sampling settings on the Train recipe in the flow are the same that your analysis?
Hi, Did you find the source of the discrepancy?
No . I did not. My dataset is well below the standard setting of first 10.000 rows.

It is extremely annoying ! Instead of 20 secs it is now again calculating for 20 min as soon as I start 2-3 training jobs in parallel.
Could you please attach a diagnostic of the affected job ? From the job page, click on Actions > Download job diagnosis.
If the resulting file is too large for mail (> 15 MB), you can use a file transfer service like WeTransfer to get it to us

Please log in or register to answer this question.

1,322 questions
1,341 answers
11,889 users

©Dataiku 2012-2018 - Privacy Policy