Q & A
Dataiku is for…
Governance & Security
Learn Dataiku DSS
Q & A
Ask a Question
Email or Username
I forgot my password
Why does starting a training job from the flow take much longer than from an analysis?
I have a flow with 4 models (xgboost) configured from the visual ML interface. If I run them there each training takes less than a minute to compute.
However when I start the job training in the flow of the 4 models it takes up to 60 min and more ... I did not have the nerve and aborted them early.
Have you checked that the sampling settings on the Train recipe in the flow are the same that your analysis?
Hi, Did you find the source of the discrepancy?
No . I did not. My dataset is well below the standard setting of first 10.000 rows.
It is extremely annoying ! Instead of 20 secs it is now again calculating for 20 min as soon as I start 2-3 training jobs in parallel.
Could you please attach a diagnostic of the affected job ? From the job page, click on Actions > Download job diagnosis.
If the resulting file is too large for mail (> 15 MB), you can use a file transfer service like WeTransfer to get it to us
to add a comment.
to answer this question.
Difference Between Flow and Analysis Steps
How to schedule a job to incrementally append data to an hdfs dataset from an oracle table ?
Accessing flow variables from within a FS provider?
When clicking "+ more" on a Long Description in the Flow, it takes me to the recipe editor
In the flow, what is the white curvy arrow on the top right corner of a dataset?
We’re working on a brand new, revamped Community experience. Want to receive updates?
Sign up now!
©Dataiku 2012-2018 -