Difference Between Flow and Analysis Steps

Solved!
dwrench07
Level 2
Difference Between Flow and Analysis Steps

What is the difference between the Flow and Analysis steps? Are the Analysis steps applied to the 'Web Service' before the transformations? Like having a preprocessing function before a sklearn.pipeline()?




def preprocessing(data):
# do some preprocessing... but not transforms...
retrun data


 

0 Kudos
1 Solution
Alex_Combessie
Dataiker Alumni
Hi, The flow is the main interface for the current project. The analysis and notebooks are for experimentation and prototyping. Once one of these prototypes is ready, you 'deploy' the model to the flow or 'convert' the notebook to recipe. This way of working allows to keep a clean and concise flow and run many experiments at the same time. From my experience it makes collaboration on the same project much easier. Regarding the analysis steps before a visual machine learning model, you are correct. They are pipelined along with the model when you deploy it in real time in an API service. Note that if you use visual machine learning models, we try to optimize the whole scoring pipeline using Java to make it faster than Python. More details on: https://doc.dataiku.com/dss/latest/machine-learning/scoring-engines.html. Hope it helps, Alexandre

View solution in original post

0 Kudos
1 Reply
Alex_Combessie
Dataiker Alumni
Hi, The flow is the main interface for the current project. The analysis and notebooks are for experimentation and prototyping. Once one of these prototypes is ready, you 'deploy' the model to the flow or 'convert' the notebook to recipe. This way of working allows to keep a clean and concise flow and run many experiments at the same time. From my experience it makes collaboration on the same project much easier. Regarding the analysis steps before a visual machine learning model, you are correct. They are pipelined along with the model when you deploy it in real time in an API service. Note that if you use visual machine learning models, we try to optimize the whole scoring pipeline using Java to make it faster than Python. More details on: https://doc.dataiku.com/dss/latest/machine-learning/scoring-engines.html. Hope it helps, Alexandre
0 Kudos

Labels

?
Labels (2)
A banner prompting to get Dataiku