+1 vote

1 Answer

0 votes
Best answer
Currently the visual preparation recipes are working line by line. This allows to work on very large datasets by streaming, potentially in parallel on Hadoop. The downside is that each line is processed independently, so, an exponential moving average cannot reliably be done in this type of recipe.

If the dataset fits in memory, I would go with a Python recipe and use the functions from Pandas.
1,325 questions
1,345 answers
11,895 users

©Dataiku 2012-2018 - Privacy Policy