Coming soon: We’re working on a brand new, revamped Community experience. Want to receive updates? Sign up now!

+1 vote
I have a large dataset (~200GB) that I would like other users to be able to filter by entering configuration parameters. (e.g. column contains "term")

Plugin recipes are written in Python (or R), however the dataset is too large to load into a pandas DataFrame.

How can I write the plugin recipe so that the data can be easily filtered by a user?

1 Answer

+1 vote
Hi Jonathan,

In python, you can use our advanced API to read a large dataset by chunks:

Otherwise, you can design a plugin to call SQL, and do the filtering in SQL:

Best regards,

Thanks Henri
1,337 questions
1,362 answers
11,912 users

©Dataiku 2012-2018 - Privacy Policy