0 votes

Situation:
Dataset is partitioned by Year - month - day on HDFS.
Existing data: year=2016/month=05/
day=01
day=02
day=03
...
day=12

Questions:

- If I rebuild a dataset on 2016-05-12. Is only the data on the path year=2016/month=05/day=12 overwritten?  Or Will all the datasets under the folder year=2016/... be overwritten?

- If I build a dataset on 2016-05-13. Is only the data written on the path year=2016/month=05/day=13 and all data remains unchanged (so not overwritten)?  Or Will all the datasets under the folder year=2016/... be recalculated?

 

by

1 Answer

0 votes
Best answer
Hi,

The answer depends on the type of recipe you're using.

If it's an sql query:

- In both cases, only the selected partition will be written/overwritten

If it's an sql script:

- It entirely depends on what you do. Everything is possible, you're responsible for delete/write the good partition.
see http://doc.dataiku.com/dss/latest/partitions/sql_recipes.html?highlight=sql%20script
by
selected by
1,116 questions
1,160 answers
1,305 comments
11,035 users

┬ęDataiku 2012-2018 - Privacy Policy