How to sync a partitioned dataset only for partitions not in the output

Alex_Combessie
Dataiker Alumni
How to sync a partitioned dataset only for partitions not in the output
I have a sync recipe with one partitioned dataset in input and one partitioned dataset in output. Partitioning is by hour.

The input dataset receives new data continuously. Today I manually build the output recipe by selecting new dates, using the append instead of overwrite options.

This is obviously not optimal, as it involves manual intervention.

What would be a solution to only sync the partition from the input that are not in the output? (other than job scheduling, which could be too costly)
0 Kudos
0 Replies

Labels

?
Labels (2)
A banner prompting to get Dataiku