0 votes
Is there a way to guarantee a dataset is sorted in order to pivot it?

In the past I had to add an additional Python or SQL recipe to sort the data, but I realized that even this is not 100% working anymore. Sometimes after an SQL ORDER BY or a Pandas sort_values the resulting dataset contains random rows that are not sorted, even though 99% of the rows are sorted correctly, e.g. I have this right now:

ID      Tag
1        automation
..
100    automation
101    biotech            <--- should not be there!
102    automation
...
152    automation
153    biotech
....

Since it doesn't matter whether I sort using Python or SQL, I guess it has something to do with how Dataiku works internally. Is there anything I can do?
asked by Simon

1 Answer

0 votes
The fact that python or SQL does not manage to sort your dataset has unfortunately nothing to do with how Dataiku works internally, we orchestrate the execution of recipes/queries.

The issue is somewhere else...

In any case, investigating more into this issue would require a dataset extract so that we can try to reproduce the issue.
answered by
860 questions
891 answers
848 comments
1,160 users