What is the difference between uploaded datasets and managed datasets ?

Solved!
FlorianD
Dataiker
What is the difference between uploaded datasets and managed datasets ?
 
1 Solution
FlorianD
Dataiker
Author

There are three kinds of datasets in dataiku: 




  • Managed datasets:  datasets that are created by recipe. Dataiku assumes he โ€œownsโ€ the data and schema of those datasets

  • Uploaded datasets:  raw files that are uploaded through the user interface. They are actually stored locally in raw form, in a specific folder (that you can find in DATA_DIR) with a subfolder per dataset name PROJECTNAME.DATASETNAME

    You can โ€œmodifyโ€ an uploaded dataset by reuploading it. An uploaded dataset is actually a folder, meaning that it can contain several files

  • Non-managed, non-uploaded datasets. Usually it's a dataset that points to exisiting data (a table in a sql table for instance) that you can use as an input for a recipe 



 

View solution in original post

2 Replies
FlorianD
Dataiker
Author

There are three kinds of datasets in dataiku: 




  • Managed datasets:  datasets that are created by recipe. Dataiku assumes he โ€œownsโ€ the data and schema of those datasets

  • Uploaded datasets:  raw files that are uploaded through the user interface. They are actually stored locally in raw form, in a specific folder (that you can find in DATA_DIR) with a subfolder per dataset name PROJECTNAME.DATASETNAME

    You can โ€œmodifyโ€ an uploaded dataset by reuploading it. An uploaded dataset is actually a folder, meaning that it can contain several files

  • Non-managed, non-uploaded datasets. Usually it's a dataset that points to exisiting data (a table in a sql table for instance) that you can use as an input for a recipe 



 

Alex
Level 1
Thank you!
0 Kudos

Labels

?
Labels (3)
A banner prompting to get Dataiku