Dataset still in the server even when deleted in the web interface

Binh
Level 1
Dataset still in the server even when deleted in the web interface

Hello,



When we upload our datasets directly from our browser in dss, the file can be stored in two location




  • Default location which will be stored in dss/uploads

  • filesystem_managed which will be stored in dss/managed_datasets/uploads



When we delete the file in the browser, the dataset is deleted (and doesn't appear in the flow) but when we look at the server, in the terminal we can still see the uploaded file in the folders mentionned before.



So the files are never deleted, and are taking too much space in the server. We have to delete then by hand in the terminal.



Do you know how we can systematize the process ? (when a file is deleted in the browser, it is deleted in the server)



 



Thank you.

0 Kudos
4 Replies
will_nowak
Level 1

Hello! Be sure to select the `Drop Data` radio button when trying to delete data.



It should look as follows:



 





-----



How are you trying to delete the data? This aforementioned button appears if you right click on a dataset in the flow and then select `Delete`.

0 Kudos
Binh
Level 1
Author
I juste replied with a screenshot (I can't add a screenshot to a reply, so I just added a new answer)
0 Kudos
Binh
Level 1
Author

Hello,



When I want to delete an uploaded file, I don't have the "drop data" box to be ticked.



I can tick it when i put files on the server and I want to delete it. But not for the uploaded files.

(To delete the data, i click on the data, then on the right I click on delete.)



Binh

0 Kudos
will_nowak
Level 1
Hello!

Indeed, this is a feature and not a bug. For files uploaded to the server, DSS doesn't allow the deletion of this data from the flow so as to protect any downstream recipes / datasets (since the flow rebuild potential is lost when initial inputs are deleted). In addition, DSS is not meant to serve as a tool to manage the data that is uploaded to the server, but rather as a tool for interacting with it once there.

But, I can understand the desire. Perhaps one solution is to write a macro that uses public API to remove datasets from your project. https://doc.dataiku.com/dss/latest/publicapi/client-python/datasets.html#basic-operations

Please let me know if this is of any assistance.
0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku