Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I am trying to save a pandas DataFrame to a managed folder in Dataiku.
My code:
import dataiku
import pandas as pd
temp_folder = "reports_TEMP"
path_upload_file = "testfile.csv"
df = pd.DataFrame(range(0,10), columns=["test"])
handle = dataiku.Folder(temp_folder)
with handle.get_writer(path_upload_file) as w:
df.to_csv(w)
and this is the error that I get:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-135-e90a7150097c> in <module>
8 handle = dataiku.Folder(temp_folder)
9 with handle.get_writer(path_upload_file) as w:
---> 10 df.to_csv(w)
/data/dataiku/dataiku-dss-6.0.1/dss_data/code-envs/python/Py_36_flight_risk/lib/python3.6/site-packages/pandas/core/generic.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, date_format, doublequote, escapechar, decimal)
3200 doublequote=doublequote,
3201 escapechar=escapechar,
-> 3202 decimal=decimal,
3203 )
3204 formatter.save()
/data/dataiku/dataiku-dss-6.0.1/dss_data/code-envs/python/Py_36_flight_risk/lib/python3.6/site-packages/pandas/io/formats/csvs.py in __init__(self, obj, path_or_buf, sep, na_rep, float_format, cols, header, index, index_label, mode, encoding, compression, quoting, line_terminator, chunksize, quotechar, date_format, doublequote, escapechar, decimal)
64
65 self.path_or_buf, _, _, self.should_close = get_filepath_or_buffer(
---> 66 path_or_buf, encoding=encoding, compression=compression, mode=mode
67 )
68 self.sep = sep
/data/dataiku/dataiku-dss-6.0.1/dss_data/code-envs/python/Py_36_flight_risk/lib/python3.6/site-packages/pandas/io/common.py in get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode)
198 if not is_file_like(filepath_or_buffer):
199 msg = f"Invalid file path or buffer object type: {type(filepath_or_buffer)}"
--> 200 raise ValueError(msg)
201
202 return filepath_or_buffer, None, compression, False
ValueError: Invalid file path or buffer object type: <class 'dataiku.core.managed_folder.ManagedFolderWriter'>
Hi,
You can use the Export to Folder recipe to export a DSS dataset to a managed folder.
If you are looking at this via code you can try using
https://doc.dataiku.com/dss/latest/python-api/managed_folders.html#dataiku.Folder.upload_data The following sample worked fine for me :
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
base_64 = dataiku.Dataset("base_64")
df = base_64.get_dataframe()
managed_folder_id = "output"
output_folder = dataiku.Folder(managed_folder_id)
filename = "my_file.csv"
output_folder.upload_data(filename, df.to_csv(index=False).encode("utf-8"))
Additionally, if you are looking to actually use get_writer() you can use it as such :
import dataiku
import pandas as pd
temp_folder = "output"
path_upload_file = "chunck_written.csv"
input_dataset = dataiku.Dataset("dataset_name")
handle = dataiku.Folder(temp_folder)
df = input_dataset.get_dataframe()
with handle.get_writer(path_upload_file) as w:
w.write(df.to_csv().encode('utf-8'))
Thank you, Alex!
I need the second solution - just to test writing into a managed folder for another task.
It works!