+1 vote
use case: you want to sync your local file with Dataiku dataset.

1 Answer

+1 vote

Hi Frank,

You should do a Python Custom Recipe in a plugin or scenario (I use to do it in scenario), with something similar to:

import dataiku
from dataikuapi import SyncRecipeCreator
from dataiku.scenario import Scenario
import pandas as pd

folder_path = "path/to/file/" # Don't forget the last /
file = "file.csv"

df = pd.read_csv(folder_path + file) # you should adapt the parameters

dataset = project.create_dataset(dataset_name, 'Filesystem', params={'connection': 'filesystem_root', 'path': folder_path + file}, formatType='csv', formatParams={'separator': ';', 'style': 'no_escape_no_quote', 'parseHeaderRow': True}) # here too

dataset.set_schema({'columns': [{'name': column, 'type': 'string'} for column in df.columns]}) # I use to set string and then change it

builder = SyncRecipeCreator("sync_output_dataset", project)
builder = builder.with_input(dataset_name)
builder = builder.with_output("output_dataset", append=False)
recipe = builder.build()

scenario.build_dataset("output_dataset", build_mode='NON_RECURSIVE_FORCED_BUILD')


Hi Alan,
Thanks for the reply! Appreciate it!   
when I run this "df = pd.read_csv(folder_path + file)"  
I got this error: file does not exist
I think  it actually tries to read from the DSS server's drive not my local file.
Any thoughts?

Yes you have to upload the file to DSS server (if you are using the REST API, you have to do it too :S
1,296 questions
1,323 answers
11,862 users

©Dataiku 2012-2018 - Privacy Policy