Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi
I would like to create a data set in a notebook out of financial data I am pulling out of a web service : Quandl.
They provide an API which allows me to download data in a dataframe
import Quandl as qd
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
df = qd.get("GOOG/NASDAQ_GOOGL")
df.index
Out[20]:
DatetimeIndex(['2004-08-19', '2004-08-20', '2004-08-23', '2004-08-24', '2004-08-25', '2004-08-26', '2004-08-27', '2004-08-30', '2004-08-31', '2004-09-01',
...
'2015-09-24', '2015-09-25', '2015-09-28', '2015-09-29', '2015-09-30', '2015-10-01', '2015-10-02', '2015-10-12', '2015-10-13', '2015-10-14'], dtype='datetime64[ns]', name=u'Date', length=2803, freq=None, tz=None)
In [21]:
df.columns
Out[21]:
Index([u'Open', u'High', u'Low', u'Close', u'Volume'], dtype='object')
df is a dataframe.
I guess it messes up as the schema isn't initiated properly
I try to run:
fdc = dataiku.Dataset("qdl")
fdc.write_schema_from_dataframe(df)
fdc.write_with_schema(df)
and it fails with:
Unable to fetch schema for %s : %s'%(self.name,err_msg)
Hence my question:
What is is the recipe/optimal way to create a dataset in python from a pandas dataframe
Thanks!