Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
The python method to get the dataframe (dataiku.Dataset(name, project).get_dataframe()) could be executed at query time instead of at webapp init. This way, the data will be updated at each request.
However, that might slow down your queries. You should consider a caching system. For xemaple:
cached_dfs = {}
def get_cached_dataframe(name, project):
global cached_dfs
if name not in cached_dfs.keys() or datetime.datetime.now() - cached_dfs[name][0] > datetime.timedelta(minutes=30):
cached_dfs[name] = (datetime.datetime.now(), dataiku.Dataset(name, project).get_dataframe())
return cached_dfs[name][1]
# then in your code, when you need to get a dataframe object:
my_df = get_cached_dataframe("dataset_name", "project_key")