+1 vote

In our project we are loading input file into Hive tables using DSS and build summary tables on top of it which are used by tableau.

We are using Python recipe which will create HIve table based on input file. Input file has HEADER which will have Column names. So python recipe creates all these columns as STRING data type. Later using DSS we manually changes data type for these columns and using SYnc to Hive metastore we are syncing the Hive table.

This is well working for every other data types But for DATE data types when we are loading it into Hive table as STRING first and then changing it to DATE data type then it is syncing it in HIVE as Timestamp and Data is going as NULL. How to Keep DATE data types in HIVE using DSS although initially it is created as STRING.

We thouht of creating tables manually and disable Sync to hive metastore, But we are using DSS Flows to load summary tables. SUMMARY tables data types also changing as string or timestamp and date is not supporting.

How to handle this using DSS. Thanks in advance !!

2 Answers

+1 vote

The "DATE" type in DSS is actually a Java-style date, in other words a SQL-style timestamp. There is no "date-only" type in DSS.
Hey Clément,
I'm dealing with the same issue over and over again.
Even when I use date format https://monosnap.com/file/EHTLmdSnENvq60Jwp7ffskgoNOvS8A
it's simply impossible to get Date instead of Datetime on the output.

Any chance to improve support of date formats in DSS? That would be so helpful!
0 votes

My solution was write own post-write statements on final table when saving into MySQL db.


1,296 questions
1,323 answers
11,862 users

©Dataiku 2012-2018 - Privacy Policy