0 votes

Hi,

I had a recipe in DSS 2.3 that worked properly and doesn't work in DSS 3.0

# Read df from a dataset. "date" is a column of type "date in DSS
# df.date is a date column

# do stuff with df

df.fillna("")

# do stuff with df

dataset.write_with_schema(df)

In DSS 3.0, the output column is now a string, not a date anymore

asked by anonymous

1 Answer

0 votes

Hi,

In DSS 3.0, DSS was upgraded to Pandas 0.17, which indeed introduces a behavior change regarding fillna on date columns.

* In DSS 2.3 / Pandas 0.16, filling a date column with "" filled the column with the "NaT" value ("Not a time") and kept the dtype - filling with "anyotherstring" failed

* In DSS 3.0 / Pandas 0.17, filling a date column with any string, whereas empty or not-empty now triggers a downcast of the column to object, which DSS then interprets as a string column

Pandas 0.16:

Pandas 0.17:

Filling a whole dataframe, containing mixed value types, with a single value is inherently dangerous. Both behaviors of Pandas are questionable, but in fine, you'd probably want to fillna only the columns for which it makes sense, with a properly-typed value

answered by
863 questions
891 answers
848 comments
1,166 users