0 votes

Pretty sure this is a basic question, however I was not able to find a solution so far.

After removing some columns, I ended up with rows that have an empty value in each column (tooltip shows "null"), which I want to remove now.

I tried the following processor ("filter rows/cells on value"):

However, the empty rows are not removed. A counter-check with "Action: Only keep matching rows" removes all rows.

The "Remove rows where cell is empty-processor" in combination with selecting "all" columns removes each row with at least one empty column (which is not what I want) as described in https://doc.dataiku.com/dss/latest/preparation/processors/remove-empty.html.

Am I missing something? Thanks!

by
reopened by

2 Answers

0 votes
Best answer

Hello,

In the meantime, you can use the following trick:

1. Add a Python function step in "cell" mode to output an "all_empty" column, with the following code

def process(row):
    all_empty = all(
        [str(v).strip() == '' or v is None 
        for k,v in row.items() if k != 'all_empty']
    )
    return(all_empty)

2. Add a filter on value step to remove the rows where the "all_empty" indicator column is True

Hope it helps,

Alex

by
selected by
0 votes
Hi,

There are currently some known issues with this processor in "ALL" mode. You'll need to add the columns explicitly for the moment.
by
Thanks for the quick reply.
Unfortunately, nothing happens when adding all the columns explicitly (i.e. no rows are removed). Even when selecting only one column, no rows are removed (agian, counter-check with "Action: only keep matching rows" removes all rows).
1,278 questions
1,305 answers
1,486 comments
11,834 users

©Dataiku 2012-2018 - Privacy Policy