+1 vote
Hi,

 

We have a dataset which has around 100K records and has column with name Brand. When we do Analyze on Brand column, as per sample data it shows distinct values. But when we compute on "Whole data" it just shows Top 10 Distinct values only though it has around 20 distinct values. Now to apply Mass Action on data, we need to have list of all products in Analyze so that we can apply Mass Action. Is there any way to increase the number of distinct values to apply Mass Action on all values

 

Regards,

Ravi Agrawal
by
reopened by

1 Answer

0 votes
Best answer
Hi Ravi,

Due to the potential large cardinality of the categorical data in a given column, Dataiku restricts you to top 10 values while analyzing the entire dataset. However If you would like a distinct count for all records in a dataset, I suggest using the Group By recipe on your dataset. If your data is already stored in a SQL database, you can also use the Charts section, then switch your sampling (on the left hand side) to In-Database, then build a bar chart or a pivot chart with the column you care about. This will generate a SQL query that will return the response you expect.
by
selected by
1,277 questions
1,305 answers
1,486 comments
11,834 users

┬ęDataiku 2012-2018 - Privacy Policy