Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Team,
I have a use case where I have to replace a value based on the row value on another condition. for an instance I have 3 columns (ABC) and have to generate a new column (D)
A. B. C. Output ->. A. B. C. D
2 red. red 2. red. red. 2
3. red. blue 3 red blue 2
4 green. pink 4. green pink. 2
In the above example I have created a new column which is D based on the condition that if ((B==C), A, 000) which gives the value 0 for the rest of the rows which aren't a match but I want the same value throughout the column without explicitly defining the value (eg: using the fill column processor and using the 2 to the value). this might serve now but not in future.
Thanks in advance !
Hi Tsurapaneni,
I am not sure to understand your use case. From what I gathered in your post, you mean to:
1. Find the value in column A such that B == C (assuming either one and only one row matches this condition, or that A always takes the same value for rows matching the condition)
2. Create a new column D filled with the value computed in step 1
If this is indeed your intent to do, you can do so with a short Python recipe, using these lines:
dataset = dataiku.Dataset(YOUR_DATASET_NAME)
df = dataset.get_dataframe()
df["D"] = df.A[df.B == df.C][0]
Hope this helps!
Agathe
expecting the solution in the prep recipe through formulas but not in the python code recipes. I am good with python but I am looking only through formulas in the prep visual recipe.
Hi - were you able to find an answer to this problem? I am interested in a solution that only uses a visual recipe as well.
@mluu I'm not sure if I understand the issue here as well, but why, if you dont want to use python code above can't just use prepare recipe -> formula and use the syntax mentioned in first post so: if ((Col_B==Col_C), Col_A, 0)?
*** Ok, nevermind, I understand it now...:)
If this is exactly the same case as in post nr 1, and you have only one row where this condition B=C is met (or multiple rows, but the rule is always the same, so in this case, its red= red - because otherewise I dont know how that would work), my first thought is to try create a new prepare recipe, remove all rows where this condition is not met and remove all duplicates if necessary so you will end up with one row:
A B C D
2. red. red. 2
and then join column D from this output again with orginal dataset with "cross join" ? - I didn't test it, but that's my idea for a work around.
Thanks,
Mateusz