Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi, im kinda new to DSS.
I have a huge Dataset (with 20M+ rows), this dataset has a timestamp column, and 20 other columns.
I have another Dataset with like 15 rows, this one has 3 columns start, stop and flag.
Foreach row in my first Dataset i want to check if the timestamp is between start and stop of a row in the second dataset, if it's i want to enrich the first dataset with the flag from the second. And if a row in my first Dataset isn't between any of the start and stop, i don't want to keep it.
I've done this with a Python recipe using Pandas, but im trying to do it only with DSS recipes.
I tried to do it with a Join recipe with this SQL code :
$left.timestamp>=$right.start AND $left.timestamp<=$right.stop;
but it's not working..
I found one way but it takes too much disk space and time: Do a cross join between my two Datasets and then do a filter with the value of start and stop
Can you guys suggest me a good way to do this ?
Hi Fragan,
Please use the following syntax on your join clause
"Test_dataset"."timestamp">="testa"."start" AND "Test_dataset"."timestamp"<="testa"."stop"
mind the quotes around the field names.
Hi Fragan,
Please use the following syntax on your join clause
"Test_dataset"."timestamp">="testa"."start" AND "Test_dataset"."timestamp"<="testa"."stop"
mind the quotes around the field names.