Possible to make Datasets join with "contains" condition ?

Thanh_Thanh
Level 1
Possible to make Datasets join with "contains" condition ?
Hi all,

I'm pretty new to Dataiku, and I'm currently trying to join 2 datasets. I found the Join recipe.

However, my join condition is not "equals" but "contains". And when I choose this "contains" condition on datasets join, I have this error : "DSS can only join with equality conditions"

Any idea how I can do this please ?

Thanks in advance,

Thanh Thanh
0 Kudos
1 Reply
Alex_Combessie
Dataiker Alumni
Hi,

In order to join on "contains" condition, you need to have your input datasets in an SQL connection such as PostgreSQL (see all supported SQL connections on https://doc.dataiku.com/dss/latest/connecting/sql.html).

Alternatively, you can also use the Spark engine (requires configuration, cf https://doc.dataiku.com/dss/latest/spark/index.html). This engine is meant for HDFS connections but can also be used locally (although you lose the benefits of parallelization on Big Data).

Hope it helps,

Alex
0 Kudos

Labels

?
Labels (2)
A banner prompting to get Dataiku