Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
What you describe is often described as the "ANTI JOIN" concept in SQL. I would recommend indeed to accomplish this using a SQL script recipe in Dataiku.
Unfortunately, there is not a common syntax for ANTI JOINs in SQL, as for regular LEFT/RIGHT/INNER/... JOINS. Depending on the flavour of SQL database you are using, you can look up the specific syntax you need. For instance, in PostgreSQL you can use a syntax like this:
SELECT *
FROM actor a
WHERE NOT EXISTS (
SELECT * FROM film_actor fa
WHERE a.actor_id = fa.actor_id
)
This is documented online in this article: https://blog.jooq.org/2017/01/12/a-probably-incomplete-comprehensive-guide-to-the-many-different-ways-to-join-tables-in-sql/. For other SQL flavours, similar articles certainly exist to help you write the right script.
Hope it helps,
Alex
Hi,
What you describe is often described as the "ANTI JOIN" concept in SQL. I would recommend indeed to accomplish this using a SQL script recipe in Dataiku.
Unfortunately, there is not a common syntax for ANTI JOINs in SQL, as for regular LEFT/RIGHT/INNER/... JOINS. Depending on the flavour of SQL database you are using, you can look up the specific syntax you need. For instance, in PostgreSQL you can use a syntax like this:
SELECT *
FROM actor a
WHERE NOT EXISTS (
SELECT * FROM film_actor fa
WHERE a.actor_id = fa.actor_id
)
This is documented online in this article: https://blog.jooq.org/2017/01/12/a-probably-incomplete-comprehensive-guide-to-the-many-different-ways-to-join-tables-in-sql/. For other SQL flavours, similar articles certainly exist to help you write the right script.
Hope it helps,
Alex
Hi @acrow6 ,
We've recently released a new version of Dataiku, 11.3, and you can now add additional output datasets to a Join recipe that contain all the unmatched rows from either a left, right, or inner join. This will let you easily create datasets with rows that don't match the join!
Anti-joins/exclusion joins are on our roadmap.
Cheers,
Ashley