0 votes

We have 2 different Oracle Databases/Connections:

  • ALPHA 
  • BETA

We want to import a table from each database, join them together and append the result in an already existing table in the ALPHA connection.

The table from the connection ALPHA has as schema (in the oracle db)

REFERENCE_PERIOD (DATE) | IDENTIFIER (VARCHAR2(10)) | X_FEATURE (VARCHAR2(20)) 

The table from the connection BETA has as schema (in the oracle db)

REFERENCE_PERIOD (DATE) | IDENTIFIER (VARCHAR2(10)) | Y_FEATURE (VARCHAR2(20)) 

The output table ALPHA_OUTPUT that's stored in the connection ALPHA has as schema

REFERENCE_PERIOD (DATE) | IDENTIFIER (VARCHAR2(10)) | X_FEATURE (VARCHAR2(20))  | Y_FEATURE (VARCHAR2(20)) 

To achieve that we create a visual recipe* as follows:

 

Running this we are getting the following error:

"Table already exists but with an incompatible schema: in act.compute_ALPHA_OUTPUT: Type mismatch for column 1 (REFERENCE_PERIOD) : 'TIMESTAMPLTZ' in dataset, 'TIMESTAMP'(93:DATE) in table"

Things I have already tried:

  • Change the Dataiku Dataset type from date to string (and vice versa)
  • Enable/Disable "Read SQL "timestamp without timezone" as DSS dates"
  • Enable/Disable "Read SQL "date" as DSS dates"

Any suggestions?

* Visual recipe: It's the only option that "works" with multiple connections. "SQL Query recipe" and "SQL Script recipe" both throw an error that multiple connections are not supported.

by
edited by
Hi, Could you please attach a diagnostic of the affected job? From the job page, click on Actions > Download job diagnosis.
If the resulting file is too large for mail (> 15 MB), you can use a file transfer service like WeTransfer to get it to us, and add the link to this thread.
Hi Alex, Thanks for your quick reply and help.
It does not seem possible to attach files in this forum and I can't use any file transfer service (blocked). As an alternative I can send the logs to you via email. If that's ok with you, please pm me your email.
Sure, you can send them to Alexandre dot Combessie at Dataiku dot com.

1 Answer

0 votes
Hi,

What you are trying to do isn't possible. When a dataset is used as output of a DSS recipe, it is "managed" by DSS, which will always try to put its own schema. DSS will never create "DATE" fields by itself because DSS does not have a type that corresponds to the Oracle "DATE", i.e. without time of day.

It is thus not possible to have DSS write with a visual recipe into a dataset with a DATE field. Either the DATE field will be dropped or you'll have the schema incompatibility issue you're noting.

Multiple connections are supported in SQL script recipe by going to Advanced and enabling the option to allow multiple connections. It's important to note that multiple connections are actually NOT supported in the visual join recipe: the join will happen in-DSS and the data will get out of Oracle, and back into it.
by
Hi Clément.

Thanks for your response and for pointing me to the right direction. I tried again the join using the SQL script recipe and enabling the multiple connections as suggested.  

The Error that I'm getting now it's "ORA-00942: table or view does not exist" . Basically it can't find the table that's not part of the "Main SQL connection" (as selected in the advanced options).
Have you verified that the input datasets of that recipe exist as tables? You can do so by double clicking on the datasets > Settings > Connection.
Hi Clément,

Oracle DATE type includes date and time portion up to milliseconds. We would expect that Dataiku would convert timestamp into Oracle DATE data type by dropping the milliseconds portion and save the data. Any reason why this is not possible to do?

Thanks and regards,
Denis
Hi Alex,

Yeah I'm absolutely sure that the input datasets exist as tables.
I assume you are doing the join using an SQL script as Clement advised, with the option to enable multiple connections. Could you post the script you are using please?
Hi Clément,

Have you had a chance to review my feedback? We certainly see it as a defect since the data is taken from an Oracle database and is written back into the same database. Dataiku does the data type conversion resulting in the error and it can be avoided as described in my previous comment.

What do you think?

Best regards,
Denis
Hello Denis,
Could you please post the SQL script you are using? Have you tried with the TIMESTAMPTZ type instead of DATE?
Cheers,
Alex
Yeah sure, that's the query:

INSERT INTO "ALPHA_OUTPUT"
(X_FEATURE, IDENTIFIER, REFERENCE_PERIOD)
SELECT   A.X_FEATURE,
         B.IDENTIFIER,
         B.REFERENCE_PERIOD
FROM "BETA" B
INNER JOIN "ALPHA" A
ON B.IDENTIFIER = A.IDENTIFIER
AND B.REFERENCE_PERIOD = A.REFERENCE_PERIOD
1,050 questions
1,088 answers
1,193 comments
9,107 users

©Dataiku 2012-2018 - Privacy Policy