0 votes

We have 2 different Oracle Databases/Connections:

  • ALPHA 
  • BETA

We want to import a table from each database, join them together and append the result in an already existing table in the ALPHA connection.

The table from the connection ALPHA has as schema (in the oracle db)

REFERENCE_PERIOD (DATE) | IDENTIFIER (VARCHAR2(10)) | X_FEATURE (VARCHAR2(20)) 

The table from the connection BETA has as schema (in the oracle db)

REFERENCE_PERIOD (DATE) | IDENTIFIER (VARCHAR2(10)) | Y_FEATURE (VARCHAR2(20)) 

The output table ALPHA_OUTPUT that's stored in the connection ALPHA has as schema

REFERENCE_PERIOD (DATE) | IDENTIFIER (VARCHAR2(10)) | X_FEATURE (VARCHAR2(20))  | Y_FEATURE (VARCHAR2(20)) 

To achieve that we create a visual recipe* as follows:

 

Running this we are getting the following error:

"Table already exists but with an incompatible schema: in act.compute_ALPHA_OUTPUT: Type mismatch for column 1 (REFERENCE_PERIOD) : 'TIMESTAMPLTZ' in dataset, 'TIMESTAMP'(93:DATE) in table"

Things I have already tried:

  • Change the Dataiku Dataset type from date to string (and vice versa)
  • Enable/Disable "Read SQL "timestamp without timezone" as DSS dates"
  • Enable/Disable "Read SQL "date" as DSS dates"

Any suggestions?

* Visual recipe: It's the only option that "works" with multiple connections. "SQL Query recipe" and "SQL Script recipe" both throw an error that multiple connections are not supported.

by
edited by
Hi, Could you please attach a diagnostic of the affected job? From the job page, click on Actions > Download job diagnosis.
If the resulting file is too large for mail (> 15 MB), you can use a file transfer service like WeTransfer to get it to us, and add the link to this thread.
Hi Alex, Thanks for your quick reply and help.
It does not seem possible to attach files in this forum and I can't use any file transfer service (blocked). As an alternative I can send the logs to you via email. If that's ok with you, please pm me your email.
Sure, you can send them to Alexandre dot Combessie at Dataiku dot com.

1 Answer

0 votes
Hi,

What you are trying to do isn't possible. When a dataset is used as output of a DSS recipe, it is "managed" by DSS, which will always try to put its own schema. DSS will never create "DATE" fields by itself because DSS does not have a type that corresponds to the Oracle "DATE", i.e. without time of day.

It is thus not possible to have DSS write with a visual recipe into a dataset with a DATE field. Either the DATE field will be dropped or you'll have the schema incompatibility issue you're noting.

Multiple connections are supported in SQL script recipe by going to Advanced and enabling the option to allow multiple connections. It's important to note that multiple connections are actually NOT supported in the visual join recipe: the join will happen in-DSS and the data will get out of Oracle, and back into it.
by
Hi Clément.

Thanks for your response and for pointing me to the right direction. I tried again the join using the SQL script recipe and enabling the multiple connections as suggested.  

The Error that I'm getting now it's "ORA-00942: table or view does not exist" . Basically it can't find the table that's not part of the "Main SQL connection" (as selected in the advanced options).
Have you verified that the input datasets of that recipe exist as tables? You can do so by double clicking on the datasets > Settings > Connection.
Hi Clément,

Oracle DATE type includes date and time portion up to milliseconds. We would expect that Dataiku would convert timestamp into Oracle DATE data type by dropping the milliseconds portion and save the data. Any reason why this is not possible to do?

Thanks and regards,
Denis
Hi Alex,

Yeah I'm absolutely sure that the input datasets exist as tables.
I assume you are doing the join using an SQL script as Clement advised, with the option to enable multiple connections. Could you post the script you are using please?
Hi Clément,

Have you had a chance to review my feedback? We certainly see it as a defect since the data is taken from an Oracle database and is written back into the same database. Dataiku does the data type conversion resulting in the error and it can be avoided as described in my previous comment.

What do you think?

Best regards,
Denis
Hello Denis,
Could you please post the SQL script you are using? Have you tried with the TIMESTAMPTZ type instead of DATE?
Cheers,
Alex
Yeah sure, that's the query:

INSERT INTO "ALPHA_OUTPUT"
(X_FEATURE, IDENTIFIER, REFERENCE_PERIOD)
SELECT   A.X_FEATURE,
         B.IDENTIFIER,
         B.REFERENCE_PERIOD
FROM "BETA" B
INNER JOIN "ALPHA" A
ON B.IDENTIFIER = A.IDENTIFIER
AND B.REFERENCE_PERIOD = A.REFERENCE_PERIOD
Hi Alex and Clément,

Could you provide an update on this issue please? I.e. when is it going to be fixed by Dataiku?

Kind regards,
Denis
Hi,
Your request has been logged. For now, as advised by Clément, we advise casting the column to the Oracle TIMESTAMPLTZ type. This type will be recognised correctly by both Oracle and DSS.
Hope it helps,
Alex
Hi Alex,

Thank you for the update. When should we expect this defect to be fixed?

The suggestion to use TIMESTAMPLTZ data type does not help unfortunately. We have many Oracle databases and datawarehouses with DATE fields and cannot change them all just to workaround the defect in DSS.

Best regards,
Denis
Hi Denis,

I'm sorry we didn't really notice your request earlier. Please note that this is a community answers forum, and not really the suitable location to raise formal requests. Followups and tracking is on a best-effort basis here.

For raising requests, we strongly advise you to open support tickets through our support portal. Could you please do so for this request so that we can reply appropriately ?
1,080 questions
1,123 answers
1,250 comments
10,870 users

©Dataiku 2012-2018 - Privacy Policy