Coming soon: We’re working on a brand new, revamped Community experience. Want to receive updates? Sign up now!

0 votes

I installed Dataiku successfully on my VM and created the two hdfs connections required by the tutorial : "hdfs_root" & "hdfs_managed". However, I can't seem to connect to Hive Metastore when I synchronize the haiku_shirt_sales.csv that I imported under the hdfs_managed directory to create a corresponding metastore table in my hive warehouse (a directory for the hive metastore isn't even created). I verified the accesses of the different directories, tried several times on different csv files (i respected the formats supported by Hive), created Hive databases, changed the hdfs connections toward the "/user/hive/warehouse" implemented by Cloudera but still, I cannot seem to identify the root of the problem. Could you help me out plz ?

Kind regards,

I looked deeper in the architecture of Dataiku and got back those logs as an explanation of the issues concerning the synchronization to the Hive Metastore :

"java.lang.RuntimeException: Metastore synchronization failed : org/apache/hadoop/hive/ql/parse/VariableSubstitution
    at com.dataiku.dip.hive.HiveMetastoreSynchronizer.executeScript(
    at com.dataiku.dip.hive.HiveMetastoreSynchronizer.synchronizeOneDataset(
    at com.dataiku.dip.hive.HiveMetastoreSynchronizer.synchronizeOneDatasetPartition(
    at com.dataiku.dip.dataflow.jobrunner.ActivityRunner.waitForEnd(
    at com.dataiku.dip.dataflow.jobrunner.ActivityRunner.runActivity(
    at com.dataiku.dip.dataflow.jobrunner.JobRunner.runActivity(
    at com.dataiku.dip.dataflow.jobrunner.JobRunner.access$700(
    at com.dataiku.dip.dataflow.jobrunner.JobRunner$"

1 Answer

0 votes

It looks like you are running on CDH 5.7

DSS does not yet support this version (it supports CDH 5.3 to 5.6). We plan to release an update with CDH 5.7 support this week.

Best regards,
1,337 questions
1,364 answers
11,916 users

©Dataiku 2012-2018 - Privacy Policy