Coming soon: We’re working on a brand new, revamped Community experience. Want to receive updates? Sign up now!
Thanks for the diagnosis. After investigation, it seems the issue was caused by a discrepancy between lowercase and uppercase in your original Parquet file versus the Hive table. Your input dataset was generated as a Parquet file manually with the column name "MANDT" (uppercase). Then it was imported from Hive to DSS. However, Hive always converts all column names to lowercase. Hence, DSS was seeing the column name as "mandt" which is incoherent to the name stored in the original Parquet file. As of today we cannot detect this type of cases automatically.
The preferred solution would be to only generate Parquet files with lowercase column names, so that they are compatible with Hive (and Impala as well).
If that option is not possible, you may try to change the recipe engine from DSS to Hive. As a matter of fact, for large datasets, it is recommended to change the recipe engine to a Hadoop related one (Spark, Hive or Impala). You should gain in performance by pushing the computation down to your Hadoop cluster instead of having it streamed to DSS.
Here are the screenshots