+1 vote

I got an error when executing SQLexecutor function in Dataiku IPyhton.

This is the command that I used:

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from dataiku.core.sql import SQLExecutor2
from dataiku.core.sql import SQLExecutor

dtk = dataiku.Dataset('flight_issued_coupon_0428_0603')
SQLexe = SQLExecutor2(dataset=dtk)

dtk = dataiku.Dataset('flight_issued_coupon_0428_0603')
SQLexe = SQLExecutor2(dataset=dtk)
df = SQLexe.query_to_df("""
select 'profile_id', 'user_id', 
  count(*) as cnt 
from dtk group by 'profile_id','user_id' 
order by cnt desc 
limit 10
""")

and this is the error log that I got:

ERROR: An unexpected error occurred while tokenizing input
The following traceback may be corrupted or invalid
The error message is: ('EOF in multi-line string', (1, 0))

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-127-cb7599f5d1d2> in <module>()
      5 order by cnt desc
      6 limit 10
----> 7 """)

/home/ubuntu/dataiku-dss-3.0.1/python/dataiku/core/sql.pyc in query_to_df(self, query, pre_queries, post_queries, extra_conf)
    223 
    224     def query_to_df(self, query, pre_queries=None, post_queries=None, extra_conf={}):
--> 225         return _streamed_query_to_df(self._iconn, query, pre_queries, post_queries, self._find_connection_from_dataset, "sql", base.get_shared_secret(), extra_conf)
    226 
    227     def query_to_iter(self, query, pre_queries=None, post_queries=None, extra_conf={}):

/home/ubuntu/dataiku-dss-3.0.1/python/dataiku/core/sql.pyc in _streamed_query_to_df(connection, query, pre_queries, post_queries, find_connection_from_dataset, db_type, secret, extra_conf)
     39     logging.info("Got initial SQL query response")
     40 
---> 41     streamingSession = _handle_intercom_json_resp(resp)
     42     queryId = streamingSession['queryId']
     43 

/home/ubuntu/dataiku-dss-3.0.1/python/dataiku/core/sql.pyc in _handle_intercom_json_resp(resp, err_msg)
     15         err_data = resp.text
     16         if err_data:
---> 17             raise Exception("%s: %s" % (err_msg, json.loads(err_data).get("message","No details").encode("utf8")))
     18         else:
     19             raise Exception("%s: %s" % (err_msg, "No details"))

Exception: Call failed: Unrecognized virtual connection type: sql

 

I also tried with one-liner command like this:

df = SQLexe.query_to_df("select 'profile_id', 'user_id', count(*) as cnt from dtk group by 'profile_id','user_id' order by cnt desc limit 10")

and still got an error but different:

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-126-93f9770f53a0> in <module>()
----> 1 df = SQLexe.query_to_df("select 'profile_id', 'user_id', count(*) as cnt from dtk group by 'profile_id','user_id' order by cnt desc limit 10")

/home/ubuntu/dataiku-dss-3.0.1/python/dataiku/core/sql.pyc in query_to_df(self, query, pre_queries, post_queries, extra_conf)
    223 
    224     def query_to_df(self, query, pre_queries=None, post_queries=None, extra_conf={}):
--> 225         return _streamed_query_to_df(self._iconn, query, pre_queries, post_queries, self._find_connection_from_dataset, "sql", base.get_shared_secret(), extra_conf)
    226 
    227     def query_to_iter(self, query, pre_queries=None, post_queries=None, extra_conf={}):

/home/ubuntu/dataiku-dss-3.0.1/python/dataiku/core/sql.pyc in _streamed_query_to_df(connection, query, pre_queries, post_queries, find_connection_from_dataset, db_type, secret, extra_conf)
     39     logging.info("Got initial SQL query response")
     40 
---> 41     streamingSession = _handle_intercom_json_resp(resp)
     42     queryId = streamingSession['queryId']
     43 

/home/ubuntu/dataiku-dss-3.0.1/python/dataiku/core/sql.pyc in _handle_intercom_json_resp(resp, err_msg)
     15         err_data = resp.text
     16         if err_data:
---> 17             raise Exception("%s: %s" % (err_msg, json.loads(err_data).get("message","No details").encode("utf8")))
     18         else:
     19             raise Exception("%s: %s" % (err_msg, "No details"))

Exception: Call failed: Unrecognized virtual connection type: sql

 

asked by

1 Answer

0 votes

A little late but here is my thought : the dataset you're trying to reach might be a Hive one. So you should use the HiveExecutor instead of the SQL one. 
To check this, just go to your flow and see what recipes DSS allows you to create from your dataset. 

Hope this might help someone :-) 

answered by
947 questions
978 answers
989 comments
2,132 users

©Dataiku 2012-2018 - Privacy Policy