Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I'm training a set of models as given below. I want to include only one variable 'feature1' for training. but it appears that all the columns in the data are used for training. How do I include only this feature while training?
if trained_model_MAPE > ERROR_THRESHOLD:
# Wait for the ML task to be ready
mltask.wait_guess_complete()
# Obtain settings, enable GBT, and save settings
settings = mltask.get_settings()
settings.set_algorithm_enabled("GBT_REGRESSION", True)
settings.use_feature('feature1')
settings.save()
# Start training and wait for it to be complete
mltask.start_train()
mltask.wait_train_complete()
# Get the identifiers of the trained models
# There will be 3 of them because Logistic regression and Random forest were default enabled, plus GBT enabled above
ids = mltask.get_trained_models_ids()
mape_list = []
for id in ids:
details = mltask.get_trained_model_details(id)
algorithm = details.get_modeling_settings()["algorithm"]
mape = details.get_performance_metrics()["mape"]
print(f"Algorithm={algorithm} MAPE={mape}")
mape_list.append(mape)
'
Operating system used: Windows
Like for algorithm, some other features have been enabled by default. You can use reject_feature to disable them.
For instance, using foreach_feature to iterate on all features:
features_to_use = ['feature1']
features_to_reject = []
def handle_feature(feature_name, feature_params):
if feature_name not in features_to_use and feature_params["role"] == 'INPUT':
features_to_reject.append(feature_name)
return feature_params
settings.foreach_feature(handle_feature)
for feature_name in features_to_use:
settings.use_feature(feature_name)
for feature_name in features_to_reject:
settings.reject_feature(feature_name)
@AdrienL I'm facing the following error with the above solution
DataikuException: com.dataiku.dip.exceptions.DSSInternalErrorException: Internal error, caused by: NullPointerException: null
---------------------------------------------------------------------------HTTPError Traceback (most recent call last)/opt/dataiku-dss-12.5.1/python/dataikuapi/dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body, headers) 1450 headers=headers)-> 1451 http_res.raise_for_status() 1452 return http_res
/opt/dataiku-dss-12.5.1/python39.packages/requests/models.py in raise_for_status(self) 1020 if http_error_msg:-> 1021 raise HTTPError(http_error_msg, response=self) 1022HTTPError: 500 Server Error: Server Error for url: http://127.0.0.1:10001/dip/publicapi/projects/PRICINGPOWERMODELS/models/lab/P6zS29Fv/PlwkqGVX/settin...
During handling of the above exception, another exception occurred:
DataikuException Traceback (most recent call last)<ipython-input-367-3e542e8df9de> in <module> 17 settings.foreach_feature(handle_feature) 18---> 19 settings.save() 20 21 # Start training and wait for it to be complete/opt/dataiku-dss-12.5.1/python/dataikuapi/dss/ml.py in save(self) 600 """ 601--> 602 self.client._perform_empty( 603 "POST", "/projects/%s/models/lab/%s/%s/settings" % (self.project_key, self.analysis_id, self.mltask_id), 604 body = self.mltask_settings)/opt/dataiku-dss-12.5.1/python/dataikuapi/dssclient.py in _perform_empty(self, method, path, params, body, files, raw_body) 1459 1460 def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):-> 1461 self._perform_http(method, path, params=params, body=body, files=files, stream=False, raw_body=raw_body) 1462 1463 def _perform_text(self, method, path, params=None, body=None,files=None, raw_body=None):/opt/dataiku-dss-12.5.1/python/dataikuapi/dssclient.py in _perform_http(self, method, path, params, body, stream, files, raw_body, headers) 1456 except ValueError: 1457 ex = {"message": http_res.text}-> 1458 raise DataikuException("%s: %s" % (ex.get("errorType", "Unknown error"), ex.get("detailedMessage", ex.get("message", "No message")))) 1459 1460 def _perform_empty(self, method, path, params=None, body=None, files = None, raw_body=None):DataikuException: com.dataiku.dip.exceptions.DSSInternalErrorException: Internal error, caused by: NullPointerException: null
Yeah I read the doc too fast, it states the handle_feature function is supposed to return the feature parameters. Also, one should only reject input features, otherwise we risk rejecting the target (not a good idea). I rewrote the code above and rearranged it for clarity.