0 votes

I created a Decision Tree model and I want to export it so I can use it outside of Dataiku.

I took the pickle file and loaded into Python to continue using it there.

import pickle

f = open('clf.pkl', 'rb')
loaded_model = pickle.load(f, encoding='latin1')

On model settings, I used standard rescaling which uses the avgstd. Dataiku also exports this json file with details about the rescaling:

{
    "shifts": [
        4.2708957215287455, 
        5.582300530732055, 
        4.721780769116731, 
        6.309030531691733, 
        4.534705132386515, 
        50183.866161634876, 
        4.628957297141036, 
        5.931597829632046, 
        1.834355009673187, 
        21814.135528393213, 
        0.9999925875959688, 
        0.23165941222883746, 
        -0.11146363232269413
    ], 
    "columns": [
        "col1", 
        "col2", 
        "col3", 
        "col4", 
        "col5", 
        "col6", 
        "col7", 
        "col8", 
        "col9", 
        "col10", 
        "col11", 
        "col12", 
        "col13"
    ], 
    "inv_scales": [
        0.29041217789420476, 
        0.32026605114154144, 
        0.3398879256267485, 
        0.2539738260220278, 
        0.27817344479641604, 
        1.1217850173179438e-05, 
        0.3181917203503525, 
        0.2886476076886483, 
        0.37842451508835384, 
        2.329233011164756e-05, 
        0.21830904186227362, 
        2.003574563132119, 
        1.386943696546877
    ]
}

Let's say I have a new input with the original values (before rescaling). How can I use the above information to rescale all the features on the new object I have to predict the results?

asked by Tasos

1 Answer

0 votes
Best answer
Hi,

For each column:

rescaled_feature = (input_feature - shift)  * inv_scale
answered by
685 questions
705 answers
528 comments
433 users