python 3.x - Different score every time I run sklearn model with random_state set -


i'm trying determine why every time rerun model obtain different score. i've defined:

# numpy seed (don't know if needed, figured couldn't hurt) np.random.seed(42) # tried re-seeding every time ran `cross_val_predict()` block, didn't work either  # cross-validator random_state set cv5 = kfold(n_splits=5, random_state=42, shuffle=true)  # scoring rmse of natural logs (to match kaggle competition i'm trying) def custom_scorer(actual, predicted):         actual = np.log1p(actual)     predicted = np.log1p(predicted)     return np.sqrt(np.sum(np.square(actual-predicted))/len(actual)) 

then ran once cv=cv5:

# running gridsearchcv rf_test = randomforestregressor(n_jobs = -1)  params = {'max_depth': [20,30,40], 'n_estimators': [500], 'max_features': [100,140,160]}  gscv = gridsearchcv(estimator=rf_test, param_grid=params, cv=cv5, n_jobs=-1, verbose=1)  gscv.fit(xtrain,ytrain) print(gscv.best_estimator_) 

after running gscv.best_estimator_, rerun several times, , different scores each time:

rf_test = gscv.best_estimator_ rf_test.random_state=42 ypred = cross_val_predict(rf_test, xtrain, ytrain, cv=cv2) custom_scorer(np.expm1(ytrain),np.expm1(ypred)) 

example of (extremely small) score differences:

0.13200993923446158 0.13200993923446164 0.13200993923446153 0.13200993923446161 

i'm trying set seeds same score every time same model, in order able compare different models. in kaggle competitions small differences in scores seem matter (although admittedly not small), i'd understand why. have rounding in machine when performing calculations? appreciated!

edit: forgot line rf_test.random_state=42 made larger difference in score disparity, line included still have minuscule differences.

you using cv2 while testing out randomforest regressor. have set it's random seed ? otherwise splits while testing out regressor different.


Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -