我使用 GridSearch从 sklearn优化分类器的参数。由于有大量的数据,因此整个优化过程需要一段时间: 超过一天。我想观察在执行过程中已经尝试过的参数组合的性能。有可能吗?
GridSearch
sklearn
将 GridSearchCV中的 verbose参数设置为正数(数字越大,获得的详细信息越多)。例如:
GridSearchCV
verbose
GridSearchCV(clf, param_grid, cv=cv, scoring='accuracy', verbose=10)
看看 GridSearchCVProgress 条
刚找到的,我正在用,非常投入:
In [1]: GridSearchCVProgressBar Out[1]: pactools.grid_search.GridSearchCVProgressBar In [2]: In [2]: ??GridSearchCVProgressBar Init signature: GridSearchCVProgressBar(estimator, param_grid, scoring=None, fit_params=None, n_jobs=1, iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', error_score='raise', return_train_score='warn') Source: class GridSearchCVProgressBar(model_selection.GridSearchCV): """Monkey patch Parallel to have a progress bar during grid search""" def _get_param_iterator(self): """Return ParameterGrid instance for the given param_grid""" iterator = super(GridSearchCVProgressBar, self)._get_param_iterator() iterator = list(iterator) n_candidates = len(iterator) cv = model_selection._split.check_cv(self.cv, None) n_splits = getattr(cv, 'n_splits', 3) max_value = n_candidates * n_splits class ParallelProgressBar(Parallel): def __call__(self, iterable): bar = ProgressBar(max_value=max_value, title='GridSearchCV') iterable = bar(iterable) return super(ParallelProgressBar, self).__call__(iterable) # Monkey patch model_selection._search.Parallel = ParallelProgressBar return iterator File: ~/anaconda/envs/python3/lib/python3.6/site-packages/pactools/grid_search.py Type: ABCMeta In [3]: ?GridSearchCVProgressBar Init signature: GridSearchCVProgressBar(estimator, param_grid, scoring=None, fit_params=None, n_jobs=1, iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', error_score='raise', return_train_score='warn') Docstring: Monkey patch Parallel to have a progress bar during grid search File: ~/anaconda/envs/python3/lib/python3.6/site-packages/pactools/grid_search.py Type: ABCMeta
我只是想补充 大卫的回答
为了给你一个想法,对于一个非常简单的情况,这是它看起来如何与 verbose=1:
verbose=1
Fitting 10 folds for each of 1 candidates, totalling 10 fits [Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers. [Parallel(n_jobs=1)]: Done 10 out of 10 | elapsed: 1.2min finished
verbose=10看起来是这样的:
verbose=10
Fitting 10 folds for each of 1 candidates, totalling 10 fits [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers. [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.637, total= 7.1s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 7.0s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.630, total= 6.5s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 13.5s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.637, total= 6.5s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 20.0s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.637, total= 6.7s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 4 out of 4 | elapsed: 26.7s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.632, total= 7.9s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 5 out of 5 | elapsed: 34.7s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.622, total= 6.9s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 6 out of 6 | elapsed: 41.6s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.627, total= 7.1s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 7 out of 7 | elapsed: 48.7s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.628, total= 7.2s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 8 out of 8 | elapsed: 55.9s remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.640, total= 6.6s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1 [Parallel(n_jobs=1)]: Done 9 out of 9 | elapsed: 1.0min remaining: 0.0s [CV] booster=gblinear, learning_rate=0.0001, max_depth=3, n_estimator=100, subsample=0.1, score=0.629, total= 6.6s [Parallel(n_jobs=1)]: Done 10 out of 10 | elapsed: 1.2min finished
在我的案例中,verbose=1就起作用了。