mltune package
Subpackages
Submodules
mltune.base module
- class mltune.base.BaseModelWrapper(hyperparameters=None, features=None)[source]
Bases:
object
Base wrapper for ML models.
Stores hyperparameters and feature list, and provides JSON serialization and basic fit/predict interface.
Subclasses must set self.model to the actual model instance.
- hyperparameters
Hyperparameters for the model.
- Type:
dict
- features
List of feature names to use.
- Type:
list of str
- model
Underlying ML model instance (set by subclass).
- Type:
Any
- get_model_factory()[source]
Returns a factory function that creates new model instances with fixed hyperparameters and dynamic hyperparameters.
The returned factory takes a dictionary of dynamic hyperparameters (those to be tuned, e.g., via grid search) and returns a new model instance ready to fit.
- Returns:
A factory function: dynamic_params → model instance.
- Return type:
Callable[[dict[str, Any]], Any]
- Raises:
NotImplementedError – If the method is not overridden by a subclass.
- to_json()[source]
Serialize the wrapper’s configuration to a JSON string.
- Returns:
JSON string representing model class, hyperparameters, and features.
- Return type:
str
- classmethod from_json(json_string)[source]
Deserialize from JSON string to create a new wrapper instance.
- Parameters:
json_string (str) – JSON string created by to_json.
- Returns:
A new instance of the wrapper with loaded hyperparameters and features.
- Return type:
- fit(X, y)[source]
Fit the underlying model to training data.
- Parameters:
X (pd.DataFrame) – Training feature data.
y (pd.Series) – Training target labels.
- Returns:
Result of the model’s fit method.
- Return type:
Any
- predict(X)[source]
Predict target values using the trained model.
- Parameters:
X (pd.DataFrame) – Input feature data.
- Returns:
Predicted target values.
- Return type:
Any
- autotune(X, y, hyperparam_initial_info, splits=5, feature_selection_strategy='none', hyperparam_tuning_strategy='grid_search', verbose=False, plot=False)[source]
Auto-tune model hyperparameters and feature set.
- Parameters:
X (pd.DataFrame) – Full feature dataset.
y (pd.Series) – Target labels.
hyperparam_initial_info (Amy) – Initial info for hyperparameter tuning (e.g. Parameter grid for “grid_search” strategy).
splits (int) – Number of CV folds.
feature_selection_strategy (str) – Strategy for feature elimination (“greedy_backward” or “none”).
hyperparam_tuning_strategy (str) – Strategy for hyperparameter tuning (currently only “grid_search”).
verbose (bool) – Print logs during tuning.
plot (bool, default=False) – If true, show plot with cv/train accuracy
- Return type:
None
mltune.lightgbm module
- class mltune.lightgbm.LightGBMModelWrapper(hyperparameters=None, features=None)[source]
Bases:
BaseModelWrapper
Wrapper for lightgbm.LGBMClassifier.
Initializes the underlying LGBMClassifier with given hyperparameters.
- Parameters:
hyperparameters (dict of str to Any) – Model hyperparameters to configure RandomForestClassifier.
features (list of str) – List of feature names to use during training and prediction.
mltune.plotting module
- mltune.plotting.plot_feature_importances(importances, title='Feature Importances')[source]
Plot a horizontal bar chart of feature importances.
- Parameters:
importances (pd.Series) – Feature importances indexed by feature names.
title (str, default="Feature Importances") – Plot title.
- Return type:
None
- mltune.plotting.plot_feature_elimination_progression(score_log)[source]
Plot CV and train accuracy progression during feature elimination.
- Parameters:
score_log (list of tuple of (int, float, float)) – Each tuple contains: - number of features remaining (int) - cross-validation accuracy score (float) - training accuracy score (float)
- Return type:
None
mltune.sklearn module
- class mltune.sklearn.RandomForestModelWrapper(hyperparameters=None, features=None)[source]
Bases:
BaseModelWrapper
Wrapper for sklearn.ensemble.RandomForestClassifier.
Initializes the underlying RandomForestClassifier with given hyperparameters.
- Parameters:
hyperparameters (dict of str to Any) – Model hyperparameters to configure RandomForestClassifier.
features (list of str) – List of feature names to use during training and prediction.
mltune.tuning module
- mltune.tuning.get_feature_importance_ranking(model, features, ascending=True, plot=False)[source]
Returns a list of features sorted by importance.
- Parameters:
model (fitted model) – Must have feature_importances_ attribute.
features (list of str) – Feature names to check.
ascending (bool, default=True) – If True, sort in ascending order (the least important first).
plot (bool, default=False) – If True, show matplotlib bar chart of feature importances.
- Returns:
Feature names sorted by importance.
- Return type:
list of str
- mltune.tuning.tune_model_parameters(X, y, estimator, hyperparam_initial_info, features, splits=5, verbose=False, search_strategy='grid_search')[source]
Perform hyperparameter tuning using GridSearchCV (with option to extend to other strategies).
- Parameters:
estimator (estimator instance) – The ML model (e.g., RandomForestClassifier) to tune.
hyperparam_initial_info (Any) – e.g. Parameter names and list of values to try for “grid_search”.
X (Any) – Feature dataset. (e.g., pandas DataFrame)
y (Any) – Target labels. (e.g., pandas Series)
features (list of str) – Feature names to use during tuning.
splits (int, default=5) – Number of cross-validation folds.
verbose (bool, default=False) – If True, print best score and params.
search_strategy (str, default="grid_search") – Search strategy to use (currently only “grid_search” is supported).
- Returns:
Dictionary with: - ‘best_params’: best hyperparameters found. - ‘best_score’: best CV accuracy score (rounded). - ‘cv_results’: full cv_results_ from GridSearchCV.
- Return type:
dict
- mltune.tuning.tune_without_feature_elimination(X, y, model_factory, features, hyperparam_initial_info, splits, hyperparam_tuning_strategy, verbose=False, plot=False)[source]
Tune model hyperparameters without any feature elimination.
- Parameters:
X (pd.DataFrame) – Feature dataset.
y (pd.Series) – Target labels.
model_factory (Callable[..., Any]) – Factory function that returns a new model instance when called with hyperparameters.
features (List[str]) – Feature list (will not be changed).
hyperparam_initial_info (Any) – Initial hyperparameter search space info (e.g., parameter grid).
splits (int) – Number of CV folds.
hyperparam_tuning_strategy (str) – Hyperparameter tuning strategy, e.g., ‘grid_search’.
verbose (bool, default=False) – Print progress logs.
plot (bool, default=False) – Plot tuning results.
- Return type:
tuple
[dict
[str
,Any
],list
[str
]]- Returns:
best_params (dict) – Best found hyperparameters.
features (List[str]) – The original feature list, unchanged.
- mltune.tuning.tune_with_feature_elimination(X, y, model_factory, features, hyperparam_initial_info, splits=5, hyperparam_tuning_strategy='grid_search', verbose=False, plot=False)[source]
Tune model hyperparameters and select features using greedy backward elimination.
- Parameters:
X (pd.DataFrame) – Feature dataset.
y (pd.Series) – Target labels.
model_factory (Callable[..., Any]) – Factory function that returns a new model instance when called with hyperparameters.
features (List[str]) – Initial feature list to consider.
hyperparam_initial_info (Any) – Initial hyperparameter search space info (e.g., parameter grid).
splits (int) – Number of CV folds.
hyperparam_tuning_strategy (str) – Hyperparameter tuning strategy, e.g., ‘grid_search’.
verbose (bool, default=False) – Print progress logs.
plot (bool, default=False) – Plot tuning progression.
- Return type:
tuple
[dict
[str
,Any
],list
[str
]]- Returns:
best_params (dict) – Best found hyperparameters.
best_features (List[str]) – Selected subset of features after elimination.
- mltune.tuning.tune_model_parameters_and_features(X, y, model_factory, features, hyperparam_initial_info, splits=5, feature_selection_strategy='none', hyperparam_tuning_strategy='grid_search', verbose=False, plot=False)[source]
Auto-tune model hyperparameters and optionally perform feature selection.
- Parameters:
X (pd.DataFrame) – Feature dataset.
y (pd.Series) – Target labels.
model_factory (Callable[..., Any]) – Factory function returning new model instance given hyperparameters.
features (List[str]) – Initial feature list.
hyperparam_initial_info (Any) – Initial hyperparameter search space info.
splits (int, default=5) – Number of CV folds.
feature_selection_strategy (str, default="none") – Feature selection strategy. Supported values: - “none”: no feature elimination (default) - “greedy_backward”: backward feature elimination
hyperparam_tuning_strategy (str, default="grid_search") – Hyperparameter tuning strategy.
verbose (bool, default=False) – Print tuning progress.
plot (bool, default=False) – Plot tuning/feature elimination results.
- Return type:
tuple
[dict
[str
,Any
],list
[str
]]- Returns:
best_params (dict) – Best hyperparameters found.
best_features (List[str]) – Selected feature subset (may be same as input features if no elimination).
- Raises:
NotImplementedError – If the specified feature selection strategy is not supported.
mltune.xgboost module
- class mltune.xgboost.XGBoostModelWrapper(hyperparameters=None, features=None)[source]
Bases:
BaseModelWrapper
Wrapper for xgboost.XGBClassifier.
Initializes the underlying XGBClassifier with given hyperparameters.
- Parameters:
hyperparameters (dict of str to Any) – Model hyperparameters to configure RandomForestClassifier.
features (list of str) – List of feature names to use during training and prediction.