Impute Module

This module provides baseline imputers to support quick experimentation or evaluation of imputation strategies. The main tool is SimpleSmartImputer, which detects column types and fills missing values accordingly.

Function Overview

SimpleSmartImputer

A simple imputer for both numerical and categorical variables.

Module Reference

SimpleSmartImputer

This simple yet adaptive imputer chooses different strategies based on column types: - Numerical columns are imputed using the mean - Categorical columns are imputed using the mode

It supports optional control over column type detection and verbosity.

class missmecha.impute.SimpleSmartImputer(cat_cols=None, verbose=True)[source]

Bases: object

A simple imputer for both numerical and categorical variables.

This class automatically detects or accepts user-specified column types and fills missing values using mean (for numerical) or mode (for categorical) strategies.

The interface supports scikit-learn-style methods: fit, transform, and fit_transform.

Parameters:
  • cat_cols (list of str, optional) – A list of column names to be treated as categorical. If None, types are inferred automatically.

  • verbose (bool, default=True) – Whether to print out the imputation summary during fit.

Examples

>>> from missmecha.impute import SimpleSmartImputer
>>> df = pd.DataFrame({'age': [25, np.nan, 30], 'gender': ['M', 'F', np.nan]})
>>> imputer = SimpleSmartImputer()
>>> df_imputed = imputer.fit_transform(df)
fit(df)[source]

Fit the imputer on the provided DataFrame.

This method determines the fill values for each column based on the strategy: - Numerical columns: mean - Categorical columns: mode

Parameters:

df (pandas.DataFrame) – Input data to analyze and compute fill values from.

Returns:

self – The fitted instance with fill_values set.

Return type:

SimpleSmartImputer

fit_transform(df)[source]

Fit the imputer and transform the dataset in one step.

Equivalent to calling fit() followed by transform().

Parameters:

df (pandas.DataFrame) – Dataset to be imputed.

Returns:

A copy of the input DataFrame with missing values filled.

Return type:

pandas.DataFrame

transform(df)[source]

Apply the learned fill values to transform the dataset.

Parameters:

df (pandas.DataFrame) – Dataset to be imputed using values from fit().

Returns:

A copy of the input DataFrame with missing values filled.

Return type:

pandas.DataFrame