Visual Module

This module provides visualization tools for inspecting missing data patterns. It includes matrix-style plots, nullity correlation heatmaps, and column-level summaries. These tools are especially helpful during exploratory data analysis (EDA) and when diagnosing the missingness structure.

Function Overview

plot_missing_matrix

Visualize missing data patterns in a matrix-style heatmap.

plot_missing_heatmap

Plot a heatmap of pairwise nullity correlations.

Module Reference

plot_missing_matrix

missmecha.visual.plot_missing_matrix(df, figsize=None, cmap='Blues', sort_by=None, color=True, fontsize=14, label_rotation=45, show_colorbar=False, ts=False)[source]

Visualize missing data patterns in a matrix-style heatmap.

This function renders a binary mask of missingness in the input DataFrame as a heatmap. It optionally colors the observed (non-missing) values using a colormap, and supports both standard tabular and time series formats.

Parameters:
  • df (pandas.DataFrame) – Input DataFrame to visualize. Missing values (NaN) will be shown as empty.

  • figsize (tuple of int, optional) – Custom figure size (width, height). Defaults to auto-scaling based on shape.

  • cmap (str, optional) – Colormap to apply to observed values when color=True. Default is “Blues”.

  • sort_by (str or None, optional) – If set, sorts rows by the specified column before plotting. Useful for detecting missing pattern.

  • color (bool, optional) – If True, applies a colormap to observed values. If False, uses a binary (gray-scale) mask.

  • fontsize (int, optional) – Font size for column labels and axis ticks. Default is 14.

  • label_rotation (int, optional) – Rotation angle for x-axis labels (column names and missing rates). Default is 45°.

  • show_colorbar (bool, optional) – Whether to display the colorbar (only works if color=True).

  • ts (bool, optional) – If True, displays the y-axis using the actual DataFrame index (e.g., for time series). If False, uses sequential row numbers.

Returns:

ax – Axes object of the generated plot.

Return type:

matplotlib.axes.Axes

Notes

  • Top axis: column names; bottom axis: column-wise missing rates.

  • Works with both numerical and categorical columns.

  • Fully observed or fully missing columns are retained (not filtered).

  • For large datasets, consider subsampling before plotting for performance.

Examples

>>> from missmecha.visual import plot_missing_matrix
>>> import pandas as pd
>>> df = pd.read_csv("data.csv")
>>> plot_missing_matrix(df, color=False)

plot_missing_heatmap

missmecha.visual.plot_missing_heatmap(df, figsize=(20, 12), fontsize=14, label_rotation=45, cmap='RdBu', method='pearson')[source]

Plot a heatmap of pairwise nullity correlations.

This function visualizes the pairwise correlation between missing value patterns across columns in the input DataFrame. The heatmap helps identify dependencies between missingness in different variables and can guide further missing data analysis.

Parameters:
  • df (pandas.DataFrame) – Input dataset to visualize. Each column should represent a feature.

  • figsize (tuple of int, optional) – Figure size in inches (width, height). Default is (20, 12).

  • fontsize (int, optional) – Font size for axis labels and annotations. Default is 14.

  • label_rotation (int, optional) – Rotation angle (in degrees) for x-axis tick labels. Default is 45.

  • cmap (str, optional) – Colormap for the heatmap (e.g., ‘RdBu’, ‘viridis’). Default is ‘RdBu’.

  • method ({'pearson', 'kendall', 'spearman'}, optional) – Correlation method to compute pairwise nullity relationships. Default is ‘pearson’.

Returns:

ax – Axes object containing the plotted heatmap.

Return type:

matplotlib.axes.Axes

Raises:

ValueError – If the input DataFrame does not contain any missing values.

Notes

  • Fully observed or fully missing columns are excluded from the plot.

  • If the dataset has more than 1000 rows, a random sample of 1000 rows is used.

  • The heatmap represents correlation between binary indicators of missingness (True/False).

Examples

>>> from missmecha.visual import plot_missing_heatmap
>>> import pandas as pd
>>> df = pd.read_csv("my_data.csv")
>>> plot_missing_heatmap(df)