helpers

frequenz.lib.notebooks.reporting.utils.helpers ¤

Energy flow and configuration utilities for microgrid analysis and reporting.

This module provides helper functions for

Safe numeric extraction and aggregation from pandas DataFrames.
Loading and validating YAML-based configuration files.
Labeling microgrid component columns using configuration metadata.
Computing derived energy flow metrics such as:
- Production excess
- Battery charge utilization
- Grid feed-in
- Self-consumption and self-share
Formatting and timezone conversion utilities for reporting.

These utilities are primarily used in energy analytics pipelines and microgrid reporting notebooks to ensure consistent data preprocessing, metric calculation, and standardized output structures.

FUNCTION	DESCRIPTION
`_get_numeric_series`	Safely extract a numeric Series or return zeros if missing.
`_sum_cols`	Safely sum multiple numeric columns.
`load_config`	Load and validate a YAML configuration file.
`_fmt_to_de_system`	Format numbers using German-style decimal conventions.
`_convert_timezone`	Convert a DataFrame timestamp column to a target timezone.
`label_component_columns`	Rename numeric component columns using MicrogridConfig.
`get_energy_report_columns`	Determine relevant columns for energy reporting.
`add_energy_flows`	Compute derived production, battery, and grid metrics.

Notes

These helpers are designed for internal use and assume well-structured DataFrames with datetime indices or timestamp columns.
All numeric outputs are returned as float64 Series to ensure consistency.

Classes¤

Functions¤

frequenz.lib.notebooks.reporting.utils.helpers.add_energy_flows ¤

add_energy_flows(
    df: DataFrame,
    production_cols: list[str] | None = None,
    consumption_cols: list[str] | None = None,
    grid_cols: list[str] | None = None,
    battery_cols: list[str] | None = None,
    production_is_positive: bool = False,
) -> DataFrame

Compute and add derived energy flow metrics to the DataFrame.

This function aggregates production and consumption data, derives energy flow relationships such as grid feed-in, battery charging, and self-consumption, and appends these computed columns to the given DataFrame. Columns that are specified but missing or contain only null/zero values are ignored.

PARAMETER	DESCRIPTION
`df`	Input DataFrame containing production, consumption, and optionally battery power data. TYPE: `DataFrame`
`production_cols`	list of column names representing production sources. TYPE: `list[str] \| None` DEFAULT: `None`
`consumption_cols`	list of column names representing consumption sources. TYPE: `list[str] \| None` DEFAULT: `None`
`grid_cols`	list of column names representing grid import/export. TYPE: `list[str] \| None` DEFAULT: `None`
`battery_cols`	optional list of column names for battery charging power. If None, battery-related flows are set to zero. TYPE: `list[str] \| None` DEFAULT: `None`
`production_is_positive`	Whether production values are already positive. If False, `production` is inverted before clipping. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`DataFrame`	A DataFrame including additional columns: - "production_excess": Production exceeding consumption. - "production_excess_in_bat": Portion of excess stored in the battery. - "grid_feed_in": Portion of excess fed into the grid. - "production_self_use": Self-consumed portion of production. - "production_self_share": Share of consumption covered by self-production.

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def add_energy_flows(
    df: pd.DataFrame,
    production_cols: list[str] | None = None,
    consumption_cols: list[str] | None = None,
    grid_cols: list[str] | None = None,
    battery_cols: list[str] | None = None,
    production_is_positive: bool = False,
) -> pd.DataFrame:
    """Compute and add derived energy flow metrics to the DataFrame.

    This function aggregates production and consumption data, derives energy flow
    relationships such as grid feed-in, battery charging, and self-consumption,
    and appends these computed columns to the given DataFrame. Columns that are
    specified but missing or contain only null/zero values are ignored.

    Args:
        df: Input DataFrame containing production, consumption, and optionally
            battery power data.
        production_cols: list of column names representing production sources.
        consumption_cols: list of column names representing consumption sources.
        grid_cols: list of column names representing grid import/export.
        battery_cols: optional list of column names for battery charging power. If None,
            battery-related flows are set to zero.
        production_is_positive: Whether production values are already positive.
            If False, `production` is inverted before clipping.

    Returns:
        A DataFrame including additional columns:
            - "production_excess": Production exceeding consumption.
            - "production_excess_in_bat": Portion of excess stored in the battery.
            - "grid_feed_in": Portion of excess fed into the grid.
            - "production_self_use": Self-consumed portion of production.
            - "production_self_share": Share of consumption covered by self-production.
    """
    df_flows = df.copy()

    # Normalize production, consumption and battery columns by removing None entries
    resolved_production_cols = [
        col for col in (production_cols or []) if _column_has_data(df_flows, col)
    ]
    resolved_consumption_cols = [
        col for col in (consumption_cols or []) if _column_has_data(df_flows, col)
    ]
    resolved_grid_cols = [
        col for col in (grid_cols or []) if _column_has_data(df_flows, col)
    ]
    resolved_battery_cols = [
        col for col in (battery_cols or []) if _column_has_data(df_flows, col)
    ]

    battery_power_series = _sum_cols(df_flows, resolved_battery_cols)
    battery_charge_series = (
        battery_power_series.reindex(df_flows.index).fillna(0.0).clip(lower=0.0)
    )
    grid_power_series = _sum_cols(df_flows, resolved_grid_cols)

    # Compute total asset production
    asset_production_cols: list[str] = []
    for col in resolved_production_cols:
        series = _get_numeric_series(
            df_flows,
            col,
        )
        asset_series = asset_production(
            series,
            production_is_positive=production_is_positive,
        )
        asset_col_name = f"{col}_asset_production"
        df_flows[asset_col_name] = asset_series
        asset_production_cols.append(asset_col_name)

    df_flows["production_total"] = _sum_cols(df_flows, asset_production_cols)

    # Compute total consumption
    consumption_series_cols: list[str] = []
    for col in resolved_consumption_cols:
        df_flows[col] = _get_numeric_series(df_flows, col)
        consumption_series_cols.append(col)

    df_flows["consumption_total"] = _sum_cols(df_flows, consumption_series_cols)

    # Surplus vs. consumption (production is already positive because of the above cleaning)
    df_flows["production_excess"] = production_excess(
        df_flows["production_total"],
        df_flows["consumption_total"],
        production_is_positive=True,
    )

    # Battery charging power (optional)
    df_flows["production_excess_in_bat"] = production_excess_in_bat(
        df_flows["production_total"],
        df_flows["consumption_total"],
        battery=battery_charge_series,
        production_is_positive=True,
    )

    # Split excess into battery vs. grid
    df_flows["grid_feed_in"] = grid_feed_in(
        df_flows["production_total"],
        df_flows["consumption_total"],
        battery=battery_charge_series,
        production_is_positive=True,
    )

    # If no production columns exist, set self-consumption metrics to zero
    if asset_production_cols:
        # Use total production for self-consumption instead of asset_production
        # (which may not exist)
        df_flows["production_self_use"] = production_self_consumption(
            df_flows["production_total"],
            df_flows["consumption_total"],
            production_is_positive=True,
        )
        df_flows["production_self_share"] = production_self_share(
            df_flows["production_total"],
            df_flows["consumption_total"],
            production_is_positive=True,
        )
    else:
        df_flows["production_self_use"] = 0.0
        df_flows["production_self_share"] = 0.0

    # Add grid consumption column
    df_flows["grid_consumption"] = grid_consumption(
        grid_power_series,
        # To convert positive production back to PSC format (where production is negative)
        df_flows["production_total"] * -1,
        df_flows["consumption_total"],
        battery_power_series,
    )

    df_flows = df_flows.drop(
        columns=["production_total", "consumption_total"], errors="ignore"
    )
    return df_flows

frequenz.lib.notebooks.reporting.utils.helpers.build_color_map ¤

build_color_map(
    cols: list[str],
    color_dict: dict[str, str] | None = None,
    palette: list[str] | None = None,
) -> dict[str, str]

Generate a color mapping for columns or categories.

Creates a mapping from column names (or categorical labels) to color values. If user-specified colors are provided via color_dict, those are applied first. Remaining columns are assigned distinct colors from a chosen palette, ensuring no duplicates.

PARAMETER	DESCRIPTION
`cols`	List of column names or category labels to assign colors to. TYPE: `list[str]`
`color_dict`	Optional dictionary of pre-defined color mappings. Columns found here are assigned these colors directly. TYPE: `dict[str, str] \| None` DEFAULT: `None`
`palette`	Optional list of color codes to use as defaults. If None, a combined Plotly qualitative palette is used. TYPE: `list[str] \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`dict[str, str]`	A dictionary mapping each column or category name to a unique color.

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def build_color_map(
    cols: list[str],
    color_dict: dict[str, str] | None = None,
    palette: list[str] | None = None,
) -> dict[str, str]:
    """Generate a color mapping for columns or categories.

    Creates a mapping from column names (or categorical labels) to color
    values. If user-specified colors are provided via `color_dict`, those
    are applied first. Remaining columns are assigned distinct colors from
    a chosen palette, ensuring no duplicates.

    Args:
        cols: List of column names or category labels to assign colors to.
        color_dict: Optional dictionary of pre-defined color mappings.
            Columns found here are assigned these colors directly.
        palette: Optional list of color codes to use as defaults.
            If None, a combined Plotly qualitative palette is used.

    Returns:
        A dictionary mapping each column or category name to a unique color.
    """
    # Default palette
    if palette is None:
        palette = px.colors.qualitative.Plotly + px.colors.qualitative.Dark2

    def to_rgba_str(color: str) -> str:
        """Convert any color format (hex, rgb, named) to normalized rgba(R,G,B,1) string."""
        try:
            rgba = mcolors.to_rgba(color)  # returns (r,g,b,a) in 0–1 range
            rgba_255 = tuple(int(round(x * 255)) for x in rgba[:3])
            return f"rgba({rgba_255[0]},{rgba_255[1]},{rgba_255[2]},{rgba[3]:.3f})"
        except ValueError:
            # fallback if string isn't recognized (e.g. malformed rgba)
            return color.lower().strip()

    final = {}
    used = set()

    # First assign user-provided colors
    if color_dict:
        for c, v in color_dict.items():
            if c in cols:
                rgba = to_rgba_str(v)
                final[c] = rgba
                used.add(rgba)

    # Then assign defaults, skipping already-used colors
    palette_iter = iter(palette * (len(cols) // len(palette) + 1))
    for c in cols:
        if c in final:
            continue
        for p in palette_iter:
            rgba = to_rgba_str(p)
            if rgba not in used:
                final[c] = rgba
                used.add(rgba)
                break

    return final

frequenz.lib.notebooks.reporting.utils.helpers.convert_timezone ¤

convert_timezone(
    ts: Series,
    target_tz: str = "Europe/Berlin",
    assume_tz: str = "UTC",
) -> Series

Convert a datetime Series to a target timezone.

If the Series contains timezone-naive datetimes, they are first localized to assume_tz before converting to target_tz.

PARAMETER	DESCRIPTION
`ts`	Input Series containing the datetime values. TYPE: `Series`
`target_tz`	Timezone name to convert the Series to. Defaults to `"Europe/Berlin"`. TYPE: `str` DEFAULT: `'Europe/Berlin'`
`assume_tz`	Timezone to assume for naive datetimes. Defaults to `"UTC"`. TYPE: `str` DEFAULT: `'UTC'`

RETURNS	DESCRIPTION
`Series`	pd.DataFrame: A copy of the DataFrame with the converted datetime column.

RAISES	DESCRIPTION
`ValueError`	If `column_timestamp` is not present in `df`.

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def convert_timezone(
    ts: pd.Series,
    target_tz: str = "Europe/Berlin",
    assume_tz: str = "UTC",
) -> pd.Series:
    """Convert a datetime Series to a target timezone.

    If the Series contains timezone-naive datetimes, they are first localized to
    ``assume_tz`` before converting to ``target_tz``.

    Args:
        ts: Input Series containing the datetime values.
        target_tz: Timezone name to convert the Series to.
            Defaults to ``"Europe/Berlin"``.
        assume_tz: Timezone to assume for naive datetimes.
            Defaults to ``"UTC"``.

    Returns:
        pd.DataFrame: A copy of the DataFrame with the converted datetime column.

    Raises:
        ValueError: If ``column_timestamp`` is not present in ``df``.
    """
    if not isinstance(ts, pd.Series):
        raise ValueError("Input must be a pandas Series")

    # Localize naive timestamps
    if ts.dt.tz is None:
        ts = ts.dt.tz_localize(assume_tz)

    return ts.dt.tz_convert(target_tz)

frequenz.lib.notebooks.reporting.utils.helpers.fmt_to_de_system ¤

fmt_to_de_system(x: float) -> str

Format a number using German-style decimal and thousands separators.

The function formats the number with two decimal places, using a comma as the decimal separator and a dot as the thousands separator.

PARAMETER	DESCRIPTION
`x`	The number to format. TYPE: `float`

RETURNS	DESCRIPTION
`str`	The formatted string with German number formatting applied.

Example

_fmt_to_de_system(12345.6789) '12.345,68'

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def fmt_to_de_system(x: float) -> str:
    """Format a number using German-style decimal and thousands separators.

    The function formats the number with two decimal places, using a comma
    as the decimal separator and a dot as the thousands separator.

    Args:
        x: The number to format.

    Returns:
        The formatted string with German number formatting applied.

    Example:
        >>> _fmt_to_de_system(12345.6789)
        '12.345,68'
    """
    return f"{x:,.2f}".replace(",", "X").replace(".", ",").replace("X", ".")

frequenz.lib.notebooks.reporting.utils.helpers.get_energy_report_columns ¤

get_energy_report_columns(
    component_types: list[str], single_components: list[str]
) -> list[str]

Build the list of dataframe columns for the energy report.

The selected columns depend on the available component types.

PARAMETER	DESCRIPTION
`component_types`	List of component types (e.g. ["pv", "battery"]) TYPE: `list[str]`
`single_components`	Extra component columns to always include. TYPE: `list[str]`

RETURNS	DESCRIPTION
`list[str]`	The full list of dataframe columns.

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def get_energy_report_columns(
    component_types: list[str], single_components: list[str]
) -> list[str]:
    """Build the list of dataframe columns for the energy report.

    The selected columns depend on the available component types.

    Args:
        component_types: List of component types (e.g. ["pv", "battery"])
        single_components: Extra component columns to always include.

    Returns:
        The full list of dataframe columns.
    """
    # Base columns
    energy_report_df_cols = [
        "timestamp",
        "grid_load",
        "grid_consumption",
        "mid_consumption",
    ] + single_components

    # Map component types to the columns they enable
    component_column_map = {
        "battery": ["battery_throughput"],
        "pv": [
            "pv_asset_production",
            "production_self_use",
            "grid_feed_in",
        ],
        "chp": ["chp_asset_production"],
    }

    # Define columns that require both PV and Battery
    pv_battery_cols = [
        "production_excess_in_bat",
        "production_self_share",
    ]

    # Add component-specific columns
    for component, columns in component_column_map.items():
        if component in component_types:
            energy_report_df_cols.extend(columns)

    # Add combined PV + Battery columns
    if (
        any(c in component_types for c in ["pv", "chp", "wind", "ev"])
        and "battery" in component_types
    ):
        energy_report_df_cols.extend(pv_battery_cols)

    return energy_report_df_cols

frequenz.lib.notebooks.reporting.utils.helpers.label_component_columns ¤

label_component_columns(
    df: DataFrame,
    mcfg: MicrogridConfig,
    column_battery: str = "battery",
    column_pv: str = "pv",
    column_chp: str = "chp",
    column_ev: str = "ev",
) -> tuple[DataFrame, list[str]]

Rename numeric single-component columns to labeled names.

Numeric string column names like "14" are converted to "Battery #14", "PV #14", "CHP #14" or "EV #14" based on the component IDs provided by mcfg.component_type_ids(...)

PARAMETER	DESCRIPTION
`df`	Input DataFrame with numeric string column names. TYPE: `DataFrame`
`mcfg`	Configuration with `_component_types_cfg` mapping component types to a `meter` iterable of numeric IDs. TYPE: `MicrogridConfig`
`column_battery`	Key name for battery component type. TYPE: `str` DEFAULT: `'battery'`
`column_pv`	Key name for PV component type. TYPE: `str` DEFAULT: `'pv'`
`column_chp`	Key name for CHP component type. TYPE: `str` DEFAULT: `'chp'`
`column_ev`	Key name for EV component type TYPE: `str` DEFAULT: `'ev'`

Returns: Tuple containing the renamed DataFrame and the list of applied labels

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def label_component_columns(
    df: pd.DataFrame,
    mcfg: MicrogridConfig,
    column_battery: str = "battery",
    column_pv: str = "pv",
    column_chp: str = "chp",
    column_ev: str = "ev",
) -> tuple[pd.DataFrame, list[str]]:
    """Rename numeric single-component columns to labeled names.

    Numeric string column names like ``"14"`` are converted to
    ``"Battery #14"``, ``"PV #14"``, ``"CHP #14"`` or ``"EV #14"`` based on
    the component IDs provided by ``mcfg.component_type_ids(...)``

    Args:
        df: Input DataFrame with numeric string column names.
        mcfg: Configuration with ``_component_types_cfg`` mapping component types to a
            ``meter`` iterable of numeric IDs.
        column_battery: Key name for battery component type.
        column_pv: Key name for PV component type.
        column_chp: Key name for CHP component type.
        column_ev: Key name for EV component type
    Returns:
        Tuple containing the renamed DataFrame and the list of applied labels
    """
    # Numeric component columns present in df
    single_components = [str(c) for c in df.columns if str(c).isdigit()]
    available_types = set(mcfg.component_types())

    # From config (empty set if missing)
    def ids_if_available(t: str) -> set[str]:
        return (
            {str(x) for x in mcfg.component_type_ids(t)}
            if t in available_types
            else set()
        )

    battery_ids = ids_if_available(column_battery)
    pv_ids = ids_if_available(column_pv)
    chp_ids = ids_if_available(column_chp)
    ev_ids = ids_if_available(column_ev)

    rename: dict[str, str] = {}
    rename.update(
        {
            c: f"{column_battery.capitalize()} #{c}"
            for c in single_components
            if c in battery_ids
        }
    )
    rename.update(
        {c: f"{column_pv.upper()} #{c}" for c in single_components if c in pv_ids}
    )
    rename.update(
        {c: f"{column_ev.upper()} #{c}" for c in single_components if c in ev_ids}
    )
    rename.update(
        {c: f"{column_chp.upper()} #{c}" for c in single_components if c in chp_ids}
    )

    return df.rename(columns=rename), list(rename.values())

frequenz.lib.notebooks.reporting.utils.helpers.load_config ¤

load_config(path: str) -> dict[str, Any]

Load a YAML config file and return it as a dictionary.

PARAMETER	DESCRIPTION
`path`	Path to the YAML file. TYPE: `str`

RETURNS	DESCRIPTION
`dict[str, Any]`	Configuration values as a dictionary.

RAISES	DESCRIPTION
`TypeError`	If the YAML root element is not a mapping (dict).

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def load_config(path: str) -> dict[str, Any]:
    """
    Load a YAML config file and return it as a dictionary.

    Args:
        path: Path to the YAML file.

    Returns:
        Configuration values as a dictionary.

    Raises:
        TypeError: If the YAML root element is not a mapping (dict).
    """
    with open(path, "r", encoding="utf-8") as f:
        data = yaml.safe_load(f)

    if not isinstance(data, dict):
        raise TypeError(
            f"YAML root must be a mapping (dict), got {type(data).__name__}"
        )

    return data

frequenz.lib.notebooks.reporting.utils.helpers.long_to_wide ¤

long_to_wide(
    df: DataFrame,
    *,
    time_col: str | Index = "Timestamp",
    category_col: str | None = "Battery",
    value_col: str | None = "Battery Throughput",
    sum_col_name: str | None = None,
    aggfunc: str = "sum"
) -> DataFrame

Convert a long-format DataFrame into wide format with optional aggregation.

Transforms a long-format dataset (one row per timestamp-category pair) into a wide-format table, where each category becomes a separate column. Optionally adds a total (sum) column across all categories.

PARAMETER	DESCRIPTION
`df`	Input DataFrame in long format. TYPE: `DataFrame`
`time_col`	Column name representing timestamps used as the index in the resulting wide table. Defaults to `"Timestamp"`. TYPE: `str \| Index` DEFAULT: `'Timestamp'`
`category_col`	Column name representing category labels that become column headers in the wide table. Defaults to `"Battery"`. TYPE: `str \| None` DEFAULT: `'Battery'`
`value_col`	Column name representing numeric values to aggregate and pivot into columns. Defaults to `"Battery Throughput"`. TYPE: `str \| None` DEFAULT: `'Battery Throughput'`
`sum_col_name`	Optional name for a new column containing the row-wise sum of all category columns. If None, defaults to `"<value_col> Sum"`. TYPE: `str \| None` DEFAULT: `None`
`aggfunc`	Aggregation function applied when multiple entries exist per timestamp-category pair (e.g., `"sum"`, `"mean"`). Defaults to `"sum"`. TYPE: `str` DEFAULT: `'sum'`

RETURNS	DESCRIPTION
`DataFrame`	A wide-format DataFrame with one row per timestamp, one column per category,
`DataFrame`	and an optional total column representing the aggregated sum across all categories.

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def long_to_wide(
    df: pd.DataFrame,
    *,
    time_col: str | pd.Index = "Timestamp",
    category_col: str | None = "Battery",
    value_col: str | None = "Battery Throughput",
    sum_col_name: str | None = None,
    aggfunc: str = "sum",
) -> pd.DataFrame:
    """Convert a long-format DataFrame into wide format with optional aggregation.

    Transforms a long-format dataset (one row per timestamp-category pair)
    into a wide-format table, where each category becomes a separate column.
    Optionally adds a total (sum) column across all categories.

    Args:
        df: Input DataFrame in long format.
        time_col: Column name representing timestamps used as the index in
            the resulting wide table. Defaults to `"Timestamp"`.
        category_col: Column name representing category labels that become
            column headers in the wide table. Defaults to `"Battery"`.
        value_col: Column name representing numeric values to aggregate and
            pivot into columns. Defaults to `"Battery Throughput"`.
        sum_col_name: Optional name for a new column containing the row-wise sum
            of all category columns. If None, defaults to `"<value_col> Sum"`.
        aggfunc: Aggregation function applied when multiple entries exist per
            timestamp-category pair (e.g., `"sum"`, `"mean"`). Defaults to `"sum"`.

    Returns:
        A wide-format DataFrame with one row per timestamp, one column per category,
        and an optional total column representing the aggregated sum across all categories.
    """
    tmp = df.copy()

    wide = tmp.pivot_table(
        index=time_col,  # type: ignore [arg-type]
        columns=category_col,
        values=value_col,
        aggfunc=aggfunc,
    ).sort_index()

    wide.columns.name = None

    if sum_col_name is None:
        sum_col_name = f"{value_col} Sum"
    wide[sum_col_name] = wide.sum(axis=1, numeric_only=True)
    return wide

frequenz.lib.notebooks.reporting.utils.helpers.set_date_to_midnight ¤

set_date_to_midnight(
    input_date: date | datetime, timezone_name: str = "UTC"
) -> datetime

Return a timezone-aware datetime set to midnight of the given date.

Converts a date or datetime into a midnight timestamp localized to the specified timezone. If the input is already a datetime, only the date portion is used.

PARAMETER	DESCRIPTION
`input_date`	Date or datetime object to normalize to midnight. TYPE: `date \| datetime`
`timezone_name`	Name of the target timezone (e.g., "Europe/Berlin"). Defaults to "UTC". Falls back to UTC if the timezone name is invalid. TYPE: `str` DEFAULT: `'UTC'`

RETURNS	DESCRIPTION
`datetime`	A timezone-aware datetime object representing midnight of the
`datetime`	given date in the specified timezone.

Source code in frequenz/lib/notebooks/reporting/utils/helpers.py

def set_date_to_midnight(
    input_date: date | datetime, timezone_name: str = "UTC"
) -> datetime:
    """Return a timezone-aware datetime set to midnight of the given date.

    Converts a date or datetime into a midnight timestamp localized to
    the specified timezone. If the input is already a datetime, only the
    date portion is used.

    Args:
        input_date: Date or datetime object to normalize to midnight.
        timezone_name: Name of the target timezone (e.g., "Europe/Berlin").
            Defaults to "UTC". Falls back to UTC if the timezone name
            is invalid.

    Returns:
        A timezone-aware datetime object representing midnight of the
        given date in the specified timezone.
    """
    if isinstance(input_date, datetime):
        input_date = input_date.date()

    try:
        tz = pytz.timezone(timezone_name)
    except pytz.UnknownTimeZoneError:
        warnings.warn(
            f"Unknown timezone '{timezone_name}', falling back to UTC.",
            RuntimeWarning,
        )
        tz = pytz.UTC

    return tz.localize(datetime.combine(input_date, time.min))