fiber_feats

Extract collagen fiber features from an H&E image.

Note

This function extracts collagen fibers from the image and computes various metrics on the extracted fibers. Allowed metrics are:

- tortuosity
- average_turning_angle
- length
- major_axis_len
- minor_axis_len
- major_axis_angle
- minor_axis_angle

Parameters:

Name	Type	Description	Default
`img`	`ndarray`	The input H&E image. Shape (H, W, 3).	required
`metrics`	`Tuple[str]`	The metrics to compute. Options are: - "tortuosity" - "average_turning_angle" - "length" - "major_axis_len" - "minor_axis_len" - "major_axis_angle" - "minor_axis_angle"	required
`label`	`ndarray`	The nuclei binary or label mask. Shape (H, W). This is used to mask out the nuclei when extracting collagen fibers. If None, the entire image is used.	`None`
`mask`	`ndarray`	Binary mask to restrict the region of interest. Shape (H, W). For example, it can be used to mask out tissues that are not of interest.	`None`
`normalize`	`bool`	Flag whether to column (quantile) normalize the computed metrics or not.	`False`
`rm_bg`	`bool`	Whether to remove the background component from the edges.	`False`
`rm_fg`	`bool`	Whether to remove the foreground component from the edges.	`False`
`device`	`str`	Device to use for collagen extraction. Options are 'cpu' or 'cuda'. If set to 'cuda', CuPy and cucim will be used for GPU acceleration. This affects only the collagen extraction step, not the metric computation.	`'cpu'`
`num_processes`	`int`	The number of processes when converting to GeoDataFrame. If -1, all available processes will be used. Default is 1. Ignored if return_edges is False.	`1`
`reset_uid`	`bool`	Whether to reset the UID of the extracted fibers. Default is True. If False, the original UIDs will be preserved.	required
`return_edges`	`bool`	Whether to return the extracted edges as a GeoDataFrame. Default is False.	`False`

Returns:

Type	Description
`GeoDataFrame`	gpd.GeoDataFrame: A GeoDataFrame containing the extracted collagen fibers as LineString geometries and the computed metrics as columns.

Examples:

>>> from histolytics.data import hgsc_stroma_he, hgsc_stroma_nuclei
>>> from histolytics.utils.raster import gdf2inst
>>>
>>> # Load example image and nuclei annotation
>>> img = hgsc_stroma_he()
>>> label = gdf2inst(hgsc_stroma_nuclei(), width=1500, height=1500)
>>>
>>> # Extract fiber features
>>> edge_gdf = fiber_feats(
...    img,
...    label=label,
...    metrics=("length", "tortuosity", "average_turning_angle"),
...    device="cpu",
...    num_processes=4,
...    normalize=True,
...    return_edges=True,
... )
>>> print(edge_gdf.head(3))
        uid  class_name                                           geometry              0    1           1  LINESTRING (29.06525 26.95506, 29.03764 26.844...
    1    2           1  MULTILINESTRING ((69.19964 89.83999, 69.01369 ...
    2    3           1  MULTILINESTRING ((51.54728 1.36606, 51.67797 1...
        length  tortuosity  average_turning_angle
    0  0.450252    0.372026               0.294881
    1  0.977289    0.643115               0.605263
    2  0.700793    0.661500               0.560562

Source code in src/histolytics/stroma_feats/collagen.py

def fiber_feats(
    img: np.ndarray,
    metrics: Tuple[str],
    label: np.ndarray = None,
    mask: np.ndarray = None,
    normalize: bool = False,
    rm_bg: bool = False,
    rm_fg: bool = False,
    device: str = "cpu",
    num_processes: int = 1,
    return_edges: bool = False,
) -> gpd.GeoDataFrame:
    """Extract collagen fiber features from an H&E image.

    Note:
        This function extracts collagen fibers from the image and computes various metrics
        on the extracted fibers. Allowed metrics are:

            - tortuosity
            - average_turning_angle
            - length
            - major_axis_len
            - minor_axis_len
            - major_axis_angle
            - minor_axis_angle

    Parameters:
        img (np.ndarray):
            The input H&E image. Shape (H, W, 3).
        metrics (Tuple[str]):
            The metrics to compute. Options are:
                - "tortuosity"
                - "average_turning_angle"
                - "length"
                - "major_axis_len"
                - "minor_axis_len"
                - "major_axis_angle"
                - "minor_axis_angle"
        label (np.ndarray):
            The nuclei binary or label mask. Shape (H, W). This is used to mask out the
            nuclei when extracting collagen fibers. If None, the entire image is used.
        mask (np.ndarray):
            Binary mask to restrict the region of interest. Shape (H, W). For example,
            it can be used to mask out tissues that are not of interest.
        normalize (bool):
            Flag whether to column (quantile) normalize the computed metrics or not.
        rm_bg (bool):
            Whether to remove the background component from the edges.
        rm_fg (bool):
            Whether to remove the foreground component from the edges.
        device (str):
            Device to use for collagen extraction. Options are 'cpu' or 'cuda'. If set to
            'cuda', CuPy and cucim will be used for GPU acceleration. This affects only
            the collagen extraction step, not the metric computation.
        num_processes (int):
            The number of processes when converting to GeoDataFrame. If -1, all
            available processes will be used. Default is 1. Ignored if return_edges is False.
        reset_uid (bool):
            Whether to reset the UID of the extracted fibers. Default is True. If False,
            the original UIDs will be preserved.
        return_edges (bool):
            Whether to return the extracted edges as a GeoDataFrame. Default is False.

    Returns:
        gpd.GeoDataFrame:
            A GeoDataFrame containing the extracted collagen fibers as LineString
            geometries and the computed metrics as columns.

    Examples:
        >>> from histolytics.data import hgsc_stroma_he, hgsc_stroma_nuclei
        >>> from histolytics.utils.raster import gdf2inst
        >>>
        >>> # Load example image and nuclei annotation
        >>> img = hgsc_stroma_he()
        >>> label = gdf2inst(hgsc_stroma_nuclei(), width=1500, height=1500)
        >>>
        >>> # Extract fiber features
        >>> edge_gdf = fiber_feats(
        ...    img,
        ...    label=label,
        ...    metrics=("length", "tortuosity", "average_turning_angle"),
        ...    device="cpu",
        ...    num_processes=4,
        ...    normalize=True,
        ...    return_edges=True,
        ... )
        >>> print(edge_gdf.head(3))
                uid  class_name                                           geometry  \
            0    1           1  LINESTRING (29.06525 26.95506, 29.03764 26.844...
            1    2           1  MULTILINESTRING ((69.19964 89.83999, 69.01369 ...
            2    3           1  MULTILINESTRING ((51.54728 1.36606, 51.67797 1...
                length  tortuosity  average_turning_angle
            0  0.450252    0.372026               0.294881
            1  0.977289    0.643115               0.605263
            2  0.700793    0.661500               0.560562
    """
    edges = extract_collagen_fibers(
        img, label=label, mask=mask, device=device, rm_bg=rm_bg, rm_fg=rm_fg
    )
    labeled_edges = sklabel(edges)

    if len(np.unique(labeled_edges)) <= 1:
        return gpd.GeoDataFrame(columns=["uid", "class_name", "geometry", *metrics])

    feat_df = _compute_fiber_feats(labeled_edges, metrics)

    if normalize:
        feat_df = feat_df.apply(col_norm)

    # Convert labeled edges to GeoDataFrame
    if return_edges:
        edge_gdf = inst2gdf(dilation(labeled_edges))
        edge_gdf = edge_gdf.merge(feat_df, left_on="uid", right_index=True)
        edge_gdf["geometry"] = gdf_apply(
            edge_gdf,
            _get_medial_smooth,
            columns=["geometry"],
            parallel=num_processes > 1,
            num_processes=num_processes,
        )
        edge_gdf = edge_gdf.assign(class_name="collagen")
        return (
            edge_gdf.sort_values(by="uid")
            .set_index("uid", verify_integrity=True, drop=True)
            .reset_index(drop=True)
        )

    return feat_df