Skip to content

fiber_feats

Extract collagen fiber features from an H&E image.

Note

This function extracts collagen fibers from the image and computes various metrics on the extracted fibers. Allowed metrics are:

- tortuosity
- average_turning_angle
- length
- major_axis_len
- minor_axis_len
- major_axis_angle
- minor_axis_angle

Parameters:

Name Type Description Default
img ndarray

The input H&E image. Shape (H, W, 3).

required
metrics Tuple[str]

The metrics to compute. Options are: - "tortuosity" - "average_turning_angle" - "length" - "major_axis_len" - "minor_axis_len" - "major_axis_angle" - "minor_axis_angle"

required
label ndarray

The nuclei binary or label mask. Shape (H, W). This is used to mask out the nuclei when extracting collagen fibers. If None, the entire image is used.

None
mask ndarray

Binary mask to restrict the region of interest. Shape (H, W). For example, it can be used to mask out tissues that are not of interest.

None
normalize bool

Flag whether to column (quantile) normalize the computed metrics or not.

False
rm_bg bool

Whether to remove the background component from the edges.

False
rm_fg bool

Whether to remove the foreground component from the edges.

False
device str

Device to use for collagen extraction. Options are 'cpu' or 'cuda'. If set to 'cuda', CuPy and cucim will be used for GPU acceleration. This affects only the collagen extraction step, not the metric computation.

'cpu'
num_processes int

The number of processes when converting to GeoDataFrame. If -1, all available processes will be used. Default is 1. Ignored if return_edges is False.

1
reset_uid bool

Whether to reset the UID of the extracted fibers. Default is True. If False, the original UIDs will be preserved.

required
return_edges bool

Whether to return the extracted edges as a GeoDataFrame. Default is False.

False

Returns:

Type Description
GeoDataFrame

gpd.GeoDataFrame: A GeoDataFrame containing the extracted collagen fibers as LineString geometries and the computed metrics as columns.

Examples:

>>> from histolytics.data import hgsc_stroma_he, hgsc_stroma_nuclei
>>> from histolytics.utils.raster import gdf2inst
>>>
>>> # Load example image and nuclei annotation
>>> img = hgsc_stroma_he()
>>> label = gdf2inst(hgsc_stroma_nuclei(), width=1500, height=1500)
>>>
>>> # Extract fiber features
>>> edge_gdf = fiber_feats(
...    img,
...    label=label,
...    metrics=("length", "tortuosity", "average_turning_angle"),
...    device="cpu",
...    num_processes=4,
...    normalize=True,
...    return_edges=True,
... )
>>> print(edge_gdf.head(3))
        uid  class_name                                           geometry              0    1           1  LINESTRING (29.06525 26.95506, 29.03764 26.844...
    1    2           1  MULTILINESTRING ((69.19964 89.83999, 69.01369 ...
    2    3           1  MULTILINESTRING ((51.54728 1.36606, 51.67797 1...
        length  tortuosity  average_turning_angle
    0  0.450252    0.372026               0.294881
    1  0.977289    0.643115               0.605263
    2  0.700793    0.661500               0.560562
Source code in src/histolytics/stroma_feats/collagen.py
def fiber_feats(
    img: np.ndarray,
    metrics: Tuple[str],
    label: np.ndarray = None,
    mask: np.ndarray = None,
    normalize: bool = False,
    rm_bg: bool = False,
    rm_fg: bool = False,
    device: str = "cpu",
    num_processes: int = 1,
    return_edges: bool = False,
) -> gpd.GeoDataFrame:
    """Extract collagen fiber features from an H&E image.

    Note:
        This function extracts collagen fibers from the image and computes various metrics
        on the extracted fibers. Allowed metrics are:

            - tortuosity
            - average_turning_angle
            - length
            - major_axis_len
            - minor_axis_len
            - major_axis_angle
            - minor_axis_angle

    Parameters:
        img (np.ndarray):
            The input H&E image. Shape (H, W, 3).
        metrics (Tuple[str]):
            The metrics to compute. Options are:
                - "tortuosity"
                - "average_turning_angle"
                - "length"
                - "major_axis_len"
                - "minor_axis_len"
                - "major_axis_angle"
                - "minor_axis_angle"
        label (np.ndarray):
            The nuclei binary or label mask. Shape (H, W). This is used to mask out the
            nuclei when extracting collagen fibers. If None, the entire image is used.
        mask (np.ndarray):
            Binary mask to restrict the region of interest. Shape (H, W). For example,
            it can be used to mask out tissues that are not of interest.
        normalize (bool):
            Flag whether to column (quantile) normalize the computed metrics or not.
        rm_bg (bool):
            Whether to remove the background component from the edges.
        rm_fg (bool):
            Whether to remove the foreground component from the edges.
        device (str):
            Device to use for collagen extraction. Options are 'cpu' or 'cuda'. If set to
            'cuda', CuPy and cucim will be used for GPU acceleration. This affects only
            the collagen extraction step, not the metric computation.
        num_processes (int):
            The number of processes when converting to GeoDataFrame. If -1, all
            available processes will be used. Default is 1. Ignored if return_edges is False.
        reset_uid (bool):
            Whether to reset the UID of the extracted fibers. Default is True. If False,
            the original UIDs will be preserved.
        return_edges (bool):
            Whether to return the extracted edges as a GeoDataFrame. Default is False.

    Returns:
        gpd.GeoDataFrame:
            A GeoDataFrame containing the extracted collagen fibers as LineString
            geometries and the computed metrics as columns.

    Examples:
        >>> from histolytics.data import hgsc_stroma_he, hgsc_stroma_nuclei
        >>> from histolytics.utils.raster import gdf2inst
        >>>
        >>> # Load example image and nuclei annotation
        >>> img = hgsc_stroma_he()
        >>> label = gdf2inst(hgsc_stroma_nuclei(), width=1500, height=1500)
        >>>
        >>> # Extract fiber features
        >>> edge_gdf = fiber_feats(
        ...    img,
        ...    label=label,
        ...    metrics=("length", "tortuosity", "average_turning_angle"),
        ...    device="cpu",
        ...    num_processes=4,
        ...    normalize=True,
        ...    return_edges=True,
        ... )
        >>> print(edge_gdf.head(3))
                uid  class_name                                           geometry  \
            0    1           1  LINESTRING (29.06525 26.95506, 29.03764 26.844...
            1    2           1  MULTILINESTRING ((69.19964 89.83999, 69.01369 ...
            2    3           1  MULTILINESTRING ((51.54728 1.36606, 51.67797 1...
                length  tortuosity  average_turning_angle
            0  0.450252    0.372026               0.294881
            1  0.977289    0.643115               0.605263
            2  0.700793    0.661500               0.560562
    """
    edges = extract_collagen_fibers(
        img, label=label, mask=mask, device=device, rm_bg=rm_bg, rm_fg=rm_fg
    )
    labeled_edges = sklabel(edges)

    if len(np.unique(labeled_edges)) <= 1:
        return gpd.GeoDataFrame(columns=["uid", "class_name", "geometry", *metrics])

    feat_df = _compute_fiber_feats(labeled_edges, metrics)

    if normalize:
        feat_df = feat_df.apply(col_norm)

    # Convert labeled edges to GeoDataFrame
    if return_edges:
        edge_gdf = inst2gdf(dilation(labeled_edges))
        edge_gdf = edge_gdf.merge(feat_df, left_on="uid", right_index=True)
        edge_gdf["geometry"] = gdf_apply(
            edge_gdf,
            _get_medial_smooth,
            columns=["geometry"],
            parallel=num_processes > 1,
            num_processes=num_processes,
        )
        edge_gdf = edge_gdf.assign(class_name="collagen")
        return (
            edge_gdf.sort_values(by="uid")
            .set_index("uid", verify_integrity=True, drop=True)
            .reset_index(drop=True)
        )

    return feat_df