Skip to content

ripley_test

Run a Ripley alphabet test on a GeoDataFrame.

Simulates a random poisson point process and computes the Ripley alphabet function values for both the observed pattern and the simulated patterns.

Parameters:

Name Type Description Default
gdf GeoDataFrame

A GeoDataFrame containing the segmented objects.

required
distances ndarray

An array of distances at which to compute the Ripley alphabet function.

required
ripley_alphabet str

The Ripley alphabet statistic to compute. Must be one of "k", "g", or "l".

'g'
n_sim int

The number of simulations to perform for the random point process.

100
hull_type str

The type of hull to use for the Ripley test. Options are "convex_hull", "alpha_shape", "ellipse", or "bbox".

'bbox'

Returns:

Type Description
tuple[ndarray, ndarray, ndarray]

tuple[np.ndarray, np.ndarray, np.ndarray]: A tuple containing: - ripley_stat: The observed Ripley alphabet function values. - sims: An array of simulated Ripley alphabet function values. - pvalues: An array of p-values for the observed Ripley alphabet function values.

Examples:

>>> from histolytics.data import hgsc_cancer_nuclei
>>> from histolytics.spatial_clust.ripley import ripley_test
>>> import numpy as np
>>>
>>> # Load the HGSC cancer nuclei dataset
>>> nuc = hgsc_cancer_nuclei()
>>> neo = nuc[nuc["class_name"] == "neoplastic"]
>>>
>>> distances = np.linspace(0, 100, 10)
>>>
>>> # Run the Ripley G test for the neoplastic nuclei
>>> ripley_stat, sims, pvalues = ripley_test(
...     neo,
...     distances=distances,
...     ripley_alphabet="g",
...     n_sim=100,
...     hull_type="bbox",
... )
>>>
>>> print(pvalues)
    [0.         0.         0.         0.00990099 0.00990099 0.01980198
    0.04950495 0.0990099  0.17821782 0.        ]
Source code in src/histolytics/spatial_clust/ripley.py
def ripley_test(
    gdf: gpd.GeoDataFrame,
    distances: np.ndarray,
    ripley_alphabet: str = "g",
    n_sim: int = 100,
    hull_type: str = "bbox",
) -> tuple[np.ndarray, np.ndarray, np.ndarray]:
    """Run a Ripley alphabet test on a GeoDataFrame.

    Simulates a random poisson point process and computes the Ripley alphabet function
    values for both the observed pattern and the simulated patterns.

    Parameters:
        gdf (gpd.GeoDataFrame):
            A GeoDataFrame containing the segmented objects.
        distances (np.ndarray):
            An array of distances at which to compute the Ripley alphabet function.
        ripley_alphabet (str):
            The Ripley alphabet statistic to compute. Must be one of "k", "g", or "l".
        n_sim (int):
            The number of simulations to perform for the random point process.
        hull_type (str):
            The type of hull to use for the Ripley test. Options are "convex_hull",
            "alpha_shape", "ellipse", or "bbox".

    Returns:
        tuple[np.ndarray, np.ndarray, np.ndarray]:
            A tuple containing:
            - ripley_stat: The observed Ripley alphabet function values.
            - sims: An array of simulated Ripley alphabet function values.
            - pvalues: An array of p-values for the observed Ripley alphabet function values.

    Examples:
        >>> from histolytics.data import hgsc_cancer_nuclei
        >>> from histolytics.spatial_clust.ripley import ripley_test
        >>> import numpy as np
        >>>
        >>> # Load the HGSC cancer nuclei dataset
        >>> nuc = hgsc_cancer_nuclei()
        >>> neo = nuc[nuc["class_name"] == "neoplastic"]
        >>>
        >>> distances = np.linspace(0, 100, 10)
        >>>
        >>> # Run the Ripley G test for the neoplastic nuclei
        >>> ripley_stat, sims, pvalues = ripley_test(
        ...     neo,
        ...     distances=distances,
        ...     ripley_alphabet="g",
        ...     n_sim=100,
        ...     hull_type="bbox",
        ... )
        >>>
        >>> print(pvalues)
            [0.         0.         0.         0.00990099 0.00990099 0.01980198
            0.04950495 0.0990099  0.17821782 0.        ]

    """
    coords = get_centroid_numpy(gdf)
    n_obs = len(coords)
    _hull = hull(coords, hull_type)

    # compute the observed Ripley alphabet function values
    ripley_stat = RIPLEY_ALPHABET[ripley_alphabet](
        coords, support=distances, hull_poly=_hull, dist_metric="euclidean"
    )

    # simulate the Ripley alphabet function values for the random point process
    sims = np.empty((len(ripley_stat), n_sim)).T
    pvalues = np.ones_like(ripley_stat, dtype=float)
    for i_repl in range(n_sim):
        random_i = poisson(coords, n_obs=n_obs, hull_poly=_hull)
        ripley_sim_i = RIPLEY_ALPHABET[ripley_alphabet](
            random_i, support=distances, hull_poly=_hull, dist_metric="euclidean"
        )
        sims[i_repl] = ripley_sim_i
        pvalues += ripley_sim_i >= ripley_stat

    pvalues /= n_sim + 1
    pvalues = np.minimum(pvalues, 1 - pvalues)

    return ripley_stat, sims, pvalues