global_autocorr

Run global spatial autocorrelation for a GeoDataFrame.

Note

This is a wrapper function for the esda.Moran from esda package, returning only the relevant data: Moran's I statistic, expected value, variance, and p-value.

Parameters:

Name	Type	Description	Default
`gdf`	`GeoDataFrame`	The GeoDataFrame containing the spatial data.	required
`w`	`W`	The spatial weights object.	required
`feat`	`str`	The feature column to analyze.	required
`permutations`	`int`	Number of random permutations for calculation of pseudo p_values.	`999`

Returns:

Type	Description
`Tuple[float, float]`	Tuple[float, float, float, float]: A tuple containing: - I: Global Moran's I statistic. - p_sim: P-value under the null hypothesis.

Examples:

>>> from histolytics.data import hgsc_cancer_nuclei
>>> from histolytics.spatial_clust.autocorr import global_autocorr
>>> from histolytics.spatial_graph.graph import fit_graph
>>> from histolytics.utils.gdf import set_uid
>>> from histolytics.spatial_geom.shape_metrics import shape_metric
>>> # Load the HGSC cancer nuclei dataset
>>> nuc = hgsc_cancer_nuclei()
>>> neo = nuc[nuc["class_name"] == "neoplastic"]
>>> neo = set_uid(neo)
>>> neo = shape_metric(neo, ["area"])
>>> # Fit a spatial graph to the neoplastic nuclei
>>> w, _ = fit_graph(neo, "distband", threshold=100)
>>> # Calculate local Moran's I for the area feature
>>> pval, moran_i = global_autocorr(neo, w, feat="area")
>>> print(pval, moran_i)
    0.00834165971467421 0.318

Source code in src/histolytics/spatial_clust/autocorr.py

def global_autocorr(
    gdf: gpd.GeoDataFrame,
    w: libpysal.weights.W,
    feat: str,
    permutations: int = 999,
    num_processes: int = 1,
) -> Tuple[float, float]:
    """Run global spatial autocorrelation for a GeoDataFrame.

    Note:
        This is a wrapper function for the `esda.Moran` from `esda` package,
        returning only the relevant data: Moran's I statistic, expected value,
        variance, and p-value.

    Parameters:
        gdf (gpd.GeoDataFrame):
            The GeoDataFrame containing the spatial data.
        w (libpysal.weights.W):
            The spatial weights object.
        feat (str):
            The feature column to analyze.
        permutations (int):
            Number of random permutations for calculation of pseudo p_values.

    Returns:
        Tuple[float, float, float, float]:
            A tuple containing:
            - I: Global Moran's I statistic.
            - p_sim: P-value under the null hypothesis.

    Examples:
        >>> from histolytics.data import hgsc_cancer_nuclei
        >>> from histolytics.spatial_clust.autocorr import global_autocorr
        >>> from histolytics.spatial_graph.graph import fit_graph
        >>> from histolytics.utils.gdf import set_uid
        >>> from histolytics.spatial_geom.shape_metrics import shape_metric
        >>> # Load the HGSC cancer nuclei dataset
        >>> nuc = hgsc_cancer_nuclei()
        >>> neo = nuc[nuc["class_name"] == "neoplastic"]
        >>> neo = set_uid(neo)
        >>> neo = shape_metric(neo, ["area"])
        >>> # Fit a spatial graph to the neoplastic nuclei
        >>> w, _ = fit_graph(neo, "distband", threshold=100)
        >>> # Calculate local Moran's I for the area feature
        >>> pval, moran_i = global_autocorr(neo, w, feat="area")
        >>> print(pval, moran_i)
            0.00834165971467421 0.318
    """
    moran = esda.Moran(
        gdf[feat],
        w,
        permutations=permutations,
    )

    return moran.I, moran.p_sim