ESDA in PySal (4): shape-measures: shape measurements

ESDA in PySal (4): shape-measures: shape measurements

1.Measures of shape

The esda.shape module provides statistics used in the literature to measure the structure and regularity of polygons. These measurements range from the very simple, such as length-width difference, to the very complex, such as normalized moment of inertia. Anyway, we’re going to go over step by step to calculate some measurements for each county in Mississippi.

Why Mississippi? Because the counties on the west side of Mississippi are connected to the Mississippi River, the river course is winding and winding. In general, we would consider counties on the left side of the state to be more “irregular” than larger counties on the right side. You can see this in the map below:

import geopandas, libpysal
from esda import shape as shapestats
import matplotlib.pyplot as plt
importpygeos

import warnings
warnings.filterwarnings("ignore")
counties = geopandas.read_file(libpysal.examples.get_path("south.shp"))
ms_counties = counties.query("STATE_NAME == 'Mississippi'")
ms_counties.plot()
plt.title("Mississippi Counties")
Text(0.5, 1.0, 'Mississippi Counties')

The first very simple measurement is the difference between the length and width of the shape. This measurement is a measurement of elongation. You can see the effect below, where relatively square counties are painted dark blue, while elongated rectangular counties are painted light yellow. Because the measure does not “see” the meanderings of the river, river counties are judged to be relatively square and not elongated.

ms_counties.plot(shapestats.length_width_diff(ms_counties.geometry))
plt.title("length-width difference")
Text(0.5, 1.0, 'length-width difference')

2.Ideal Shape Measures

The next category of shape measurements are often considered “ideal shape” compactness measurements. This means that they construct a relationship between one (or more) aspects of a polygon (such as its perimeter or area) and compare it to similar values for an “ideal” shape.

The ideal shape comes in several flavors.
– A “relatively ideal shape” is a shape whose properties are fixed relative to the original shape. For example, “isoperimetric_quotient” compares the area of a polygon to the area of a circle with the same perimeter as the original polygon. Mathematically speaking, these measures are usually constructed so that they vary between zero and one, and are one when the shape is the same as its relative ideal shape. Metrics in this family include “isoperimetric_quotient” and “isoareal_quotient”, as well as our implementation of “fractal_dimension”, discussed later.
– An “absolutely ideal shape” is a shape that has some fixed, known relationship to the original shape and serves in some way as the “bounds” of that shape. For example, Convex Hull Ratio compares the area of a polygon to the area of its convex hull. Since the convex hull is guaranteed to be at least as large as the original shape, this measure is also between zero and one, where 1 means the polygon is its own convex hull. Metrics in this series include “boundary_amplitude”, “convex_hull_ratio”, “radii_ratio”, “diameter_ratio” and “minimum_bounding_circle_ratio”.

2.1 Absolute Ideal Shape Measures

“boundary_amplitude” and “convex_hull_ratio” are two simple and closely related measures of shape regularity. The boundary amplitude is the perimeter of the convex hull divided by the perimeter of the original shape. It varies between zero and one, where one represents the case where the polygon is its own convex hull. This is because the convex hull is always at most the perimeter of the original shape; when a shape has many recesses that go into the shape, it will be shorter than the original shape.

In the map below, you can see that counties along the Mississippi River have very poor “boundary_amplitude” scores because their boundaries are very unstable:

ms_counties.plot(shapestats.boundary_amplitude(ms_counties.geometry))
plt.title("boundary amplitude")
Text(0.5, 1.0, 'boundary amplitude')

Relatedly, the convex hull ratio is the area of the original shape divided by the convex hull area. This in turn varies between zero and one: since the convex hull always “contains” the original shape, its area is always larger. Therefore, this measure is related to “boundary_amplitude”, but will be different for different polygons because it belongs to area and not perimeter. In general, perimeter-based metrics are more sensitive to non-convexity than area-based metrics.

ms_counties.plot(shapestats.convex_hull_ratio(ms_counties.geometry))
plt.title("convex hull areal ratio")
Text(0.5, 1.0, 'convex hull areal ratio')

Another useful measure is the “minimum boundary circle ratio,” sometimes called the Reock measure, named after the author of the first journal article used to analyze congressional districts. This ratio compares the area of the original shape to the area of the smallest circle that can completely surround the shape. This measure severely harms elongation, since the minimum bounding circle must become larger and larger to contain the shape. It also varies between zero and one, with one reflecting the case where the polygon is its own bounding circle.

ms_counties.plot(shapestats.minimum_bounding_circle_ratio(ms_counties))
plt.title("minimum bounding circle ratio")
Text(0.5, 1.0, 'minimum bounding circle ratio')

A related metric is “radii_ratio”. “radii_ratio” actually mixes the reference shape and ideal shape concepts together instead of comparing the areas of the two shapes. It relates the radius of the smallest bounding circle to the radius of an equal-area circle (or a circle containing the same area as the original shape).

ms_counties.plot(shapestats.radii_ratio(ms_counties.geometry))
plt.title("radii ratio")
Text(0.5, 1.0, 'radii ratio')

This measure generally works about the same as Minimum Bounding Circle Ratio, but is more sensitive to concavities in the shape.

plt.scatter(shapestats.radii_ratio(ms_counties.geometry),
            shapestats.minimum_bounding_circle_ratio(ms_counties.geometry))
plt.plot((0,1),(0,1), color='k', linestyle=':')
plt.xlim(.2, .9)
plt.ylim(.2, .9)
plt.xlabel("Radii Ratio")
plt.ylabel("Minimum Bounding Circle Ratio")
Text(0, 0.5, 'Minimum Bounding Circle Ratio')

A similar measure to the minimum bounding circle ratio is the diameter ratio. This measures the ratio between the “longest” and “shortest” diameters of the shape. This can be measured as the longest and shortest axes of the smallest rotated rectangle of the shape. Alternatively, you can use the shape’s original bounding box, but this favors east-west and north-south shapes. This is again a fairly strong measure of elongation, with shapes having a large difference between their longest and shortest axes scoring lower.

ms_counties.plot(shapestats.diameter_ratio(ms_counties.geometry))
<Axes: >

2.2 Relative ideal shape measures

These types of shape measurements construct relationships between the observed shape and different shapes that have some known relationship. As we discussed previously with the “radii_ratio” metric, it usually looks like a circle with the same perimeter or area as the source shape.

In the case of “isoareal_quotient” this relates the perimeter of the shape to the circumference of a circle with the same area as the source shape:

ms_counties.plot(shapestats.isoareal_quotient(ms_counties))
<Axes: >

The related metric “isoperimetric quotient” relates the area of a shape to the area of a circle with the same circumference as the original shape.

ms_counties.plot(shapestats.isoperimetric_quotient(ms_counties))
<Axes: >

These two indicators are directly related to each other, albeit non-linearly

plt.scatter(shapestats.isoareal_quotient(ms_counties),
            shapestats.isoperimetric_quotient(ms_counties))
plt.plot((0,1),(0,1), color='k', linestyle=':')
plt.xlim(.2, .9)
plt.ylim(.2, .9)
plt.xlabel("Isoareal Quotient")
plt.ylabel("Isoperimetric Quotient")
Text(0, 0.5, 'Isoperimetric Quotient')

The last relevant measure is the fractal dimension of the shape. It measures the effective size of a shape’s boundary and usually varies between zero and two, where two represents a very complex boundary and zero represents a very simple boundary. However, our particular implementation approximates the true fractal dimension by assuming that the shape’s boundaries move along a grid or hexagonal lattice. So the measure is actually the relationship between the square (or hexagon) and the existing shape.

ms_counties.plot(shapestats.fractal_dimension(ms_counties, support='hex'))
<Axes: >

ms_counties.plot(shapestats.fractal_dimension(ms_counties, support='square'))
<Axes: >

The two are also extremely related:

plt.scatter(shapestats.fractal_dimension(ms_counties, support='hex'),
            shapestats.fractal_dimension(ms_counties, support='square'))
plt.plot((0,2),(0,2), color='k', linestyle=':')
plt.xlim(.4, 1.1)
plt.ylim(.4, 1.1)
plt.xlabel("Fractal Dimension (hex)")
plt.ylabel("Fractal Dimension (square)")
Text(0, 0.5, 'Fractal Dimension (square)')

3. Conclusion

There are many more shape measurements in the “esda.shape” module that can be used in a variety of applications. What is detailed here is what is most commonly found in the literature on repartitioning, which is not a special area where shape measurements are useful. For more information on shape measurement, a good introductory conceptual paper is by Shlomo Angel et al. (2010) on how to measure shape in geography.