Posted on :: Tags: , , ,

This post introduces a new feature for binary blobs, such as connected components in text.

The feature is called patch histogram, and it represents the histogram of 3x3 patches of black and white pixels. We collect all 3x3 patches and count their frequencies.

A 3x3 patch for a binary image contains $2^9 = 512$ different combinations. For each of these combinations, we assign a unique ID. I wrote the implementation in Python; here is a lookup table that converts all possible 3x3 patches to their IDs.

patch_histogram_dict = {}
for f in range(0, 512):
    i22 = f & 1
    i21 = (f >> 1) & 1
    i20 = (f >> 2) & 1
    i12 = (f >> 3) & 1
    i11 = (f >> 4) & 1
    i10 = (f >> 5) & 1
    i02 = (f >> 6) & 1
    i01 = (f >> 7) & 1
    i00 = (f >> 8) & 1
    patch_histogram_dict[ma(i00, i01, i02, i10, i11, i12, i20, i21, i22)] = f

I was looking for a skeleton feature that can be applied to components after a medial axis transform. However, in this version, we will use binarized connected component images directly and explore the skeletal version later.

Above, we used a function ma to convert a set of numbers to a tuple of tuples. It’s as simple as:

def ma(i00, i01, i02, i10, i11, i12, i20, i21, i22):
    return ((i00, i01, i02),
            (i10, i11, i12),
            (i20, i21, i22))

The first step in feature generation is marking each pixel of the image with its ID. Then, a simple np.histogram call generates the histogram.

for i in range(1, rows - 1):
    for j in range(1, cols - 1):
        s = framed[(i-1):(i+2), (j-1):(j+2)]
        marks[i-1, j-1] = patch_dict[ma(s[0, 0], s[0, 1], s[0, 2],
                                        s[1, 0], s[1, 1], s[1, 2],
                                        s[2, 0], s[2, 1], s[2, 2])]

framed is a copy of the original image with a 1-pixel frame, so that boundary pixels are also counted without much boundary checking. patch_dict is a parameter that the function receives. Its default value is the dictionary that we created above, but any kind of patch dictionary can be used.

The feature is generated using a call to numpy.histogram as follows:

histogram = numpy.histogram(marked_points,
                            hist_range,
                            density=True,
                            bins=bins)

marked_points are the results of the pixel counting loop above. hist_range is the minimum and maximum value for the patch dictionary. The density parameter is set to True so that the histogram is normalized to 1, making the feature size-invariant. bins is a parameter we can set freely, but by default, it’s equal to the range of elements.

Comparisons can be performed using standard histogram comparison methods, such as EMD (Earth Mover’s Distance) or Chi-Square distance metrics.