Image segmentation with heatdiff
A new method to segment images with the heat semigroup
Deep learning methods have been leading the most recent advancements in image segmentation due to consistently outperforming traditional methods. These advances, however, come with huge computational and labour costs from creating and training on vast datasets of pre-labelled images. This remains a barrier to implementing image segmentation methods into current computer vision workflows that might require that increased precision for real-life applications.
Today, we present a new feature of our open-source image processing package, heatdiff, that uses the heat semigroup for image segmentation with no costly training required.
What is image segmentation and what is it used for?
Image segmentation partitions images into ‘segments’ or groups of pixels representing objects or regions within the image. Think cars, people, teeth, tumours, but also sky, roads, rows of townhouses, fields of plains. Depending on the type of segmentation (semantic, instance, or panoptic segmentation), the method identifies entire classes, like a row of cars, individual instances of a class, like individual trees in a bunch, or a combination of the two. The output of a segmentation are segmentation masks, which shows, to the pixel, the boundary of each object, region, or feature identified in the image.
Image segmentation is used alone or combined with image classification and object detection in multi-stage or advanced computer vision applications which are widely used today. Image segmentation allows autonomous vehicles to identify roads it can drive on and pedestrians it needs to avoid. In manufacturing, image segmentation is used to identify defects in products on the production line. In medical imaging, it is used on MRI and CT scans to identify the boundaries of tumours and diagnose diseases. In satellite imaging, it can be used to identify weather events, the different types of terrain in an area, and other geographic features. These are just some of the myriad examples.
What are existing methods for image segmentation?
Deep learning methods lead most current computer vision advancements because of their greater precision and ability to segment complex images compared to traditional image segmentation methods. These models consist of those based on convolutional neural networks and modifications thereof (U-Net, Mask R-CNN, DeepLab), or transformer-based architectures (SegFormer, Swin Transformer).
Whereas traditional methods use pixel colour information like brightness, contrast, intensity to identify the boundaries of objects or regions, deep learning models are trained on large pre-labelled datasets to learn patterns in the visual data points that can be generalised to new, unseen images. Notable open-source datasets that are widely used for training include COCO (Microsoft), ADE20K, and Cityscapes.
Creating these pre-labelled datasets requires many hours of human labour. Subsequently, training the deep learning models on them requires a large amount of computational resources.
Image segmentation with the heat semigroup in heatdiff
heatdiff employs the heat semigroup to achieve training-free image segmentation that can handle complex images without a decrease in accuracy. The segmentation is based on 1) the heat semigroup approximation of the curve shortening flow and 2) a reaction ordinary differential equation. The heat semigroup approximation minimises the perimeter of the mask and smooths it, while the reaction term helps the mask to explore new regions of the image that have not been analysed.
Recommended by LinkedIn
Currently, heatdiff only supports binary segmentation but these capabilities will be extended in the future.
Code example
Here's how to do binary image segmentation on an image with heatdiff. We've chosen to place a square initial mask in the centre of the image, but feel free to experiment with different polygons and initial placements.
from heatdiff import load_image, show_image, JacobiThetaSegmenter
import numpy as np
from numpy.typing import NDArray
# 1. Load the original image
image = load_image("images/flowers.jpg", grayscale = False)
# 2. Create the initial segmentation mask
def create_initial_mask(size: tuple, side_length: int) -> NDArray:
"""Create initial square mask in center of image."""
np.zeros(size)
cx, cy = size[1] / 2, size[0] / 2 # Center coordinates
x, y = np.indices(size)
return (
(np.abs(x - cx) <= side_length / 2) & (np.abs(y - cy) <= side_length / 2)
).astype(float)
side_length = image.shape[0] // 3
initial_mask = create_initial_mask(image.shape[:2], side_length)
show_image(initial_mask, title="Initial Mask")
# 3. Instantiate and apply the image segmenter
segmenter = JacobiThetaSegmenter(lambda_param=0.00025, dt=0.5)
result, energies = segmenter.segment(image, initial_mask, max_iter=20)
# 4. Show results
show_image(result, title="Segmentation Result")
With further development, image segmentation with the heat semigroup can be incorporated to more advanced computer vision flows that are used in many fields today, providing an alternative to current dominant deep learning methods. Try it out and let us know what you think!
Install heatdiff from PyPi with
pip install heatdiff
See the heatdiff GitHub for documentation. Also check out our previous article on image corruption and denoising with heatdiff, also using the heat semigroup.
For questions and feedback, feel free to reach out to us!