1. What is Image Segmentation? (Simple Definition)

Segmentation = Dividing an image into meaningful parts

Think of it like this:

  • Face detection app – separates face from background
  • Cancer detection – separates tumor from healthy tissue
  • Self-driving car – separates road, car, pedestrian, sky

Real example: Look at this photo of a cat. Segmentation will:

  • Region 1 = Cat (foreground)
  • Region 2 = Floor (background)
  • Region 3 = Wall

2. Types of Segmentation (Simple Explanation)


3. Method 1: Thresholding (Easiest Method)

Concept: If pixel value > T, it's white (foreground), else black (background).

Real example: Detecting a dark stain on white shirt.

Python Code:

import cv2
import matplotlib.pyplot as plt

# Load image in grayscale
img = cv2.imread('stain.jpg', 0)

# Simple thresholding
threshold_value = 127
_, simple_thresh = cv2.threshold(img, threshold_value, 255, cv2.THRESH_BINARY)

# Otsu's automatic thresholding (no need to guess T)
_, otsu_thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)

# Display results
plt.figure(figsize=(12,4))
plt.subplot(131), plt.imshow(img, 'gray'), plt.title('Original')
plt.subplot(132), plt.imshow(simple_thresh, 'gray'), plt.title('Simple Threshold (T=127)')
plt.subplot(133), plt.imshow(otsu_thresh, 'gray'), plt.title("Otsu's Auto Threshold")
plt.show()

MATLAB Code:

% Read image
img = imread('stain.jpg');
gray_img = rgb2gray(img);

% Simple threshold
threshold_value = 127;
simple_thresh = gray_img > threshold_value;

% Otsu's threshold
otsu_thresh = imbinarize(gray_img);

% Display
subplot(1,3,1), imshow(gray_img), title('Original');
subplot(1,3,2), imshow(simple_thresh), title('Simple Threshold (T=127)');
subplot(1,3,3), imshow(otsu_thresh), title("Otsu's Auto Threshold");

Student Task: Try different threshold values (50, 100, 200) and see what happens.


4. Method 2: Edge Detection (Finding Boundaries)

Concept: Find places where brightness suddenly changes (these are edges).

Real example: Finding the outline of a person in a photo.

Python Code:

import cv2
import matplotlib.pyplot as plt

img = cv2.imread('person.jpg', 0)

# Different edge detectors
edges_canny = cv2.Canny(img, 50, 150)      # Best for most cases
edges_sobelx = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=5)  # Horizontal edges
edges_sobely = cv2.Sobel(img, cv2.CV_64F, 0, 1, ksize=5)  # Vertical edges
edges_laplacian = cv2.Laplacian(img, cv2.CV_64F)

# Display
plt.figure(figsize=(12,8))
plt.subplot(221), plt.imshow(img, 'gray'), plt.title('Original')
plt.subplot(222), plt.imshow(edges_canny, 'gray'), plt.title('Canny Edges (Best)')
plt.subplot(223), plt.imshow(abs(edges_sobelx), 'gray'), plt.title('Horizontal Edges')
plt.subplot(224), plt.imshow(abs(edges_sobely), 'gray'), plt.title('Vertical Edges')
plt.show()

MATLAB Code:

img = imread('person.jpg');
gray_img = rgb2gray(img);

% Different edge detectors
edges_canny = edge(gray_img, 'canny');      % Best
edges_sobel = edge(gray_img, 'sobel');
edges_prewitt = edge(gray_img, 'prewitt');
edges_log = edge(gray_img, 'log');           % Laplacian of Gaussian

% Display
figure;
subplot(2,3,1), imshow(gray_img), title('Original');
subplot(2,3,2), imshow(edges_canny), title('Canny');
subplot(2,3,3), imshow(edges_sobel), title('Sobel');
subplot(2,3,4), imshow(edges_prewitt), title('Prewitt');
subplot(2,3,5), imshow(edges_log), title('LoG');

Real-time tip: If image is noisy, first blur it: cv2.GaussianBlur(img, (5,5), 0) then apply Canny.

5. Method 3: K-Means Clustering (Color-Based)

Concept: Group pixels into K clusters based on color similarity.

Real example: Separating red, green, and yellow apples in a single image.

Python Code:

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load image
img = cv2.imread('fruits.jpg')
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Reshape to 1D array of pixels
pixels = img_rgb.reshape((-1, 3))
pixels = np.float32(pixels)

# Apply K-means
k = 3  # Number of segments (change as needed)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.2)
_, labels, centers = cv2.kmeans(pixels, k, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)

# Convert back to image
centers = np.uint8(centers)
segmented_img = centers[labels.flatten()]
segmented_img = segmented_img.reshape(img_rgb.shape)

# Display
plt.figure(figsize=(10,5))
plt.subplot(121), plt.imshow(img_rgb), plt.title('Original')
plt.subplot(122), plt.imshow(segmented_img), plt.title(f'K-Means (K={k})')
plt.show()

MATLAB Code:

img = imread('fruits.jpg');
[rows, cols, channels] = size(img);

% Reshape for k-means
pixel_data = double(reshape(img, rows*cols, 3));

% Apply k-means
k = 3;
[labels, centers] = kmeans(pixel_data, k, 'Distance', 'sqEuclidean', 'Replicates', 3);

% Reconstruct image
segmented_img = reshape(centers(labels, :), rows, cols, 3);
segmented_img = uint8(segmented_img);

% Display
figure;
subplot(1,2,1), imshow(img), title('Original');
subplot(1,2,2), imshow(segmented_img), title(['K-Means (K=' num2str(k) ')']);

Try this: Change k = 2, 4, 5, 8 and see how segmentation becomes more detailed.

6. Method 4: Region Growing (Seeded Segmentation)

Concept: You click on an object, algorithm expands to include all similar pixels.

Real example: Medical imaging – doctor clicks on tumor, algorithm finds complete tumor boundary.

Python Code:

import cv2
import numpy as np

def region_growing(img, seed_point, threshold=20):
    """
    img: grayscale image
    seed_point: (x, y) coordinate to start from
    threshold: intensity difference allowed
    """
    h, w = img.shape
    segmented = np.zeros_like(img, dtype=np.uint8)
    visited = np.zeros_like(img, dtype=bool)
    
    # Queue for pixels to check
    queue = [seed_point]
    seed_value = int(img[seed_point])
    segmented[seed_point] = 255
    visited[seed_point] = True
    
    while queue:
        x, y = queue.pop(0)
        
        # Check 4 neighbors
        for dx, dy in [(-1,0), (1,0), (0,-1), (0,1)]:
            nx, ny = x + dx, y + dy
            if 0 <= nx < h and 0 <= ny < w and not visited[nx, ny]:
                diff = abs(int(img[nx, ny]) - seed_value)
                if diff <= threshold:
                    segmented[nx, ny] = 255
                    visited[nx, ny] = True
                    queue.append((nx, ny))
    return segmented

# Usage
img = cv2.imread('tumor.jpg', 0)
seed = (150, 150)  # You need to pick the correct point
result = region_growing(img, seed, threshold=30)

# Display
cv2.imshow('Original', img)
cv2.imshow('Segmented', result)
cv2.waitKey(0)

MATLAB Code:

% MATLAB has built-in geodesic segmentation
img = imread('tumor.jpg');
gray_img = rgb2gray(img);

% Create mask with seed points
mask = false(size(gray_img));
mask(150, 150) = true;  % Set seed point

% Region growing
segmented = imseggeodesic(gray_img, mask, 0.2);

% Display
figure;
subplot(1,3,1), imshow(gray_img), title('Original');
subplot(1,3,2), imshow(mask), title('Seed Point');
subplot(1,3,3), imshow(segmented), title('Region Growing Result');

7. Method 5: Watershed Algorithm (Separating Touching Objects)

Concept: Treat image like a topographic map, flood from local minima, build dams where water meets.

Real example: Counting overlapping coins or cells in a microscope image.

Python Code:

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load image
img = cv2.imread('coins.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Threshold
_, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)

# Remove noise
kernel = np.ones((3,3), np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2)

# Sure background area
sure_bg = cv2.dilate(opening, kernel, iterations=3)

# Finding sure foreground area
dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
_, sure_fg = cv2.threshold(dist_transform, 0.7*dist_transform.max(), 255, 0)

# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv2.subtract(sure_bg, sure_fg)

# Marker labelling
_, markers = cv2.connectedComponents(sure_fg)
markers = markers + 1
markers[unknown == 255] = 0

# Apply watershed
markers = cv2.watershed(img, markers)
img[markers == -1] = [255,0,0]  # Mark boundaries in red

# Display
plt.figure(figsize=(12,4))
plt.subplot(131), plt.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)), plt.title('Watershed Result')
plt.subplot(132), plt.imshow(thresh, 'gray'), plt.title('Threshold')
plt.subplot(133), plt.imshow(sure_fg, 'gray'), plt.title('Foreground')
plt.show()

MATLAB Code:

img = imread('coins.jpg');
gray_img = rgb2gray(img);

% Compute watershed
L = watershed(gray_img);
rgb_img = label2rgb(L);

% Display
figure;
subplot(1,2,1), imshow(img), title('Original');
subplot(1,2,2), imshow(rgb_img), title('Watershed Segmentation');

8. Complete Real-World Example: Face Detection Using Segmentation

Let's combine everything into a practical example.

Problem: Separate a person's face from background.

Python Complete Code:

import cv2
import numpy as np
import matplotlib.pyplot as plt

def segment_face(image_path):
    # Step 1: Load image
    img = cv2.imread(image_path)
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    # Step 2: Use face detector to get region of interest
    face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_frontalface_default.xml')
    faces = face_cascade.detectMultiScale(gray, 1.1, 4)
    
    if len(faces) == 0:
        print("No face detected")
        return img_rgb
    
    # Step 3: Get the first face
    (x, y, w, h) = faces[0]
    face_region = gray[y:y+h, x:x+w]
    
    # Step 4: Apply thresholding on face region
    _, face_seg = cv2.threshold(face_region, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    
    # Step 5: Draw results
    result = img_rgb.copy()
    cv2.rectangle(result, (x, y), (x+w, y+h), (255, 0, 0), 2)
    
    # Display
    plt.figure(figsize=(15,5))
    plt.subplot(131), plt.imshow(img_rgb), plt.title('Original')
    plt.subplot(132), plt.imshow(face_region, 'gray'), plt.title('Face Region')
    plt.subplot(133), plt.imshow(face_seg, 'gray'), plt.title('Segmented Face')
    plt.show()
    
    return face_seg

# Run it
segment_face('group_photo.jpg')

9. Quick Reference: Which Method to Use?

10. Common Problems and Solutions

Problem 1: My edges are broken/disconnected

  • Solution: Use morphological operations
kernel = np.ones((3,3), np.uint8)
edges = cv2.morphologyEx(edges, cv2.MORPH_CLOSE, kernel)

Problem 2: Too many small regions in segmentation

  • Solution: Increase minimum region size or apply blur first
img = cv2.GaussianBlur(img, (5,5), 0)

Problem 3: K-means gives different results each time

  • Solution: Set random seed
cv2.kmeans(pixels, k, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)

Problem 4: Slow processing for large images

  • Solution: Resize image first
img = cv2.resize(img, (300, 300))

11. Practice Exercise for Students

Task: Given a photo of fruits on a table, separate each fruit and count them.

Steps to follow:

  1. Load image
  2. Convert to RGB
  3. Apply K-means with K = number of fruit types
  4. Count connected components in each segment
  5. Display results with numbers

Expected output: "Found 3 red apples, 2 bananas, 4 oranges"

Hint solution structure:

# Basic skeleton
img = cv2.imread('fruits.jpg')
# Your code here
# ... use K-means, then contours
contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
print(f"Count: {len(contours)}")

12.Conclusion

Advice for Students:

  • Start with thresholding for simple problems
  • Use Canny edges when you need boundaries
  • Use K-means for color-based segmentation
  • Use Watershed only when objects are touching
  • Always blur noisy images before segmentation