Skip to main content

UNet

Last Version: 12/09/2025

UNet Overview

This guide demonstrates efficient image segmentation using the lightweight UNet semantic segmentation model with the SpacemiT AI processor.

The model accepts static images or video streams as input and outputs pixel-level semantic labels (grayscale) and pseudo-color renderings, which are published via ROS 2 for downstream perception or control modules.

The model is trained on the Cityscapes dataset and deployed in ONNX format, supporting accurate segmentation of typical urban scene elements such as people, vehicles, roads, and traffic signs.

Typical applications for UNet include:

  • Autonomous driving perception (road/target segmentation)
  • Urban street scene analysis and digital map construction
  • Remote sensing land-use and object classification
  • Medical image segmentation (e.g., organs or lesions)
  • Industrial defect detection and preprocessing

Environment Setup

Install Dependencies

sudo apt install python3-opencv ros-humble-cv-bridge ros-humble-camera-info-manager \
ros-humble-image-transport python3-spacemit-ort

Load ROS 2 Environment

source /opt/bros/humble/setup.bash

Check Supported Segmentation Models

Run the following command to view available segmentation model configurations:

ros2 launch br_perception infer_info.launch.py | grep 'segmentation'

Example output:

- config/segmentation/unet.yaml

Image Inference

Prepare Input Image

cp /opt/bros/humble/share/jobot_infer_py/data/segmentation/test_unet.jpg .

Run Inference and Save Results

ros2 launch br_perception infer_img.launch.py \
config_path:='config/segmentation/unet.yaml' \
img_path:='./test_unet.jpg'

Example terminal output:

[INFO] [launch]: All log files can be found below /home/zq-pi/.ros/log/2025-05-26-09-45-18-005008-spacemit-k1-x-MUSE-Pi-board-5995
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [infer_img_node-1]: process started with pid [5996]
[infer_img_node-1] Inference time: 4665.87 ms
[infer_img_node-1] The semantic segmentation results are saved in: seg_result.jpg
[infer_img_node-1] The semantic segmentation pseudo-color image is saved to seg_pseudo_color.png
[INFO] [infer_img_node-1]: process has finished cleanly [pid 5996]

Output files:

  • seg_result.jpg: Overlay of input image with predicted segmentation
  • seg_pseudo_color.png: Pseudo-color map with semantic classes in color

Example images:

Original ImageSegmentation OverlayPseudo-Color Map
imgimgimg

Web Visualization of Inference

Terminal 1 — Launch inference with image publishing:

ros2 launch br_perception infer_img.launch.py \
config_path:='config/segmentation/unet.yaml' \
img_path:='./test_unet.jpg' \
publish_result_img:=true \
result_img_topic:='result_img' \
result_topic:='/inference_result'

Example output:

[INFO] [launch]: All log files can be found below /home/zq-pi/.ros/log/2025-05-26-10-05-41-203368-spacemit-k1-x-MUSE-Pi-board-7077
[INFO] [launch]: Default logging verbosity is set to INFO
[INFO] [infer_img_node-1]: process started with pid [7084]
[infer_img_node-1] Inference time: 4712.77 ms
[infer_img_node-1] The image inference results are published cyclically
[infer_img_node-1] The image inference results are published cyclically
[infer_img_node-1] The image inference results are published cyclically

Terminal 2 — Start WebSocket visualization:

ros2 launch br_visualization websocket_cpp.launch.py image_topic:='/result_img'

Visit the provided URL in your browser (http://<IP>:8080) to view the segmentation results.

Message Subscription and Monitoring

The topic result_topic:='/inference_result' publishes the inference results. You can view the data using:

ros2 topic echo /inference_result

Example output:

header:
stamp:
sec: 0
nanosec: 0
frame_id: ''
height: 480
width: 640
encoding: mono8
is_bigendian: 0
step: 640
data:
- 2
- 2
- 2
- 2
.....

The message uses the standard Image format. The encoding is mono8, representing a single-channel grayscale image, where each pixel value corresponds to the class index of the segmented object.

Press Ctrl + C to stop the following command:

ros2 launch br_perception infer_img.launch.py config_path:='config/segmentation/unet.yaml' img_path:='./test_unet.jpg' publish_result_img:=true result_img_topic:='result_img' result_topic:='/inference_result'

You can replace img_path with a different image to perform inference on other inputs. The results on the web interface will update automatically.

infer_img.launch.py Parameters

ParameterDescriptionDefault
config_pathPath to the model configuration file used for inferenceconfig/detection/yolov6.yaml
img_pathPath to the image file to run inference ondata/detection/test.jpg
publish_result_imgWhether to publish the detection result as an image messagefalse
result_img_topicTopic name for rendered image output (only valid if publish_result_img = true)/result_img
result_topicTopic name for the inference result message/inference_result

Video Stream Inference

Start Camera Device

ros2 launch br_sensors usb_cam.launch.py video_device:="/dev/video20"

Start Video Segmentation

Terminal 1 — Launch video stream segmentation:

ros2 launch br_perception infer_video.launch.py \
config_path:='config/segmentation/unet.yaml' \
sub_image_topic:='/image_raw' \
publish_result_img:=true \
result_topic:='/inference_result'

Web Visualization

Terminal 2 — Start WebSocket visualization:

ros2 launch br_visualization websocket_cpp.launch.py image_topic:='/result_img'

Visit http://<IP>:8080 in your browser to view real-time segmentation.

Raw Inference Results (No Visualization)

To have the grayscale semantic labels without web visualization:

ros2 launch br_perception infer_video.launch.py \
config_path:='config/segmentation/unet.yaml' \
sub_image_topic:='/image_raw' \
publish_result_img:=false \
result_topic:='/inference_result'

You can view the data using:

ros2 topic echo /inference_result

Example output:

header:
stamp:
sec: 0
nanosec: 0
frame_id: ''
height: 480
width: 640
encoding: mono8
is_bigendian: 0
step: 640
data:
- 2
- 2
- 2
- 2
.....

The message uses the standard Image format. The encoding is mono8, representing a single-channel grayscale image, where each pixel value corresponds to the class index of the segmented object.

infer_video.launch.py Parameters

Parameter NameDescriptionDefault Value
config_pathPath to the model configuration file for inferenceconfig/detection/yolov6.yaml
sub_image_topicImage topic to subscribe to/image_raw
publish_result_imgWhether to publish the rendered inference imagefalse
result_img_topicImage topic to publish (only when publish_result_img=true)/result_img
result_topicInference-result topic to publish/inference_result