patchwork-plusplus

Patchwork++


C++ API Python API

<a href=https://www.youtube.com/watch?v=fogCM159GRk>Video</a>   •   Install   •   ROS2   •   <a href=https://www.youtube.com/watch?v=fogCM159GRk>Paper</a>   •   <a href=https://github.com/url-kaist/patchwork-plusplus/issues>Contact Us</a>

<img src=pictures/patchwork++.gif alt="animated" />

(May 19, 2026) pip installation is now live:
pip install pypatchworkpp

[Patchwork++][arxivlink], an extension of [Patchwork][patchworklink], is **a fast, robust, and self-adaptive ground segmentation algorithm** on 3D point cloud.

:books: Usage Guide

This guide covers three things that are easy to get wrong on first contact:

  1. Choosing a SemanticKITTI evaluation protocol — picks the right ground-truth definition so numbers match the paper.
  2. Tuning the algorithm parameters for your sensor — what each knob does and which ones to touch first when results look bad.
  3. Reproducing the paper’s Table I — a one-command sweep.

For a quick start, jump to §3.


:scroll: 1. Evaluation protocols

The Patchwork and Patchwork++ papers use different ground-truth definitions on SemanticKITTI. The eval driver python/examples/evaluate_semantickitti.py supports both via --eval_protocol {patchwork, patchworkpp}.

Why the two papers disagree

The disagreement is concentrated on one class: vegetation (label 70). SemanticKITTI’s vegetation label conflates two visually similar but physically very different things — low ground cover (grass, terrain weeds, leaves on flat ground) and overhead foliage / branches / hedge tops. The first is essentially ground; the second is not.

Either choice is defensible; they just yield different numbers on the same predictions. Always use the protocol that matches the paper you’re comparing against.

A. --eval_protocol patchwork (original Patchwork repo protocol)

Use this when comparing against numbers from the original Patchwork paper / url-kaist/patchwork.

B. --eval_protocol patchworkpp (Patchwork++ paper Table I protocol — DEFAULT for reproducing the Patchwork++ paper)

Use this when comparing against numbers from the Patchwork++ paper.

Why it matters

Same Patchwork++ inference, KITTI 00–10 macro average, two protocols:

Protocol Precision Recall F1
--eval_protocol patchwork 93.72 92.33 92.87
--eval_protocol patchworkpp 95.55 97.16 96.29
Patchwork++ paper Table I 94.92 98.18 96.51

3.4 F1 difference, entirely from the protocol switch. If your reproduction is 3 F1 low, this is almost certainly the cause.


:wrench: 2. Parameter tuning

If results look wrong on a new sensor (Velodyne 16/32, Ouster 64/128, Livox, etc.), tune in roughly this order. Defaults are in cpp/patchworkpp/include/patchwork/patchworkpp.h (Patchwork++) and cpp/patchwork/include/patchwork/patchwork.h (classic Patchwork).

Step 1 — Get sensor_height right (the most important parameter)

sensor_height is the height of the LiDAR origin above the ground when the vehicle is stationary on flat pavement.

How to tell it is wrong: precision is fine on far-range patches but ground points near the sensor are split between ground and non-ground in a striped pattern. The elevation threshold and adaptive seed selection both reference sensor_height directly.

If you cannot measure it, leave ATAT_ON = true and the All-Terrain Automatic heighT estimator will recover it from the first scan.

Step 2 — Tune uprightness_thr for the surface roughness you expect

uprightness_thr is the cosine of the maximum tilt angle accepted for a patch’s normal vs. world-up. Higher = stricter.

Setting Max tilt When to use
0.5 ~60° very rough terrain, off-road; library default for Patchwork++
0.707 ~45° Patchwork paper / on-road / structured driving — recommended for KITTI
0.866 ~30° flat indoor floors, parking lots

If precision is low and you see ramps, low walls, or curbs being labelled as ground: increase to 0.707 or 0.866. If recall is low on hills, ramps, or rough pavement: lower to 0.5 or 0.4.

Step 3 — Set range bounds min_range / max_range

Step 4 — Tune the plane-fit thresholds

Step 5 — elevation_thr and flatness_thr (only if you’ve changed the sensor mount or scene scale)

elevation_thr = {0.523, 0.746, 0.879, 1.125} are the ground-frame height cutoffs for the four closest CZM rings — patches whose mean is more than this above the ground are rejected unless their planarity (flatness_thr) saves them. The library converts these to sensor-frame internally by subtracting sensor_height.

Rule of thumb: scale them ∝ expected_terrain_undulation / 1.723 m if your sensor sits lower or higher than KITTI. Most users do not need to touch these.

Step 6 — Patchwork++ extras (pypatchworkpp.patchworkpp only)


:rocket: 3. Reproducing paper Table I

# 1. Install once
pip install -v ./python/

# 2. Reproduce Patchwork++ Table I row on KITTI 00–10
python python/examples/evaluate_semantickitti.py \
    --method patchworkpp \
    --eval_protocol patchworkpp \
    --dataset_path /path/to/SemanticKITTI/sequences \
    --output_csv summary_patchworkpp.csv

Expected output (full sweep, 23,201 frames):

seq frames P R F1
Avg 23201 95.55 97.16 96.29

Paper Table I: P=94.92, R=98.18, F1=96.51 — match within ±0.22 F1.

Quick smoke test (3 frames per seq, ~5 s total)

python python/examples/evaluate_semantickitti.py \
    --method patchworkpp \
    --eval_protocol patchworkpp \
    --dataset_path /path/to/SemanticKITTI/sequences \
    --max_frames 3 --verbose

Apples-to-apples vs. the original Patchwork repo

# Compare the in-repo classic Patchwork against the original ROS 2 patchwork
python python/examples/evaluate_semantickitti.py \
    --method patchwork \
    --eval_protocol patchworkpp \
    --dataset_path /path/to/SemanticKITTI/sequences \
    --output_csv summary_patchwork.csv

--method patchwork is paper-faithful since v1.3.0 (see #89 / #90 for the fixes).


:bar_chart: 4. Official benchmarks

KITTI 00-10 full sweep, 23,201 frames, macro-average across the eleven sequences. All numbers are produced by python/examples/evaluate_semantickitti.py on current master (v1.3.1) with paper-matched parameters (the script already sets uprightness_thr=0.707 and using_global_thr=false for --method patchwork; --method patchworkpp uses library defaults).

--eval_protocol patchworkpp (Patchwork++ paper Sec. IV.A — VEGETATION excluded)

Method Precision Recall F1
--method patchwork (this repo, classic Patchwork) 94.64 97.58 96.02
--method patchworkpp (this repo, Patchwork++) 95.55 97.16 96.29
Patchwork [1] — as reported in Patchwork++ paper Table I 94.23 97.62 95.88
Patchwork++ — as reported in Patchwork++ paper Table I 94.92 98.18 96.51
url-kaist/patchwork (original ROS 2) — independent reference number 94.38 97.90 96.05

This is the protocol you want for reproducing the Patchwork++ paper.

--eval_protocol patchwork (original Patchwork repo — VEGETATION-low-z counts as ground)

Method Precision Recall F1
--method patchwork (this repo, classic Patchwork) 92.77 93.66 93.08
--method patchworkpp (this repo, Patchwork++) 93.72 92.33 92.87
Patchwork [1] — as reported in original Patchwork paper Table I 92.47 93.43 93.00
url-kaist/patchwork (original ROS 2) — independent reference number 91.94 94.22 92.94

This is the protocol you want for apples-to-apples comparisons against the original Patchwork paper / url-kaist/patchwork repo.

Reading the table

Reproducing any row

# Patchwork++, paper protocol — top-line headline number
python python/examples/evaluate_semantickitti.py \
    --method patchworkpp --eval_protocol patchworkpp \
    --dataset_path /path/to/SemanticKITTI/sequences

# Classic Patchwork, paper protocol — apples-to-apples vs. Patchwork++
python python/examples/evaluate_semantickitti.py \
    --method patchwork --eval_protocol patchworkpp \
    --dataset_path /path/to/SemanticKITTI/sequences

# Either method under the original Patchwork-paper protocol — swap `--eval_protocol patchwork`

:chart_with_upwards_trend: 5. Per-sequence performance

All numbers below are produced by python/examples/evaluate_semantickitti.py on v1.3.1 (current master), KITTI 00-10, paper-matched parameters. Use them to debug per-sequence regressions: if seq 05 looks fine but seq 10 is 3 F1 below the table, you have a parameter problem, not a code problem.

--method patchworkpp --eval_protocol patchworkpp (headline configuration, matches Patchwork++ paper)

seq frames Precision Recall F1
00 4541 94.88 98.47 96.62
01 1101 98.43 96.36 97.34
02 4661 95.63 97.18 96.35
03 801 96.72 97.73 97.21
04 271 98.20 96.40 97.25
05 2761 92.06 97.87 94.84
06 1101 98.01 97.24 97.61
07 1101 92.89 98.45 95.56
08 4071 96.29 97.26 96.74
09 1591 96.01 96.25 96.06
10 1201 91.93 95.63 93.63
Avg 23201 95.55 97.17 96.29

--method patchwork --eval_protocol patchworkpp (classic Patchwork, paper protocol)

seq frames Precision Recall F1
00 4541 93.61 98.97 96.19
01 1101 97.47 96.80 97.09
02 4661 95.26 97.11 96.11
03 801 96.31 98.24 97.23
04 271 98.15 97.96 98.04
05 2761 90.32 98.53 94.19
06 1101 97.32 98.45 97.88
07 1101 91.19 98.71 94.76
08 4071 95.52 98.16 96.79
09 1591 95.29 96.63 95.87
10 1201 90.65 93.86 92.04
Avg 23201 94.64 97.58 96.02

--method patchworkpp --eval_protocol patchwork (Patchwork++, original-Patchwork protocol)

seq frames Precision Recall F1
00 4541 93.93 93.29 93.53
01 1101 97.03 87.33 91.80
02 4661 93.40 93.36 93.29
03 801 90.74 93.21 91.83
04 271 97.77 88.93 93.10
05 2761 91.38 94.24 92.76
06 1101 97.59 95.73 96.64
07 1101 92.12 96.03 93.99
08 4071 94.81 92.21 93.43
09 1591 93.56 91.00 92.13
10 1201 88.53 90.36 89.14
Avg 23201 93.72 92.34 92.88

--method patchwork --eval_protocol patchwork (classic Patchwork, original-Patchwork protocol)

seq frames Precision Recall F1
00 4541 92.34 94.64 93.41
01 1101 95.84 89.16 92.27
02 4661 93.13 93.87 93.42
03 801 90.26 95.74 92.77
04 271 97.44 91.40 94.29
05 2761 89.18 95.54 92.20
06 1101 96.72 97.06 96.88
07 1101 90.02 96.80 93.24
08 4071 93.71 93.79 93.69
09 1591 92.69 92.46 92.46
10 1201 89.10 89.80 89.25
Avg 23201 92.77 93.66 93.08

Per-sequence tips


:vs: 6. RANSAC baseline (Open3D segment_plane)

A common first instinct on a new dataset is to fit a single plane with RANSAC and call the inliers “ground”. python/examples/evaluate_ransac_in_semantickitti.py does exactly that, on top of Open3D’s segment_plane, with the same metric definitions and --eval_protocol flag as evaluate_semantickitti.py, so the numbers drop directly into the same comparison frame as §5.

# Single (thr, iter) point — defaults to thr=0.15, iter=500
python python/examples/evaluate_ransac_in_semantickitti.py \
    --distance_threshold 0.15 --num_iterations 1000 \
    --eval_protocol patchworkpp

# Full sweep across a (thr × iter) grid
python python/examples/evaluate_ransac_in_semantickitti.py \
    --seqs 00 \
    --sweep_thresholds 0.10,0.15,0.25,0.30,0.40,0.50 \
    --sweep_iterations 100,500,1000,5000,10000 \
    --eval_protocol patchworkpp \
    --output_csv summary_ransac_seq00_grid.csv

Grid sweep on KITTI seq 00 (4541 frames, --eval_protocol patchworkpp)

distance_threshold (rows) is the max point-to-plane distance counted as inlier (metres). num_iterations (columns) is the RANSAC hypothesis cap; Open3D’s segment_plane early-terminates when a hypothesis crosses an internal confidence bound, so this is a maximum not an exact iteration count. ransac_n=3 throughout (plane). Cell value is F1 (%); second line is the median wall-clock ms of segment_plane per frame.

thr \ iter 100 500 1000 5000 10000
0.10 82.67 (16.5 ms) 88.69 (34.6 ms) 89.31 (37.5 ms) 89.31 (56.6 ms) 89.33 (56.7 ms)
0.15 89.34 (17.1 ms) 93.12 (29.3 ms) 93.28 (29.3 ms) 93.30 (40.7 ms) 93.35 (40.8 ms)
0.25 90.94 (17.4 ms) 92.34 (24.0 ms) 92.72 (24.2 ms) 92.52 (30.4 ms) 92.52 (30.5 ms)
0.30 89.54 (17.5 ms) 90.16 (22.6 ms) 90.20 (22.4 ms) 90.35 (27.5 ms) 90.21 (27.7 ms)
0.40 84.38 (15.8 ms) 84.72 (18.6 ms) 84.78 (20.4 ms) 84.75 (22.8 ms) 84.71 (23.0 ms)
0.50 79.43 (18.3 ms) 80.25 (17.8 ms) 80.16 (18.1 ms) 80.24 (18.4 ms) 80.02 (18.6 ms)

Wall-clock numbers are median per-frame ms of segment_plane on an i7-12700; the 24-thread parallel default of Open3D is used for iter ≤ 1000, and 8 threads (OMP_NUM_THREADS=8) for iter ≥ 5000 (the 24-thread iter=10000 run exhausted system memory). Compare F1 numbers across columns freely; absolute ms across iter≤1000 and iter≥5000 columns are not directly comparable.

Reading the grid

Best config on the full KITTI 00–10 sweep

Picking thr=0.15, iter=1000 (ties the highest-iter F1 at this threshold, runs faster) and evaluating on all 23,201 frames under the Patchwork++ paper protocol:

seq frames Precision Recall F1
00 4541 95.37 91.63 93.31
01 1101 98.33 87.74 92.52
02 4661 94.34 80.44 86.27
03 801 97.92 77.49 85.79
04 271 97.70 87.90 92.42
05 2761 93.01 88.09 90.26
06 1101 97.29 79.67 87.52
07 1101 92.68 89.33 90.81
08 4071 93.33 78.20 83.88
09 1591 96.75 80.68 87.65
10 1201 79.23 61.17 67.75
Avg 23201 94.18 82.03 87.11

Median wall-clock 19.5 ms / frame (51.2 Hz) with Open3D’s default 24-thread parallelism on an i7-12700.

Macro comparison — RANSAC vs. Patchwork / Patchwork++ on KITTI 00–10

Side-by-side with the §5 numbers, under --eval_protocol patchworkpp on the same 23,201 frames:

Method Precision Recall F1 Median ms
Open3D RANSAC (best: thr=0.15, iter=1000) 94.18 82.03 87.11 ~19.5
Classic Patchwork (this repo, v1.4.0) 94.64 97.58 96.02 ~9
Patchwork++ (this repo, v1.4.0) 95.55 97.16 96.29 ~18

Patchwork++ wins by +9.18 F1 on the macro average and roughly matches RANSAC on wall-clock per frame (~18 ms vs. ~19.5 ms), even though Patchwork++ is currently single-threaded on v1.4.0 (TBB intentionally disabled; see #96) while Open3D’s segment_plane is using all 24 cores. The recall column is where the gap concentrates: RANSAC’s 82.03 vs. Patchwork++’s 97.16 — a single global plane simply cannot cover the multiple ground patches that the concentric-zone partition handles natively.

Per-sequence gap to Patchwork++

The macro gap is not uniform; it is dragged down by the hard sequences:

seq scene RANSAC F1 Patchwork++ F1 Δ
00 residential, mild slope 93.31 96.62 -3.31
01 highway 92.52 97.34 -4.82
02 residential, parked cars 86.27 96.35 -10.08
03 short urban 85.79 97.21 -11.42
04 short highway 92.42 97.25 -4.83
05 undulating road 90.26 94.84 -4.58
06 open road 87.52 97.61 -10.09
07 inner-city 90.81 95.56 -4.75
08 dense urban 83.88 96.74 -12.86
09 rural 87.65 96.06 -8.41
10 rough rural / rolling roads 67.75 93.63 -25.88

Sequences with a gap below 5 F1 (00, 01, 04, 05, 07) are essentially flat with a single dominant ground plane — exactly where the single-plane assumption holds. Sequences with a gap above 10 F1 (02, 03, 06, 08, 10) all have rolling shoulders, multi-tier sidewalks, or rough off-road terrain — multiple ground patches that one plane cannot represent. Seq 10 is the extreme case: rolling rural terrain where one global plane is so wrong RANSAC drops below 70 F1 while Patchwork++ stays above 93 F1.

Takeaway

RANSAC is the obvious sanity-check baseline for ground segmentation. On KITTI it is 9 F1 behind the macro Patchwork++ row, 26 F1 behind on the worst sequence, and no improvement at higher iteration counts can close that gap — the bottleneck is the model, not the optimiser. The concentric-zone partition that Patchwork and Patchwork++ both use turns this from a hard problem (one plane for the whole scan) into many easy ones (one plane per patch, with per-patch flatness and elevation gates), which is what closes the gap.

Caveats