RoboBPP

Leaderboards

To provide a unified evaluation across diverse metrics, RoboBPP adopts a normalized scoring system. All metrics are converted to a common scale using min–max normalization, ensuring that higher scores consistently represent better performance. After normalization, the final score of an algorithm is computed as a weighted sum of all normalized metrics, where the weights reflect the relative importance of each metric. The detailed definitions of all metrics can be found in the Documentation at Metric.

In the Execution Pack, eight metrics are included, and the weight vector is:
w_exec = (0.35, 0.15, 0.08, 0.07, 0.15, 0.08, 0.07, 0.05).
This design emphasizes space utilization while accounting for stability, safety, and computational efficiency.

The Physics Pack evaluates five metrics, excluding trajectory and safety-related measures. The corresponding normalized weight vector becomes:
w_phys = (0.43, 0.19, 0.10, 0.09, 0.19).

The Math Pack focuses on three geometry-based metrics, with the weight vector:
w_math = (0.60, 0.26, 0.14).

Overall Performance Ranking

We aggregated the results of the experiments using the scoring system to compute an overall score for each algorithm, which allows us to rank all methods across different test settings and datasets. The top-4 results are highlighted as first, second, third, and fourth.

Setting	Math Pack			Physics Pack			Execution Pack
Dataset	Repetitive	Diverse	Wood Board	Repetitive	Diverse	Wood Board	Repetitive	Diverse	Wood Board
PCT	0.908	0.939	0.971	0.810	0.714	0.839	0.765	0.672	0.785
TAP-Net++	0.797	0.387	0.815	0.891	0.502	0.725	0.781	0.603	0.756
AR2L	0.914	0.878	0.824	0.740	0.611	0.694	0.708	0.737	0.688
PackE	0.397	0.440	0.207	0.865	0.623	0.209	0.617	0.469	0.300
CDRL	0.468	0.638	0.655	0.491	0.453	0.612	0.422	0.728	0.637
DBL	0.854	0.763	0.879	0.850	0.476	0.816	0.807	0.713	0.777
LSAH	0.863	0.794	0.679	0.861	0.477	0.510	0.701	0.523	0.487
HM	0.823	0.627	0.609	0.709	0.554	0.683	0.414	0.623	0.654
SDFPack	0.659	0.579	0.087	0.389	0.345	0.193	0.232	0.228	0.229
OnlineBPH	0.527	0.534	0.583	0.517	0.478	0.560	0.464	0.618	0.574
MACS	0.221	0.135	0.009	0.298	0.183	0.236	0.238	0.364	0.293
BR	0.781	0.677	0.396	0.542	0.330	0.291	0.410	0.360	0.295

Test Setting 1 : Math Pack

This setting performs purely geometric placement without any physics simulation or robot execution. By removing gravity, friction, and motion uncertainty, it isolates the algorithm’s spatial reasoning ability and reflects an idealized, noise-free upper bound of packing performance.

Test Set:

This dataset contains consecutively repeated items commonly seen in assembly-line logistics. Items appear in continuous sequences, allowing algorithms to learn and exploit repetitive patterns while maintaining stable stacking over long time horizons.

Test Setting 2 : Physics Pack

In this setting, gravity, collisions, and other physical effects are enabled, but robot motion is not involved. It evaluates whether the algorithm’s placement strategy can remain stable under realistic physical constraints such as stacking balance and collision dynamics, highlighting robustness without introducing motion-planning complexity.

Test Set:

This dataset features highly diverse items reflecting realistic order distributions in logistics workflows. It evaluates an algorithm’s ability to handle varied box sizes, adapt to heterogeneous inputs, and maintain both space efficiency and stacking stability.

Test Setting 3 : Execution Pack

This highest-fidelity setting integrates both physical simulation and robotic execution, including motion planning and trajectory control. Algorithm performance depends on kinematic reachability, collision-free path planning, and execution stability. This end-to-end evaluation reflects how well the algorithm’s placements can be realistically carried out by a robot in industrial environments.

Test Set:

This dataset contains long-board or panel-like items with one dimension significantly larger than the others. Such shapes frequently appear in furniture and manufacturing scenarios and require precise spatial control to ensure balance and stability during packing.