
Where Cerulion pulls ahead
ROS 2’s architecture carries decades of accumulated complexity: a DDS networking layer, performance features that are opt-in rather than automatic, and execution order that depends on the executor and thread pool. Cerulion is engineered to beat that head-on, turning the frustrations teams feel most on a single machine into deliberate design advantages.Zero-copy is the default, not an opt-in
In ROS 2, zero-copy means choosing a compatible RMW, using loaned messages, and tuning configuration, so most teams run the default copy-on-receive path and never get it. In Cerulion, zero-copy shared memory is the hot path. A message is written once into a shared-memory slot and read in place, so latency stays flat as payloads grow, with no tuning.Deterministic, replayable execution
In ROS 2, execution order depends on the executor, thread pool, and DDS scheduling, which makes reproducing a timing bug notoriously hard. In Cerulion, execution order is derived from the graph file and driven by a simulated clock, so a recorded run replays the same way it ran live. Debug once, reproduce exactly.Predictable, low-jitter tails
ROS 2 tail latency varies with executor and DDS behavior. In Cerulion’s internal benchmarks, 99th-percentile latency stays within roughly 10% of the median across payload sizes: the tail tracks the median rather than spiking.
Timing violations are loud, not silent
ROS 2’s DDSDEADLINE QoS exists but is easy to misconfigure or have silently
ignored by the RMW. Cerulion tracks input, output, and tick-execution deadlines
with miss counters and emits a structured warning the moment a deadline slips,
which is critical for real-time control loops.
One source of truth, less config sprawl
In ROS 2, behavior is spread across code, launch files, parameter YAMLs, and QoS profiles that can silently conflict. In Cerulion, node behavior lives in the code macro and the graph file defines only wiring. No silent overrides, no config drift.A shallow learning curve
ROS 2 has a steep ramp: DDS QoS matrices, launch files, parameter plumbing. Cerulion asks you to write a Rust struct plus one macro; the CLI scaffolds the workspace, nodes, and graph for you.A memory-safe Rust foundation
ROS 2’s core client library (rclcpp) is C++, exposed to whole classes of memory
bugs. Cerulion is built in Rust: memory safety without a garbage collector, and
predictable performance.
The performance story
Cerulion’s headline advantage is flat latency. Under a saturation (back-to-back) round-trip workload, Cerulion stays in the low-microsecond range (about 2.4 to 2.8 µs) from a 64-byte message all the way to a 16-megabyte message. Standard ROS 2 (CycloneDDS over shared memory), measured at realistic sensor rates, climbs into the milliseconds as payloads grow, because it copies each message on receive. The decisive difference is the shape: Cerulion holds flat exactly where standard ROS 2 falls behind. Round-trip latency, single machine, single publisher/subscriber pair, from internal benchmarks. The Cerulion column is a saturation test; the standard ROS 2 column is measured at sensor rates. “vs. standard” is the ratio of the ROS 2 figure to the Cerulion figure.| Payload | Cerulion (saturation test) | ROS 2 standard (sensor-rate test) | vs. standard |
|---|---|---|---|
| 64 B | 2.4 µs | 15.3 µs | 6.3× |
| 256 B | 2.7 µs | 25.6 µs | 9.6× |
| 1 KB | 2.8 µs | 21.3 µs | 7.7× |
| 4 KB | 2.5 µs | 29.6 µs | 12.0× |
| 16 KB | 2.5 µs | 23.0 µs | 9.4× |
| 64 KB | 2.4 µs | 19.2 µs | 8.0× |
| 256 KB | 2.5 µs | 228 µs | 91× |
| 1 MB | 2.4 µs | 1.28 ms | 537× |
| 4 MB | 2.6 µs | 6.00 ms | 2,343× |
| 16 MB | 2.4 µs | 26.2 ms | 10,807× |
All figures are internal benchmarks, single-machine, single
publisher/subscriber pair. The Cerulion column is a saturation (back-to-back)
round-trip test; the “standard ROS 2” column is ROS 2 Humble, CycloneDDS over
shared memory on the default receive path, measured at sensor rates (Jazzy and
Kilted measured within a few percent). ROS 2 also has an optional zero-copy
(“loaned-message”) receive path; against that path Cerulion’s lead is steadier.
These are not guarantees and carry no error bars.
- At control-loop message sizes, Cerulion is roughly 6 to 12× faster than standard ROS 2.
- At image and lidar scale, the gap grows into the hundreds to thousands× faster than standard ROS 2 (about 537× at a 1 MB depth frame and about 10,807× at a 16 MB dense scan) because standard ROS 2 copies each message and Cerulion does not.
The path to a fully distributed stack. Cerulion’s vision is a complete,
better-in-every-way robotics stack, and cross-machine communication with
network-wide topic discovery is the next milestone on that road, landing in
the next release. Today Cerulion is purpose-built for the single-machine,
multi-process graph — and that same zero-copy, deterministic core is what
scales out to the distributed system that comes next.
Next steps
Core concepts
Workspaces, nodes, graphs, topics, and schemas: the mental model.
Quickstart
Go from zero to a running graph you can observe with
topic echo.Define a node
Create a node type, choose a trigger policy, and write
tick().CLI reference
Every command and flag, in one place.