Skip to main content

Performance

Cerulion Core is designed for high-performance real-time applications. This page covers performance characteristics, optimization tips, and benchmarking guidance.

Performance Characteristics

Latency Measurements

OperationTransportLatencyNotes
Local sendiceoryx2< 1 μsZero-copy shared memory
Local receiveiceoryx2< 1 μsDirect memory read
Network send (queue)Background thread~30 nsAdded to fast path
Network send (serialization)Background thread1-10 μsDepends on message size
Network send (zenoh)Background thread1-10 msNetwork dependent
Network receiveZenoh1-10 msNetwork dependent
Local communication latency is sub-microsecond for small messages. Network communication adds minimal overhead to the fast path (~30 ns) because serialization and network I/O happen on a background thread.

Throughput

Local transport:
  • Limited by memory bandwidth
  • Typical: 10+ GB/s for small messages
  • Scales with message size (larger messages = higher throughput)
Network transport:
  • Limited by network bandwidth
  • Typical: 1-10 Gbps (depends on network)
  • Serialization overhead: ~1-10 μs per message
For high-throughput applications, local transport is ideal. Network transport is suitable for lower-frequency data or distributed systems.

Optimization Tips

1. Use Local Transport When Possible

// ✅ Good: Local transport for same-machine communication
let subscriber = Subscriber::<T>::create("topic", None)?;  // Auto-detects local

// ⚠️ Only if needed: Force network for remote communication
let subscriber = Subscriber::<T>::create("topic", Some(true))?;
Local transport provides < 1 μs latency vs 1-10 ms for network. Use local transport whenever possible.

2. Keep Message Types Small

// ✅ Good: Small, focused message type
#[derive(Copy, Clone, Debug)]
#[repr(C)]
struct SensorData {
    temperature: f32,  // 4 bytes
    timestamp: u64,    // 8 bytes
}  // Total: 12 bytes

// ⚠️ Consider: Large messages increase serialization time
#[derive(Copy, Clone, Debug)]
#[repr(C)]
struct LargeData {
    buffer: [u8; 1_000_000],  // 1MB per message!
}
Small messages (< 1 KB) have minimal serialization overhead. Large messages (> 100 KB) may benefit from compression or chunking.

3. Use Copy Types

// ✅ Good: Copy type (zero-copy local transport)
#[derive(Copy, Clone, Debug)]
#[repr(C)]
struct Data {
    value: f32,
}

// ❌ Bad: Non-Copy type (requires serialization even for local)
struct Data {
    value: String,  // Not Copy!
}
Copy types enable zero-copy local transport. Non-Copy types require serialization even for local communication, adding overhead.

4. Batch Messages When Possible

// ✅ Good: Batch multiple values into one message
#[derive(Copy, Clone, Debug)]
#[repr(C)]
struct BatchData {
    values: [f32; 100],  // 100 values in one message
}

// ⚠️ Less efficient: Send 100 separate messages
for value in values {
    publisher.send(SingleValue { value })?;  // 100 sends
}
Batching reduces the number of send operations and can improve throughput, especially for network transport.

5. Use TopicManager for Multiple Topics

// ✅ Good: Shared session reduces overhead
let manager = TopicManager::create()?;
let pub1 = manager.register_publisher::<T1>("topic1", false)?;
let pub2 = manager.register_publisher::<T2>("topic2", false)?;

// ⚠️ Less efficient: Each publisher creates its own session
let pub1 = Publisher::<T1>::create("topic1")?;
let pub2 = Publisher::<T2>::create("topic2")?;
TopicManager shares a single Zenoh session across all publishers/subscribers, reducing memory and connection overhead.

Benchmarking

Local Transport Benchmark

Here’s a simple benchmark for local transport:
use cerulion_core::prelude::*;
use std::time::Instant;

#[derive(Copy, Clone, Debug)]
#[repr(C)]
struct TestData {
    value: f32,
    timestamp: u64,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let publisher = Publisher::<TestData>::create("test")?;
    let subscriber = Subscriber::<TestData>::create("test", None)?;
    
    // Warmup
    for _ in 0..100 {
        publisher.send(TestData { value: 1.0, timestamp: 0 })?;
        let _ = subscriber.receive();
    }
    
    // Benchmark
    let iterations = 1_000_000;
    let start = Instant::now();
    
    for i in 0..iterations {
        publisher.send(TestData {
            value: i as f32,
            timestamp: i as u64,
        })?;
        
        if let Ok(Some(_)) = subscriber.receive() {
            // Message received
        }
    }
    
    let elapsed = start.elapsed();
    let latency = elapsed.as_nanos() / iterations as u128;
    
    println!("Average latency: {} ns", latency);
    println!("Throughput: {:.2} M messages/s", 
             iterations as f64 / elapsed.as_secs_f64() / 1_000_000.0);
    
    Ok(())
}
Expected results for local transport: < 1 μs average latency, 10+ M messages/s throughput for small messages.

Network Transport Benchmark

For network transport, measure end-to-end latency:
use cerulion_core::prelude::*;
use std::time::{Instant, SystemTime};

#[derive(Copy, Clone, Debug)]
#[repr(C)]
struct TimestampedData {
    value: f32,
    send_time: u64,
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let publisher = Publisher::<TimestampedData>::create("test")?;
    let subscriber = Subscriber::<TimestampedData>::create("test", Some(true))?;  // Force network
    
    // Wait for connection
    std::thread::sleep(std::time::Duration::from_secs(1));
    
    let iterations = 1000;
    let mut latencies = Vec::new();
    
    for i in 0..iterations {
        let send_time = SystemTime::now()
            .duration_since(SystemTime::UNIX_EPOCH)?
            .as_nanos() as u64;
        
        publisher.send(TimestampedData {
            value: i as f32,
            send_time,
        })?;
        
        if let Ok(Some(data)) = subscriber.receive() {
            let receive_time = SystemTime::now()
                .duration_since(SystemTime::UNIX_EPOCH)?
                .as_nanos() as u64;
            
            let latency = receive_time - data.send_time;
            latencies.push(latency);
        }
        
        std::thread::sleep(std::time::Duration::from_millis(10));
    }
    
    let avg_latency = latencies.iter().sum::<u64>() / latencies.len() as u64;
    println!("Average network latency: {} ns ({:.2} ms)", 
             avg_latency, avg_latency as f64 / 1_000_000.0);
    
    Ok(())
}
Network latency depends on network conditions. Typical values: 1-10 ms for local network, 10-100 ms for WAN.

Performance Considerations

Message Size Impact

Message SizeLocal LatencyNetwork SerializationNetwork Latency
16 bytes< 1 μs~1 μs1-10 ms
1 KB< 1 μs~5 μs1-10 ms
100 KB< 10 μs~50 μs1-10 ms
1 MB< 100 μs~500 μs1-10 ms
Local transport latency is relatively insensitive to message size (memory bandwidth is high). Network serialization time increases with message size, but network latency dominates for larger messages.

CPU Usage

Local transport:
  • Minimal CPU usage (< 1% for 1 MHz message rate)
  • CPU usage scales with message rate
Network transport:
  • Background thread handles serialization and network I/O
  • Main thread CPU usage: ~30 ns per send (just queueing)
  • Background thread CPU usage: Depends on message size and rate
Network transport is designed to minimize impact on the main thread. Serialization and network I/O happen on a background thread, so they don’t block your application.

Best Practices Summary

  1. Use local transport when possible (< 1 μs vs 1-10 ms)
  2. Keep messages small (< 1 KB for best performance)
  3. Use Copy types for zero-copy local transport
  4. Batch messages when sending multiple values
  5. Use TopicManager for multiple topics (shared session)
  6. Profile your application to identify bottlenecks
  7. Consider message frequency - high frequency benefits more from local transport

Troubleshooting Performance Issues

”High latency on local transport”

Possible causes:
  • Message type is not Copy
  • Message size is very large
  • System is under heavy load
Solutions:
  • Ensure message type implements Copy
  • Reduce message size or batch messages
  • Check system load and CPU usage

”Network messages not arriving”

Possible causes:
  • Network transport not enabled
  • Network connectivity issues
  • Subscriber not connected
Solutions:
  • Check that network transport is enabled
  • Verify network connectivity
  • Ensure subscriber is created before publisher sends

”High CPU usage”

Possible causes:
  • Very high message rate
  • Large message sizes
  • Multiple background threads
Solutions:
  • Reduce message rate if possible
  • Reduce message size
  • Use TopicManager to share sessions

Next Steps