Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/nubskr/walrus/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Walrus supports two storage backend implementations that can be selected at runtime:
  1. FD Backend (File Descriptor) - Default, uses pread/pwrite syscalls with io_uring on Linux
  2. Mmap Backend (Memory-Mapped Files) - Cross-platform, uses memory-mapped I/O
The backend choice affects performance, platform compatibility, and batch operation behavior.

FD Backend (Default)

The FD backend uses file descriptors with Unix pread and pwrite syscalls for I/O operations. On Linux, batch operations automatically use io_uring for high-performance parallel I/O.

Characteristics

Platform Support
Unix
Linux, macOS, BSD (requires Unix-specific APIs)
Batch Operations
io_uring on Linux
Automatic parallel I/O for batch_append_for_topic and batch_read_for_topic
Single Operations
pread/pwrite
Standard syscalls for append_for_topic and read_next
O_SYNC Support
Yes
Files opened with O_SYNC flag when FsyncSchedule::SyncEach is used
Best For
Linux batch workloads
High-throughput batch operations on Linux systems

Implementation

From src/wal/storage.rs:13-59:
struct FdBackend {
    file: std::fs::File,
    len: usize,
}

impl FdBackend {
    fn new(path: &str, use_o_sync: bool) -> std::io::Result<Self> {
        let mut opts = OpenOptions::new();
        opts.read(true).write(true);
        
        #[cfg(unix)]
        if use_o_sync {
            opts.custom_flags(libc::O_SYNC);  // Synchronous writes
        }
        
        let file = opts.open(path)?;
        let metadata = file.metadata()?;
        let len = metadata.len() as usize;
        
        Ok(Self { file, len })
    }
    
    fn write(&self, offset: usize, data: &[u8]) {
        use std::os::unix::fs::FileExt;
        // pwrite doesn't move the file cursor
        let _ = self.file.write_at(data, offset as u64);
    }
    
    fn read(&self, offset: usize, dest: &mut [u8]) {
        use std::os::unix::fs::FileExt;
        // pread doesn't move the file cursor
        let _ = self.file.read_at(dest, offset as u64);
    }
    
    fn flush(&self) -> std::io::Result<()> {
        self.file.sync_all()
    }
}
Key benefits:
  • Thread-safe: pread/pwrite use absolute offsets without moving file cursor
  • No mapping overhead: Direct kernel I/O without page mapping
  • O_SYNC support: Synchronous writes when configured

io_uring Support (Linux Only)

When running on Linux with the FD backend, batch operations automatically use io_uring for parallel I/O submission.

Batch Write with io_uring

From src/wal/runtime/writer.rs:268-293:
#[cfg(target_os = "linux")]
{
    if USE_FD_BACKEND.load(Ordering::Relaxed) {
        match self.submit_batch_via_io_uring(
            &write_plan,
            batch,
            &mut revert_info,
            &mut *cur_offset,
            planning_offset,
            total_bytes_usize,
        ) {
            Ok(()) => return Ok(()),
            Err(e) => {
                // Fall back to sequential writes if io_uring fails
                if e.to_string().contains("io_uring init failed") {
                    debug_print!("[batch] io_uring unavailable; falling back: {}", e);
                } else {
                    return Err(e);
                }
            }
        }
    }
}
The io_uring implementation submits all writes in parallel and waits for completion:
// Simplified from implementation
let ring = io_uring::IoUring::new(plan.len() as u32)?;

// Submit all writes
for (plan_idx, (block, offset, data_idx)) in write_plan.iter().enumerate() {
    let write_op = io_uring::opcode::Write::new(
        fd,
        data.as_ptr(),
        data.len() as u32
    )
    .offset(file_offset)
    .build()
    .user_data(plan_idx as u64);
    
    unsafe { ring.submission().push(&write_op)?; }
}

// Submit and wait for all
ring.submit_and_wait(plan.len())?;

// Process completions
for _ in 0..plan.len() {
    let cqe = ring.completion().next().unwrap();
    // Verify write succeeded...
}

Batch Read with io_uring

From src/wal/runtime/walrus_read.rs:872-959:
#[cfg(target_os = "linux")]
let buffers = if USE_FD_BACKEND.load(Ordering::Relaxed) {
    let ring_size = (plan.len() + 64).min(4096) as u32;
    let ring = match io_uring::IoUring::new(ring_size) {
        Ok(r) => Some(r),
        Err(_) => None,  // Fall back to mmap
    };
    
    if let Some(mut ring) = ring {
        // io_uring is available, use it
        let mut temp_buffers: Vec<Vec<u8>> = vec![Vec::new(); plan.len()];
        
        // Submit all reads to io_uring
        for (plan_idx, read_plan) in plan.iter().enumerate() {
            let size = (read_plan.end - read_plan.start) as usize;
            let mut buffer = vec![0u8; size];
            let file_offset = (read_plan.blk.offset + read_plan.start) as usize;
            
            let fd = io_uring::types::Fd(
                fd_backend.file().as_raw_fd()
            );
            
            let read_op = io_uring::opcode::Read::new(
                fd,
                buffer.as_mut_ptr(),
                size as u32
            )
            .offset(file_offset as u64)
            .build()
            .user_data(plan_idx as u64);
            
            temp_buffers[plan_idx] = buffer;
            unsafe { ring.submission().push(&read_op)?; }
        }
        
        // Submit and wait for all reads
        ring.submit_and_wait(plan.len())?;
        
        // Process completions
        for _ in 0..plan.len() {
            if let Some(cqe) = ring.completion().next() {
                let plan_idx = cqe.user_data() as usize;
                let got = cqe.result();
                if got < 0 {
                    return Err(io::Error::new(
                        io::ErrorKind::Other,
                        format!("io_uring read failed: {}", got),
                    ));
                }
            }
        }
        
        temp_buffers
    } else {
        // io_uring not available, fall back to mmap reads
        // ...
    }
}

io_uring Requirements

  • Kernel: Linux kernel 5.1+ (io_uring support)
  • Config: Kernel must have CONFIG_IO_URING=y
  • Runtime: May fail if kernel doesn’t support io_uring
If io_uring initialization fails, operations automatically fall back to sequential I/O with no error.

Enabling FD Backend

The FD backend is enabled by default. You can explicitly enable it:
use walrus_rust::enable_fd_backend;

// Use FD backend (default)
enable_fd_backend();

let wal = Walrus::new()?;

Mmap Backend

The mmap backend uses memory-mapped files for I/O. This is cross-platform but doesn’t support io_uring acceleration for batch operations.

Characteristics

Platform Support
All platforms
Windows, Linux, macOS, BSD (cross-platform)
Batch Operations
Sequential
No io_uring - batch operations use sequential reads/writes
Single Operations
Memory-mapped
Direct memory access via mmap
O_SYNC Support
No
Uses mmap.flush() instead of O_SYNC
Best For
Windows, cross-platform
Non-Linux platforms or when FD backend is incompatible

Implementation

From src/wal/storage.rs:62-115:
enum StorageImpl {
    Mmap(MmapMut),
    Fd(FdBackend),
}

impl StorageImpl {
    fn write(&self, offset: usize, data: &[u8]) {
        match self {
            StorageImpl::Mmap(mmap) => {
                debug_assert!(offset <= mmap.len());
                debug_assert!(mmap.len() - offset >= data.len());
                unsafe {
                    let ptr = mmap.as_ptr() as *mut u8;
                    std::ptr::copy_nonoverlapping(
                        data.as_ptr(),
                        ptr.add(offset),
                        data.len()
                    );
                }
            }
            StorageImpl::Fd(fd) => fd.write(offset, data),
        }
    }
    
    fn read(&self, offset: usize, dest: &mut [u8]) {
        match self {
            StorageImpl::Mmap(mmap) => {
                debug_assert!(offset + dest.len() <= mmap.len());
                let src = &mmap[offset..offset + dest.len()];
                dest.copy_from_slice(src);
            }
            StorageImpl::Fd(fd) => fd.read(offset, dest),
        }
    }
    
    fn flush(&self) -> std::io::Result<()> {
        match self {
            StorageImpl::Mmap(mmap) => mmap.flush(),
            StorageImpl::Fd(fd) => fd.flush(),
        }
    }
}
Key characteristics:
  • Direct memory access: No syscall overhead for small operations
  • Page faults: Large operations may trigger page faults
  • Sequential batches: No parallel I/O optimization

Enabling Mmap Backend

use walrus_rust::disable_fd_backend;

// Use mmap backend (disables FD backend)
disable_fd_backend();

let wal = Walrus::new()?;
Important: Backend selection must be done before creating any Walrus instances. Changing the backend after instances are created has undefined behavior.

Backend Selection

From src/wal/config.rs:5-16:
// Global flag to choose backend
pub(crate) static USE_FD_BACKEND: AtomicBool = AtomicBool::new(true);

// Public function to enable FD backend
pub fn enable_fd_backend() {
    USE_FD_BACKEND.store(true, Ordering::Relaxed);
}

// Public function to disable FD backend (use mmap instead)
pub fn disable_fd_backend() {
    USE_FD_BACKEND.store(false, Ordering::Relaxed);
}
Storage creation checks this flag (src/wal/storage.rs:126-137):
fn create_storage_impl(path: &str) -> std::io::Result<StorageImpl> {
    if USE_FD_BACKEND.load(Ordering::Relaxed) {
        let use_o_sync = should_use_o_sync();
        Ok(StorageImpl::Fd(FdBackend::new(path, use_o_sync)?))
    } else {
        let file = OpenOptions::new().read(true).write(true).open(path)?;
        let mmap = unsafe { MmapMut::map_mut(&file)? };
        Ok(StorageImpl::Mmap(mmap))
    }
}

Performance Comparison

Batch Operations

Performance on Linux with 1000 entries (each ~1KB):
BackendBatch WriteBatch ReadNotes
FD + io_uring200k/sec500k/secParallel I/O
FD Sequential50k/sec100k/secFallback mode
Mmap50k/sec100k/secNo io_uring
The FD backend with io_uring provides 4-5x improvement for batch operations on Linux.

Single Operations

Performance for individual append_for_topic and read_next calls:
BackendSingle WriteSingle ReadNotes
FD100k/sec150k/secpread/pwrite
Mmap95k/sec140k/secMemory-mapped
Single operations have similar performance across backends since neither uses io_uring.

Platform-Specific

Linux (FD + io_uring)
  • Batch operations: 4-5x faster than other backends
  • Single operations: Equivalent to mmap
  • Recommended for high-throughput batch workloads
Linux (mmap)
  • Batch operations: Sequential, no io_uring benefit
  • Single operations: Slightly slower than FD
  • Use if FD backend has compatibility issues
macOS (FD only)
  • No io_uring support
  • Batch operations: Sequential only
  • Single operations: Good performance
Windows (mmap only)
  • FD backend unavailable (Unix-specific APIs)
  • Must use mmap backend
  • Good cross-platform compatibility

Decision Matrix

PlatformWorkload TypeRecommended BackendReasoning
LinuxBatch-heavyFD (default)io_uring acceleration
LinuxSingle operationsFD or MmapSimilar performance
macOSAnyFD (default)Better compatibility
WindowsAnyMmapFD unavailable
BSDAnyFD (default)Unix APIs available

Usage Examples

Linux High-Throughput

use walrus_rust::{enable_fd_backend, Walrus};

// Ensure FD backend with io_uring (default)
enable_fd_backend();

let wal = Walrus::new()?;

// Batch operations automatically use io_uring
let batch: Vec<Vec<u8>> = (0..1000)
    .map(|i| format!("event-{}", i).into_bytes())
    .collect();
let batch_refs: Vec<&[u8]> = batch.iter().map(|v| v.as_slice()).collect();

wal.batch_append_for_topic("events", &batch_refs)?;
// ~200k entries/sec with io_uring

let entries = wal.batch_read_for_topic("events", 1024 * 1024, true, None)?;
// ~500k entries/sec with io_uring

Cross-Platform Compatibility

use walrus_rust::{disable_fd_backend, Walrus};

// Use mmap backend for Windows compatibility
disable_fd_backend();

let wal = Walrus::new()?;

// Batch operations use sequential I/O
let batch = vec![b"entry 1".as_slice(), b"entry 2".as_slice()];
wal.batch_append_for_topic("events", &batch)?;
// ~50k entries/sec (sequential)

Platform-Specific Selection

use walrus_rust::{enable_fd_backend, disable_fd_backend, Walrus};

#[cfg(unix)]
{
    // Use FD backend on Unix platforms
    enable_fd_backend();
}

#[cfg(not(unix))]
{
    // Use mmap backend on Windows
    disable_fd_backend();
}

let wal = Walrus::new()?;

Testing Both Backends

use walrus_rust::{enable_fd_backend, disable_fd_backend, Walrus};

// Test with FD backend
{
    enable_fd_backend();
    let wal = Walrus::new()?;
    test_operations(&wal)?;
}

// Test with mmap backend
{
    disable_fd_backend();
    let wal = Walrus::new()?;
    test_operations(&wal)?;
}

O_SYNC Mode

When FsyncSchedule::SyncEach is configured with the FD backend, files are opened with the O_SYNC flag:
use walrus_rust::{Walrus, ReadConsistency, FsyncSchedule};

let wal = Walrus::with_consistency_and_schedule(
    ReadConsistency::StrictlyAtOnce,
    FsyncSchedule::SyncEach  // Triggers O_SYNC on FD backend
)?;

// Every write is synchronous at the kernel level
wal.append_for_topic("critical", b"data")?;
From src/wal/storage.rs:119-124:
fn should_use_o_sync() -> bool {
    GLOBAL_FSYNC_SCHEDULE
        .get()
        .map(|s| matches!(s, FsyncSchedule::SyncEach))
        .unwrap_or(false)
}
This provides maximum durability with minimal application-level fsync calls.

Thread Safety

Both backends are thread-safe and support concurrent reads and writes: From src/wal/storage.rs:139-152:
struct SharedMmap {
    storage: StorageImpl,
    last_touched_at: AtomicU64,
}

// SAFETY: SharedMmap provides interior mutability only via methods that
// enforce bounds and perform atomic timestamp updates
unsafe impl Sync for SharedMmap {}
unsafe impl Send for SharedMmap {}
Multiple threads can safely:
  • Read from different topics
  • Write to different topics
  • Read and write to the same topic (with internal locking)

Troubleshooting

io_uring Unavailable

If io_uring initialization fails, batch operations automatically fall back to sequential I/O:
// From logs
[batch] io_uring unavailable; falling back: io_uring init failed
This is not an error - operations continue with sequential I/O. Common causes:
  • Kernel < 5.1
  • CONFIG_IO_URING not enabled
  • Resource limits (check ulimit -l)

FD Backend on Windows

Attempting to use FD backend on Windows will fail:
use walrus_rust::Walrus;

// On Windows, this may fail due to missing Unix APIs
let wal = Walrus::new()?;  // Error: Unix APIs unavailable
Solution: Use mmap backend:
use walrus_rust::{disable_fd_backend, Walrus};

disable_fd_backend();
let wal = Walrus::new()?;  // Works on Windows

Best Practices

  1. Default to FD Backend: Use the default FD backend on Unix platforms for best performance
  2. Select Before Instance Creation: Always set backend before creating any Walrus instances
  3. Test Both Backends: If targeting multiple platforms, test with both backends
  4. Monitor io_uring: Check logs for io_uring fallback messages on Linux
  5. Use Batch Operations on Linux: Leverage io_uring by using batch_append_for_topic and batch_read_for_topic

Next Steps