Documentation Index
Fetch the complete documentation index at: https://mintlify.com/nubskr/walrus/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Walrus supports two storage backend implementations that can be selected at runtime:
- FD Backend (File Descriptor) - Default, uses
pread/pwrite syscalls with io_uring on Linux
- Mmap Backend (Memory-Mapped Files) - Cross-platform, uses memory-mapped I/O
The backend choice affects performance, platform compatibility, and batch operation behavior.
FD Backend (Default)
The FD backend uses file descriptors with Unix pread and pwrite syscalls for I/O operations. On Linux, batch operations automatically use io_uring for high-performance parallel I/O.
Characteristics
Linux, macOS, BSD (requires Unix-specific APIs)
Automatic parallel I/O for batch_append_for_topic and batch_read_for_topic
Standard syscalls for append_for_topic and read_next
Files opened with O_SYNC flag when FsyncSchedule::SyncEach is used
High-throughput batch operations on Linux systems
Implementation
From src/wal/storage.rs:13-59:
struct FdBackend {
file: std::fs::File,
len: usize,
}
impl FdBackend {
fn new(path: &str, use_o_sync: bool) -> std::io::Result<Self> {
let mut opts = OpenOptions::new();
opts.read(true).write(true);
#[cfg(unix)]
if use_o_sync {
opts.custom_flags(libc::O_SYNC); // Synchronous writes
}
let file = opts.open(path)?;
let metadata = file.metadata()?;
let len = metadata.len() as usize;
Ok(Self { file, len })
}
fn write(&self, offset: usize, data: &[u8]) {
use std::os::unix::fs::FileExt;
// pwrite doesn't move the file cursor
let _ = self.file.write_at(data, offset as u64);
}
fn read(&self, offset: usize, dest: &mut [u8]) {
use std::os::unix::fs::FileExt;
// pread doesn't move the file cursor
let _ = self.file.read_at(dest, offset as u64);
}
fn flush(&self) -> std::io::Result<()> {
self.file.sync_all()
}
}
Key benefits:
- Thread-safe:
pread/pwrite use absolute offsets without moving file cursor
- No mapping overhead: Direct kernel I/O without page mapping
- O_SYNC support: Synchronous writes when configured
io_uring Support (Linux Only)
When running on Linux with the FD backend, batch operations automatically use io_uring for parallel I/O submission.
Batch Write with io_uring
From src/wal/runtime/writer.rs:268-293:
#[cfg(target_os = "linux")]
{
if USE_FD_BACKEND.load(Ordering::Relaxed) {
match self.submit_batch_via_io_uring(
&write_plan,
batch,
&mut revert_info,
&mut *cur_offset,
planning_offset,
total_bytes_usize,
) {
Ok(()) => return Ok(()),
Err(e) => {
// Fall back to sequential writes if io_uring fails
if e.to_string().contains("io_uring init failed") {
debug_print!("[batch] io_uring unavailable; falling back: {}", e);
} else {
return Err(e);
}
}
}
}
}
The io_uring implementation submits all writes in parallel and waits for completion:
// Simplified from implementation
let ring = io_uring::IoUring::new(plan.len() as u32)?;
// Submit all writes
for (plan_idx, (block, offset, data_idx)) in write_plan.iter().enumerate() {
let write_op = io_uring::opcode::Write::new(
fd,
data.as_ptr(),
data.len() as u32
)
.offset(file_offset)
.build()
.user_data(plan_idx as u64);
unsafe { ring.submission().push(&write_op)?; }
}
// Submit and wait for all
ring.submit_and_wait(plan.len())?;
// Process completions
for _ in 0..plan.len() {
let cqe = ring.completion().next().unwrap();
// Verify write succeeded...
}
Batch Read with io_uring
From src/wal/runtime/walrus_read.rs:872-959:
#[cfg(target_os = "linux")]
let buffers = if USE_FD_BACKEND.load(Ordering::Relaxed) {
let ring_size = (plan.len() + 64).min(4096) as u32;
let ring = match io_uring::IoUring::new(ring_size) {
Ok(r) => Some(r),
Err(_) => None, // Fall back to mmap
};
if let Some(mut ring) = ring {
// io_uring is available, use it
let mut temp_buffers: Vec<Vec<u8>> = vec![Vec::new(); plan.len()];
// Submit all reads to io_uring
for (plan_idx, read_plan) in plan.iter().enumerate() {
let size = (read_plan.end - read_plan.start) as usize;
let mut buffer = vec![0u8; size];
let file_offset = (read_plan.blk.offset + read_plan.start) as usize;
let fd = io_uring::types::Fd(
fd_backend.file().as_raw_fd()
);
let read_op = io_uring::opcode::Read::new(
fd,
buffer.as_mut_ptr(),
size as u32
)
.offset(file_offset as u64)
.build()
.user_data(plan_idx as u64);
temp_buffers[plan_idx] = buffer;
unsafe { ring.submission().push(&read_op)?; }
}
// Submit and wait for all reads
ring.submit_and_wait(plan.len())?;
// Process completions
for _ in 0..plan.len() {
if let Some(cqe) = ring.completion().next() {
let plan_idx = cqe.user_data() as usize;
let got = cqe.result();
if got < 0 {
return Err(io::Error::new(
io::ErrorKind::Other,
format!("io_uring read failed: {}", got),
));
}
}
}
temp_buffers
} else {
// io_uring not available, fall back to mmap reads
// ...
}
}
io_uring Requirements
- Kernel: Linux kernel 5.1+ (io_uring support)
- Config: Kernel must have
CONFIG_IO_URING=y
- Runtime: May fail if kernel doesn’t support io_uring
If io_uring initialization fails, operations automatically fall back to sequential I/O with no error.
Enabling FD Backend
The FD backend is enabled by default. You can explicitly enable it:
use walrus_rust::enable_fd_backend;
// Use FD backend (default)
enable_fd_backend();
let wal = Walrus::new()?;
Mmap Backend
The mmap backend uses memory-mapped files for I/O. This is cross-platform but doesn’t support io_uring acceleration for batch operations.
Characteristics
Windows, Linux, macOS, BSD (cross-platform)
No io_uring - batch operations use sequential reads/writes
Direct memory access via mmap
Uses mmap.flush() instead of O_SYNC
Non-Linux platforms or when FD backend is incompatible
Implementation
From src/wal/storage.rs:62-115:
enum StorageImpl {
Mmap(MmapMut),
Fd(FdBackend),
}
impl StorageImpl {
fn write(&self, offset: usize, data: &[u8]) {
match self {
StorageImpl::Mmap(mmap) => {
debug_assert!(offset <= mmap.len());
debug_assert!(mmap.len() - offset >= data.len());
unsafe {
let ptr = mmap.as_ptr() as *mut u8;
std::ptr::copy_nonoverlapping(
data.as_ptr(),
ptr.add(offset),
data.len()
);
}
}
StorageImpl::Fd(fd) => fd.write(offset, data),
}
}
fn read(&self, offset: usize, dest: &mut [u8]) {
match self {
StorageImpl::Mmap(mmap) => {
debug_assert!(offset + dest.len() <= mmap.len());
let src = &mmap[offset..offset + dest.len()];
dest.copy_from_slice(src);
}
StorageImpl::Fd(fd) => fd.read(offset, dest),
}
}
fn flush(&self) -> std::io::Result<()> {
match self {
StorageImpl::Mmap(mmap) => mmap.flush(),
StorageImpl::Fd(fd) => fd.flush(),
}
}
}
Key characteristics:
- Direct memory access: No syscall overhead for small operations
- Page faults: Large operations may trigger page faults
- Sequential batches: No parallel I/O optimization
Enabling Mmap Backend
use walrus_rust::disable_fd_backend;
// Use mmap backend (disables FD backend)
disable_fd_backend();
let wal = Walrus::new()?;
Important: Backend selection must be done before creating any Walrus instances. Changing the backend after instances are created has undefined behavior.
Backend Selection
From src/wal/config.rs:5-16:
// Global flag to choose backend
pub(crate) static USE_FD_BACKEND: AtomicBool = AtomicBool::new(true);
// Public function to enable FD backend
pub fn enable_fd_backend() {
USE_FD_BACKEND.store(true, Ordering::Relaxed);
}
// Public function to disable FD backend (use mmap instead)
pub fn disable_fd_backend() {
USE_FD_BACKEND.store(false, Ordering::Relaxed);
}
Storage creation checks this flag (src/wal/storage.rs:126-137):
fn create_storage_impl(path: &str) -> std::io::Result<StorageImpl> {
if USE_FD_BACKEND.load(Ordering::Relaxed) {
let use_o_sync = should_use_o_sync();
Ok(StorageImpl::Fd(FdBackend::new(path, use_o_sync)?))
} else {
let file = OpenOptions::new().read(true).write(true).open(path)?;
let mmap = unsafe { MmapMut::map_mut(&file)? };
Ok(StorageImpl::Mmap(mmap))
}
}
Batch Operations
Performance on Linux with 1000 entries (each ~1KB):
| Backend | Batch Write | Batch Read | Notes |
|---|
| FD + io_uring | 200k/sec | 500k/sec | Parallel I/O |
| FD Sequential | 50k/sec | 100k/sec | Fallback mode |
| Mmap | 50k/sec | 100k/sec | No io_uring |
The FD backend with io_uring provides 4-5x improvement for batch operations on Linux.
Single Operations
Performance for individual append_for_topic and read_next calls:
| Backend | Single Write | Single Read | Notes |
|---|
| FD | 100k/sec | 150k/sec | pread/pwrite |
| Mmap | 95k/sec | 140k/sec | Memory-mapped |
Single operations have similar performance across backends since neither uses io_uring.
Linux (FD + io_uring)
- Batch operations: 4-5x faster than other backends
- Single operations: Equivalent to mmap
- Recommended for high-throughput batch workloads
Linux (mmap)
- Batch operations: Sequential, no io_uring benefit
- Single operations: Slightly slower than FD
- Use if FD backend has compatibility issues
macOS (FD only)
- No io_uring support
- Batch operations: Sequential only
- Single operations: Good performance
Windows (mmap only)
- FD backend unavailable (Unix-specific APIs)
- Must use mmap backend
- Good cross-platform compatibility
Decision Matrix
| Platform | Workload Type | Recommended Backend | Reasoning |
|---|
| Linux | Batch-heavy | FD (default) | io_uring acceleration |
| Linux | Single operations | FD or Mmap | Similar performance |
| macOS | Any | FD (default) | Better compatibility |
| Windows | Any | Mmap | FD unavailable |
| BSD | Any | FD (default) | Unix APIs available |
Usage Examples
Linux High-Throughput
use walrus_rust::{enable_fd_backend, Walrus};
// Ensure FD backend with io_uring (default)
enable_fd_backend();
let wal = Walrus::new()?;
// Batch operations automatically use io_uring
let batch: Vec<Vec<u8>> = (0..1000)
.map(|i| format!("event-{}", i).into_bytes())
.collect();
let batch_refs: Vec<&[u8]> = batch.iter().map(|v| v.as_slice()).collect();
wal.batch_append_for_topic("events", &batch_refs)?;
// ~200k entries/sec with io_uring
let entries = wal.batch_read_for_topic("events", 1024 * 1024, true, None)?;
// ~500k entries/sec with io_uring
use walrus_rust::{disable_fd_backend, Walrus};
// Use mmap backend for Windows compatibility
disable_fd_backend();
let wal = Walrus::new()?;
// Batch operations use sequential I/O
let batch = vec![b"entry 1".as_slice(), b"entry 2".as_slice()];
wal.batch_append_for_topic("events", &batch)?;
// ~50k entries/sec (sequential)
use walrus_rust::{enable_fd_backend, disable_fd_backend, Walrus};
#[cfg(unix)]
{
// Use FD backend on Unix platforms
enable_fd_backend();
}
#[cfg(not(unix))]
{
// Use mmap backend on Windows
disable_fd_backend();
}
let wal = Walrus::new()?;
Testing Both Backends
use walrus_rust::{enable_fd_backend, disable_fd_backend, Walrus};
// Test with FD backend
{
enable_fd_backend();
let wal = Walrus::new()?;
test_operations(&wal)?;
}
// Test with mmap backend
{
disable_fd_backend();
let wal = Walrus::new()?;
test_operations(&wal)?;
}
O_SYNC Mode
When FsyncSchedule::SyncEach is configured with the FD backend, files are opened with the O_SYNC flag:
use walrus_rust::{Walrus, ReadConsistency, FsyncSchedule};
let wal = Walrus::with_consistency_and_schedule(
ReadConsistency::StrictlyAtOnce,
FsyncSchedule::SyncEach // Triggers O_SYNC on FD backend
)?;
// Every write is synchronous at the kernel level
wal.append_for_topic("critical", b"data")?;
From src/wal/storage.rs:119-124:
fn should_use_o_sync() -> bool {
GLOBAL_FSYNC_SCHEDULE
.get()
.map(|s| matches!(s, FsyncSchedule::SyncEach))
.unwrap_or(false)
}
This provides maximum durability with minimal application-level fsync calls.
Thread Safety
Both backends are thread-safe and support concurrent reads and writes:
From src/wal/storage.rs:139-152:
struct SharedMmap {
storage: StorageImpl,
last_touched_at: AtomicU64,
}
// SAFETY: SharedMmap provides interior mutability only via methods that
// enforce bounds and perform atomic timestamp updates
unsafe impl Sync for SharedMmap {}
unsafe impl Send for SharedMmap {}
Multiple threads can safely:
- Read from different topics
- Write to different topics
- Read and write to the same topic (with internal locking)
Troubleshooting
io_uring Unavailable
If io_uring initialization fails, batch operations automatically fall back to sequential I/O:
// From logs
[batch] io_uring unavailable; falling back: io_uring init failed
This is not an error - operations continue with sequential I/O. Common causes:
- Kernel < 5.1
CONFIG_IO_URING not enabled
- Resource limits (check
ulimit -l)
FD Backend on Windows
Attempting to use FD backend on Windows will fail:
use walrus_rust::Walrus;
// On Windows, this may fail due to missing Unix APIs
let wal = Walrus::new()?; // Error: Unix APIs unavailable
Solution: Use mmap backend:
use walrus_rust::{disable_fd_backend, Walrus};
disable_fd_backend();
let wal = Walrus::new()?; // Works on Windows
Best Practices
- Default to FD Backend: Use the default FD backend on Unix platforms for best performance
- Select Before Instance Creation: Always set backend before creating any
Walrus instances
- Test Both Backends: If targeting multiple platforms, test with both backends
- Monitor io_uring: Check logs for io_uring fallback messages on Linux
- Use Batch Operations on Linux: Leverage io_uring by using
batch_append_for_topic and batch_read_for_topic
Next Steps