Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Zero-Cost Alternatives for I/O Operations

This document analyzes alternatives to dynamic dispatch (Box<dyn Trait>) for streaming I/O and directory iteration.

Decision: See ADR-025: Strategic Boxing for the formal decision.

TL;DR: We follow Tower/Axum’s approach - zero-cost on hot path (read(), write()), box at cold path boundaries (open_read(), read_dir()). We avoid heap allocations and dynamic dispatch unless they buy flexibility with negligible performance impact.


Current Design (Dynamic Dispatch)

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}

// Where ReadDirIter is:
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send>);
}

Cost: One heap allocation per open_read(), open_write(), or read_dir() call.


Option 1: Associated Types (Classic Approach)

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    type Reader: Read + Send;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError>;
}

pub trait FsDir: Send + Sync {
    type DirIter: Iterator<Item = Result<DirEntry, FsError>> + Send;

    fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError>;
}
}

Implementation

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    type Reader = std::io::Cursor<Vec<u8>>;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
        let data = self.read(path)?;
        Ok(std::io::Cursor::new(data))
    }
}

impl FsDir for MemoryBackend {
    type DirIter = std::vec::IntoIter<Result<DirEntry, FsError>>;

    fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError> {
        let entries = self.collect_entries(path)?;
        Ok(entries.into_iter())
    }
}
}

Middleware Propagation Problem

#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Quota<B> {
    // Must define our own Reader type that wraps B::Reader
    type Reader = QuotaReader<B::Reader>;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
        let inner = self.inner.open_read(path)?;
        Ok(QuotaReader::new(inner, self.usage.clone()))
    }
}

// Every middleware needs a custom wrapper type
struct QuotaReader<R> {
    inner: R,
    usage: Arc<RwLock<QuotaUsage>>,
}

impl<R: Read> Read for QuotaReader<R> {
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
        // Track bytes read if needed
        self.inner.read(buf)
    }
}
}

The Type Explosion

With a middleware stack like Quota<PathFilter<Tracing<MemoryBackend>>>:

#![allow(unused)]
fn main() {
type FinalReader = QuotaReader<PathFilterReader<TracingReader<Cursor<Vec<u8>>>>>;
type FinalDirIter = QuotaIter<PathFilterIter<TracingIter<IntoIter<Result<DirEntry, FsError>>>>>;
}

Verdict

AspectAssessment
Heap allocations✅ None
Type complexity❌ Exponential growth
Middleware authoring❌ Every middleware needs wrapper types
User ergonomics⚠️ Type annotations become unwieldy
Compile times❌ Longer due to monomorphization

Not recommended as the primary API due to complexity explosion.


Option 2: RPITIT (Rust 1.75+)

Return Position Impl Trait in Traits allows:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path)
        -> Result<impl Iterator<Item = Result<DirEntry, FsError>> + Send, FsError>;
}
}

How It Works

The compiler infers a unique anonymous type for each implementor:

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        let data = self.read(path)?;
        Ok(std::io::Cursor::new(data))  // Returns Cursor<Vec<u8>>, but caller sees impl Read
    }
}

impl FsRead for SqliteBackend {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        Ok(SqliteReader::new(self.conn.clone(), path))  // Different type, same interface
    }
}
}

Middleware Still Works

#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Tracing<B> {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        let span = tracing::span!(Level::DEBUG, "open_read");
        let _guard = span.enter();
        self.inner.open_read(path)  // Just forward - return type is inferred
    }
}
}

The Catch: Object Safety

RPITIT makes traits non-object-safe. You cannot do:

#![allow(unused)]
fn main() {
// This won't compile with RPITIT
let backends: Vec<Box<dyn FsRead>> = vec![...];
}

Verdict

AspectAssessment
Heap allocations✅ None
Type complexity✅ Hidden behind impl Trait
Middleware authoring✅ Simple forwarding
User ergonomics✅ Clean API
Object safety❌ Lost - can’t use dyn FsRead
Rust version⚠️ Requires 1.75+

Good for performance-critical paths but sacrifices dyn usage.


Option 3: Generic Associated Types (GATs)

For readers that borrow from the backend:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    type Reader<'a>: Read + Send where Self: 'a;

    fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError>;
}
}

Use Case: Zero-Copy Reads

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    type Reader<'a> = &'a [u8];  // Borrow directly from internal storage!

    fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError> {
        let data = self.storage.read().unwrap();
        let bytes = data.get(path.as_ref())
            .ok_or(FsError::NotFound { path: path.as_ref().to_path_buf() })?;
        Ok(bytes.as_slice())
    }
}
}

Complexity

GATs are powerful but add significant complexity:

  • Lifetime parameters propagate through middleware
  • Not all backends can provide borrowed data (SQLite must copy)
  • Makes trait definitions harder to understand

Verdict

AspectAssessment
Heap allocations✅ Can be zero-copy
Type complexity❌ High (lifetimes everywhere)
Middleware authoring❌ Complex lifetime handling
Use case fit⚠️ Only benefits backends with owned data

Overkill for most use cases. Consider only for specialized zero-copy scenarios.


Provide both dynamic and static APIs:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    /// Dynamic dispatch version (simple, flexible)
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

/// Extension trait for zero-cost static dispatch
pub trait FsReadTyped: FsRead {
    type Reader: Read + Send;

    /// Static dispatch version (zero-cost, less flexible)
    fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}

// Blanket impl for convenience when types align
impl<T: FsReadTyped> FsRead for T {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
        Ok(Box::new(self.open_read_typed(path)?))
    }
}
}

Usage

#![allow(unused)]
fn main() {
// Default: dynamic dispatch (works everywhere)
let reader = fs.open_read("/file.txt")?;

// Performance-critical: static dispatch
let reader: MemoryReader = fs.open_read_typed("/file.txt")?;
}

Verdict

AspectAssessment
Heap allocations✅ Optional (use _typed to avoid)
Type complexity✅ Hidden unless you opt-in
Middleware authoring✅ Only implement base trait
User ergonomics✅ Simple default, power when needed
Object safety✅ Base trait remains object-safe

Best of both worlds - simple default, zero-cost opt-in.


Option 5: Callback-Based Iteration

Avoid returning iterators entirely:

#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
    fn for_each_entry<F>(&self, path: &Path, f: F) -> Result<(), FsError>
    where
        F: FnMut(DirEntry) -> ControlFlow<(), ()>;
}
}

Usage

#![allow(unused)]
fn main() {
fs.for_each_entry("/dir", |entry| {
    println!("{}", entry.name);
    ControlFlow::Continue(())
})?;
}

Verdict

AspectAssessment
Heap allocations✅ None
Ergonomics❌ Callbacks are awkward
Early exit✅ Via ControlFlow::Break
Composability❌ Can’t chain iterator methods

Not recommended as primary API. Could be added as optimization option.


Option 6: Stack-Allocated Small Buffer

For directory iteration, most directories are small:

#![allow(unused)]
fn main() {
use smallvec::SmallVec;

pub struct ReadDirIter {
    // Stack-allocate up to 32 entries, heap only if larger
    entries: SmallVec<[Result<DirEntry, FsError>; 32]>,
    index: usize,
}
}

Verdict

AspectAssessment
Heap allocations⚠️ Avoided for small directories
Memory overhead⚠️ Larger stack frames
Dependencies⚠️ Adds smallvec crate

Reasonable optimization for directory iteration specifically.


Recommendation

Primary API: Keep Dynamic Dispatch

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}
}

Why:

  1. Simplicity - One type to learn, one API
  2. Object safety - Can use Box<dyn Fs> for runtime polymorphism
  3. Middleware simplicity - No wrapper types needed
  4. Actual cost is low - One allocation per stream open, not per read

Optional: Static Dispatch Extension (Fast Path)

For performance-critical code, offer typed variants. This is the first-class fast path for hot loops when the backend type is known:

#![allow(unused)]
fn main() {
pub trait FsReadTyped: FsRead {
    type Reader: Read + Send;
    fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}

Future: RPITIT When Object Safety Not Needed

If a user doesn’t need dyn Fs, they can define their own trait:

#![allow(unused)]
fn main() {
pub trait FsReadStatic: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}
}

Cost Analysis: Is It Actually a Problem?

Heap Allocation Cost

OperationAllocationsTypical SizeCost
open_read()1~24-48 bytes (vtable + pointer)~20-50ns
read() (data)0-1File sizeDominates
read_dir()1~24-48 bytes~20-50ns
Iteration0--

The allocation is dwarfed by actual I/O time. For a 4KB file read from SQLite or disk, the Box allocation is <0.1% of total time.

When It Matters

ScenarioMatters?
Reading large filesNo - I/O dominates
Reading many small filesMaybe - consider batching
Hot loop micro-benchmarksYes
Real-world applicationsRarely

Conclusion

Dynamic dispatch is the right default. The cost is negligible for real workloads, and the ergonomic benefits are substantial. Offer static dispatch as an opt-in escape hatch for the rare cases where it matters.


Summary Decision Matrix

ApproachAlloc-FreeSimpleObject-SafeRecommended
Current (Box<dyn>)✅ Default
Associated Types❌ Too complex
RPITIT⚠️ When no dyn needed
GATs❌ Overkill
Hybrid✅ opt-in✅ Best of both
Callbacks❌ Awkward API
SmallVec⚠️⚠️ For ReadDirIter