Zero-Cost Alternatives for I/O Operations

This document analyzes alternatives to dynamic dispatch (Box<dyn Trait>) for streaming I/O and directory iteration.

Decision: See ADR-025: Strategic Boxing for the formal decision.

TL;DR: We follow Tower/Axum’s approach - zero-cost on hot path (read(), write()), box at cold path boundaries (open_read(), read_dir()). We avoid heap allocations and dynamic dispatch unless they buy flexibility with negligible performance impact.

Current Design (Dynamic Dispatch)

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}

// Where ReadDirIter is:
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send>);
}

Cost: One heap allocation per open_read(), open_write(), or read_dir() call.

Option 1: Associated Types (Classic Approach)

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    type Reader: Read + Send;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError>;
}

pub trait FsDir: Send + Sync {
    type DirIter: Iterator<Item = Result<DirEntry, FsError>> + Send;

    fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError>;
}
}

Implementation

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    type Reader = std::io::Cursor<Vec<u8>>;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
        let data = self.read(path)?;
        Ok(std::io::Cursor::new(data))
    }
}

impl FsDir for MemoryBackend {
    type DirIter = std::vec::IntoIter<Result<DirEntry, FsError>>;

    fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError> {
        let entries = self.collect_entries(path)?;
        Ok(entries.into_iter())
    }
}
}

Middleware Propagation Problem

#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Quota<B> {
    // Must define our own Reader type that wraps B::Reader
    type Reader = QuotaReader<B::Reader>;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
        let inner = self.inner.open_read(path)?;
        Ok(QuotaReader::new(inner, self.usage.clone()))
    }
}

// Every middleware needs a custom wrapper type
struct QuotaReader<R> {
    inner: R,
    usage: Arc<RwLock<QuotaUsage>>,
}

impl<R: Read> Read for QuotaReader<R> {
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
        // Track bytes read if needed
        self.inner.read(buf)
    }
}
}

The Type Explosion

With a middleware stack like Quota<PathFilter<Tracing<MemoryBackend>>>:

#![allow(unused)]
fn main() {
type FinalReader = QuotaReader<PathFilterReader<TracingReader<Cursor<Vec<u8>>>>>;
type FinalDirIter = QuotaIter<PathFilterIter<TracingIter<IntoIter<Result<DirEntry, FsError>>>>>;
}

Verdict

Aspect	Assessment
Heap allocations	✅ None
Type complexity	❌ Exponential growth
Middleware authoring	❌ Every middleware needs wrapper types
User ergonomics	⚠️ Type annotations become unwieldy
Compile times	❌ Longer due to monomorphization

Not recommended as the primary API due to complexity explosion.

Option 2: RPITIT (Rust 1.75+)

Return Position Impl Trait in Traits allows:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path)
        -> Result<impl Iterator<Item = Result<DirEntry, FsError>> + Send, FsError>;
}
}

How It Works

The compiler infers a unique anonymous type for each implementor:

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        let data = self.read(path)?;
        Ok(std::io::Cursor::new(data))  // Returns Cursor<Vec<u8>>, but caller sees impl Read
    }
}

impl FsRead for SqliteBackend {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        Ok(SqliteReader::new(self.conn.clone(), path))  // Different type, same interface
    }
}
}

Middleware Still Works

#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Tracing<B> {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        let span = tracing::span!(Level::DEBUG, "open_read");
        let _guard = span.enter();
        self.inner.open_read(path)  // Just forward - return type is inferred
    }
}
}

The Catch: Object Safety

RPITIT makes traits non-object-safe. You cannot do:

#![allow(unused)]
fn main() {
// This won't compile with RPITIT
let backends: Vec<Box<dyn FsRead>> = vec![...];
}

Verdict

Aspect	Assessment
Heap allocations	✅ None
Type complexity	✅ Hidden behind `impl Trait`
Middleware authoring	✅ Simple forwarding
User ergonomics	✅ Clean API
Object safety	❌ Lost - can’t use `dyn FsRead`
Rust version	⚠️ Requires 1.75+

Good for performance-critical paths but sacrifices dyn usage.

Option 3: Generic Associated Types (GATs)

For readers that borrow from the backend:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    type Reader<'a>: Read + Send where Self: 'a;

    fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError>;
}
}

Use Case: Zero-Copy Reads

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    type Reader<'a> = &'a [u8];  // Borrow directly from internal storage!

    fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError> {
        let data = self.storage.read().unwrap();
        let bytes = data.get(path.as_ref())
            .ok_or(FsError::NotFound { path: path.as_ref().to_path_buf() })?;
        Ok(bytes.as_slice())
    }
}
}

Complexity

GATs are powerful but add significant complexity:

Lifetime parameters propagate through middleware
Not all backends can provide borrowed data (SQLite must copy)
Makes trait definitions harder to understand

Verdict

Aspect	Assessment
Heap allocations	✅ Can be zero-copy
Type complexity	❌ High (lifetimes everywhere)
Middleware authoring	❌ Complex lifetime handling
Use case fit	⚠️ Only benefits backends with owned data

Overkill for most use cases. Consider only for specialized zero-copy scenarios.

Option 4: Hybrid Approach (Recommended)

Provide both dynamic and static APIs:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    /// Dynamic dispatch version (simple, flexible)
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

/// Extension trait for zero-cost static dispatch
pub trait FsReadTyped: FsRead {
    type Reader: Read + Send;

    /// Static dispatch version (zero-cost, less flexible)
    fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}

// Blanket impl for convenience when types align
impl<T: FsReadTyped> FsRead for T {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
        Ok(Box::new(self.open_read_typed(path)?))
    }
}
}

Usage

#![allow(unused)]
fn main() {
// Default: dynamic dispatch (works everywhere)
let reader = fs.open_read("/file.txt")?;

// Performance-critical: static dispatch
let reader: MemoryReader = fs.open_read_typed("/file.txt")?;
}

Verdict

Aspect	Assessment
Heap allocations	✅ Optional (use `_typed` to avoid)
Type complexity	✅ Hidden unless you opt-in
Middleware authoring	✅ Only implement base trait
User ergonomics	✅ Simple default, power when needed
Object safety	✅ Base trait remains object-safe

Best of both worlds - simple default, zero-cost opt-in.

Option 5: Callback-Based Iteration

Avoid returning iterators entirely:

#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
    fn for_each_entry<F>(&self, path: &Path, f: F) -> Result<(), FsError>
    where
        F: FnMut(DirEntry) -> ControlFlow<(), ()>;
}
}

Usage

#![allow(unused)]
fn main() {
fs.for_each_entry("/dir", |entry| {
    println!("{}", entry.name);
    ControlFlow::Continue(())
})?;
}

Verdict

Aspect	Assessment
Heap allocations	✅ None
Ergonomics	❌ Callbacks are awkward
Early exit	✅ Via `ControlFlow::Break`
Composability	❌ Can’t chain iterator methods

Not recommended as primary API. Could be added as optimization option.

Option 6: Stack-Allocated Small Buffer

For directory iteration, most directories are small:

#![allow(unused)]
fn main() {
use smallvec::SmallVec;

pub struct ReadDirIter {
    // Stack-allocate up to 32 entries, heap only if larger
    entries: SmallVec<[Result<DirEntry, FsError>; 32]>,
    index: usize,
}
}

Verdict

Aspect	Assessment
Heap allocations	⚠️ Avoided for small directories
Memory overhead	⚠️ Larger stack frames
Dependencies	⚠️ Adds `smallvec` crate

Reasonable optimization for directory iteration specifically.

Recommendation

Primary API: Keep Dynamic Dispatch

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}
}

Why:

Simplicity - One type to learn, one API
Object safety - Can use Box<dyn Fs> for runtime polymorphism
Middleware simplicity - No wrapper types needed
Actual cost is low - One allocation per stream open, not per read

Optional: Static Dispatch Extension (Fast Path)

For performance-critical code, offer typed variants. This is the first-class fast path for hot loops when the backend type is known:

#![allow(unused)]
fn main() {
pub trait FsReadTyped: FsRead {
    type Reader: Read + Send;
    fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}

Future: RPITIT When Object Safety Not Needed

If a user doesn’t need dyn Fs, they can define their own trait:

#![allow(unused)]
fn main() {
pub trait FsReadStatic: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}
}

Cost Analysis: Is It Actually a Problem?

Heap Allocation Cost

Operation	Allocations	Typical Size	Cost
`open_read()`	1	~24-48 bytes (vtable + pointer)	~20-50ns
`read()` (data)	0-1	File size	Dominates
`read_dir()`	1	~24-48 bytes	~20-50ns
Iteration	0	-	-

The allocation is dwarfed by actual I/O time. For a 4KB file read from SQLite or disk, the Box allocation is <0.1% of total time.

When It Matters

Scenario	Matters?
Reading large files	No - I/O dominates
Reading many small files	Maybe - consider batching
Hot loop micro-benchmarks	Yes
Real-world applications	Rarely

Dynamic dispatch is the right default. The cost is negligible for real workloads, and the ergonomic benefits are substantial. Offer static dispatch as an opt-in escape hatch for the rare cases where it matters.

Summary Decision Matrix

Approach	Alloc-Free	Simple	Object-Safe	Recommended
Current (`Box<dyn>`)	❌	✅	✅	✅ Default
Associated Types	✅	❌	✅	❌ Too complex
RPITIT	✅	✅	❌	⚠️ When no dyn needed
GATs	✅	❌	❌	❌ Overkill
Hybrid	✅ opt-in	✅	✅	✅ Best of both
Callbacks	✅	❌	✅	❌ Awkward API
SmallVec	⚠️	✅	✅	⚠️ For ReadDirIter

AnyFS Ecosystem Manual

Zero-Cost Alternatives for I/O Operations

Current Design (Dynamic Dispatch)

Option 1: Associated Types (Classic Approach)

Implementation

Middleware Propagation Problem

The Type Explosion

Verdict

Option 2: RPITIT (Rust 1.75+)

How It Works

Middleware Still Works

The Catch: Object Safety

Verdict

Option 3: Generic Associated Types (GATs)

Use Case: Zero-Copy Reads

Complexity

Verdict

Option 4: Hybrid Approach (Recommended)

Usage

Verdict

Option 5: Callback-Based Iteration

Usage

Verdict

Option 6: Stack-Allocated Small Buffer

Verdict

Recommendation

Primary API: Keep Dynamic Dispatch

Optional: Static Dispatch Extension (Fast Path)

Future: RPITIT When Object Safety Not Needed

Cost Analysis: Is It Actually a Problem?

Heap Allocation Cost

When It Matters

Conclusion

Summary Decision Matrix

Keyboard shortcuts

AnyFS Ecosystem Manual