Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

AnyFS Ecosystem

An open standard for pluggable virtual filesystem backends in Rust.


Overview

AnyFS is an open standard for virtual filesystem backends using a Tower-style middleware pattern for composable functionality.

You get:

  • A familiar std::fs-aligned API
  • Composable middleware (limits, logging, security)
  • Choice of storage: memory, SQLite, host filesystem, or custom
  • A developer-first goal: make storage composition easy, safe, and enjoyable

Architecture

┌─────────────────────────────────────────┐
│  FileStorage<B>                         │  ← Ergonomic std::fs-aligned API
├─────────────────────────────────────────┤
│  Middleware (composable):               │
│    Quota<B>                             │  ← Quotas
│    Restrictions<B>                      │  ← Security
│    Tracing<B>                           │  ← Audit
├─────────────────────────────────────────┤
│  Fs                                     │  ← Storage
│  (Memory, SQLite, VRootFs, custom...)   │
└─────────────────────────────────────────┘

Each layer has one job. Compose only what you need.


Two-Crate Structure

CratePurpose
anyfs-backendMinimal contract: Fs trait + types
anyfsBackends + middleware + mounting + ergonomic FileStorage<B>

Note: Mounting (FsFuse + MountHandle) is part of the anyfs crate behind feature flags (fuse, winfsp), not a separate crate.


Quick Example

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, FileStorage};

// Layer-based composition
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(RestrictionsLayer::builder()
        .deny_permissions()
        .build());

let fs = FileStorage::new(backend);

fs.create_dir_all("/data")?;
fs.write("/data/file.txt", b"hello")?;
}

How to Use This Manual

SectionAudiencePurpose
OverviewStakeholdersOne-page understanding
Getting StartedDevelopersPractical examples
Design & ArchitectureContributorsDetailed design
Traits & APIsBackend authorsContract and types
ImplementationImplementersPlan + backend guide

Future Considerations

These are optional extensions that fit the design but are out of scope for initial release:

  • URL-based backend registry and bulk helpers (FsExt/utilities)
  • Async adapter for remote backends
  • Companion shell for interactive exploration
  • Copy-on-write overlay and archive backends (zip/tar)

See Design Overview for the full list and rationale.


Status

ComponentStatus
DesignComplete
ImplementationNot started (mounting roadmap defined)

Authoritative Documents

  1. AGENTS.md (for AI assistants)
  2. src/architecture/design-overview.md
  3. src/architecture/adrs.md

AnyFS - Executive Summary

One-page overview for stakeholders and decision-makers.


What is it?

AnyFS is an open standard for pluggable virtual filesystem backends in Rust, using a Tower-style middleware pattern for composable functionality.

You get:

  • A familiar std::fs-aligned API
  • Composable middleware for limits, logging, and security
  • Choice of storage: memory, SQLite, host filesystem, or custom

Architecture

┌─────────────────────────────────────────┐
│  FileStorage<B>                         │  ← Ergonomics (std::fs API)
├─────────────────────────────────────────┤
│  Middleware (composable):               │
│    Quota<B>                             │  ← Quotas
│    Restrictions<B>                      │  ← Security
│    Tracing<B>                           │  ← Audit
├─────────────────────────────────────────┤
│  Fs                                     │  ← Storage
└─────────────────────────────────────────┘

Why does it matter?

ProblemHow AnyFS helps
Multi-tenant isolationSeparate backend instances per tenant
PortabilitySQLite backend: tenant data = single .db file
SecurityRestrictions blocks risky operations when composed
Resource controlQuota enforces quotas
Audit complianceTracing records all operations
Custom storageImplement Fs for any medium

Key design points

  • Two-crate structure

    • anyfs-backend: trait contract
    • anyfs: backends + middleware + ergonomic wrapper
  • Middleware pattern (like Axum/Tower)

    • Each middleware has one job
    • Compose only what you need
    • Complete separation of concerns
  • std::fs alignment

    • Familiar method names
    • Core traits use &Path; FileStorage/FsExt accept impl AsRef<Path> for ergonomics
  • Developer experience first

    • Make storage composition easy, safe, and enjoyable to use

Quick example

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, FileStorage};

// Layer-based composition
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(RestrictionsLayer::builder()
        .deny_permissions()
        .build());

let fs = FileStorage::new(backend);

fs.create_dir_all("/documents")?;
fs.write("/documents/hello.txt", b"Hello!")?;
}

Status

PhaseStatus
DesignComplete
ImplementationNot started (mounting roadmap defined)

For details, see Design Overview.

AnyFS - Project Structure

Status: Target layout (design spec) Last updated: 2025-12-24


This manual describes the intended code repository layout; this repository contains documentation only.

Repository Layout

anyfs-backend/              # Crate 1: traits + types (minimal dependencies)
  Cargo.toml
  src/
    lib.rs
    traits/
      fs_read.rs            # FsRead trait
      fs_write.rs           # FsWrite trait
      fs_dir.rs             # FsDir trait
      fs_link.rs            # FsLink trait
      fs_permissions.rs     # FsPermissions trait
      fs_sync.rs            # FsSync trait
      fs_stats.rs           # FsStats trait
      fs_path.rs            # FsPath trait (canonicalization, blanket impl)
      fs_inode.rs           # FsInode trait
      fs_handles.rs         # FsHandles trait
      fs_lock.rs            # FsLock trait
      fs_xattr.rs           # FsXattr trait
    layer.rs                # Layer trait (Tower-style)
    ext.rs                  # FsExt (extension methods)
    markers.rs              # SelfResolving marker trait
    path_resolver.rs        # PathResolver trait (pluggable resolution)
    types.rs                # Metadata, DirEntry, Permissions, StatFs
    error.rs                # FsError

anyfs/                      # Crate 2: framework glue (simple backends + middleware + ergonomics)
  Cargo.toml
  src/
    lib.rs
    backends/
      memory.rs             # MemoryBackend (in-memory, simple)
      stdfs.rs              # StdFsBackend (thin std::fs wrapper)
      vrootfs.rs            # VRootFsBackend (std::fs + path containment)
    middleware/
      quota.rs              # Quota<B>
      restrictions.rs       # Restrictions<B>
      path_filter.rs        # PathFilter<B>
      read_only.rs          # ReadOnly<B>
      rate_limit.rs         # RateLimit<B>
      tracing.rs            # Tracing<B>
      dry_run.rs            # DryRun<B>
      cache.rs              # Cache<B>
      overlay.rs            # Overlay<B1, B2>
    resolvers/
      iterative.rs          # IterativeResolver (default)
      noop.rs               # NoOpResolver (for SelfResolving backends)
      caching.rs            # CachingResolver (LRU cache wrapper)
    container.rs            # FileStorage<B>
    stack.rs                # BackendStack builder

Ecosystem Crates (Separate Repositories)

Complex backends with internal runtime requirements:

anyfs-sqlite/               # SQLite backend (connection pooling, WAL, sharding)
anyfs-indexed/              # Hybrid backend (SQLite index + disk blobs)
anyfs-s3/                   # Third-party: S3 backend
anyfs-redis/                # Third-party: Redis backend

Testing Crate

anyfs-test/                 # Conformance test suite for backend implementers
  src/
    lib.rs
    conformance/            # Test generators for each trait level
      fs_tests.rs           # Fs-level tests
      fs_full_tests.rs      # FsFull-level tests
      fs_fuse_tests.rs      # FsFuse-level tests
    prelude.rs              # Re-exports for test files

Dependency Model

anyfs-backend (trait + types)
     ^
     |-- anyfs (backends + middleware + ergonomics)
     |     ^-- vrootfs feature uses strict-path

Key points:

  • Custom backends depend only on anyfs-backend
  • anyfs provides built-in backends, middleware, mounting (behind feature flags), and the ergonomic FileStorage<B> wrapper

Middleware Pattern

FileStorage<B>
    wraps -> Tracing<B>
        wraps -> Restrictions<B>
            wraps -> Quota<B>
                wraps -> MemoryBackend (or any Fs)

Each layer implements Fs, enabling composition.


Cargo Features

Backends (anyfs crate)

  • memory — In-memory storage (default)
  • stdfs — Direct std::fs delegation (no containment)
  • vrootfs — Host filesystem backend with path containment (uses strict-path)

Ecosystem Backends (Separate Crates)

Complex backends live in their own crates:

  • anyfs-sqlite — SQLite-backed persistent storage (pooling, WAL, sharding, encryption)
  • anyfs-indexed — SQLite index + disk blobs for large file performance

Middleware (MVP Scope)

Core middleware is always available (no feature flags needed):

  • Quota, PathFilter, Restrictions, ReadOnly, RateLimit, Cache, DryRun, Overlay

Optional middleware with external dependencies:

  • tracing — Detailed audit logging (requires tracing crate)

Mounting (Platform Features)

  • fuse — Mount as filesystem on Linux/macOS (requires fuser crate)
  • winfsp — Mount as filesystem on Windows (requires winfsp crate)

Use default-features = false to cherry-pick exactly what you need.

Middleware (Future Scope)

  • metrics — Prometheus integration (requires prometheus crate)

Where To Start

AnyFS — Getting Started Guide

A practical introduction with examples


Installation

Add to your Cargo.toml:

[dependencies]
anyfs = "0.1"

For additional backends:

[dependencies]
anyfs = { version = "0.1", features = ["stdfs", "vrootfs"] }

# SQLite storage (ecosystem crate)
anyfs-sqlite = "0.1"
anyfs-sqlite = { version = "0.1", features = ["encryption"] }  # With SQLCipher

# Hybrid backend: SQLite metadata + disk blobs (ecosystem crate)
anyfs-indexed = "0.1"

Available anyfs features:

  • memory — In-memory storage (default)
  • stdfs — Direct std::fs delegation (no containment)
  • vrootfs — Host filesystem backend with path containment
  • bytes — Zero-copy Bytes support (adds read_bytes() method)
  • tracing — Detailed audit logging (requires tracing crate)
  • fuse — Mount as filesystem on Linux/macOS
  • winfsp — Mount as filesystem on Windows

Core middleware (Quota, PathFilter, Restrictions, ReadOnly, RateLimit, Cache, DryRun, Overlay) is always available.


Quick Start

Examples below use FileStorage, so you can pass paths as &str. If you call core trait methods directly, use &Path.

Hello World

use anyfs::{MemoryBackend, FileStorage};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let fs = FileStorage::new(MemoryBackend::new());

    fs.write("/hello.txt", b"Hello, AnyFS!")?;
    let content = fs.read("/hello.txt")?;
    println!("{}", String::from_utf8_lossy(&content));

    Ok(())
}

With Quotas

use anyfs::{MemoryBackend, QuotaLayer, FileStorage};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let backend = QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)  // 100 MB
        .max_file_size(10 * 1024 * 1024)    // 10 MB per file
        .build()
        .layer(MemoryBackend::new());

    let fs = FileStorage::new(backend);

    fs.create_dir_all("/documents")?;
    fs.write("/documents/notes.txt", b"Meeting notes")?;

    Ok(())
}

With Restrictions

use anyfs::{MemoryBackend, RestrictionsLayer, FileStorage};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Block permission changes for untrusted code
    let backend = RestrictionsLayer::builder()
        .deny_permissions()   // Block set_permissions() calls
        .build()
        .layer(MemoryBackend::new());

    let fs = FileStorage::new(backend);

    // All other operations work normally
    fs.write("/file.txt", b"content")?;

    Ok(())
}

Full Stack (Layer-based)

use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, TracingLayer, FileStorage};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder()
            .max_total_size(100 * 1024 * 1024)
            .max_file_size(10 * 1024 * 1024)
            .build())
        .layer(RestrictionsLayer::builder()
            .deny_permissions()
            .build())
        .layer(TracingLayer::new());

    let fs = FileStorage::new(backend);

    fs.create_dir_all("/data")?;
    fs.write("/data/file.txt", b"hello")?;

    Ok(())
}

Common Operations

Creating Directories

#![allow(unused)]
fn main() {
fs.create_dir("/documents")?;              // Single level
fs.create_dir_all("/documents/2024/q1")?;  // Recursive
}

Reading and Writing Files

#![allow(unused)]
fn main() {
fs.write("/data.txt", b"line 1\n")?;       // Create or overwrite
fs.append("/data.txt", b"line 2\n")?;      // Append

let content = fs.read("/data.txt")?;                    // Read all
let partial = fs.read_range("/data.txt", 0, 6)?;        // Read range
let text = fs.read_to_string("/data.txt")?;             // Read as String
}

Listing Directories

#![allow(unused)]
fn main() {
for entry in fs.read_dir("/documents")? {
    println!("{}: {:?}", entry.name, entry.file_type);
}
}

Checking Existence and Metadata

#![allow(unused)]
fn main() {
if fs.exists("/file.txt")? {
    let meta = fs.metadata("/file.txt")?;
    println!("Size: {} bytes", meta.size);
}
}

Copying and Moving

#![allow(unused)]
fn main() {
fs.copy("/original.txt", "/copy.txt")?;
fs.rename("/original.txt", "/renamed.txt")?;
}

Deleting

#![allow(unused)]
fn main() {
fs.remove_file("/old-file.txt")?;
fs.remove_dir("/empty-folder")?;
fs.remove_dir_all("/old-folder")?;
}

Middleware

Quota — Resource Limits

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Quota};

let backend = QuotaLayer::builder()
    .max_total_size(500 * 1024 * 1024)   // 500 MB total
    .max_file_size(50 * 1024 * 1024)     // 50 MB per file
    .max_node_count(100_000)             // 100K files/dirs
    .max_dir_entries(5_000)              // 5K per directory
    .max_path_depth(32)                  // Max nesting
    .build()
    .layer(MemoryBackend::new());

// Check usage
let usage = backend.usage();
println!("Using {} bytes", usage.total_size);

// Check remaining
let remaining = backend.remaining();
if !remaining.can_write {
    println!("Storage full!");
}
}

Restrictions — Block Permission Changes

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, RestrictionsLayer};

// Restrictions controls permission-related operations.
// Symlink/hard-link capability is determined by trait bounds (FsLink).
let backend = RestrictionsLayer::builder()
    .deny_permissions()   // Block set_permissions() calls
    .build()
    .layer(MemoryBackend::new());

// Blocked operations return FsError::FeatureNotEnabled
}

Tracing — Instrumentation

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, TracingLayer};

// TracingLayer uses the global tracing subscriber by default
let backend = MemoryBackend::new().layer(TracingLayer::new());

// Or configure with custom settings
let backend = MemoryBackend::new()
    .layer(TracingLayer::new()
        .with_target("myapp::fs")
        .with_level(tracing::Level::DEBUG));
}

Error Handling

#![allow(unused)]
fn main() {
use anyfs_backend::FsError;

match fs.write("/file.txt", &large_data) {
    Ok(()) => println!("Written"),

    Err(FsError::NotFound { path, .. }) => println!("Not found: {}", path.display()),
    Err(FsError::AlreadyExists { path, .. }) => println!("Exists: {}", path.display()),
    Err(FsError::QuotaExceeded { .. }) => println!("Quota exceeded"),
    Err(FsError::FeatureNotEnabled { feature }) => println!("Feature disabled: {}", feature),

    Err(e) => println!("Error: {}", e),
}
}

Testing

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};

#[test]
fn test_write_and_read() {
    let fs = FileStorage::new(MemoryBackend::new());

    fs.write("/test.txt", b"test data").unwrap();
    let content = fs.read("/test.txt").unwrap();

    assert_eq!(content, b"test data");
}
}

With limits:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Quota, FileStorage};

#[test]
fn test_quota_exceeded() {
    let backend = QuotaLayer::builder()
        .max_total_size(1024)  // 1 KB
        .build()
        .layer(MemoryBackend::new());
    let fs = FileStorage::new(backend);

    let big_data = vec![0u8; 2048];  // 2 KB
    let result = fs.write("/big.bin", &big_data);

    assert!(result.is_err());
}
}

Best Practices

1. Use Appropriate Backend

Use CaseBackendCrate
TestingMemoryBackendanyfs
Production (portable)SqliteBackendanyfs-sqlite
Host filesystem (with containment)VRootFsBackendanyfs
Host filesystem (direct access)StdFsBackendanyfs

2. Compose Middleware for Your Needs

#![allow(unused)]
fn main() {
// Minimal: just storage
let fs = FileStorage::new(MemoryBackend::new());

// With limits (layer-based)
let fs = FileStorage::new(
    MemoryBackend::new()
        .layer(QuotaLayer::builder()
            .max_total_size(100 * 1024 * 1024)
            .build())
);

// Sandboxed (layer-based)
let temp_dir = tempfile::tempdir()?;
let fs = FileStorage::new(
    VRootFsBackend::new(temp_dir.path())?
        .layer(QuotaLayer::builder()
            .max_total_size(100 * 1024 * 1024)
            .build())
        .layer(RestrictionsLayer::builder()
            .deny_permissions()
            .build())
);
}

3. Handle Errors Gracefully

Always check for quota exceeded, feature not enabled, and other errors.


Advanced Use Cases

These use cases require the fuse or winfsp feature flags.

Database-Backed Drive with Live Monitoring

Mount a database-backed filesystem and query it directly for real-time analytics:

┌─────────────────────────────────────────────────────────────┐
│  Database (SQLite, PostgreSQL, etc.)                        │
├─────────────────────────────────────────────────────────────┤
│                         │                                   │
│    MountHandle          │         Stats Dashboard           │
│    (write + read)       │         (direct DB queries)       │
│         │               │               │                   │
│         ▼               │               ▼                   │
│  /mnt/workspace         │    SELECT SUM(size) FROM nodes    │
│  $ cp file.txt ./       │    SELECT COUNT(*) FROM nodes     │
│  $ mkdir projects/      │    SELECT * FROM audit_log        │
│                         │               │                   │
│                         │               ▼                   │
│                         │        ┌──────────────┐           │
│                         │        │ Live Graphs  │           │
│                         │        │ - Disk usage │           │
│                         │        │ - File count │           │
│                         │        │ - Recent ops │           │
│                         │        └──────────────┘           │
└─────────────────────────────────────────────────────────────┘

SQLite Example (using ecosystem crate - API sketch, planned):

#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, TracingLayer, MountHandle};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Mount the drive
let backend = SqliteBackend::open("tenant.db")?
    .layer(TracingLayer::new())  // Logs operations to tracing subscriber
    .layer(QuotaLayer::builder()
        .max_total_size(1_000_000_000)
        .build());

let mount = MountHandle::mount(backend, "/mnt/workspace")?;
}
#![allow(unused)]
fn main() {
// Meanwhile, in a monitoring dashboard...
// Note: This queries SqliteBackend's internal schema (nodes, audit_log tables).
// See anyfs-sqlite documentation for schema details.
use rusqlite::{Connection, OpenFlags};

let conn = Connection::open_with_flags(
    "tenant.db",
    OpenFlags::SQLITE_OPEN_READ_ONLY,  // Safe concurrent reads
)?;

loop {
    let (file_count, total_bytes): (i64, i64) = conn.query_row(
        "SELECT COUNT(*), COALESCE(SUM(size), 0) FROM nodes WHERE type = 'file'",
        [],
        |row| Ok((row.get(0)?, row.get(1)?)),
    )?;

    let recent_ops: Vec<String> = conn
        .prepare("SELECT operation, path, timestamp FROM audit_log ORDER BY timestamp DESC LIMIT 10")?
        .query_map([], |row| Ok(format!("{}: {}", row.get::<_, String>(0)?, row.get::<_, String>(1)?)))?
        .collect::<Result<_, _>>()?;

    render_dashboard(file_count, total_bytes, &recent_ops);
    std::thread::sleep(Duration::from_secs(1));
}
}

Works with any database backend:

BackendDirect Query Method
SqliteBackendrusqlite with SQLITE_OPEN_READ_ONLY
Custom (user-implemented)Direct database driver connection

Third-party crates can implement additional database backends (PostgreSQL, MySQL, etc.) following the same pattern.

What you can visualize:

  • Real-time storage usage (gauges, bar charts)
  • File count over time (line graphs)
  • Operations log (live feed)
  • Most accessed files (heatmaps)
  • Directory tree maps (size visualization)
  • Per-tenant usage (multi-tenant dashboards)

This pattern is powerful because the database is the source of truth — you get filesystem semantics via FUSE and SQL analytics via direct queries, from the same data.

RAM Drive

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, MountHandle};

// 4GB RAM drive
let mount = MountHandle::mount(
    MemoryBackend::new()
        .layer(QuotaLayer::builder()
            .max_total_size(4 * 1024 * 1024 * 1024)
            .build()),
    "/mnt/ramdisk"
)?;

// Use for fast compilation, temp files, etc.
// $ TMPDIR=/mnt/ramdisk cargo build
}

Sandboxed AI Agent Workspace

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RestrictionsLayer, TracingLayer, MountHandle};

let mount = MountHandle::mount(
    MemoryBackend::new()
        .layer(PathFilterLayer::builder()
            .allow("/workspace/**")
            .deny("**/..*")           // No hidden files
            .deny("**/.*")            // No dotfiles
            .build())
        .layer(QuotaLayer::builder()
            .max_total_size(100 * 1024 * 1024)
            .max_file_size(10 * 1024 * 1024)
            .build())
        .layer(TracingLayer::new()),  // Full audit trail
    "/mnt/agent"
)?;

// Agent uses standard filesystem APIs
// All operations are sandboxed, quota-limited, and logged
}

Next Steps

AnyFS — API Quick Reference

Condensed reference for developers


Installation

[dependencies]
anyfs = "0.1"

With optional features and ecosystem crates:

anyfs = { version = "0.1", features = ["vrootfs", "bytes"] }
anyfs-sqlite = "0.1"  # SQLite backend (ecosystem crate)

Creating a Backend Stack

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, TracingLayer, FileStorage};

// Simple
let fs = FileStorage::new(MemoryBackend::new());

// With limits
let fs = FileStorage::new(
    MemoryBackend::new()
        .layer(QuotaLayer::builder()
            .max_total_size(100 * 1024 * 1024)
            .max_file_size(10 * 1024 * 1024)
            .build())
);

// Full stack (fluent composition)
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(RestrictionsLayer::builder()
        .deny_permissions()
        .build())
    .layer(TracingLayer::new());

let fs = FileStorage::new(backend);

// BackendStack builder (fluent API)
use anyfs::BackendStack;

let fs = BackendStack::new(MemoryBackend::new())
    .limited(|l| l.max_total_size(100 * 1024 * 1024))
    .restricted(|g| g.deny_permissions())
    .traced()
    .into_container();
}

BackendStack Methods

BackendStack provides a fluent API for building middleware stacks:

#![allow(unused)]
fn main() {
use anyfs::BackendStack;

BackendStack::new(backend)           // Start with any backend
    .limited(|l| l                   // -> Quota<B>
        .max_total_size(bytes)
        .max_file_size(bytes)
        .max_node_count(count))
    .restricted(|g| g                // -> Restrictions<B>
        .deny_permissions())
    .traced()                        // -> Tracing<B>
    .cached(|c| c                    // -> Cache<B>
        .max_size(bytes)
        .ttl(duration))
    .read_only()                     // -> ReadOnly<B>
    .into_container()                // -> FileStorage<...>
}
MethodCreatesDescription
.limited()Quota<B>Add quota limits
.restricted()Restrictions<B>Add operation restrictions
.traced()Tracing<B>Add tracing instrumentation
.cached()Cache<B>Add LRU caching
.read_only()ReadOnly<B>Make backend read-only
.into_container()FileStorage<B>Finish and wrap in FileStorage

Quota Methods

#![allow(unused)]
fn main() {
// Builder pattern (required - at least one limit must be set)
QuotaLayer::builder()
    .max_total_size(bytes)      // Total storage limit
    .max_file_size(bytes)       // Per-file limit
    .max_node_count(count)      // Max files/dirs
    .max_dir_entries(count)     // Max entries per dir
    .max_path_depth(depth)      // Max nesting
    .build()
    .layer(backend)

// Query
backend.usage()        // -> Usage { total_size, file_count, ... }
backend.limits()       // -> Limits { max_total_size, ... }
backend.remaining()    // -> Remaining { bytes, can_write, ... }
}

Restrictions Methods

#![allow(unused)]
fn main() {
// Builder pattern
// Restrictions only controls permission-related operations.
// Symlink/hard-link capability is via trait bounds (FsLink), not middleware.
RestrictionsLayer::builder()
    .deny_permissions()    // Block set_permissions() calls
    .build()
    .layer(backend)
}

TracingLayer Methods

#![allow(unused)]
fn main() {
// TracingLayer configuration (applied via .layer())
TracingLayer::new()
    .with_target("anyfs")              // tracing target
    .with_level(tracing::Level::DEBUG)

// Usage
let backend = inner.layer(TracingLayer::new().with_target("anyfs"));
}

PathFilter Methods

#![allow(unused)]
fn main() {
// Builder pattern (required - at least one rule must be set)
PathFilterLayer::builder()
    .allow("/workspace/**")    // Allow glob pattern
    .deny("**/.env")           // Deny glob pattern
    .deny("**/secrets/**")
    .build()
    .layer(backend)

// Rules evaluated in order; first match wins
// No match = denied (deny by default)
}

ReadOnly Methods

#![allow(unused)]
fn main() {
ReadOnly::new(backend)

// All read operations pass through
// All write operations return FsError::ReadOnly
}

RateLimit Methods

#![allow(unused)]
fn main() {
// Builder pattern (required - must set ops and window)
RateLimitLayer::builder()
    .max_ops(1000)           // Operation limit
    .per_second()            // Window: 1 second
    // or
    .per_minute()            // Window: 60 seconds
    // or
    .per(Duration::from_millis(500))  // Custom window
    .build()
    .layer(backend)
}

DryRun Methods

#![allow(unused)]
fn main() {
let dry_run = DryRun::new(backend);
let fs = FileStorage::new(dry_run);

// Read operations execute normally
// Write operations are logged but not executed
fs.write("/file.txt", b"data")?;  // Logged, returns Ok

// Inspect logged operations (returns Vec<String>)
let ops: Vec<String> = dry_run.operations();
// e.g., ["write /file.txt (4 bytes)", "remove_file /old.txt"]

dry_run.clear();  // Clear the log
}

Cache Methods

#![allow(unused)]
fn main() {
// Builder pattern (required - at least max_entries must be set)
CacheLayer::builder()
    .max_entries(1000)                          // LRU cache size
    .max_entry_size(1024 * 1024)               // 1MB max per entry
    .ttl(Duration::from_secs(300))             // Optional: entry lifetime (default: no expiry)
    .build()
    .layer(backend)
}

IndexLayer Methods (Future)

#![allow(unused)]
fn main() {
// Builder pattern (required - set index path)
IndexLayer::builder()
    .index_file("index.db")             // Sidecar index file (SQLite default)
    .consistency(IndexConsistency::Strict)
    .track_reads(false)                 // Optional
    .build()
    .layer(backend)
}

Overlay Methods

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, MemoryBackend, Overlay};

let base = VRootFsBackend::new("/var/templates")?;  // Read-only base
let upper = MemoryBackend::new();                    // Writable upper

let overlay = Overlay::new(base, upper);

// Read: check upper first, fall back to base
// Write: always to upper layer
// Delete: whiteout marker in upper
}

FsExt Methods

Extension methods available on all backends:

#![allow(unused)]
fn main() {
use anyfs_backend::FsExt;

// JSON support (requires `serde` feature on anyfs-backend)
let config: Config = fs.read_json("/config.json")?;
fs.write_json("/config.json", &config)?;

// Type checks
if fs.is_file("/path")? { ... }
if fs.is_dir("/path")? { ... }
}

Note: JSON methods require anyfs-backend = { version = "0.1", features = ["serde"] }


File Operations

Examples below assume FileStorage (std::fs-style paths). If you call core trait methods directly, pass &Path.

#![allow(unused)]
fn main() {
// Check existence
fs.exists("/path")?                     // -> bool

// Metadata
let meta = fs.metadata("/path")?;
meta.inode                               // unique identifier
meta.nlink                               // hard link count
meta.file_type                           // File | Directory | Symlink
meta.size                                // file size in bytes
meta.permissions                         // Permissions (default if unsupported)
meta.created                             // SystemTime (UNIX_EPOCH if unsupported)
meta.modified                            // SystemTime (UNIX_EPOCH if unsupported)
meta.accessed                            // SystemTime (UNIX_EPOCH if unsupported)

// Read
let bytes = fs.read("/path")?;           // -> Vec<u8>
let text = fs.read_to_string("/path")?;  // -> String
let chunk = fs.read_range("/path", 0, 1024)?;

// List directory
for entry in fs.read_dir("/path")? {
    let entry = entry?;
    entry.name                           // String (file/dir name only)
    entry.path                           // PathBuf (full path)
    entry.file_type                      // File | Directory | Symlink
    entry.size                           // u64 (0 for directories)
    entry.inode                          // u64 (0 if unsupported)
}

// Write
fs.write("/path", b"content")?;          // Create or overwrite
fs.append("/path", b"more")?;            // Append

// Directories
fs.create_dir("/path")?;
fs.create_dir_all("/path")?;

// Delete
fs.remove_file("/path")?;
fs.remove_dir("/path")?;                 // Empty only
fs.remove_dir_all("/path")?;             // Recursive

// Move/Copy
fs.rename("/from", "/to")?;
fs.copy("/from", "/to")?;

// Links
fs.symlink("/target", "/link")?;
fs.hard_link("/original", "/link")?;
fs.read_link("/link")?;                  // -> PathBuf
fs.symlink_metadata("/link")?;           // Metadata of link itself, not target
// Symlink capability is determined by FsLink trait bounds, not middleware.

// Permissions (requires FsPermissions)
fs.set_permissions("/path", perms)?;

// File size
fs.truncate("/path", 1024)?;             // Resize to 1024 bytes

// Durability
fs.sync()?;                              // Flush all writes
fs.fsync("/path")?;                      // Flush writes for one file
}

Path Canonicalization

FileStorage provides path canonicalization that works on the virtual filesystem.

Note: Canonicalization requires FsLink because symlink resolution needs read_link() and symlink_metadata(). Backends that only implement Fs (without FsLink) cannot use these methods.

#![allow(unused)]
fn main() {
// Strict canonicalization - path must exist
let canonical = fs.canonicalize("/some/../path/./file.txt")?;
// Returns fully resolved absolute path, follows symlinks

// Soft canonicalization - handles non-existent paths
let resolved = fs.soft_canonicalize("/existing/dir/new_file.txt")?;
// Resolves existing components, appends non-existent remainder lexically

// Anchored canonicalization - sandboxed resolution
let safe = fs.anchored_canonicalize("/workspace/../etc/passwd", "/workspace")?;
// Clamps result within anchor directory (returns error if escape attempted)
}

Standalone utility (no backend needed):

#![allow(unused)]
fn main() {
use anyfs::normalize;

// Lexical path cleanup only
let clean = normalize("//foo///bar//");  // -> "/foo/bar"
// Does NOT resolve . or .. (those require filesystem context)
// Does NOT follow symlinks
}

Comparison:

FunctionPath Must Exist?Follows Symlinks?Resolves ..?
canonicalizeYes (all components)YesYes (symlink-aware)
soft_canonicalizeNo (appends non-existent)Yes (for existing)Yes (symlink-aware)
anchored_canonicalizeNoYes (for existing)Yes (clamped to anchor)
normalizeN/A (lexical only)NoNo

Inode Operations (FsInode trait)

Backends implementing FsInode track inodes internally for hardlink support and FUSE mounting:

#![allow(unused)]
fn main() {
use anyfs::FileStorage;

// Convert between paths and inodes
let fs = FileStorage::new(backend);
let inode: u64 = fs.path_to_inode("/some/path")?;
let path: PathBuf = fs.inode_to_path(inode)?;

// Lookup child by name within a directory (FUSE-style)
let root_inode = fs.path_to_inode("/")?;
let child_inode = fs.lookup(root_inode, "filename.txt")?;

// Get metadata by inode (avoids path resolution)
let meta = fs.metadata_by_inode(inode)?;

// Hardlinks share the same inode
fs.hard_link("/original", "/link")?;
let ino1 = fs.path_to_inode("/original")?;
let ino2 = fs.path_to_inode("/link")?;
assert_eq!(ino1, ino2);  // Same inode!
}

Inode sources by backend:

BackendInode Source
MemoryBackendInternal node IDs (incrementing counter)
anyfs-sqlite: SqliteBackendSQLite row IDs (INTEGER PRIMARY KEY)
VRootFsBackendOS inode numbers (metadata.inode)

Error Handling

#![allow(unused)]
fn main() {
use anyfs_backend::FsError;

match result {
    // Path errors
    Err(FsError::NotFound { path }) => {
        // e.g., path="/file.txt"
    }
    Err(FsError::AlreadyExists { path, operation }) => ...
    Err(FsError::NotADirectory { path }) => ...
    Err(FsError::NotAFile { path }) => ...
    Err(FsError::DirectoryNotEmpty { path }) => ...
    Err(FsError::SymlinkLoop { path }) => ...  // Circular symlink detected

    // Quota middleware errors\n    Err(FsError::QuotaExceeded { path, limit, requested, usage }) => ...\n    Err(FsError::FileSizeExceeded { path, size, limit }) => ...\n\n    // Restrictions middleware errors\n    Err(FsError::FeatureNotEnabled { path, feature, operation }) => ...\n\n    // PathFilter middleware errors\n    Err(FsError::AccessDenied { path, reason }) => ...\n\n    // ReadOnly middleware errors\n    Err(FsError::ReadOnly { path, operation }) => ...\n\n    // RateLimit middleware errors\n    Err(FsError::RateLimitExceeded { path, limit, window_secs }) => ...

    // FsExt errors
    Err(FsError::Serialization(msg)) => ...
    Err(FsError::Deserialization(msg)) => ...

    // Optional feature not supported
    Err(FsError::NotSupported { operation }) => ...

    Err(e) => ...
}
}

Built-in Backends (anyfs crate)

TypeDescription
MemoryBackendIn-memory storage (default)
StdFsBackendDirect std::fs (no containment)
VRootFsBackendHost filesystem (with path containment)

Ecosystem Backends (Separate Crates)

CrateTypeDescription
anyfs-sqliteSqliteBackendPersistent single-file database (optional encryption via SQLCipher)
anyfs-indexedIndexedBackendVirtual paths + disk blobs (large files)

SqliteBackend Encryption (Ecosystem Crate)

Crate: anyfs-sqlite with encryption feature

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;

// Standard (no encryption)
let backend = SqliteBackend::open("data.db")?;

// With encryption (requires `encryption` feature)
let backend = SqliteBackend::open_encrypted("encrypted.db", "password")?;

// With raw 256-bit key
let backend = SqliteBackend::open_with_key("encrypted.db", &key)?;

// Change password on open encrypted database
backend.change_password("new-password")?;
}

Without the correct password, the .db file appears as random bytes.

Feature: anyfs-sqlite = { version = "0.1", features = ["encryption"] }


Middleware

TypePurpose
Quota<B>Quota enforcement
Restrictions<B>Least privilege
PathFilter<B>Path-based access control
ReadOnly<B>Prevent write operations
RateLimit<B>Operation throttling
Tracing<B>Instrumentation (tracing ecosystem)
DryRun<B>Log without executing
Cache<B>LRU read caching
Overlay<B1,B2>Union filesystem

Layers

LayerCreates
QuotaLayerQuota<B>
RestrictionsLayerRestrictions<B>
PathFilterLayerPathFilter<B>
ReadOnlyLayerReadOnly<B>
RateLimitLayerRateLimit<B>
TracingLayerTracing<B>
DryRunLayerDryRun<B>
CacheLayerCache<B>

Note: Overlay<B1, B2> is constructed directly via Overlay::new(base, upper) rather than using a Layer, because it takes two backends.


Type Reference

From anyfs-backend

Core Traits (Layer 1):

TraitDescription
FsReadRead operations: read, read_to_string, read_range, exists, metadata, open_read
FsWriteWrite operations: write, append, remove_file, rename, copy, truncate, open_write
FsDirDirectory operations: read_dir, create_dir*, remove_dir*

Extended Traits (Layer 2):

TraitDescription
FsLinkLink operations: symlink, hard_link, read_link
FsPermissionsPermission operations: set_permissions
FsSyncSync operations: sync, fsync
FsStatsStats operations: statfs

Inode Trait (Layer 3):

TraitDescription
FsInodeInode operations: path_to_inode, inode_to_path, lookup, metadata_by_inode

POSIX Traits (Layer 4):

TraitDescription
FsHandlesHandle operations: open, read_at, write_at, close
FsLockLock operations: lock, try_lock, unlock
FsXattrExtended attribute operations: get_xattr, set_xattr, list_xattr

Convenience Supertraits:

TraitCombines
FsFsRead + FsWrite + FsDir (90% of use cases)
FsFullFs + FsLink + FsPermissions + FsSync + FsStats
FsFuseFsFull + FsInode (FUSE-mountable)
FsPosixFsFuse + FsHandles + FsLock + FsXattr (full POSIX)

Other Types:

TypeDescription
LayerMiddleware composition trait
FsExtExtension methods trait (JSON, type checks)
FsErrorError type (with context)
ROOT_INODEConstant: root directory inode (= 1)
FileTypeFile, Directory, Symlink
MetadataFile/dir metadata (inode, nlink, size, times, permissions)
DirEntryDirectory entry (name, inode, file_type)
PermissionsFile permissions (mode: u32)
StatFsFilesystem stats (bytes, inodes, block_size)

From anyfs

TypeDescription
MemoryBackendIn-memory backend
StdFsBackendDirect std::fs backend (no containment)
VRootFsBackendHost FS backend (with containment)
Quota<B>Quota middleware
Restrictions<B>Feature gate middleware
PathFilter<B>Path access control middleware
ReadOnly<B>Read-only middleware
RateLimit<B>Rate limiting middleware
Tracing<B>Tracing middleware
DryRun<B>Dry-run middleware
Cache<B>Caching middleware
Overlay<B1,B2>Union filesystem middleware
QuotaLayerLayer for Quota
RestrictionsLayerLayer for Restrictions
PathFilterLayerLayer for PathFilter
ReadOnlyLayerLayer for ReadOnly
RateLimitLayerLayer for RateLimit
TracingLayerLayer for Tracing
DryRunLayerLayer for DryRun
CacheLayerLayer for Cache
OverlayLayerLayer for Overlay

From Ecosystem Crates

CrateTypeDescription
anyfs-sqliteSqliteBackendSQLite backend (optional encryption with feature)
anyfs-indexedIndexedBackendVirtual paths + disk blobs

Ergonomic Wrappers (in anyfs):

TypeDescription
FileStorage<B>Thin ergonomic wrapper (generic backend, boxed resolver)
BackendStackFluent builder for middleware stacks
.boxed()Opt-in type erasure for FileStorage
IterativeResolverDefault path resolver (symlink-aware for backends with FsLink)
NoOpResolverNo-op resolver for SelfResolving backends
CachingResolver<R>LRU cache wrapper around another resolver

AnyFS - Design Overview

Status: Current Last updated: 2025-12-24


What This Project Is

AnyFS is an open standard for pluggable virtual filesystem backends in Rust. It uses a middleware/decorator pattern (like Axum/Tower) for composable functionality with complete separation of concerns.

Philosophy: Focused App, Smart Storage

It decouples application logic from storage policy, enabling a Data Mesh at the filesystem level.

  • The App focuses on business value (“save the document”).
  • The Storage Layer enforces non-functional requirements (“encrypt, audit, limit, index”).

Anyone can:

  • Control how a drive acts, looks, and protects itself.
  • Implement a custom backend for their specific storage needs (Cloud, DB, RAM).
  • Compose middleware to add limits, logging, and security.
  • Use the ergonomic FileStorage<B> wrapper for a standard std::fs-like API.

Architecture (Tower-style Middleware)

┌─────────────────────────────────────────┐
│  FileStorage<B>                         │  ← Ergonomic std::fs-aligned API
├─────────────────────────────────────────┤
│  Middleware (optional, composable):     │
│                                         │
│  Policy:                                │
│    Quota<B>         - Resource limits   │
│    Restrictions<B>  - Least privilege   │
│    PathFilter<B>    - Sandbox paths     │
│    ReadOnly<B>      - Prevent writes    │
│    RateLimit<B>     - Ops/sec limit     │
│                                         │
│  Observability:                         │
│    Tracing<B>       - Instrumentation   │
│    DryRun<B>        - Test mode         │
│                                         │
│  Performance:                           │
│    Cache<B>         - LRU caching       │
│                                         │
│  Composition:                           │
│    Overlay<B1,B2>   - Layered FS        │
│                                         │
├─────────────────────────────────────────┤
│  Backend (implements Fs, FsFull,        │  ← Pure storage + fs semantics
│           FsFuse, or FsPosix)           │
│  (Memory, SQLite, VRootFs, custom...)   │
└─────────────────────────────────────────┘

Each layer has exactly one responsibility:

LayerResponsibility
Backend (Fs+)Storage + filesystem semantics
Quota<B>Resource limits (size, count, depth)
Restrictions<B>Opt-in operation restrictions
PathFilter<B>Path-based access control
ReadOnly<B>Prevent all write operations
RateLimit<B>Limit operations per second
Tracing<B>Instrumentation / audit trail

Design Principle: Predictable Defaults, Opt-in Security

The Fs traits mimic std::fs with predictable, permissive defaults.

See ADR-027 for the decision rationale.

The traits are low-level interfaces that any backend can implement - memory, SQLite, real filesystem, network storage, etc. To maintain consistent behavior across all backends:

  • All operations work by default (symlink(), hard_link(), set_permissions())
  • No security restrictions at the trait level
  • Behavior matches what you’d expect from a real filesystem

Why not secure-by-default at this layer?

  1. Predictability: A backend should behave like a filesystem. Surprising restrictions break expectations.
  2. Backend-agnostic: The traits don’t know if they’re wrapping a sandboxed memory store or a real filesystem. Restrictions that make sense for one may not for another.
  3. Composition: Security is achieved by layering middleware, not by baking it into the storage layer.

Security is the responsibility of higher-level APIs:

LayerSecurity Responsibility
Backend (Fs+)None - pure filesystem semantics
Middleware (Restrictions, PathFilter, etc.)Opt-in restrictions
FileStorage or application codeConfigure appropriate middleware

Example: Secure AI Agent Sandbox

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, FileStorage};

// Create wrapper type for type-safe sandbox
struct AiSandbox(FileStorage<MemoryBackend>);

impl AiSandbox {
    fn new() -> Self {
        AiSandbox(FileStorage::new(
            MemoryBackend::new()
                .layer(QuotaLayer::builder()
                    .max_total_size(50 * 1024 * 1024)
                    .build())
                .layer(PathFilterLayer::builder()
                    .allow("/workspace/**")
                    .deny("**/.env")
                    .build())
        ))
    }
}
}

The backend is permissive. The application adds restrictions appropriate for its use case.


Crates

CratePurposeContains
anyfs-backendMinimal contractLayered traits (Fs, FsFull, FsFuse, FsPosix), Layer trait, types, FsExt
anyfsBackends + middleware + ergonomicsBuilt-in backends, all middleware layers, FileStorage<B>, BackendStack builder

Dependency Graph

anyfs-backend (trait + types)
     ^
     |-- anyfs (backends + middleware + ergonomics)
           ^-- vrootfs feature may use strict-path

Future Considerations

These are optional extensions to explore after the core is stable.

Keep (add-ons that fit the current design):

  • URL-based backend registry (sqlite://, mem://, stdfs://) as a helper crate, not in core APIs.
  • Bulk operation helpers (read_many, write_many, copy_many, glob, walk) as FsExt or a utilities crate.
  • Early async adapter crate (anyfs-async) to support remote backends without changing sync traits.
  • Bash-style shell (example app or anyfs-shell crate) that routes ls/cd/cat/cp/mv/rm/mkdir/stat through FileStorage to demonstrate middleware and backend neutrality (navigation and file management only, not full bash scripting).
  • Copy-on-write overlay middleware (Afero-style CopyOnWriteFs) as a specialized Overlay variant.
  • Archive backends (zip/tar) as separate crates implementing Fs (inspired by PyFilesystem/fsspec).
  • Indexing middleware (Indexing<B> + IndexLayer) with pluggable index engines (SQLite default). See Indexing Middleware.

Defer (valuable, but needs data or wider review):

  • Range/block caching middleware for read_range heavy workloads (fsspec-style block cache).
  • Runtime capability discovery (Capabilities struct) for feature detection (symlink control, case sensitivity, max path length).
  • Lint/analyzer to discourage direct std::fs usage in app code (System.IO.Abstractions-style).
  • Retry/timeout middleware for remote backends (when network backends are real).

Drop for now (adds noise or cross-platform complexity):

  • Change notification support (optional FsWatch trait or polling middleware).

Detailed rationale lives in src/comparisons/prior-art-analysis.md.


Language Bindings (Python, C, etc.)

The AnyFS design is FFI-friendly and can be exposed to other languages with minimal friction.

Why the design works well for FFI:

Design ChoiceFFI Benefit
&self methods (ADR-023)Interior mutability allows holding a single Arc<FileStorage<...>> across FFI
Box<dyn Fs> type erasureFileStorage::boxed() provides a concrete type suitable for FFI
Owned return typesVec<u8>, String, bool - no lifetime issues across FFI boundary
Simple structsMetadata, DirEntry, Permissions map directly to Python/C structs

Recommended approach for Python (PyO3):

#![allow(unused)]
fn main() {
// anyfs-python/src/lib.rs
use pyo3::prelude::*;
use anyfs::{FileStorage, MemoryBackend, Fs};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

#[pyclass]
struct PyFileStorage {
    // Type-erased for FFI
    inner: FileStorage<Box<dyn Fs>>,
}

#[pymethods]
impl PyFileStorage {
    #[staticmethod]
    fn memory() -> Self {
        Self { inner: FileStorage::new(MemoryBackend::new()).boxed() }
    }

    #[staticmethod]
    fn sqlite(path: &str) -> PyResult<Self> {
        let backend = SqliteBackend::open(path)
            .map_err(|e| PyErr::new::<pyo3::exceptions::PyIOError, _>(e.to_string()))?;
        Ok(Self { inner: FileStorage::new(backend).boxed() })
    }

    fn read(&self, path: &str) -> PyResult<Vec<u8>> {
        self.inner.read(path)
            .map_err(|e| PyErr::new::<pyo3::exceptions::PyIOError, _>(e.to_string()))
    }

    fn write(&self, path: &str, data: &[u8]) -> PyResult<()> {
        self.inner.write(path, data)
            .map_err(|e| PyErr::new::<pyo3::exceptions::PyIOError, _>(e.to_string()))
    }
}

#[pymodule]
fn anyfs_python(_py: Python, m: &PyModule) -> PyResult<()> {
    m.add_class::<PyFileStorage>()?;
    Ok(())
}
}

Python usage:

from anyfs_python import PyFileStorage

fs = PyFileStorage.memory()
fs.write("/hello.txt", b"Hello from Python!")
data = fs.read("/hello.txt")
print(data)  # b"Hello from Python!"

Key considerations for FFI:

ConcernSolution
Generics (FileStorage<B>)Use FileStorage<Box<dyn Fs>> (boxed) for FFI layer
Streaming (Box<dyn Read>)Wrap in language-native class with read(n) method
Middleware compositionPre-build common stacks, expose as factory functions
Error handlingConvert FsError to language-native exceptions

Future crate: anyfs-python

Dynamic Middleware

The current design uses compile-time generics for zero-cost middleware composition:

#![allow(unused)]
fn main() {
// Static: type known at compile time
let fs: Tracing<Quota<MemoryBackend>> = MemoryBackend::new()
    .layer(QuotaLayer::builder().max_total_size(100).build())
    .layer(TracingLayer::new());
}

For runtime-configured middleware (e.g., based on config files), use Box<dyn Fs>:

#![allow(unused)]
fn main() {
fn build_from_config(config: &Config) -> FileStorage<Box<dyn Fs>> {
    let mut backend: Box<dyn Fs> = Box::new(MemoryBackend::new());

    if config.enable_quota {
        let quota_config = QuotaConfig {
            max_total_size: Some(config.quota_limit),
            ..Default::default()
        };
        backend = Box::new(Quota::with_config(backend, quota_config)
            .expect("quota initialization failed"));
    }

    if config.enable_antivirus {
        backend = Box::new(AntivirusMiddleware::new(backend, config.av_scanner_path));
    }

    if config.enable_tracing {
        backend = Box::new(Tracing::new(backend));
    }

    FileStorage::new(backend)
}
}

Trade-off: One Box allocation per layer + vtable dispatch. For I/O-bound workloads, this overhead is negligible (<1% of operation time).

Example: Antivirus Middleware

#![allow(unused)]
fn main() {
pub struct Antivirus<B> {
    inner: B,
    scanner: Arc<dyn VirusScanner + Send + Sync>,
}

pub trait VirusScanner: Send + Sync {
    fn scan(&self, data: &[u8]) -> Option<String>;  // Returns threat name if detected
}

impl<B: FsWrite> FsWrite for Antivirus<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        if let Some(threat) = self.scanner.scan(data) {
            return Err(FsError::ThreatDetected { 
                path: path.to_path_buf(), 
                reason: threat,
            });
        }
        self.inner.write(path, data)
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
        let inner = self.inner.open_write(path)?;
        Ok(Box::new(ScanningWriter::new(inner, self.scanner.clone())))
    }
}
}

Future: Plugin System

For true runtime-loaded plugins (.so/.dll), a future MiddlewarePlugin trait could enable:

#![allow(unused)]
fn main() {
pub trait MiddlewarePlugin: Send + Sync {
    fn name(&self) -> &str;
    fn wrap(&self, backend: Box<dyn Fs>) -> Box<dyn Fs>;
}

// Load at runtime
let plugin = libloading::Library::new("antivirus_plugin.so")?;
let create_plugin: fn() -> Box<dyn MiddlewarePlugin> = plugin.get(b"create_plugin")?;
let av_plugin = create_plugin();

let backend = av_plugin.wrap(backend);
}

When to use each approach:

ScenarioApproachOverhead
Fixed middleware stackGenerics (compile-time)Zero-cost
Config-driven middlewareBox<dyn Fs> chaining~50ns per layer
Runtime-loaded pluginsMiddlewarePlugin trait~50ns + plugin load

Verdict: The current design supports dynamic middleware via Box<dyn Fs>. A formal MiddlewarePlugin trait for hot-loading is a future enhancement.

Middleware with Configurable Backends

Some middleware benefit from pluggable backends for their own storage or output. The pattern is to inject a trait object or configuration at construction time.

Metrics Middleware with Prometheus Exporter: (Requires features = ["metrics"])

#![allow(unused)]
fn main() {
use prometheus::{Counter, Histogram, Registry};

pub struct Metrics<B> {
    inner: B,
    reads: Counter,
    writes: Counter,
    read_bytes: Counter,
    write_bytes: Counter,
    latency: Histogram,
}

impl<B> Metrics<B> {
    /// Creates a new Metrics middleware.
    ///
    /// # Panics
    /// Panics if metric registration fails (indicates duplicate metric names - programmer error).
    /// This is acceptable at initialization time per the No Panic Policy, which applies to
    /// runtime operations. Initialization failures are configuration errors that should fail fast.
    pub fn new(inner: B, registry: &Registry) -> Self {
        let reads = Counter::new("anyfs_reads_total", "Total read operations")
            .expect("metric creation failed");
        let writes = Counter::new("anyfs_writes_total", "Total write operations")
            .expect("metric creation failed");
        registry.register(Box::new(reads.clone()))
            .expect("metric registration failed");
        registry.register(Box::new(writes.clone()))
            .expect("metric registration failed");
        // ... register all metrics
        Self { inner, reads, writes, /* ... */ }
    }
}

impl<B: FsRead> FsRead for Metrics<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        self.reads.inc();
        let start = Instant::now();
        let result = self.inner.read(path);
        self.latency.observe(start.elapsed().as_secs_f64());
        if let Ok(ref data) = result {
            self.read_bytes.inc_by(data.len() as u64);
        }
        result
    }
}

// Expose via HTTP endpoint
async fn metrics_handler(registry: web::Data<Registry>) -> impl Responder {
    let encoder = TextEncoder::new();
    let metrics = registry.gather();
    encoder.encode_to_string(&metrics)
        .unwrap_or_else(|e| format!("# Encoding error: {}", e))
}
}

Indexing Middleware with Remote Database:

#![allow(unused)]
fn main() {
pub trait IndexBackend: Send + Sync {
    fn record_write(&self, path: &Path, size: u64, hash: &str) -> Result<(), IndexError>;
    fn record_delete(&self, path: &Path) -> Result<(), IndexError>;
    fn query(&self, pattern: &str) -> Result<Vec<IndexEntry>, IndexError>;
}

// SQLite implementation
pub struct SqliteIndex { conn: Connection }

// PostgreSQL implementation  
pub struct PostgresIndex { pool: PgPool }

// MariaDB implementation
pub struct MariaDbIndex { pool: MySqlPool }

pub struct Indexing<B, I: IndexBackend> {
    inner: B,
    index: I,
}

impl<B: FsWrite, I: IndexBackend> FsWrite for Indexing<B, I> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        self.inner.write(path, data)?;
        let hash = sha256(data);
        self.index.record_write(path, data.len() as u64, &hash)
            .map_err(|e| FsError::Backend(e.to_string()))?;
        Ok(())
    }
}

// Usage with PostgreSQL
let index = PostgresIndex::connect("postgres://user:pass@db.example.com/files").await?;
let backend = MemoryBackend::new()
    .layer(IndexLayer::builder()
        .index(index)
        .build());
}

Configurable Tracing with Multiple Sinks:

#![allow(unused)]
fn main() {
pub trait TraceSink: Send + Sync {
    fn log_operation(&self, op: &Operation);
}

// Structured JSON logs
pub struct JsonSink { writer: Box<dyn Write + Send> }

// CEF (Common Event Format) for SIEM integration
pub struct CefSink { 
    host: String,
    port: u16,
    device_vendor: String 
}

impl TraceSink for CefSink {
    fn log_operation(&self, op: &Operation) {
        let cef = format!(
            "CEF:0|AnyFS|FileStorage|1.0|{}|{}|{}|src={} dst={}",
            op.event_id, op.name, op.severity, op.source_path, op.dest_path
        );
        self.send_syslog(&cef);
    }
}

// Remote sink (e.g., Loki, Elasticsearch)
pub struct RemoteSink { endpoint: String, client: reqwest::Client }

pub struct Tracing<B, S: TraceSink> {
    inner: B,
    sink: S,
}
}

Performance: Strategic Boxing (ADR-025)

AnyFS follows Tower/Axum’s approach to dynamic dispatch: zero-cost on the hot path, box at boundaries where flexibility is needed. We avoid heap allocations and dynamic dispatch unless they add flexibility without meaningful performance impact.

PathOperationsCost
Hot path (zero-cost)read(), write(), metadata(), exists()Concrete types, no boxing
Hot path (zero-cost)Middleware composition: Quota<Tracing<B>>Generics, monomorphized
Cold path (boxed)open_read(), open_write(), read_dir()One Box allocation per call
Opt-inFileStorage::boxed()Explicit type erasure

Hot-loop guidance: If you open many small files and care about micro-overhead (especially on virtual backends), prefer read()/write() or the typed streaming extension (FsReadTyped/FsWriteTyped) when the backend type is known. These are the zero-allocation fast paths.

Why box streams and iterators?

  1. Middleware needs to wrap them (QuotaWriter counts bytes, PathFilter filters entries)
  2. Box allocation (~50ns) is <1% of actual I/O time
  3. Avoids type explosion: QuotaReader<PathFilterReader<TracingReader<Cursor<...>>>>

Why NOT box bulk operations?

  1. read() and write() are the most common operations
  2. They return concrete types (Vec<u8>, ())
  3. Zero overhead for the typical use case

See ADR-025 and Zero-Cost Alternatives for full analysis.


Trait Architecture (in anyfs-backend)

AnyFS uses layered traits for maximum flexibility with minimal complexity.

See ADR-030 for the rationale behind the layered hierarchy.

                        FsPosix
                           │
            ┌──────────────┼──────────────┐
            │              │              │
       FsHandles      FsLock       FsXattr
            │              │              │
            └──────────────┴──────────────┘
                           │
                        FsFuse ← FsFull + FsInode
                           │
            ┌──────────────┴──────────────┐
            │                             │
         FsFull                       FsInode
            │
            │
            ├──────┬───────┬───────┬──────┐
            │      │       │       │      │
       FsLink  FsPerm  FsSync FsStats │
            │      │       │       │      │
            └──────┴───────┴───────┴──────┘
                           │
                           Fs  ← Most users only need this
                           │
               ┌───────────┼───────────┐
               │           │           │
            FsRead    FsWrite     FsDir

Simple rule: Import Fs for basic use. Add traits as needed for advanced features.


Core Traits (Layer 1)

FsRead - Read Operations

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    fn read_to_string(&self, path: &Path) -> Result<String, FsError>;
    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError>;
    fn exists(&self, path: &Path) -> Result<bool, FsError>;
    fn metadata(&self, path: &Path) -> Result<Metadata, FsError>;
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
}

FsWrite - Write Operations

#![allow(unused)]
fn main() {
pub trait FsWrite: Send + Sync {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    fn remove_file(&self, path: &Path) -> Result<(), FsError>;
    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError>;
    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError>;
    fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError>;
    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
}
}

Note: All methods use &self (interior mutability). Backends manage their own synchronization. See ADR-023.

FsDir - Directory Operations

#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
    fn create_dir(&self, path: &Path) -> Result<(), FsError>;
    fn create_dir_all(&self, path: &Path) -> Result<(), FsError>;
    fn remove_dir(&self, path: &Path) -> Result<(), FsError>;
    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError>;
}
}

Extended Traits (Layer 2 - Optional)

#![allow(unused)]
fn main() {
pub trait FsLink: Send + Sync {
    fn symlink(&self, original: &Path, link: &Path) -> Result<(), FsError>;
    fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError>;
    fn read_link(&self, path: &Path) -> Result<PathBuf, FsError>;
    fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError>;
}

pub trait FsPermissions: Send + Sync {
    fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError>;
}

pub trait FsSync: Send + Sync {
    fn sync(&self) -> Result<(), FsError>;
    fn fsync(&self, path: &Path) -> Result<(), FsError>;
}

pub trait FsStats: Send + Sync {
    fn statfs(&self) -> Result<StatFs, FsError>;
}
}

Inode Traits (Layer 3 - For FUSE)

#![allow(unused)]
fn main() {
pub trait FsInode: Send + Sync {
    fn path_to_inode(&self, path: &Path) -> Result<u64, FsError>;
    fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError>;
    fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError>;
    fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError>;
}
}

POSIX Traits (Layer 4 - Full POSIX)

#![allow(unused)]
fn main() {
pub trait FsHandles: Send + Sync {
    fn open(&self, path: &Path, flags: OpenFlags) -> Result<Handle, FsError>;
    fn read_at(&self, handle: Handle, buf: &mut [u8], offset: u64) -> Result<usize, FsError>;
    fn write_at(&self, handle: Handle, data: &[u8], offset: u64) -> Result<usize, FsError>;
    fn close(&self, handle: Handle) -> Result<(), FsError>;
}

pub trait FsLock: Send + Sync {
    fn lock(&self, handle: Handle, lock: LockType) -> Result<(), FsError>;
    fn try_lock(&self, handle: Handle, lock: LockType) -> Result<bool, FsError>;
    fn unlock(&self, handle: Handle) -> Result<(), FsError>;
}

pub trait FsXattr: Send + Sync {
    fn get_xattr(&self, path: &Path, name: &str) -> Result<Vec<u8>, FsError>;
    fn set_xattr(&self, path: &Path, name: &str, value: &[u8]) -> Result<(), FsError>;
    fn remove_xattr(&self, path: &Path, name: &str) -> Result<(), FsError>;
    fn list_xattr(&self, path: &Path) -> Result<Vec<String>, FsError>;
}
}

Convenience Supertraits (Simple API)

#![allow(unused)]
fn main() {
/// Basic filesystem - covers 90% of use cases
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}

/// Full filesystem with all std::fs features
pub trait FsFull: Fs + FsLink + FsPermissions + FsSync + FsStats {}
impl<T: Fs + FsLink + FsPermissions + FsSync + FsStats> FsFull for T {}

/// FUSE-mountable filesystem
pub trait FsFuse: FsFull + FsInode {}
impl<T: FsFull + FsInode> FsFuse for T {}

/// Full POSIX filesystem
pub trait FsPosix: FsFuse + FsHandles + FsLock + FsXattr {}
impl<T: FsFuse + FsHandles + FsLock + FsXattr> FsPosix for T {}
}

Usage Examples

Application code should use FileStorage for the std::fs-style DX (string paths). Core trait examples are shown separately for implementers and generic code.

Most Users: FileStorage

#![allow(unused)]
fn main() {
use anyfs::{FileStorage, MemoryBackend};

fn process_files() -> Result<(), Box<dyn std::error::Error>> {
    let fs = FileStorage::new(MemoryBackend::new());
    let data = fs.read("/input.txt")?;
    fs.write("/output.txt", &processed(data))?;
    Ok(())
}
}

Generic Code over Core Traits

#![allow(unused)]
fn main() {
use anyfs::{FileStorage, Fs, FsError};

fn process_files<B: Fs>(fs: &FileStorage<B>) -> Result<(), FsError> {
    let data = fs.read("/input.txt")?;
    fs.write("/output.txt", &processed(data))?;
    Ok(())
}
}
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, Fs, FsLink, FsError};

fn with_symlinks<B: Fs + FsLink>(fs: &FileStorage<B>) -> Result<(), FsError> {
    fs.write("/target.txt", b"content")?;
    fs.symlink("/target.txt", "/link.txt")?;
    Ok(())
}
}

FUSE Mount

Mounting is part of anyfs crate with fuse and winfsp feature flags; see src/guides/mounting.md.

#![allow(unused)]
fn main() {
use anyfs::{FsFuse, MountHandle, MountError};

fn mount_filesystem(fs: impl FsFuse) -> Result<(), MountError> {
    MountHandle::mount(fs, "/mnt/myfs")?;
    Ok(())
}
}

Full POSIX Application

#![allow(unused)]
fn main() {
use anyfs::{FileStorage, FsPosix, FsError, OpenFlags, LockType, Handle};

fn database_app<B: FsPosix>(fs: &FileStorage<B>, data: &[u8], offset: u64) -> Result<(), FsError> {
    let handle: Handle = fs.open("/data.db", OpenFlags::READ_WRITE)?;
    fs.lock(handle, LockType::Exclusive)?;
    fs.write_at(handle, data, offset)?;
    fs.unlock(handle)?;
    fs.close(handle)?;
    Ok(())
}
}

Core Types (in anyfs-backend)

Constants

#![allow(unused)]
fn main() {
/// Root directory inode. FUSE convention.
pub const ROOT_INODE: u64 = 1;
}

Metadata

#![allow(unused)]
fn main() {
/// File or directory metadata.
#[derive(Debug, Clone)]
pub struct Metadata {
    /// Type: File, Directory, or Symlink.
    pub file_type: FileType,

    /// Size in bytes (0 for directories).
    pub size: u64,

    /// Permission mode bits. Default to 0o755/0o644 if unsupported.
    pub permissions: Permissions,

    /// Creation time (UNIX_EPOCH if unsupported).
    pub created: SystemTime,

    /// Last modification time.
    pub modified: SystemTime,

    /// Last access time.
    pub accessed: SystemTime,

    /// Inode number (0 if unsupported).
    pub inode: u64,

    /// Number of hard links (1 if unsupported).
    pub nlink: u64,
}

impl Metadata {
    /// Check if this is a file.
    pub fn is_file(&self) -> bool { self.file_type == FileType::File }

    /// Check if this is a directory.
    pub fn is_dir(&self) -> bool { self.file_type == FileType::Directory }

    /// Check if this is a symlink.
    pub fn is_symlink(&self) -> bool { self.file_type == FileType::Symlink }
}
}

FileType

#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum FileType {
    File,
    Directory,
    Symlink,
}
}

DirEntry

#![allow(unused)]
fn main() {
/// Entry in a directory listing.
#[derive(Debug, Clone)]
pub struct DirEntry {
    /// File or directory name (not full path).
    pub name: String,

    /// Full path to the entry.
    pub path: PathBuf,

    /// Type: File, Directory, or Symlink.
    pub file_type: FileType,

    /// Size in bytes (0 for directories, can be lazy).
    pub size: u64,

    /// Inode number (0 if unsupported).
    pub inode: u64,
}
}

Permissions

#![allow(unused)]
fn main() {
/// Unix-style permission bits.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Permissions(u32);

impl Permissions {
    /// Create permissions from a mode (e.g., 0o755).
    pub fn from_mode(mode: u32) -> Self { Permissions(mode) }

    /// Get the mode bits.
    pub fn mode(&self) -> u32 { self.0 }

    /// Read-only permissions (0o444).
    pub fn readonly() -> Self { Permissions(0o444) }

    /// Default file permissions (0o644).
    pub fn default_file() -> Self { Permissions(0o644) }

    /// Default directory permissions (0o755).
    pub fn default_dir() -> Self { Permissions(0o755) }
}
}

StatFs

#![allow(unused)]
fn main() {
/// Filesystem statistics.
#[derive(Debug, Clone)]
pub struct StatFs {
    /// Total size in bytes (0 = unlimited).
    pub total_bytes: u64,

    /// Used bytes.
    pub used_bytes: u64,

    /// Available bytes.
    pub available_bytes: u64,

    /// Total number of inodes (0 = unlimited).
    pub total_inodes: u64,

    /// Used inodes.
    pub used_inodes: u64,

    /// Available inodes.
    pub available_inodes: u64,

    /// Filesystem block size.
    pub block_size: u64,

    /// Maximum filename length.
    pub max_name_len: u64,
}
}

Middleware (in anyfs)

Each middleware implements the same traits as its inner backend. This enables composition while preserving capabilities.

Quota

Enforces quota limits. Tracks usage and rejects operations that would exceed limits.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer};

let backend = QuotaLayer::builder()
    .max_total_size(100 * 1024 * 1024)   // 100 MB
    .max_file_size(10 * 1024 * 1024)     // 10 MB per file
    .max_node_count(10_000)              // 10K files/dirs
    .max_dir_entries(1_000)              // 1K entries per dir
    .max_path_depth(64)
    .build()
    .layer(MemoryBackend::new());

// Check usage
let usage = backend.usage();
let remaining = backend.remaining();
}

Restrictions

Blocks permission-related operations when needed.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Restrictions};

// Symlink/hard-link capability is determined by trait bounds (FsLink).
// Restrictions only controls permission changes.
let backend = RestrictionsLayer::builder()
    .deny_permissions()    // Block set_permissions() calls
    .build()
    .layer(MemoryBackend::new());
}

When blocked, operations return FsError::FeatureNotEnabled.

Tracing

Integrates with the tracing ecosystem for structured logging and instrumentation.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, TracingLayer};

let backend = MemoryBackend::new()
    .layer(TracingLayer::new()
        .with_target("anyfs")
        .with_level(tracing::Level::DEBUG));

// Users configure tracing subscribers as they prefer
tracing_subscriber::fmt::init();
}

Why tracing instead of custom logging?

  • Works with existing tracing infrastructure
  • Structured logging with spans
  • Compatible with OpenTelemetry, Jaeger, etc.
  • Users choose their subscriber (console, file, distributed tracing)

PathFilter

Restricts access to specific paths. Essential for sandboxing.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, PathFilterLayer};

let backend = PathFilterLayer::builder()
    .allow("/workspace/**")           // Allow all under /workspace
    .allow("/tmp/**")                  // Allow temp files
    .deny("/workspace/.env")           // But deny .env files
    .deny("**/.git/**")               // Deny all .git directories
    .build()
    .layer(MemoryBackend::new());
}

When a path is denied, operations return FsError::AccessDenied.

ReadOnly

Prevents all write operations. Useful for publishing immutable data.

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, ReadOnly, FileStorage};

// Wrap any backend to make it read-only
let backend = ReadOnly::new(VRootFsBackend::new("/var/published")?);
let fs = FileStorage::new(backend);

fs.read("/doc.txt")?;     // OK
fs.write("/doc.txt", b"x"); // Error: FsError::ReadOnly
}

RateLimit

Limits operations per second. Prevents runaway agents.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, RateLimitLayer};

let backend = RateLimitLayer::builder()
    .max_ops(100)       // 100 ops per window
    .per_second()       // 1 second window
    .build()
    .layer(MemoryBackend::new());

// When rate exceeded: FsError::RateLimitExceeded
}

DryRun

Logs operations without executing writes. Great for testing and debugging.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, DryRun, FileStorage};

let backend = DryRun::new(MemoryBackend::new());
let fs = FileStorage::new(backend);

fs.write("/test.txt", b"hello")?;  // Logged but not written
let _ = fs.read("/test.txt");       // Error: file doesn't exist

// To inspect recorded operations, keep the DryRun handle before wrapping it.
}

Cache

LRU cache for read operations. Essential for slow backends (S3, network).

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, CacheLayer, FileStorage};

let backend = MemoryBackend::new()
    .layer(CacheLayer::builder()
        .max_entries(10_000)              // Max 10K entries in cache
        .max_entry_size(10 * 1024 * 1024) // 10 MB max per entry
        .build());
let fs = FileStorage::new(backend);

// First read: hits backend, caches result
let data = fs.read("/file.txt")?;

// Second read: served from cache (fast!)
let data = fs.read("/file.txt")?;
}

Overlay<Base, Upper>

Union filesystem with a read-only base and writable upper layer. Like Docker.

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, MemoryBackend, Overlay};

// Base: read-only template
let base = VRootFsBackend::new("/var/templates")?;

// Upper: writable layer for changes
let upper = MemoryBackend::new();

let backend = Overlay::new(base, upper);

// Reads check upper first, then base
// Writes always go to upper
// Deletes in upper "shadow" base files
}

Use cases:

  • Container images (base image + writable layer)
  • Template filesystems with per-user modifications
  • Testing with rollback capability

FileStorage (in anyfs)

FileStorage<B> is an ergonomic wrapper with a single generic parameter:

  • B - Backend type (the only generic)
  • Resolver is boxed internally (cold path, per ADR-025)

Axum-style design: Simple by default, type erasure opt-in via .boxed().

Basic Usage

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};

// Type is inferred - no need to write it out
let fs = FileStorage::new(MemoryBackend::new());

fs.create_dir_all("/documents")?;
fs.write("/documents/hello.txt", b"Hello!")?;
let content = fs.read("/documents/hello.txt")?;
}

Type-Safe Wrappers (User-Defined)

If you need compile-time safety to prevent mixing filesystems, create wrapper types:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Define wrapper types for your domains
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);

// Type-safe function signatures prevent mixing
fn process_sandbox(fs: &SandboxFs) {
    // Can only accept SandboxFs
}

fn save_user_file(fs: &UserDataFs, name: &str, data: &[u8]) {
    // Can only accept UserDataFs
}

// Compile-time safety:
let sandbox = SandboxFs(FileStorage::new(MemoryBackend::new()));
process_sandbox(&sandbox);   // OK
// process_sandbox(&userdata);  // Compile error! Wrong type
}

Type Aliases for Clean Code

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Define your standard secure stack
type SecureBackend = Tracing<Restrictions<Quota<SqliteBackend>>>;

// Type aliases for common combinations
type SandboxFs = FileStorage<MemoryBackend>;
type UserDataFs = FileStorage<SecureBackend>;

// Clean function signatures
fn run_agent(fs: &SandboxFs) { ... }
}

FileStorage Implementation

#![allow(unused)]
fn main() {
use anyfs_backend::PathResolver;

/// Ergonomic wrapper with single generic.
pub struct FileStorage<B> {
    backend: B,
    resolver: Box<dyn PathResolver>,  // Boxed: cold path
}

impl<B: Fs> FileStorage<B> {
    /// Create with default resolver (IterativeResolver).
    pub fn new(backend: B) -> Self { ... }

    /// Create with custom path resolver.
    pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self { ... }

    /// Type-erase the backend (opt-in boxing).
    pub fn boxed(self) -> FileStorage<Box<dyn Fs>> { ... }
}
}

Type Erasure (Opt-in)

When you need uniform types (e.g., collections), use .boxed():

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Type-erased for uniform storage
let filesystems: Vec<FileStorage<Box<dyn Fs>>> = vec![
    FileStorage::new(MemoryBackend::new()).boxed(),
    FileStorage::new(SqliteBackend::open("a.db")?).boxed(),
];
}

Layer Trait (in anyfs-backend)

The Layer trait (inspired by Tower) standardizes middleware composition:

#![allow(unused)]
fn main() {
/// A layer that wraps a backend to add functionality.
pub trait Layer<B: Fs> {
    type Backend: Fs;
    fn layer(self, backend: B) -> Self::Backend;
}

/// Extension trait enabling fluent `.layer()` method on any Fs.
/// This is how `backend.layer(QuotaLayer::builder()...build())` works.
pub trait LayerExt: Fs + Sized {
    fn layer<L: Layer<Self>>(self, layer: L) -> L::Backend {
        layer.layer(self)
    }
}

// Blanket impl: any Fs gets .layer() for free
impl<B: Fs> LayerExt for B {}
}

Each middleware provides a corresponding Layer implementation:

#![allow(unused)]
fn main() {
// QuotaLayer wraps QuotaConfig (not a separate QuotaLimits type)
pub struct QuotaLayer {
    config: QuotaConfig,
}

impl<B: Fs> Layer<B> for QuotaLayer {
    type Backend = Quota<B>;
    fn layer(self, backend: B) -> Self::Backend {
        Quota::with_config(backend, self.config)
            .expect("quota initialization failed")
    }
}
}

Note: Middleware that implements additional traits (like FsInode) can use more specific bounds to preserve capabilities through the layer.


Composing Middleware

Middleware composes by wrapping. Order matters - innermost applies first.

Fluent Composition

Use the .layer() extension method for Axum-style composition:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, TracingLayer};

let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(RestrictionsLayer::builder()
        .deny_permissions()  // Block set_permissions()
        .build())
    .layer(TracingLayer::new());
}

BackendStack Builder

For complex stacks, use BackendStack for a fluent API:

#![allow(unused)]
fn main() {
use anyfs::BackendStack;

let fs = BackendStack::new(MemoryBackend::new())
    .limited(|l| l
        .max_total_size(100 * 1024 * 1024)
        .max_file_size(10 * 1024 * 1024))
    .restricted(|g| g
        .deny_permissions())    // Block set_permissions() calls
    .traced()
    .into_container();
}

Built-in Backends (anyfs crate)

BackendDescription
MemoryBackendIn-memory storage, implements Clone for snapshots
StdFsBackendDirect std::fs delegation (no containment)
VRootFsBackendHost filesystem with path containment (via strict-path)

Ecosystem Backends (Separate Crates)

Complex backends with internal runtime requirements live in their own crates:

CrateBackendDescription
anyfs-sqliteSqliteBackendSingle-file database with pooling, WAL, sharding; optional encryption
anyfs-indexedIndexedBackendVirtual paths + disk blobs (large file support)

Why separate crates? Complex backends need internal runtimes (connection pools, sharding, chunking). Keeps anyfs lightweight and focused on framework glue.


Path Handling

Core traits take &Path so they are object-safe (dyn Fs works). The ergonomic layer (FileStorage and FsExt) accepts impl AsRef<Path>:

#![allow(unused)]
fn main() {
// These work via FileStorage/FsExt
fs.write("/file.txt", data)?;
fs.write(String::from("/file.txt"), data)?;
fs.write(PathBuf::from("/file.txt"), data)?;
}

Path Resolution

Path resolution (walking directory structure, following symlinks) operates on the Fs abstraction, not reimplemented per-backend.

See ADR-029 for the path-resolution decision.

Why Abstract Path Resolution?

We simulate inodes - that’s the whole point of virtualizing a filesystem. Path resolution must work on that abstraction:

  • /foo/../bar cannot be resolved lexically - foo might be a symlink to /other/place, making .. resolve to /other
  • Resolution requires following the actual directory structure (inodes)
  • The Fs traits have the needed methods: metadata(), read_link(), read_dir()

Path Resolution via PathResolver Trait

FileStorage delegates path resolution to a pluggable PathResolver (see ADR-033). The default IterativeResolver walks paths component by component:

#![allow(unused)]
fn main() {
/// Default resolver algorithm (simplified):
/// - Walk path component by component
/// - Use backend.metadata() to check node types
/// - If backend implements FsLink, use read_link() to follow symlinks
/// - Detect circular symlinks (max depth: 40)
/// - Return fully resolved canonical path
pub struct IterativeResolver {
    max_symlink_depth: usize,  // Default: 40
}
}

Resolution behavior depends on the resolver used. The default IterativeResolver follows symlinks when the backend implements FsLink. For backends without FsLink, it traverses directories but treats symlinks as regular files. Users can provide custom resolvers for case-insensitive matching, caching, or other behaviors.

Note: Built-in virtual backends (MemoryBackend) and ecosystem backends (SqliteBackend) implement FsLink, so symlink-aware resolution works out of the box.

When Resolution Is Needed

BackendNeeds Our Resolution?Why
MemoryBackendYesStorage (HashMap) has no FS semantics
SqliteBackendYesStorage (SQL tables) has no FS semantics
VRootFsBackendNoOS handles resolution; strict-path prevents escapes

Opt-out Mechanism

Virtual backends need resolution by default. Real filesystem backends opt out via a marker trait:

#![allow(unused)]
fn main() {
/// Marker trait for backends that handle their own path resolution.
/// VRootFsBackend implements this because the OS handles resolution.
pub trait SelfResolving {}

impl SelfResolving for VRootFsBackend {}
}

Important: FileStorage does NOT auto-detect SelfResolving. You must explicitly use NoOpResolver:

#![allow(unused)]
fn main() {
// For SelfResolving backends, use NoOpResolver explicitly
let fs = FileStorage::with_resolver(VRootFsBackend::new("/data")?, NoOpResolver);
}

The default IterativeResolver follows symlinks when FsLink is available. Custom resolvers can implement different behaviors (e.g., no symlink following, caching, case-insensitivity).

#![allow(unused)]
fn main() {
impl<B: Fs> FileStorage<B> {
    pub fn new(backend: B) -> Self { /* uses IterativeResolver */ }
    pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self { /* custom resolver */ }
}
}

Path Canonicalization Utilities

FileStorage provides path canonicalization methods modeled after the soft-canonicalize crate, adapted to work on the virtual filesystem abstraction.

Why We Need Our Own Canonicalization

std::fs::canonicalize operates on the real filesystem. For virtual backends (MemoryBackend, SqliteBackend), there is no real filesystem - we need canonicalization that queries the virtual structure via metadata() and read_link().

Core Methods

#![allow(unused)]
fn main() {
impl<B: Fs> FileStorage<B> {
    /// Strict canonicalization - entire path must exist.
    ///
    /// Delegates to the PathResolver to resolve symlinks and normalize the path.
    /// Returns error if any component doesn't exist.
    pub fn canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
        self.resolver.canonicalize(path.as_ref(), &self.backend as &dyn Fs)
    }

    /// Soft canonicalization - resolves existing components,
    /// appends non-existent remainder lexically.
    ///
    /// Delegates to the PathResolver.
    pub fn soft_canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
        self.resolver.soft_canonicalize(path.as_ref(), &self.backend as &dyn Fs)
    }

    /// Anchored soft canonicalization - like soft_canonicalize but
    /// clamps result within a boundary directory.
    ///
    /// Useful for sandboxing: ensures the resolved path never escapes
    /// the anchor directory, even via symlinks or `..` traversal.
    pub fn anchored_canonicalize(
        &self,
        path: impl AsRef<Path>,
        anchor: impl AsRef<Path>
    ) -> Result<PathBuf, FsError>;
}

/// Standalone lexical normalization (no backend needed).
///
/// Pure string manipulation:
/// - Collapses `//` to `/`
/// - Removes trailing slashes
/// - Does NOT resolve `.` or `..` (those require filesystem context)
/// - Does NOT follow symlinks
pub fn normalize(path: impl AsRef<Path>) -> PathBuf;
}

Algorithm: Component-by-Component Resolution

The canonicalization algorithm walks the path one component at a time:

Input: /a/b/c/d/e

1. Start at root (/)
2. Check /a exists?
   - Yes, and it's a symlink → follow to target
   - Yes, and it's a directory → continue
3. Check /a/b exists?
   - Yes → continue
4. Check /a/b/c exists?
   - No → stop resolution, append "c/d/e" lexically
5. Result: /resolved/path/to/b/c/d/e

Key behaviors:

  • Symlink following: Existing symlinks are resolved to their targets
  • Non-existent handling: When a component doesn’t exist, the remainder is appended as-is
  • Cycle detection: Bounded depth tracking prevents infinite loops from circular symlinks
  • Root boundary: Never ascends past the filesystem root

Comparison with std::fs

Functionstd::fsFileStorage
canonicalizeRequires all components existSame - returns error if path doesn’t exist
N/AN/Asoft_canonicalize - handles non-existent paths
N/AN/Aanchored_canonicalize - sandboxed resolution

Security Considerations

For virtual backends: Canonicalization happens entirely within the virtual structure. There is no host filesystem to escape to.

For VRootFsBackend: Delegates to OS canonicalization + strict-path containment. The anchored_canonicalize provides additional safety by clamping paths within a boundary.

Platform Notes (VRootFsBackend only)

When delegating to OS canonicalization:

  • Windows: Returns extended-length UNC paths (\\?\C:\path) by default
  • Linux/macOS: Standard canonical paths

Windows UNC Path Simplification

The dunce crate provides simplified() - a lexical function that converts UNC paths to regular paths without filesystem access:

#![allow(unused)]
fn main() {
use dunce::simplified;

// \\?\C:\Users\foo\bar.txt → C:\Users\foo\bar.txt
let path = simplified(r"\\?\C:\Users\foo\bar.txt");
}

Why this matters for soft_canonicalize:

  • soft_canonicalize works with non-existent paths
  • We can’t use dunce::canonicalize (requires path to exist)
  • dunce::simplified is pure string manipulation - works on any path

When UNC can be simplified:

  • Path is on a local drive (C:, D:, etc.)
  • Path doesn’t exceed MAX_PATH (260 chars)
  • No reserved names (CON, PRN, etc.)

When UNC must be kept:

  • Network paths (\\?\UNC\server\share)
  • Paths exceeding MAX_PATH
  • Paths with reserved device names

Virtual backends have no platform differences - paths are just strings.


Filesystem Semantics: Linux-like by Default

Design principle: Simple, secure defaults. Don’t close doors for alternative semantics.

See ADR-028 for the decision rationale.

Default Behavior (Virtual Backends)

Virtual backends (MemoryBackend, SqliteBackend) use Linux-like semantics:

AspectBehaviorRationale
Case sensitivityCase-sensitiveSimpler, more secure, Unix standard
Path separator/ internallyCross-platform consistency
Reserved namesNoneNo artificial restrictions
Max path lengthNo limitVirtual, no OS constraints
ADS (:stream)Not supportedSecurity risk, complexity

Real filesystem backends (StdFsBackend, VRootFsBackend) follow OS semantics—case-insensitive on Windows/macOS, case-sensitive on Linux.

Trait is Agnostic

The Fs trait doesn’t enforce filesystem semantics - backends decide their behavior:

#![allow(unused)]
fn main() {
use anyfs::{FileStorage, MemoryBackend};
use std::path::Path;

// Virtual backends: Linux-like (case-sensitive)
let linux_fs = FileStorage::new(MemoryBackend::new());
assert!(linux_fs.exists("/Foo.txt")? != linux_fs.exists("/foo.txt")?);

// For case-insensitive behavior, implement a custom PathResolver:
// (Not built-in because real-world demand is minimal - VRootFsBackend on 
// Windows/macOS already gets case-insensitivity from the OS)
struct CaseFoldingResolver;
impl PathResolver for CaseFoldingResolver {
    fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
        // Normalize path components to lowercase during lookup
        todo!()
    }
    
    fn soft_canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
        // Same but allows non-existent final component
        todo!()
    }
}

let ntfs_like = FileStorage::with_resolver(
    MemoryBackend::new(),
    CaseFoldingResolver  // User-implemented
);
}

FUSE Mount: Report What You Support

When mounting, the FUSE layer reports backend capabilities to the OS:

#![allow(unused)]
fn main() {
impl FuseOps for AnyFsFuse<B> {
    fn get_volume_params(&self) -> VolumeParams {
        VolumeParams {
            case_sensitive: self.backend.is_case_sensitive(),
            supports_hard_links: /* check if B: FsLink */,
            supports_symlinks: /* check if B: FsLink */,
            // ...
        }
    }
}
}

Windows respects these flags - a case-sensitive mounted filesystem works correctly (modern Windows/WSL handle this).

Illustrative: Custom Middleware for Windows Compatibility

For users who need Windows-safe paths in virtual backends, here are example middleware patterns (not built-in - implement as needed):

#![allow(unused)]
fn main() {
/// Example: Middleware that validates paths are Windows-compatible.
/// Rejects: CON, PRN, NUL, COM1-9, LPT1-9, trailing dots/spaces, ADS.
pub struct NtfsValidation<B> { /* user-implemented */ }

/// Example: Middleware that makes a backend case-insensitive.
/// Stores canonical (lowercase) keys, preserves original case in metadata.
pub struct CaseInsensitive<B> { /* user-implemented */ }
}

Not built-in - these are illustrative patterns for users who need NTFS-like behavior.


Security Model

Security is achieved through composition:

ConcernSolution
Path containmentPathFilter + VRootFsBackend
Resource exhaustionQuota enforces quotas
Rate limitingRateLimit prevents abuse
Feature restrictionRestrictions disables dangerous features
Read-only accessReadOnly prevents writes
Audit trailTracing instruments operations
Tenant isolationSeparate backend instances
TestingDryRun logs without executing

Defense in depth: Compose multiple middleware layers for comprehensive security.

AI Agent Sandbox Example

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Quota, PathFilter, RateLimit, Tracing};

// Build a secure sandbox for an AI agent
let sandbox = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(50 * 1024 * 1024)  // 50 MB
        .max_file_size(5 * 1024 * 1024)    // 5 MB per file
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .deny("**/.env")
        .deny("**/secrets/**")
        .build())
    .layer(RateLimitLayer::builder()
        .max_ops(1000)
        .per_second()
        .build())
    .layer(TracingLayer::new());
}

Extension Traits (in anyfs-backend)

The FsExt trait provides convenience methods for any Fs backend:

#![allow(unused)]
fn main() {
/// Extension methods for Fs (auto-implemented for all backends).
pub trait FsExt: Fs {
    /// Check if path is a file.
    fn is_file(&self, path: impl AsRef<Path>) -> Result<bool, FsError> {
        self.metadata(path.as_ref()).map(|m| m.file_type == FileType::File)
    }

    /// Check if path is a directory.
    fn is_dir(&self, path: impl AsRef<Path>) -> Result<bool, FsError> {
        self.metadata(path.as_ref()).map(|m| m.file_type == FileType::Directory)
    }

    // JSON methods require `serde` feature (see below)
    #[cfg(feature = "serde")]
    fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError>;
    #[cfg(feature = "serde")]
    fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError>;
}

// Blanket implementation for all Fs backends
impl<B: Fs> FsExt for B {}
}

JSON Methods (feature: serde)

The read_json and write_json methods require the serde feature:

anyfs-backend = { version = "0.1", features = ["serde"] }
#![allow(unused)]
fn main() {
use serde::{Serialize, de::DeserializeOwned};

#[cfg(feature = "serde")]
impl<B: Fs> FsExt for B {
    fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError> {
        let bytes = self.read(path.as_ref())?;
        serde_json::from_slice(&bytes).map_err(|e| FsError::Deserialization(e.to_string()))
    }

    fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError> {
        let bytes = serde_json::to_vec(value).map_err(|e| FsError::Serialization(e.to_string()))?;
        self.write(path.as_ref(), &bytes)
    }
}
}

Users can define their own extension traits for domain-specific operations.


Optional Features

Bytes Support (feature: bytes)

For zero-copy efficiency, enable the bytes feature to get Bytes-returning convenience methods on FileStorage:

anyfs = { version = "0.1", features = ["bytes"] }
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, MemoryBackend};
use bytes::Bytes;

let fs = FileStorage::new(MemoryBackend::new());

// With bytes feature, FileStorage provides read_bytes() convenience method
let data: Bytes = fs.read_bytes("/large-file.bin")?;
let slice = data.slice(1000..2000);  // Zero-copy!

// Core trait still uses Vec<u8> for object safety
// read_bytes() wraps the Vec<u8> in Bytes::from()
}

Note: Core traits (FsRead, etc.) always use Vec<u8> for object safety (dyn Fs). The bytes feature adds convenience methods to FileStorage that wrap results in Bytes.

When to use:

  • Large file handling with frequent slicing
  • Network-backed storage
  • Streaming scenarios

Default: Vec<u8> (no extra dependency)


Error Types

FsError includes context for better debugging. It implements std::error::Error via thiserror and uses #[non_exhaustive] for forward compatibility.

#![allow(unused)]
fn main() {
/// Filesystem error with context.
///
/// All variants include enough information for meaningful error messages.
/// Use `#[non_exhaustive]` to allow adding variants in minor versions.
#[non_exhaustive]
#[derive(Debug, thiserror::Error)]
pub enum FsError {
    // ========================================================================
    // Path/File Errors
    // ========================================================================

    /// Path not found.
    #[error("not found: {path}")]
    NotFound {
        path: PathBuf,
    },

    /// Circular symlink detected during path resolution.
    #[error("symlink loop detected: {path}")]
    SymlinkLoop {
        path: PathBuf,
    },

    /// Security threat detected (e.g., virus).
    /// Note: This variant supports the Antivirus middleware example.
    /// Custom middleware can use this or define domain-specific error types.
    #[error("threat detected: {reason} in {path}")]
    ThreatDetected {
        path: PathBuf,
        reason: String,
    },

    /// Path already exists.
    #[error("{operation}: already exists: {path}")]
    AlreadyExists {
        path: PathBuf,
        operation: &'static str,
    },

    /// Expected a file, found directory.
    NotAFile { path: PathBuf },

    /// Expected a directory, found file.
    NotADirectory { path: PathBuf },

    /// Directory not empty (for remove_dir).
    DirectoryNotEmpty { path: PathBuf },

    // ========================================================================
    // Permission/Access Errors
    // ========================================================================

    /// Permission denied (general filesystem permission error).
    PermissionDenied {
        path: PathBuf,
        operation: &'static str,
    },

    /// Access denied (from PathFilter or RBAC).
    AccessDenied {
        path: PathBuf,
        reason: String,  // Dynamic reason string
    },

    /// Read-only filesystem (from ReadOnly middleware).
    ReadOnly {
        path: PathBuf,
        operation: &'static str,
    },

    /// Feature not enabled (from Restrictions middleware).
    /// Note: Symlink/hard-link capability is determined by trait bounds (FsLink),
    /// not middleware. Restrictions only controls "permissions".
    FeatureNotEnabled {
        path: PathBuf,
        feature: &'static str,  // "permissions"
        operation: &'static str,
    },

    // ========================================================================
    // Resource Limit Errors
    // ========================================================================

    /// Quota exceeded (total storage).
    QuotaExceeded {
        path: PathBuf,
        limit: u64,
        requested: u64,
        usage: u64,
    },

    /// File size limit exceeded.
    FileSizeExceeded {
        path: PathBuf,
        size: u64,
        limit: u64,
    },

    /// Rate limit exceeded (from RateLimit middleware).
    RateLimitExceeded {
        path: PathBuf,
        limit: u32,
        window_secs: u64,
    },

    // ========================================================================
    // Data Errors
    // ========================================================================

    /// Invalid data (e.g., not valid UTF-8 when string expected).
    InvalidData {
        path: PathBuf,
        details: String,
    },

    /// Corrupted data (e.g., failed checksum, parse error).
    CorruptedData {
        path: PathBuf,
        details: String,
    },

    /// Data integrity verification failed (AEAD tag mismatch, HMAC failure).
    IntegrityError {
        path: PathBuf,
    },

    /// Serialization error (from FsExt JSON methods).
    Serialization(String),

    /// Deserialization error (from FsExt JSON methods).
    Deserialization(String),

    // ========================================================================
    // Backend/Operation Errors
    // ========================================================================

    /// Operation not supported by this backend.
    NotSupported {
        operation: &'static str,
    },

    /// Invalid password or encryption key (from SqliteBackend with encryption).
    InvalidPassword,

    /// Conflict during sync (from offline mode).
    Conflict {
        path: PathBuf,
    },

    /// Backend-specific error (catch-all for custom backends).
    Backend {
        message: String,
    },

    /// I/O error wrapper.
    Io {
        operation: &'static str,
        path: PathBuf,
        source: std::io::Error,
    },
}

// Required implementations
impl From<std::io::Error> for FsError {
    fn from(err: std::io::Error) -> Self {
        FsError::Io {
            operation: "io",
            path: PathBuf::new(),
            source: err,
        }
    }
}
}

Implementation notes:

  • All variants have #[error("...")] attributes (shown for first two, omitted for brevity)
  • #[non_exhaustive] allows adding variants in minor versions without breaking changes
  • From<std::io::Error> enables ? operator with std::io functions
  • Consider #[must_use] on functions returning Result<_, FsError>

Cross-Platform Compatibility

AnyFS is designed for cross-platform use. Virtual backends work everywhere; real filesystem backends have platform considerations.

Backend Compatibility

BackendWindowsLinuxmacOSWASM
MemoryBackend
SqliteBackend✅*
IndexedBackend
StdFsBackend
VRootFsBackend

*SQLiteBackend on WASM requires wasm32 build of rusqlite with bundled SQLite. Encryption feature not available on WASM.

Feature Compatibility

FeatureVirtual BackendsVRootFsBackend
Basic I/O (Fs)✅ All platforms✅ All platforms
Symlinks✅ All platformsPlatform-dependent (see below)
Hard links✅ All platformsPlatform-dependent
Permissions✅ Stored as metadataPlatform-dependent
Extended attributes✅ Stored as metadataPlatform-dependent
FUSE mountingN/APlatform-dependent

Platform-Specific Notes

Virtual Backends (MemoryBackend, SqliteBackend)

Fully cross-platform. All features work identically everywhere because:

  • Paths are just strings/keys - no OS path resolution
  • Symlinks are stored data, not OS constructs
  • Permissions are metadata, not enforced by OS
  • No filesystem syscalls involved
#![allow(unused)]
fn main() {
// This works identically on Windows, Linux, macOS, and WASM
let fs = FileStorage::new(MemoryBackend::new());
fs.symlink("/target", "/link")?;                          // Just stores the link
fs.set_permissions("/file", Permissions::from_mode(0o755))?; // Just stores metadata
}

VRootFsBackend (Real Filesystem)

Wraps the host filesystem. Platform differences apply:

FeatureLinuxmacOSWindows
Symlinks⚠️ Requires privileges*
Hard links✅ (NTFS only)
Permissions (mode bits)⚠️ Limited mapping
Extended attributes✅ xattr✅ xattr⚠️ ADS (different API)
Case sensitivity⚠️ Default insensitive⚠️ Insensitive

*Windows requires SeCreateSymbolicLinkPrivilege or Developer Mode for symlinks.

FUSE Mounting

PlatformSupportLibrary
Linux✅ Nativelibfuse
macOS⚠️ Third-partymacFUSE
Windows⚠️ Third-partyWinFsp or Dokan
WASMN/A

Path Handling

Virtual backends use / as separator internally, regardless of platform:

#![allow(unused)]
fn main() {
// Always use forward slashes with virtual backends
fs.write("/project/src/main.rs", code)?;  // Works everywhere
}

VRootFsBackend translates to native paths internally:

  • Linux/macOS: / stays /
  • Windows: /project/file.txtC:\root\project\file.txt

Recommendations

Use CaseRecommended BackendWhy
Cross-platform appMemoryBackend or SqliteBackendNo platform differences
Portable storageSqliteBackendSingle file, works everywhere
WASM/browserMemoryBackend or SqliteBackendNo filesystem access needed
Host filesystem accessVRootFsBackendWith awareness of platform limits
TestingMemoryBackendFast, no cleanup, deterministic

Feature Detection

Check platform capabilities at runtime if needed:

#![allow(unused)]
fn main() {
/// Check if symlinks are supported on the current platform.
pub fn symlinks_available() -> bool {
    #[cfg(unix)]
    return true;

    #[cfg(windows)]
    {
        // Check for Developer Mode or symlink privilege
        // ...
    }
}
}

On platforms without symlink support, use a backend that doesn’t implement FsLink, or check symlinks_available() before calling symlink operations.

Layered Design: Backends + Middleware + Ergonomics

AnyFS uses a layered architecture that separates concerns:

  1. Backends: Pure storage + filesystem semantics
  2. Middleware: Composable policy layers
  3. FileStorage: Ergonomic wrapper

Architecture

┌─────────────────────────────────────────┐
│  FileStorage                         │  ← Ergonomics only
├─────────────────────────────────────────┤
│  Middleware Stack (composable):         │  ← Policy enforcement
│    Tracing → PathFilter → Restrictions  │
│    → Quota → Backend                    │
├─────────────────────────────────────────┤
│  Fs                             │  ← Pure storage
│  (Memory, SQLite, VRootFs, custom)      │
└─────────────────────────────────────────┘

Layer Responsibilities

LayerResponsibilityPath Handling
FileStorageErgonomic API + path resolutionAccepts impl AsRef<Path>; resolves paths via pluggable PathResolver
MiddlewarePolicy enforcement&Path (object-safe core traits)
BackendStorage + FS semantics&Path (object-safe core traits)

Core traits use &Path for object safety; FileStorage/FsExt provide impl AsRef<Path> ergonomics. Path resolution is pluggable via PathResolver trait (see ADR-033). Backends that wrap a real filesystem implement SelfResolving so FileStorage can skip resolution.


Policy via Middleware

Old design (rejected): FileStorage contained quota/feature logic.

Current design: Policy is handled by composable middleware:

#![allow(unused)]
fn main() {
// Middleware enforces policy
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .build())
    .layer(TracingLayer::new());

// FileStorage is ergonomics + path resolution (no policy)
let fs = FileStorage::new(backend);
}

Path Containment

For VRootFsBackend (real filesystem), path containment uses strict-path::VirtualRoot internally:

#![allow(unused)]
fn main() {
// VRootFsBackend implements FsRead, FsWrite, FsDir (and thus Fs via blanket impl)
impl FsRead for VRootFsBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // VirtualRoot ensures paths can't escape
        let safe_path = self.root.join(path)?;
        std::fs::read(safe_path).map_err(Into::into)
    }
}
}

For virtual backends (Memory, SQLite), paths are just keys - no OS path traversal possible. FileStorage performs symlink-aware resolution for these backends so normalization is consistent across virtual implementations.

For sandboxing across all backends, use PathFilter middleware:

#![allow(unused)]
fn main() {
PathFilterLayer::builder()
    .allow("/workspace/**")
    .deny("**/.env")
    .build()
    .layer(backend)
}

Why This Matters

  • Separation of concerns: Backends focus on storage, middleware handles policy
  • Composability: Add/remove policies without touching storage code
  • Flexibility: Same middleware works with any backend
  • Simplicity: Each layer has one job

AnyFS - Architecture Decision Records

This file captures the decisions for the current AnyFS design.


Decision Map

Primary docs are where each decision is explained in narrative form. ADRs remain the source of truth for the decision itself.


ADR Index

ADRTitleStatus
ADR-001Path-based Fs traitAccepted
ADR-002Two-crate structureAccepted
ADR-003Object-safe path parametersAccepted
ADR-004Tower-style middleware patternAccepted
ADR-005std::fs-aligned method namesAccepted
ADR-006Quota for quota enforcementAccepted
ADR-007Restrictions for least-privilegeAccepted
ADR-008FileStorage as thin ergonomic wrapperAccepted
ADR-009Built-in backends are feature-gatedAccepted
ADR-010Sync-first, async-ready designAccepted
ADR-011Layer trait for standardized compositionAccepted
ADR-012Tracing for instrumentationAccepted
ADR-013FsExt for extension methodsAccepted
ADR-014Optional Bytes supportAccepted
ADR-015Contextual FsErrorAccepted
ADR-016PathFilter for path-based access controlAccepted
ADR-017ReadOnly for preventing writesAccepted
ADR-018RateLimit for operation throttlingAccepted
ADR-019DryRun for testing and debuggingAccepted
ADR-020Cache for read performanceAccepted
ADR-021Overlay for union filesystemAccepted
ADR-022Builder pattern for configurable middlewareAccepted
ADR-023Interior mutability for all trait methodsAccepted
ADR-024Async StrategyAccepted
ADR-025Strategic Boxing (Tower-style)Accepted
ADR-026Companion shell (anyfs-shell)Accepted (Future)
ADR-027Permissive core; security via middlewareAccepted
ADR-028Linux-like semantics for virtual backendsAccepted
ADR-029Path resolution in FileStorageAccepted
ADR-030Layered trait hierarchyAccepted
ADR-031Indexing as middlewareAccepted (Future)
ADR-032Path Canonicalization via FsPath TraitAccepted
ADR-033PathResolver Trait for Pluggable ResolutionAccepted
ADR-034LLM-Oriented Architecture (LOA)Accepted

ADR-001: Path-based Fs trait

Decision: Backends implement a path-based trait aligned with std::fs method naming.

Why: Filesystem operations are naturally path-oriented; a single, familiar trait surface is easier to implement and adopt than graph-store or inode models.


ADR-002: Two-crate structure

Decision:

CratePurpose
anyfs-backendMinimal contract: Fs trait, Layer trait, FsExt, types
anyfsBackends + middleware + ergonomics (FileStorage<B>, BackendStack)

Why:

  • Backend authors only need anyfs-backend (no heavy dependencies).
  • Middleware is composable and lives with backends in anyfs.
  • FileStorage provides ergonomics plus centralized path resolution for virtual backends - no policy logic - included in anyfs for convenience.

ADR-003: Object-safe path parameters

Decision: Core Fs traits take &Path so they remain object-safe (dyn Fs works). For ergonomics, FileStorage and FsExt accept impl AsRef<Path> and forward to the core traits.

Why:

  • Object safety enables opt-in type erasure (FileStorage::boxed()).
  • Keeps hot-path calls zero-cost; dynamic dispatch is explicit and optional.
  • Ergonomics preserved via FileStorage/FsExt (&str, String, PathBuf).

ADR-004: Tower-style middleware pattern

Decision: Use composable middleware (decorator pattern) for cross-cutting concerns like limits, logging, and feature gates. Each middleware implements Fs by wrapping another Fs.

Why:

  • Complete separation of concerns - each layer has one job.
  • Composable - use only what you need.
  • Familiar pattern (Axum/Tower use the same approach).
  • No code duplication - middleware written once, works with any backend.
  • Testable - each layer can be tested in isolation.

Example:

#![allow(unused)]
fn main() {
let backend = SqliteBackend::open("data.db")?
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .build())
    .layer(TracingLayer::new());
}

ADR-005: std::fs-aligned method names

Decision: Prefer read_dir, create_dir_all, remove_file, etc.

Why: Familiarity and reduced cognitive overhead.


ADR-006: Quota for quota enforcement

Decision: Quota/limit enforcement is handled by Quota<B> middleware, not by backends or FileStorage.

Configuration:

  • with_max_total_size(bytes) - total storage limit
  • with_max_file_size(bytes) - per-file limit
  • with_max_node_count(count) - max files/directories
  • with_max_dir_entries(count) - max entries per directory
  • with_max_path_depth(depth) - max directory nesting

Why:

  • Limits are policy, not storage semantics.
  • Written once, works with any backend.
  • Optional - users who don’t need limits skip this middleware.

Implementation notes:

  • On construction, scan existing backend to initialize usage counters.
  • Wrap open_write streams with CountingWriter to track streamed bytes.
  • Check limits before operations, update usage after successful operations.

ADR-007: Capability via Trait Bounds

Decision: Symlink and hard-link capability is determined by whether the backend implements FsLink. The default PathResolver (IterativeResolver) follows symlinks when FsLink is available.

The Rule

Backend implements FsLink?Symlinks work?
Yes (B: FsLink)Yes
NoNo (won’t compile)

Examples

#![allow(unused)]
fn main() {
// MemoryBackend implements FsLink
let fs = FileStorage::new(MemoryBackend::new());
fs.symlink("/target", "/link")?;  // ✅ Works

// Custom backend that doesn't implement FsLink
let fs = FileStorage::new(MySimpleBackend::new());
fs.symlink("/target", "/link")?;  // ❌ Won't compile - no FsLink impl
}

Why Not Runtime Blocking?

A hypothetical deny_symlinks() middleware would create type/behavior mismatch:

  • Type says “I implement FsLink”
  • Runtime says “but symlink() errors”

This is confusing and defeats the purpose of Rust’s type system. Instead, symlink capability is determined at compile time by trait bounds.

Restrictions Middleware

Restrictions<B> is limited to operations where runtime policy makes sense:

#![allow(unused)]
fn main() {
let backend = RestrictionsLayer::builder()
    .deny_permissions()  // Prevent metadata changes
    .build()
    .layer(backend);
}

Symlink following: Controlled by the PathResolver. The default IterativeResolver follows symlinks when FsLink is available. Custom resolvers can implement different behaviors. OS-backed backends delegate to the OS (strict-path prevents escapes).


ADR-008: FileStorage as thin ergonomic wrapper

Decision: FileStorage<B> is a thin wrapper that provides std::fs-aligned ergonomics and path resolution for virtual backends. It contains NO policy logic.

Context: Earlier designs used FileStorage<B, R, M> with three type parameters:

  • B - Backend type
  • R - PathResolver type (default: IterativeResolver)
  • M - Marker type for compile-time container differentiation

This was over-engineered. We simplified to a single generic.

Why only one generic parameter?

RemovedRationale
R (Resolver)Path resolution is a cold path (once per operation, I/O dominates). Boxing is acceptable per ADR-025. Runtime swapping via with_resolver() is sufficient.
M (Marker)Speculative feature with unclear demand. Prior art (vfs, cap-std, tempfile) don’t have marker parameters. Users who need type safety can create wrapper newtypes: struct SandboxFs(FileStorage<MemoryBackend>).

What it does:

  • Provides familiar method names
  • Accepts impl AsRef<Path> for convenience and forwards to the core &Path traits
  • Delegates path resolution to a boxed PathResolver (cold path, boxing OK per ADR-025)
  • Delegates all operations to the wrapped backend

What it does NOT do:

  • Quota enforcement (use Quota)
  • Feature gating (use Restrictions)
  • Instrumentation (use Tracing)
  • Marker types (users create wrapper newtypes if needed)
  • Any other policy

Why this design:

  • Single responsibility - ergonomics + path resolution (no policy).
  • One generic parameter keeps the API simple for 90% of users.
  • Resolver is boxed because path resolution is a cold path.
  • Users who need type-safe markers can create their own wrapper types.
  • Policy is composable via middleware, not hardcoded.

User-defined type safety pattern:

#![allow(unused)]
fn main() {
// Instead of FileStorage<_, _, Sandbox>, users create:
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);

fn process_sandbox(fs: &SandboxFs) { /* only accepts SandboxFs */ }
}

ADR-009: Simple backends in anyfs, complex backends as ecosystem crates

Decision: Simple backends (MemoryBackend, StdFsBackend, VRootFsBackend) are built into anyfs with feature flags. Complex backends (SqliteBackend, IndexedBackend) live in separate ecosystem crates.

Built-in backends (anyfs features):

  • memory (default)
  • stdfs (optional)
  • vrootfs (optional)

Ecosystem crates:

  • anyfs-sqliteSqliteBackend with optional encryption feature
  • anyfs-indexedIndexedBackend (SQLite metadata + disk blobs)

Why:

  • Simple backends have minimal dependencies (just std)
  • Complex backends need internal runtimes (connection pools, sharding, chunking)
  • Follows Tower/Axum pattern: framework is minimal, complex implementations in their own crates
  • Reduces compile time and binary size for users who don’t need complex backends

ADR-010: Sync-first, async-ready design

Decision: Fs traits are synchronous. The API is designed to allow adding AsyncFs later without breaking changes.

Rationale:

  • Built-in backends are naturally synchronous:
    • MemoryBackend - in-memory, instant
    • StdFsBackend / VRootFsBackend - std::fs is sync
  • Ecosystem backends are also sync (e.g., SqliteBackend uses rusqlite which is sync)
  • Sync is simpler - no runtime dependency (tokio/async-std)
  • Users can wrap sync backends in spawn_blocking if needed

Async-ready design principles:

  • Traits require Send - compatible with async executors
  • Return types are Result<T, FsError> - works with async
  • No internal blocking assumptions
  • Methods are stateless per-call - no hidden blocking state

Future async path (Option 2): When async is needed (e.g., network-backed storage), add a parallel trait:

#![allow(unused)]
fn main() {
// In anyfs-backend
pub trait AsyncFs: Send + Sync {
    async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    async fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    // ... mirrors Fs with async

    // Streaming uses AsyncRead/AsyncWrite
    async fn open_read(&self, path: &Path)
        -> Result<Box<dyn AsyncRead + Send + Unpin>, FsError>;
}
}

Migration notes:

  • AsyncFs would be a separate trait, not replacing Fs
  • Blanket impl possible: impl<T: Fs> AsyncFs for T using spawn_blocking
  • Middleware would need async variants: AsyncQuota<B>, etc.
  • No breaking changes to existing sync API

Why not async now:

  • Complexity without benefit - all current backends are sync
  • Rust 1.75 makes async traits easy, so adding later is low-cost
  • Better to wait for real async backend requirements

ADR-011: Layer trait for standardized composition

Decision: Provide a Layer trait (inspired by Tower) that standardizes middleware composition.

#![allow(unused)]
fn main() {
pub trait Layer<B: Fs> {
    type Backend: Fs;
    fn layer(self, backend: B) -> Self::Backend;
}
}

Note: For async compatibility, the trait is unbounded: Layer<B> without B: Fs. This allows the same layer types to implement both sync (impl<B: Fs> Layer<B>) and async (impl<B: AsyncFs> Layer<B>). See ADR-027.

Why:

  • Standardized composition pattern familiar to Tower/Axum users.
  • IDE autocomplete for available layers.
  • Enables BackendStack fluent builder in anyfs.
  • Each middleware provides a corresponding *Layer type.

Example:

#![allow(unused)]
fn main() {
// SqliteBackend from anyfs-sqlite crate
let backend = SqliteBackend::open("data.db")?
    .layer(QuotaLayer::builder()
        .max_total_size(100_000)
        .build())
    .layer(TracingLayer::new());
}

ADR-012: Tracing for instrumentation

Decision: Use Tracing<B> integrated with the tracing ecosystem instead of a custom logging solution.

Why:

  • Works with existing tracing infrastructure (tracing-subscriber, OpenTelemetry, Jaeger).
  • Structured logging with spans for each operation.
  • Users choose their subscriber - no logging framework lock-in.
  • Consistent with modern Rust ecosystem practices.

Configuration:

#![allow(unused)]
fn main() {
backend.layer(TracingLayer::new()
    .with_target("anyfs")
    .with_level(tracing::Level::DEBUG))
}

ADR-013: FsExt for extension methods

Decision: Provide FsExt trait with convenience methods, auto-implemented for all backends.

#![allow(unused)]
fn main() {
pub trait FsExt: Fs {
    fn is_file(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;
    fn is_dir(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;

    // JSON methods require `serde` feature
    #[cfg(feature = "serde")]
    fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError>;
    #[cfg(feature = "serde")]
    fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError>;
}

impl<B: Fs> FsExt for B {}
}

Feature gating:

  • is_file() and is_dir() are always available.
  • read_json() and write_json() require anyfs-backend = { features = ["serde"] }.

Why:

  • Adds convenience without bloating Fs trait.
  • Blanket impl means all backends get these methods for free.
  • Users can define their own extension traits for domain-specific operations.
  • Follows Rust convention (e.g., IteratorExt, StreamExt).
  • Serde is optional - users who don’t need JSON avoid the dependency.

ADR-014: Optional Bytes support

Decision: Support the bytes crate via an optional feature for zero-copy efficiency.

anyfs = { version = "0.1", features = ["bytes"] }

Why:

  • Bytes provides O(1) slicing via reference counting.
  • Beneficial for large file handling, network backends, streaming.
  • Optional - users who don’t need it avoid the dependency.
  • Core traits remain Vec<u8> for simplicity and Send + Sync compliance.

Implementation: The bytes feature adds a convenience method to FileStorage, not a core trait change:

#![allow(unused)]
fn main() {
// In anyfs/src/container.rs (behind `bytes` feature)
impl<B: Fs> FileStorage<B> {
    #[cfg(feature = "bytes")]
    pub fn read_bytes(&self, path: impl AsRef<Path>) -> Result<bytes::Bytes, FsError> {
        Ok(bytes::Bytes::from(self.read(path)?))
    }
}
}

Core traits unchanged: FsRead::read() returns Vec<u8>. The bytes feature only adds ergonomic wrappers.


ADR-015: Contextual FsError

Decision: FsError variants include context for better debugging.

#![allow(unused)]
fn main() {
FsError::NotFound {
    path: PathBuf,
}

FsError::QuotaExceeded {
    limit: u64,
    requested: u64,
    usage: u64,
}
}

Why:

  • Error messages include enough context to understand what failed.
  • No need for separate error context crate (like anyhow) for basic usage.
  • Path is sufficient for NotFound - the call site knows the operation.
  • Quota errors include all relevant numbers for debugging.

ADR-016: PathFilter for path-based access control

Decision: Provide PathFilter<B> middleware for glob-based path access control.

Configuration:

#![allow(unused)]
fn main() {
PathFilterLayer::builder()
    .allow("/workspace/**")    // Allow workspace access
    .deny("**/.env")           // Deny .env files anywhere
    .deny("**/secrets/**")     // Deny secrets directories
    .build()
    .layer(backend)
}

Semantics:

  • Deny rules are evaluated first and take precedence over allow rules.
  • If path matches any deny rule, access is denied.
  • If path matches an allow rule (and no deny), access is granted.
  • If no rules match, access is denied (deny by default).
  • Uses glob patterns (e.g., ** for recursive, * for single segment).
  • Returns FsError::AccessDenied for denied paths.

Why:

  • Essential for AI agent sandboxing - restrict to specific directories.
  • Prevents access to sensitive files (.env, secrets, credentials).
  • Separate from backend - works with any backend.
  • Inspired by AgentFS and similar AI sandbox patterns.

Implementation notes:

  • Use globset crate for efficient glob pattern matching.
  • read_dir filters out denied entries from results (don’t expose existence of denied files).
  • Check path at operation start, then delegate to inner backend.

ADR-017: ReadOnly for preventing writes

Decision: Provide ReadOnly<B> middleware that blocks all write operations.

Usage:

#![allow(unused)]
fn main() {
let readonly_fs = ReadOnly::new(backend);
}

Semantics:

  • All read operations pass through to inner backend.
  • All write operations return FsError::ReadOnly.
  • Simple, no configuration needed.

Why:

  • Safe browsing of container contents without modification risk.
  • Useful for debugging, inspection, auditing.
  • Simpler than configuring Restrictions for read-only use case.

ADR-018: RateLimit for operation throttling

Decision: Provide RateLimit<B> middleware to limit operations per time window.

Configuration:

#![allow(unused)]
fn main() {
RateLimitLayer::builder()
    .max_ops(1000)
    .per_second()
    .build()
    .layer(backend)
}

Semantics:

  • Tracks operation count in fixed time window (simpler than sliding window, sufficient for most use cases).
  • Returns FsError::RateLimitExceeded when limit exceeded.
  • Counter resets when window expires.

Why:

  • Protects against runaway processes consuming resources.
  • Essential for multi-tenant environments.
  • Prevents denial-of-service from misbehaving code.

Implementation notes:

  • Use std::time::Instant for timing.
  • Store window start time and counter; reset when window expires.
  • Count operation calls (including open_read/open_write), not bytes transferred.
  • Return error immediately when limit exceeded (no blocking/waiting).

ADR-019: DryRun for testing and debugging

Decision: Provide DryRun<B> middleware that logs write operations without executing them.

Usage:

#![allow(unused)]
fn main() {
let dry_run = DryRun::new(backend);
let fs = FileStorage::new(dry_run);

fs.write("/test.txt", b"hello")?;  // Logged but not written
// To inspect recorded operations, keep the DryRun handle before wrapping it.
}

Semantics:

  • Read operations execute normally against inner backend.
  • Write operations are logged but return Ok(()) without executing.
  • Operations log can be inspected for verification.

Why:

  • Test code paths without side effects.
  • Debug complex operation sequences.
  • Audit what would happen before committing.

Implementation notes:

  • Read operations delegate to inner backend (test against real state).
  • Write operations log and return Ok(()) without executing.
  • open_write returns std::io::sink() - writes are discarded.
  • Useful for: “What would this code do?” not “Run this in isolation.”

ADR-020: Cache for read performance

Decision: Provide Cache<B> middleware with LRU caching for read operations.

Configuration:

#![allow(unused)]
fn main() {
CacheLayer::builder()
    .max_entries(1000)
    .max_entry_size(1024 * 1024)  // 1MB max per entry
    .build()
    .layer(backend)
}

Semantics:

  • Read operations check cache first, populate on miss.
  • Write operations invalidate relevant cache entries.
  • LRU eviction when max entries exceeded.

Why:

  • Improves performance for repeated reads.
  • Reduces load on underlying backend (especially for SQLite/network).
  • Configurable to balance memory vs performance.

Implementation notes:

  • Cache bulk reads only: read(), read_to_string(), read_range(), metadata(), exists().
  • Do NOT cache open_read() - streams are for large files that shouldn’t be cached.
  • Invalidate cache entry on any write to that path.
  • Use lru crate or similar for LRU eviction.

ADR-021: Overlay for union filesystem

Decision: Provide Overlay<B1, B2> middleware for copy-on-write layered filesystems.

Usage:

#![allow(unused)]
fn main() {
// SqliteBackend from anyfs-sqlite crate
let base = SqliteBackend::open("base.db")?;  // Read-only base
let upper = MemoryBackend::new();             // Writable upper layer

let overlay = Overlay::new(base, upper);
}

Semantics:

  • Read: check upper layer first, fall back to base if not found.
  • Write: always to upper layer (copy-on-write).
  • Delete: create whiteout marker in upper layer (file appears deleted but base unchanged).
  • Directory listing: merge results from both layers.

Why:

  • Docker-like layered filesystem for containers.
  • Base image with per-instance modifications.
  • Testing with isolated changes over shared baseline.
  • Inspired by OverlayFS and VFS crate patterns.

Implementation notes:

  • Whiteout convention: .wh.<filename> marks deleted files from base layer.
  • read_dir must merge results from both layers, excluding whiteouts and whited-out files.
  • exists checks upper first, then base (respecting whiteouts).
  • All writes go to upper layer; base is never modified.
  • Consider opaque directories (.wh..wh..opq) to hide entire base directories.

ADR-022: Builder pattern for configurable middleware

Decision: Middleware that requires configuration MUST use a builder pattern that prevents construction without meaningful values. ::new() constructors are NOT allowed for middleware where a default configuration is nonsensical.

Problem: A constructor like QuotaLayer::new() raises the question: “What quota?” An unlimited quota is pointless - you wouldn’t use QuotaLayer at all. Similarly, RestrictionsLayer::new() with no restrictions, PathFilterLayer::new() with no rules, and RateLimitLayer::new() with no rate limit are all nonsensical.

Solution: Use builders that enforce at least one meaningful configuration:

#![allow(unused)]
fn main() {
// QuotaLayer - requires at least one limit
let quota = QuotaLayer::builder()
    .max_total_size(100 * 1024 * 1024)
    .build();

// Can also set multiple limits
let quota = QuotaLayer::builder()
    .max_total_size(1_000_000)
    .max_file_size(100_000)
    .max_node_count(1000)
    .build();

// RestrictionsLayer - requires at least one restriction
let restrictions = RestrictionsLayer::builder()
    .deny_permissions()
    .build();

// PathFilterLayer - requires at least one rule
let filter = PathFilterLayer::builder()
    .allow("/workspace/**")
    .deny("**/.env")
    .build();

// RateLimitLayer - requires rate limit parameters
let rate_limit = RateLimitLayer::builder()
    .max_ops(1000)
    .per_second()
    .build();

// CacheLayer - requires cache configuration
let cache = CacheLayer::builder()
    .max_entries(1000)
    .build();
}

Middleware that MAY keep ::new():

MiddlewareRationale
TracingLayerDefault (global tracing subscriber) is meaningful
ReadOnlyLayerNo configuration needed
DryRunLayerNo configuration needed
OverlayLayerTakes two backends as required params: Overlay::new(lower, upper)

Implementation:

#![allow(unused)]
fn main() {
// Builder with typestate pattern for compile-time enforcement
pub struct QuotaLayerBuilder<State = Unconfigured> {
    max_total_size: Option<u64>,
    max_file_size: Option<u64>,
    max_node_count: Option<u64>,
    _state: PhantomData<State>,
}

pub struct Unconfigured;
pub struct Configured;

impl QuotaLayerBuilder<Unconfigured> {
    pub fn max_total_size(mut self, bytes: u64) -> QuotaLayerBuilder<Configured> {
        self.max_total_size = Some(bytes);
        QuotaLayerBuilder {
            max_total_size: self.max_total_size,
            max_file_size: self.max_file_size,
            max_node_count: self.max_node_count,
            _state: PhantomData,
        }
    }

    pub fn max_file_size(mut self, bytes: u64) -> QuotaLayerBuilder<Configured> {
        // Similar transition to Configured state
    }

    pub fn max_node_count(mut self, count: u64) -> QuotaLayerBuilder<Configured> {
        // Similar transition to Configured state
    }

    // Note: NO build() method on Unconfigured state!
}

impl QuotaLayerBuilder<Configured> {
    // Additional configuration methods stay in Configured state
    pub fn max_total_size(mut self, bytes: u64) -> Self {
        self.max_total_size = Some(bytes);
        self
    }

    // Only Configured state has build()
    pub fn build(self) -> QuotaLayer {
        QuotaLayer { /* ... */ }
    }
}

impl QuotaLayer {
    pub fn builder() -> QuotaLayerBuilder<Unconfigured> {
        QuotaLayerBuilder {
            max_total_size: None,
            max_file_size: None,
            max_node_count: None,
            _state: PhantomData,
        }
    }
}
}

Why:

  • Compile-time safety: Invalid configurations don’t compile.
  • Self-documenting API: Users must explicitly choose configuration.
  • No meaningless defaults: Eliminates “what does this default to?” confusion.
  • IDE guidance: Autocomplete shows required methods before build().
  • Familiar pattern: Rust builders are idiomatic and widely understood.

Error prevention:

#![allow(unused)]
fn main() {
// This won't compile - no build() on Unconfigured
let quota = QuotaLayer::builder().build();  // ❌ Error!

// This compiles - at least one limit set
let quota = QuotaLayer::builder()
    .max_total_size(1_000_000)
    .build();  // ✅ OK
}

ADR-023: Interior mutability for all trait methods

Decision: All Fs trait methods use &self, not &mut self. Backends manage their own synchronization internally (interior mutability).

Previous design:

#![allow(unused)]
fn main() {
pub trait FsRead: Send {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
}

pub trait FsWrite: Send {
    fn write(&mut self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
}

New design:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
}

pub trait FsWrite: Send + Sync {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
}

Why:

  1. Filesystems are conceptually always mutable. A filesystem doesn’t become “borrowed” when you write to it - the underlying storage manages concurrency itself.

  2. Enables concurrent access patterns. With &mut self, you cannot have concurrent readers and writers even when the backend supports it (e.g., SQLite with WAL mode, real filesystems).

  3. Matches real-world filesystem semantics. std::fs::write() takes a path, not a mutable reference to some filesystem object. Files are shared resources.

  4. Simplifies middleware implementation. Middleware no longer needs to worry about propagating mutability - all operations use &self.

  5. Common pattern in Rust. Many I/O abstractions use interior mutability: std::io::Write for File (via OS handles), tokio::fs, database connection pools, etc.

Implementation:

Backends use appropriate synchronization primitives:

#![allow(unused)]
fn main() {
pub struct MemoryBackend {
    // Interior mutability via Mutex/RwLock
    data: RwLock<HashMap<PathBuf, Vec<u8>>>,
}

impl FsWrite for MemoryBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let mut guard = self.data.write().unwrap();
        guard.insert(path.as_ref().to_path_buf(), data.to_vec());
        Ok(())
    }
}

pub struct SqliteBackend {
    // SQLite handles its own locking
    conn: Connection,  // rusqlite::Connection is internally synchronized
}
}

Trade-offs:

Aspect&mut self&self (interior mutability)
Compile-time safetySingle writer enforcedRuntime synchronization
Concurrent accessNot possibleBackend decides
API simplicitySimpleSlightly more complex backends
Real-world matchPoorGood

Backend implementer responsibility:

Backends MUST use interior mutability (RwLock, Mutex, etc.) to ensure thread-safe concurrent access. This guarantees:

  • Memory safety (no data corruption)
  • Atomic operations (a single write() won’t produce partial results)

This does NOT guarantee:

  • Order of concurrent writes to the same path (last write wins - standard FS behavior)

Conclusion: The benefits of matching filesystem semantics and enabling concurrent access outweigh the loss of compile-time single-writer enforcement. Backends are responsible for their own thread safety via interior mutability.


ADR-024: Async Strategy

Status: Accepted Context: Async/await is prevalent in Rust networking and I/O. While AnyFS is primarily sync-focused (matching std::fs), we may need async support in the future for:

  • Network-backed storage (S3, WebDAV, etc.)
  • High-concurrency scenarios
  • Integration with async runtimes (tokio, async-std)

Decision: Plan for a parallel async trait hierarchy that mirrors the sync traits.

Strategy:

Sync Traits          Async Traits
-----------          ------------
FsRead        →      AsyncFsRead
FsWrite       →      AsyncFsWrite
FsDir         →      AsyncFsDir
Fs            →      AsyncFs
FsFull        →      AsyncFsFull
FsFuse        →      AsyncFsFuse
FsPosix       →      AsyncFsPosix

Design principles:

  1. Separate crate: Async traits live in anyfs-async to avoid pulling async dependencies into the core.

  2. Method parity: Each async trait method corresponds 1:1 with its sync counterpart:

    #![allow(unused)]
    fn main() {
    // Sync (anyfs-backend)
    pub trait FsRead: Send + Sync {
        fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    }
    
    // Async (anyfs-async)
    #[async_trait]
    pub trait AsyncFsRead: Send + Sync {
        async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    }
    }
  3. Layer trait compatibility: The Layer trait works for both sync and async:

    #![allow(unused)]
    fn main() {
    pub trait Layer<B> {
        type Backend;
        fn layer(self, backend: B) -> Self::Backend;
    }
    
    // Middleware can implement for both:
    impl<B: Fs> Layer<B> for QuotaLayer {
        type Backend = Quota<B>;
        fn layer(self, backend: B) -> Self::Backend { ... }
    }
    
    impl<B: AsyncFs> Layer<B> for QuotaLayer {
        type Backend = AsyncQuota<B>;
        fn layer(self, backend: B) -> Self::Backend { ... }
    }
    }
  4. Sync-to-async bridge: Provide adapters for using sync backends in async contexts:

    #![allow(unused)]
    fn main() {
    // Wraps sync backend for use in async code (uses spawn_blocking)
    pub struct SyncToAsync<B>(B);
    
    impl<B: Fs> AsyncFsRead for SyncToAsync<B> {
        async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
            let path = path.as_ref().to_path_buf();
            let backend = self.0.clone(); // requires Clone
            tokio::task::spawn_blocking(move || backend.read(&path)).await?
        }
    }
    }
  5. No async-to-sync bridge: We intentionally don’t provide async-to-sync adapters (would require blocking on async runtime, which is problematic).

Implementation phases:

PhaseScopeDependency
1Sync traits stableNow
2Design async traitsWhen needed
3anyfs-async crateWhen needed
4Async middlewareWhen needed

Why parallel traits (not feature flags):

  • No conditional compilation complexity - sync and async are separate, clean codebases
  • No trait object issues - async traits have different object safety requirements
  • Clear dependency boundaries - sync code doesn’t pull in tokio/async-std
  • Ecosystem alignment - mirrors how std::io vs tokio::io work

Trade-offs:

ApproachProsCons
Parallel traitsClean separation, no async deps in coreCode duplication in middleware
Feature flagsSingle codebaseComplex conditional compilation
Async-onlyModern, no duplicationForces async runtime on sync users
Sync-onlySimpleCan’t support network backends efficiently

Conclusion: Parallel async traits provide the best balance of simplicity now (sync-only core) with a clear migration path for async support later. The Layer trait design already accommodates this pattern.


ADR-025: Strategic Boxing (Tower-style)

Status: Accepted

Context: Dynamic dispatch (Box<dyn Trait>) adds heap allocation and vtable indirection. We need to decide where boxing is acceptable vs. where zero-cost abstractions are required.

Decision: Follow Tower/Axum’s battle-tested strategy: zero-cost on the hot path, box at boundaries where flexibility is needed and I/O cost dominates.

Principle: Avoid heap allocations and dynamic dispatch unless they buy real flexibility with negligible performance impact. Box only at cold boundaries (streams/iterators), and make type erasure explicit and opt-in.

DX stance: Application code uses FileStorage/FsExt (std::fs-style paths). Core traits stay object-safe for dyn Fs. For hot loops on known concrete backends, we provide a typed streaming extension as the first-class zero-alloc fast path.

Boxing Strategy

HOT PATH (many calls per operation - must be zero-cost):
┌─────────────────────────────────────────────────────┐
│  read(), write(), metadata(), exists()              │  ← Returns concrete types
│  Read::read() / Write::write() on streams           │  ← Vtable dispatch only
│  Iterator::next() on ReadDirIter                    │  ← Vtable dispatch only
│  Middleware composition                             │  ← Generics, monomorphized
└─────────────────────────────────────────────────────┘

COLD PATH (once per operation - boxing acceptable):
┌─────────────────────────────────────────────────────┐
│  open_read(), open_write()                          │  ← Box<dyn Read/Write>
│  read_dir()                                         │  ← ReadDirIter (boxed inner)
└─────────────────────────────────────────────────────┘

SETUP (once at startup - zero-cost):
┌─────────────────────────────────────────────────────┐
│  Middleware stacking: Quota<Tracing<B>>             │  ← Generics, no boxing
│  FileStorage::new(backend)                          │  ← Zero-cost wrapper
└─────────────────────────────────────────────────────┘

OPT-IN TYPE ERASURE (when explicitly needed):
┌─────────────────────────────────────────────────────────────┐
│  FileStorage::boxed() -> FileStorage<Box<dyn Fs>>          │  ← Like Tower's BoxService
│  (Resolver already boxed internally - this boxes backend)  │
└─────────────────────────────────────────────────────────────┘

What Gets Boxed and Why

APIBoxed?Rationale
read()Vec<u8>NoHot path, most common operation
write(data)()NoHot path, most common operation
metadata()MetadataNoHot path, frequently called
exists()boolNoHot path, frequently called
open_read()Box<dyn Read>YesCold path (once per file), enables middleware wrappers
open_write()Box<dyn Write>YesCold path (once per file), enables QuotaWriter
read_dir()ReadDirIterYes (inner)Enables filtering in PathFilter, merging in Overlay
Middleware stackNoGenerics compose at compile time
FileStorage::boxed()Opt-inExplicit type erasure when needed

Why This Works

1. Bulk operations are the common case: Most code uses read() and write(), not streaming. These are zero-cost.

2. Streaming is for large files: open_read() / open_write() are for files too large to load into memory. For large files, I/O time (1-100ms) dwarfs box allocation (~50ns).

3. Box once, vtable many: After open_read() allocates once, subsequent Read::read() calls are just vtable dispatch - no further allocations.

4. Middleware needs flexibility:

  • Quota wraps streams with QuotaWriter to count bytes
  • PathFilter filters ReadDirIter to hide denied entries
  • Overlay merges directory listings from two backends Boxing enables this without type explosion.

Comparison to Tower/Axum

AnyFSTower/AxumPurpose
Quota<Tracing<B>>Timeout<RateLimit<S>>Zero-cost middleware composition
Box<dyn Read>Pin<Box<dyn Future>>Flexibility at boundaries
ReadDirIterBoxedIntoRouteType erasure for storage
FileStorage::boxed()BoxService / BoxCloneServiceOpt-in type erasure

Tower’s Timeout middleware uses Pin<Box<dyn Future>> in practice. Axum’s Router uses BoxedIntoRoute to store handlers. We follow the same pattern.

Cost Analysis

OperationBox AllocationActual I/OBox % of Total
Open + read 4KB file~50ns~10,000ns0.5%
Open + read 1MB file~50ns~1,000,000ns0.005%
List directory (10 entries)~50ns~5,000ns1%

The boxing cost is negligible relative to actual I/O.

Alternatives Considered

1. Associated types everywhere:

#![allow(unused)]
fn main() {
pub trait FsRead {
    type Reader: Read + Send;
    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}

Rejected: Causes type explosion. QuotaReader<PathFilterReader<TracingReader<Cursor<Vec<u8>>>>> is unwieldy and every middleware needs a custom wrapper type.

2. RPITIT (Rust 1.75+):

#![allow(unused)]
fn main() {
fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}

Rejected as default: Loses object safety. Can’t use dyn Fs for runtime backend selection.

3. Always box everything: Rejected: Unnecessary overhead on hot path operations like read().

Future Considerations

If profiling shows stream boxing is a bottleneck (unlikely), we can add:

#![allow(unused)]
fn main() {
/// Extension trait for zero-cost streaming when backend type is known
pub trait FsReadTyped: FsRead {
    type Reader: Read + Send;
    fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}

This follows Tower’s pattern of providing both Service (with associated types) and BoxService (with type erasure).

Conclusion

Our boxing strategy mirrors Tower/Axum’s production-proven approach:

  • Zero-cost where it matters (hot path bulk operations, middleware composition)
  • Box where flexibility is needed (streaming I/O, iterator filtering)
  • Opt-in type erasure (explicit boxed() method)

The performance cost is negligible (<1% of I/O time), while the ergonomic and flexibility benefits are substantial.


ADR-026: Companion shell (anyfs-shell)

Status: Accepted (Future)

Context: Users want a low-friction way to explore how different backends and middleware behave without writing a full application.

Decision: Provide a separate companion crate (e.g., anyfs-shell) that exposes a bash-style navigation and file management interface built on FileStorage.

Scope:

  • Commands: ls, cd, cat, cp, mv, rm, mkdir, stat.
  • Navigation and file management only; no full bash scripting, pipes, or job control.
  • All operations route through FileStorage to exercise middleware and backend composition.

Why:

  • Demonstrates backend neutrality and middleware effects in a tangible way.
  • Useful for docs, demos, and quick validation.
  • Keeps the core crates free of CLI/UI dependencies.

ADR-027: Permissive core; security via middleware

Status: Accepted

Context: We need predictable filesystem semantics across backends. Some use cases require strict sandboxing, while others expect full filesystem behavior. Baking security restrictions into core traits would make behavior surprising and backend-dependent.

Decision: Core traits are permissive: all operations supported by a backend are allowed by default. Security controls (limits, access control, read-only, rate limiting, audit) are applied via middleware such as Restrictions, PathFilter, ReadOnly, Quota, RateLimit, and Tracing.

Why:

  • Predictability: core behavior matches std::fs expectations.
  • Backend-agnostic: virtual and host backends share the same contract.
  • Separation of concerns: policy lives in middleware, not storage semantics.
  • Explicit security posture: applications opt in to the protections they need.

ADR-028: Linux-like semantics for virtual backends

Status: Accepted

Context: Cross-platform filesystems differ in case sensitivity, separators, reserved names, and path length limits. Virtual backends need a consistent model that does not inherit OS quirks.

Decision: Virtual backends use Linux-like semantics by default:

  • Case-sensitive paths
  • / as the internal separator
  • No reserved names
  • No max path length
  • No ADS (:stream) support

Why:

  • Cross-platform consistency for the same data.
  • Fewer surprises and reduced security footguns.
  • Simplifies backend implementation and testing.
  • Custom semantics remain possible via middleware or custom backends.

ADR-029: Path resolution in FileStorage

Status: Accepted

Context: Path normalization (//, ., ..) and symlink resolution must be consistent across backends. Implementing this logic in every backend is error-prone and leads to divergent behavior.

Decision: FileStorage performs canonicalization and normalization for virtual backends. Backends receive resolved paths. Real filesystem backends (e.g., VRootFsBackend) delegate to OS resolution plus strict-path containment. FileStorage exposes canonicalize, soft_canonicalize, and anchored_canonicalize for explicit use.

Why:

  • Consistent semantics across all backends.
  • Centralizes security-critical path handling.
  • Simplifies backend implementations.
  • Makes conformance testing straightforward.

ADR-030: Layered trait hierarchy

Status: Accepted

Context: Not all backends can or should implement full POSIX behavior. Forcing a single large trait would make simple backends harder to implement and would obscure capabilities.

Decision: Split the API into layered traits:

  • Core: FsRead, FsWrite, FsDir (combined as Fs)
  • Extensions: FsLink, FsPermissions, FsSync, FsStats
  • FUSE: FsInode
  • POSIX: FsHandles, FsLock, FsXattr
  • Convenience supertraits: Fs, FsFull, FsFuse, FsPosix

Why:

  • Implement the lowest level you need.
  • Clear capability boundaries and trait bounds.
  • Avoids forcing unsupported features on backends.
  • Enables middleware to target specific capabilities.

ADR-031: Indexing as middleware

Status: Accepted (Future)

Context: We want a durable, queryable index of file activity and metadata (for audit trails, drive management tools, and statistics). This indexing should be optional, configurable, and work across all backends.

Decision: Indexing is implemented as middleware (Indexing<B> with IndexLayer), not as a specialized backend. The middleware writes to a sidecar index (SQLite by default) and can evolve to support alternate index engines.

Naming: Use IndexLayer (builder) and Indexing<B> (middleware), consistent with existing layer naming.

Why:

  • Separation of concerns: Indexing is policy/analytics, not storage semantics.
  • Backend-agnostic: Works with Memory, SQLite, VRootFs, and custom backends.
  • Composability: Users opt in and configure it like other middleware (Quota, Tracing).
  • Flexibility: Allows future index engines without changing core traits.
  • DX consistency: Keeps std::fs-style usage via FileStorage with no API changes.

Trade-offs:

  • External OS changes: Not captured unless a future watcher/scan helper is added.
  • Index failures: Choose between strict mode (fail the op) and best-effort mode.

Implementation sketch:

  • IndexLayer::builder().index_file("index.db").consistency(IndexConsistency::Strict)...
  • Wraps open_write() with a counting writer to record final size on close.
  • Updates a nodes table and logs ops entries per operation.

ADR-032: Path Canonicalization via FsPath Trait

Status: Accepted

Context: Path canonicalization (resolving .., ., and symlinks) is needed for consistent path handling. The naive approach of baking this into FileStorage has issues:

  • It’s not testable in isolation
  • It can’t be optimized per-backend
  • N+1 queries for paths like /a/b/c/d/e (each component = separate call)

Decision: Introduce an FsPath trait with canonicalize() and soft_canonicalize() methods that have default implementations but allow backend-specific optimizations.

The Pattern:

#![allow(unused)]
fn main() {
pub trait FsPath: FsRead + FsLink {
    /// Resolve all symlinks and normalize path components.
    /// Returns error if final path doesn't exist.
    fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        default_canonicalize(self, path)
    }

    /// Like canonicalize, but allows non-existent final component.
    fn soft_canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        default_soft_canonicalize(self, path)
    }
}

// Auto-implement for all FsLink implementors
impl<T: FsRead + FsLink> FsPath for T {}
}

Default Implementation:

#![allow(unused)]
fn main() {
fn default_canonicalize<F: FsRead + FsLink>(fs: &F, path: &Path) -> Result<PathBuf, FsError> {
    let mut resolved = PathBuf::from("/");
    for component in path.components() {
        match component {
            Component::RootDir => resolved = PathBuf::from("/"),
            Component::ParentDir => { resolved.pop(); },
            Component::CurDir => {},
            Component::Normal(name) => {
                resolved.push(name);
                if let Ok(meta) = fs.symlink_metadata(&resolved) {
                    if meta.file_type.is_symlink() {
                        let target = fs.read_link(&resolved)?;
                        resolved.pop();
                        resolved = resolve_relative(&resolved, &target);
                    }
                }
            },
            _ => {},
        }
    }
    // Verify final path exists
    if !fs.exists(&resolved)? {
        return Err(FsError::NotFound { path: resolved, operation: "canonicalize" });
    }
    Ok(resolved)
}
}

Backend Optimization Examples:

BackendOptimization
SqliteBackendSingle recursive CTE query resolves entire path
VRootFsBackendDelegates to std::fs::canonicalize() + containment check
MemoryBackendUses default (in-memory is fast anyway)

SQLite Optimized Implementation:

#![allow(unused)]
fn main() {
impl FsPath for SqliteBackend {
    fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        // Single query with recursive CTE
        self.conn.query_row(
            r#"
            WITH RECURSIVE path_resolve(segment, remaining, resolved, depth) AS (
                -- Initial: split path into segments
                SELECT ..., 0
                UNION ALL
                -- Recursive: resolve each segment, following symlinks
                SELECT ...
                FROM path_resolve
                JOIN nodes ON ...
                WHERE depth < 40  -- Loop protection
            )
            SELECT resolved FROM path_resolve 
            WHERE remaining = '' 
            ORDER BY depth DESC LIMIT 1
            "#,
            params![path.to_string_lossy()],
            |row| Ok(PathBuf::from(row.get::<_, String>(0)?))
        ).map_err(|e| FsError::NotFound { path: path.into(), operation: "canonicalize" })
    }
}
}

Why This Design:

BenefitExplanation
Portable defaultWorks with any Fs backend out of the box
OptimizableBackends can override for O(1) queries vs O(n)
TestableCanonicalization logic is separate, can be unit tested
ComposableMiddleware can wrap/intercept canonicalization

FileStorage Integration:

Note: ADR-033 introduces PathResolver as the primary resolution strategy. FsPath remains as an optional backend optimization hook. When a backend implements both FsPath and the default traits, the backend can choose to delegate to its resolver or provide fully custom logic (e.g., SQLite CTE queries).

FileStorage uses a boxed PathResolver internally for resolution (see ADR-033):

#![allow(unused)]
fn main() {
impl<B: Fs> FileStorage<B> {
    pub fn canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
        self.resolver.canonicalize(path.as_ref(), &self.backend as &dyn Fs)
    }

    pub fn soft_canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
        self.resolver.soft_canonicalize(path.as_ref(), &self.backend as &dyn Fs)
    }
}
}

Backends implementing FsPath can provide optimized implementations that the resolver MAY use:

#![allow(unused)]
fn main() {
impl FsPath for SqliteBackend {
    fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        // Optimized: single CTE query instead of iterative resolution
        self.conn.query_row(/* ... */)
    }
}
}

Trade-offs:

ApproachQueriesComplexityBest For
Default implO(n) per componentSimpleMemory, small files
SQLite CTEO(1) single queryModerateLarge trees, many symlinks
OS delegationO(1) syscallSimpleReal filesystem

Conclusion: The FsPath trait provides a clean abstraction that works everywhere but can be optimized where it matters. This follows Rust’s “zero-cost abstractions” philosophy: you don’t pay for what you don’t use, and you can optimize hot paths when needed.


ADR-033: PathResolver Trait for Pluggable Resolution

Status: Accepted

Context: Path resolution (normalizing .., ., and following symlinks) is currently handled in two places:

  1. FsPath trait methods (canonicalize, soft_canonicalize) with backend-specific optimizations
  2. FileStorage performs pre-resolution for non-SelfResolving backends
  3. SelfResolving marker trait opts out of FileStorage resolution

This works, but the resolution algorithm is not a first-class, testable unit. The logic is spread across components, making it harder to:

  • Test path resolution in isolation
  • Benchmark/profile resolution performance
  • Provide third-party custom resolvers
  • Explore alternative resolution strategies (case-insensitive, caching, etc.)

Decision: Introduce a PathResolver trait that encapsulates the path resolution algorithm as a standalone, pluggable component.

The Pattern:

#![allow(unused)]
fn main() {
// In anyfs-backend (trait definition)
/// Strategy trait for path resolution algorithms.
///
/// Encapsulates path normalization, `..`/`.` resolution, and optionally symlink following.
///
/// **Symlink handling:** The base trait works with `&dyn Fs` (no symlink awareness).
/// For symlink-aware resolution, use `PathResolverWithLinks` which accepts `&dyn FsLink`.
/// `IterativeResolver` implements both traits - call the appropriate method based on
/// what your backend supports.
///
/// **Implementation note:** Only `canonicalize` is required. `soft_canonicalize` has a
/// default implementation that canonicalizes the parent and appends the final component.
pub trait PathResolver: Send + Sync {
    /// Resolve path to canonical form (no symlink following).
    /// Normalizes `.` and `..` components only.
    fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError>;
    
    /// Like canonicalize, but allows non-existent final component.
    /// Default: canonicalize parent, append final component.
    fn soft_canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
        match path.parent() {
            Some(parent) if !parent.as_os_str().is_empty() => {
                let canonical_parent = self.canonicalize(parent, fs)?;
                match path.file_name() {
                    Some(name) => Ok(canonical_parent.join(name)),
                    None => Ok(canonical_parent),
                }
            }
            _ => self.canonicalize(path, fs),  // Root or single component
        }
    }
}

/// Extension trait for symlink-aware resolution.
/// Backends implementing FsLink can use this for full resolution.
pub trait PathResolverWithLinks: PathResolver {
    /// Resolve path following symlinks (requires FsLink backend).
    fn canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError>;
    
    /// Like canonicalize_following_links, but allows non-existent final component.
    /// Default: canonicalize parent following links, append final component.
    fn soft_canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError> {
        match path.parent() {
            Some(parent) if !parent.as_os_str().is_empty() => {
                let canonical_parent = self.canonicalize_following_links(parent, fs)?;
                match path.file_name() {
                    Some(name) => Ok(canonical_parent.join(name)),
                    None => Ok(canonical_parent),
                }
            }
            _ => self.canonicalize_following_links(path, fs),
        }
    }
}
}

Built-in Implementations (in anyfs crate):

#![allow(unused)]
fn main() {
/// Default iterative resolver - walks path component by component.
/// Implements both PathResolver and PathResolverWithLinks.
pub struct IterativeResolver {
    max_symlink_depth: usize,  // Default: 40
}

impl PathResolver for IterativeResolver {
    fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
        // Normalize `.` and `..` only - no symlink following
        self.normalize_path(path, fs)
    }
    // ...
}

impl PathResolverWithLinks for IterativeResolver {
    fn canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError> {
        // Full resolution with symlink following
        self.resolve_with_symlinks(path, fs, self.max_symlink_depth)
    }
    // ...
}

/// No-op resolver for SelfResolving backends (OS handles resolution).
pub struct NoOpResolver;

/// LRU cache wrapper around another resolver.
pub struct CachingResolver<R: PathResolver> {
    inner: R,
    cache: Cache<PathBuf, PathBuf>,  // LRU cache, bounded size
}

// Case-folding resolver is NOT built-in. Users can implement one via PathResolver
// trait if needed, but real-world demand is minimal since VRootFsBackend on
// Windows/macOS already gets case-insensitivity from the OS.
}

Integration with FileStorage:

#![allow(unused)]
fn main() {
pub struct FileStorage<B> {
    backend: B,
    resolver: Box<dyn PathResolver>,  // Boxed: resolution is cold path
}

impl<B: Fs> FileStorage<B> {
    pub fn new(backend: B) -> Self {
        Self { backend, resolver: Box::new(IterativeResolver::default()) }
    }

    pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self {
        Self { backend, resolver: Box::new(resolver) }
    }
}

// Usage
let fs = FileStorage::new(backend);  // Uses IterativeResolver
let fs = FileStorage::with_resolver(backend, CachingResolver::new(IterativeResolver::default()));
}

Relationship with FsPath Trait:

ComponentResponsibility
PathResolverAlgorithm for resolution (first-class, testable, swappable)
FsPathBackend-level optimization hook (can delegate to resolver or override entirely)
SelfResolvingRemains as marker OR becomes NoOpResolver assignment

FsPath can delegate to the resolver:

#![allow(unused)]
fn main() {
pub trait FsPath: FsRead + FsLink {
    fn resolver(&self) -> &dyn PathResolver {
        static DEFAULT: IterativeResolver = IterativeResolver::new();
        &DEFAULT
    }
    
    fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        self.resolver().canonicalize(path, self)
    }
}
}

Or backends can override entirely for optimized implementations (e.g., SQLite CTE).

Why This Design:

BenefitExplanation
Testable in isolationUnit test resolvers without full backend setup
BenchmarkableProfile resolution algorithms independently
Third-party extensibleCustom resolvers without touching Fs traits
MaintainablePath resolution is one focused, isolated component
New capabilitiesCase-insensitive, caching, Windows-style resolvers become easy
Backwards compatibleExisting FsPath overrides still work; resolver is additive

Crate Placement:

ComponentCrateRationale
PathResolver traitanyfs-backendCore contract, minimal deps
IterativeResolveranyfsDefault impl, needs Fs methods
NoOpResolveranyfsFor SelfResolving backends
CachingResolveranyfsOptional, needs cache impl
FileStorage integrationanyfsUses resolvers for path handling

Note: Case-folding resolvers are NOT built-in. The PathResolver trait allows users to implement custom resolvers if needed, but we don’t ship speculative features.

Example Use Cases:

#![allow(unused)]
fn main() {
// Default: case-sensitive, symlink-aware (IterativeResolver is ZST, zero-cost)
let fs = FileStorage::new(MemoryBackend::new());

// Caching for read-heavy workloads
let fs = FileStorage::with_resolver(
    backend,
    CachingResolver::new(IterativeResolver::default())
);

// Custom resolver (user-implemented)
let fs = FileStorage::with_resolver(backend, MyCustomResolver::new());

// Testing: verify resolution behavior in isolation
#[test]
fn test_symlink_loop_detection() {
    let resolver = IterativeResolver::new();
    let mock_fs = MockFs::with_symlink_loop();
    let result = resolver.canonicalize(Path::new("/loop"), &mock_fs);
    assert!(matches!(result, Err(FsError::InvalidData { .. })));
}
}

Conclusion: The PathResolver trait provides clean separation of concerns, making path resolution testable, benchmarkable, and extensible. It complements FsPath (backend optimization hook) and can replace or work alongside SelfResolving (via NoOpResolver).


ADR-034: LLM-Oriented Architecture (LOA)

Status: Accepted

Context: AnyFS is being developed with significant LLM assistance (GitHub Copilot, Claude, etc.). Traditional software architecture prioritizes maintainability, testability, and extensibility for human developers. However, when LLMs are part of the development workflow, additional constraints become essential:

  1. LLMs work best with limited context windows - they can’t “understand” an entire codebase
  2. LLMs excel at pattern matching - consistent structure enables better assistance
  3. LLMs need clear contracts - well-documented interfaces reduce hallucination
  4. LLMs benefit from isolated components - fixing one thing shouldn’t require understanding everything

These same properties also benefit:

  • Open source contributors (quick onboarding)
  • Code review (focused changes)
  • Parallel development (independent components)
  • AI-generated tests and documentation

Decision: Structure AnyFS using LLM-Oriented Architecture (LOA) - a methodology where every component is independently understandable, testable, and fixable with only local context.

The Five Pillars:

PillarDescriptionImplementation
Single ResponsibilityOne file = one conceptquota.rs, iterative.rs, etc.
Contract-FirstTraits define the specDocumented trait invariants
Isolated TestingTests use mocks onlyNo real backends in unit tests
Rich ErrorsErrors explain the fixContext in every FsError variant
Boundary DocsExamples at every APIUsage example in every doc comment

File Structure Convention:

#![allow(unused)]
fn main() {
//! # Component Name
//!
//! ## Responsibility
//! - Single bullet point
//!
//! ## Dependencies  
//! - Traits/types only
//!
//! ## Usage
//! ```rust
//! // Minimal example
//! ```

// ============================================================================
// Types
// ============================================================================

// ============================================================================
// Trait Implementations
// ============================================================================

// ============================================================================
// Public API
// ============================================================================

// ============================================================================
// Private Helpers
// ============================================================================

// ============================================================================
// Tests
// ============================================================================
}

Component Isolation Checklist:

  • Single file per component
  • Implements a trait with documented invariants
  • Dependencies are traits/types, not implementations
  • Tests use mocks, not real backends
  • Error messages explain what went wrong and how to fix
  • Doc comment shows standalone usage example
  • No global state
  • Send + Sync where required

LLM Prompting Patterns:

The architecture enables these clean prompts:

# Implement (user-provided resolver example)
Implement a case-folding resolver in your project.
Contract: Implement `PathResolver` trait from anyfs-backend.
Test: "/Foo/BAR" → "/foo/bar"

# Fix
Bug: Quota<B> doesn't account for existing file size.
File: src/middleware/quota.rs
Error: QuotaExceeded writing 50 bytes to 30-byte file with 100-byte limit.

# Review
Does this change maintain the PathResolver contract?
Are edge cases handled?
Are error messages informative?

Deliverables:

  1. AGENTS.md - Instructions for LLMs contributing to the codebase
  2. LLM Development Methodology Guide - Full methodology documentation
  3. llm-context.md - Context7-style API guide for LLMs using the library

Why This Design:

BenefitFor LLMsFor Humans
Isolated componentsFits in context windowEasy to understand
Clear contractsReduces hallucinationSelf-documenting
Consistent structurePattern matching worksPredictable codebase
Rich errorsCan suggest fixesQuick debugging
Focused testsCan verify changesFast CI

Trade-offs:

ApproachProsCons
Deep abstractionMaximum isolationMore files, more indirection
Monolithic designFewer filesLLMs can’t reason about it
LOA (chosen)LLM-friendly + maintainableRequires discipline

Relationship to Other ADRs:

  • ADR-030 (Layered traits): LOA extends this with per-file isolation
  • ADR-033 (PathResolver): Example of LOA - resolver is isolated, testable, replaceable
  • ADR-025 (Strategic Boxing): LOA prefers simplicity over micro-optimization

Conclusion: LLM-Oriented Architecture is not just about AI. It’s about creating a codebase where any component can be understood, tested, fixed, or replaced with only local context. This benefits LLMs, open source contributors, code reviewers, and future maintainers equally. As AI-assisted development becomes standard, LOA positions AnyFS as a reference implementation for sustainable human-AI collaboration.

See Also: LLM Development Methodology Guide

IndexedBackend Pattern

SQLite Metadata + Content-Addressed Blob Storage

This document describes the IndexedBackend architecture pattern: separating filesystem metadata (stored in SQLite) from file content (stored as blobs). This enables efficient queries, large file support, and flexible storage backends.

Ecosystem Implementation: The anyfs-indexed crate provides IndexedBackend as a production-ready implementation using local disk blobs. See the Backends Guide for usage. This document covers the underlying design pattern for those building custom implementations (e.g., with S3, cloud storage, or custom blob stores).


Overview

The IndexedBackend pattern separates:

  • Metadata (directory structure, inodes, permissions) → SQLite
  • Content (file bytes) → Content-Addressed Storage (CAS)
┌─────────────────────────────────────────────────────────┐
│            IndexedBackend (pattern)                     │
│  ┌─────────────────────┐    ┌────────────────────────┐  │
│  │   SQLite Metadata   │    │   Blob Store (CAS)     │  │
│  │                     │    │                        │  │
│  │  - inodes           │    │  - content-addressed   │  │
│  │  - dir_entries      │    │  - deduplicated        │  │
│  │  - blob references  │    │  - S3, local, etc.     │  │
│  │  - audit log        │    │                        │  │
│  └─────────────────────┘    └────────────────────────┘  │
└─────────────────────────────────────────────────────────┘

Custom backends can use S3, cloud storage, or other blob stores.
IndexedBackend implements a simpler variant with UUID-named local blobs
(optimized for streaming; see note below on storage models).

Why this pattern?

  • SQLite is great for metadata queries (directory listings, stats, audit)
  • Blob stores scale better for large file content
  • Content-addressing enables deduplication
  • Separating concerns enables independent scaling

Storage Model Variants

ModelBlob NamingDedupBest For
Content-AddressedSHA-256 of content✅ YesCloud/S3, archival, multi-tenant
UUID+Timestamp{uuid}-{timestamp}.bin❌ NoStreaming large files, simplicity

IndexedBackend uses UUID+Timestamp naming because:

  • Large files can be streamed without buffering the entire file to compute a hash
  • Write latency is consistent (no hash computation)
  • Simpler garbage collection (delete blob when reference removed)

Custom implementations may prefer content-addressed storage when:

  • Deduplication is valuable (many users uploading same files)
  • Using cloud blob stores with native CAS support (S3, GCS)
  • Building archival systems where write latency is acceptable

Framework Validation

Do Current Traits Support This?

Yes. The Fs traits define operations, not storage implementation.

Trait MethodHybrid Implementation
read(path)SQLite lookup → blob fetch
write(path, data)Blob upload → SQLite update
metadata(path)SQLite query only
read_dir(path)SQLite query only
remove_file(path)SQLite update (refcount–)
rename(from, to)SQLite update only
copy(from, to)SQLite update (refcount++)

The traits don’t care where bytes come from - that’s the backend’s business.

Thread Safety

Current design requires &self methods with interior mutability. For hybrid:

#![allow(unused)]
fn main() {
pub struct CustomIndexedBackend {
    // SQLite needs single-writer (see "Write Queue" below)
    metadata: Arc<Mutex<Connection>>,

    // Blob store is typically already thread-safe
    blobs: Arc<dyn BlobStore>,

    // Write queue for serializing SQLite writes
    write_tx: mpsc::Sender<WriteCmd>,
}
}

This aligns with ADR-023 (interior mutability).


Data Model

SQLite Schema

-- Inode table (one row per file/directory/symlink)
CREATE TABLE nodes (
    inode       INTEGER PRIMARY KEY,
    parent      INTEGER NOT NULL,
    name        TEXT NOT NULL,
    node_type   TEXT NOT NULL,  -- 'file', 'dir', 'symlink'
    size        INTEGER NOT NULL DEFAULT 0,
    mode        INTEGER NOT NULL DEFAULT 420,  -- 0o644
    nlink       INTEGER NOT NULL DEFAULT 1,
    blob_id     TEXT,           -- NULL for directories
    symlink_target TEXT,        -- NULL unless symlink
    created_at  INTEGER NOT NULL,
    modified_at INTEGER NOT NULL,
    accessed_at INTEGER NOT NULL,

    UNIQUE(parent, name)
);

-- Root directory (inode 1)
INSERT INTO nodes (inode, parent, name, node_type, size, mode, created_at, modified_at, accessed_at)
VALUES (1, 1, '', 'dir', 0, 493, strftime('%s', 'now'), strftime('%s', 'now'), strftime('%s', 'now'));

-- Blob reference tracking (for dedup + GC)
CREATE TABLE blobs (
    blob_id     TEXT PRIMARY KEY,  -- sha256 hex
    size        INTEGER NOT NULL,
    refcount    INTEGER NOT NULL DEFAULT 0,
    created_at  INTEGER NOT NULL
);

-- Audit log (optional but recommended)
CREATE TABLE audit (
    seq         INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp   INTEGER NOT NULL,
    operation   TEXT NOT NULL,
    path        TEXT,
    actor       TEXT,
    details     TEXT  -- JSON
);

-- Indexes
CREATE INDEX idx_nodes_parent ON nodes(parent);
CREATE INDEX idx_nodes_blob ON nodes(blob_id) WHERE blob_id IS NOT NULL;
CREATE INDEX idx_blobs_refcount ON blobs(refcount) WHERE refcount = 0;

Blob Store Interface

#![allow(unused)]
fn main() {
/// Content-addressed blob storage.
pub trait BlobStore: Send + Sync {
    /// Store bytes, returns content hash (blob_id).
    fn put(&self, data: &[u8]) -> Result<String, BlobError>;

    /// Retrieve bytes by content hash.
    fn get(&self, blob_id: &str) -> Result<Vec<u8>, BlobError>;

    /// Check if blob exists.
    fn exists(&self, blob_id: &str) -> Result<bool, BlobError>;

    /// Delete blob (only call after refcount reaches 0).
    fn delete(&self, blob_id: &str) -> Result<(), BlobError>;

    /// Streaming read for large files.
    fn open_read(&self, blob_id: &str) -> Result<Box<dyn Read + Send>, BlobError>;

    /// Streaming write, returns blob_id on completion.
    fn open_write(&self) -> Result<Box<dyn BlobWriter>, BlobError>;
}

pub trait BlobWriter: Write + Send {
    /// Finalize the blob and return its content hash.
    fn finalize(self: Box<Self>) -> Result<String, BlobError>;
}
}

Implementations could be:

  • LocalCasBackend - local directory with content-addressed files
  • S3BlobStore - S3-compatible object storage
  • MemoryBlobStore - in-memory for testing

Implementation Sketch

Core Structure

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter, DirEntry, FileType};
use rusqlite::Connection;
use std::sync::{Arc, Mutex};
use std::path::{Path, PathBuf};
use tokio::sync::mpsc;

pub struct CustomIndexedBackend {
    /// SQLite connection (metadata)
    db: Arc<Mutex<Connection>>,

    /// Content-addressed blob storage
    blobs: Arc<dyn BlobStore>,

    /// Write command queue (single-writer pattern)
    write_tx: mpsc::UnboundedSender<WriteCmd>,

    /// Background writer handle
    _writer_handle: Arc<WriterHandle>,
}

enum WriteCmd {
    Write {
        path: PathBuf,
        blob_id: String,
        size: u64,
        reply: oneshot::Sender<Result<(), FsError>>,
    },
    Remove {
        path: PathBuf,
        reply: oneshot::Sender<Result<(), FsError>>,
    },
    CreateDir {
        path: PathBuf,
        reply: oneshot::Sender<Result<(), FsError>>,
    },
    // ... other write operations
}
}

Read Operations (Direct)

Read operations can query SQLite and blob store directly (no queue needed):

#![allow(unused)]
fn main() {
impl FsRead for CustomIndexedBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();

        // 1. Query SQLite for blob_id
        let db = self.db.lock().map_err(|_| FsError::Backend("lock poisoned".into()))?;

        let (blob_id, node_type): (Option<String>, String) = db.query_row(
            "SELECT blob_id, node_type FROM nodes WHERE inode = (
                SELECT inode FROM nodes WHERE parent = ? AND name = ?
            )",
            // ... path resolution params
            |row| Ok((row.get(0)?, row.get(1)?)),
        ).map_err(|_| FsError::NotFound { path: path.to_path_buf() })?;

        if node_type != "file" {
            return Err(FsError::NotAFile { path: path.to_path_buf() });
        }

        let blob_id = blob_id.ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;

        drop(db);  // Release lock before blob fetch

        // 2. Fetch from blob store
        self.blobs.get(&blob_id)
            .map_err(|e| FsError::Backend(e.to_string()))
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        let path = path.as_ref();
        let db = self.db.lock().map_err(|_| FsError::Backend("lock poisoned".into()))?;

        // Pure SQLite query
        let exists: bool = db.query_row(
            "SELECT EXISTS(SELECT 1 FROM nodes WHERE parent = ? AND name = ?)",
            // ... params
            |row| row.get(0),
        ).unwrap_or(false);

        Ok(exists)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        let path = path.as_ref();
        let db = self.db.lock().map_err(|_| FsError::Backend("lock poisoned".into()))?;

        // Pure SQLite query - no blob store needed
        db.query_row(
            "SELECT node_type, size, mode, nlink, created_at, modified_at, accessed_at, inode
             FROM nodes WHERE parent = ? AND name = ?",
            // ... params
            |row| {
                let node_type: String = row.get(0)?;
                Ok(Metadata {
                    file_type: match node_type.as_str() {
                        "file" => FileType::File,
                        "dir" => FileType::Directory,
                        "symlink" => FileType::Symlink,
                        _ => FileType::File,
                    },
                    size: row.get(1)?,
                    permissions: Some(row.get(2)?),
                    // ... other fields
                })
            },
        ).map_err(|_| FsError::NotFound { path: path.to_path_buf() })
    }

    // ... other FsRead methods
}
}

Write Operations (Two-Phase Commit)

Writes use a two-phase pattern: upload blob first, then commit SQLite:

#![allow(unused)]
fn main() {
impl FsWrite for CustomIndexedBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();

        // Phase 1: Upload blob (can fail independently)
        let blob_id = self.blobs.put(data)
            .map_err(|e| FsError::Backend(format!("blob upload failed: {}", e)))?;

        // Phase 2: Commit metadata (via write queue)
        let (tx, rx) = oneshot::channel();

        self.write_tx.send(WriteCmd::Write {
            path,
            blob_id,
            size: data.len() as u64,
            reply: tx,
        }).map_err(|_| FsError::Backend("write queue closed".into()))?;

        // Wait for SQLite commit
        rx.blocking_recv()
            .map_err(|_| FsError::Backend("write cancelled".into()))?
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();

        // Queue the removal (blob cleanup happens in background via GC)
        let (tx, rx) = oneshot::channel();

        self.write_tx.send(WriteCmd::Remove { path, reply: tx })
            .map_err(|_| FsError::Backend("write queue closed".into()))?;

        rx.blocking_recv()
            .map_err(|_| FsError::Backend("remove cancelled".into()))?
    }

    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        // Copy is just a metadata operation - increment refcount, no blob copy!
        let (tx, rx) = oneshot::channel();

        self.write_tx.send(WriteCmd::Copy {
            from: from.as_ref().to_path_buf(),
            to: to.as_ref().to_path_buf(),
            reply: tx,
        }).map_err(|_| FsError::Backend("write queue closed".into()))?;

        rx.blocking_recv()
            .map_err(|_| FsError::Backend("copy cancelled".into()))?
    }

    // ... other FsWrite methods
}
}

Write Queue Worker

The single-writer pattern for SQLite:

#![allow(unused)]
fn main() {
async fn write_worker(
    db: Arc<Mutex<Connection>>,
    blobs: Arc<dyn BlobStore>,
    mut rx: mpsc::UnboundedReceiver<WriteCmd>,
) {
    while let Some(cmd) = rx.recv().await {
        let result = {
            let mut db = db.lock().unwrap();

            match cmd {
                WriteCmd::Write { path, blob_id, size, reply } => {
                    let result = db.execute_batch(&format!(r#"
                        BEGIN;

                        -- Upsert blob record
                        INSERT INTO blobs (blob_id, size, refcount, created_at)
                        VALUES ('{blob_id}', {size}, 1, strftime('%s', 'now'))
                        ON CONFLICT(blob_id) DO UPDATE SET refcount = refcount + 1;

                        -- Update or insert node
                        -- (simplified - real impl needs path resolution)

                        -- Audit log
                        INSERT INTO audit (timestamp, operation, path)
                        VALUES (strftime('%s', 'now'), 'write', '{path}');

                        COMMIT;
                    "#));

                    let _ = reply.send(result.map_err(|e| FsError::Backend(e.to_string())));
                }

                WriteCmd::Remove { path, reply } => {
                    // Decrement refcount (GC cleans up when refcount = 0)
                    let result = db.execute_batch(&format!(r#"
                        BEGIN;

                        -- Get blob_id before delete
                        -- Decrement refcount
                        -- Remove node
                        -- Audit log

                        COMMIT;
                    "#));

                    let _ = reply.send(result.map_err(|e| FsError::Backend(e.to_string())));
                }

                // ... other commands
            }
        };
    }
}
}

Deduplication

Content-addressing gives you dedup for free:

#![allow(unused)]
fn main() {
impl BlobStore for LocalCasBackend {
    fn put(&self, data: &[u8]) -> Result<String, BlobError> {
        // Hash the content
        let hash = sha256(data);
        let blob_id = hex::encode(hash);

        // Check if already exists
        let blob_path = self.root.join(&blob_id[0..2]).join(&blob_id);

        if blob_path.exists() {
            // Already have this content - dedup!
            return Ok(blob_id);
        }

        // Store new blob
        std::fs::create_dir_all(blob_path.parent().unwrap())?;
        std::fs::write(&blob_path, data)?;

        Ok(blob_id)
    }
}
}

Dedup in action:

  • User A writes report.pdf (10 MB) → blob abc123, refcount = 1
  • User B writes identical report.pdf → same blob abc123, refcount = 2
  • Physical storage: 10 MB (not 20 MB)

Refcount Management

-- On file write (new reference to blob)
UPDATE blobs SET refcount = refcount + 1 WHERE blob_id = ?;

-- On file delete
UPDATE blobs SET refcount = refcount - 1 WHERE blob_id = ?;

-- On copy (no blob copy needed!)
UPDATE blobs SET refcount = refcount + 1 WHERE blob_id = ?;

SQLite Performance

The SQLite metadata database benefits from the same tuning as SqliteBackend:

SettingDefaultPurposeTradeoff
journal_modeWALConcurrent reads during writesCreates .wal/.shm files
synchronousFULLIndex integrity on power lossSafe default, opt-in to NORMAL
cache_size16MBSmaller cache for metadata-onlyTune based on index size
busy_timeout5000Gracefully handle lock contentionPrevents SQLITE_BUSY errors
auto_vacuumINCREMENTALReclaim space from deletionsGradual space recovery

Why FULL synchronous: Index corruption means paths no longer resolve to blobs—blobs become orphaned and unreachable. Use FULL as the safe default; opt-in to NORMAL only with battery-backed storage or when index can be rebuilt.

SQL Indexes (critical):

CREATE INDEX idx_nodes_parent ON nodes(parent);
CREATE INDEX idx_nodes_blob ON nodes(blob_id) WHERE blob_id IS NOT NULL;
CREATE INDEX idx_blobs_refcount ON blobs(refcount) WHERE refcount = 0;

Without proper indexes, path lookups become full table scans—catastrophic for large filesystems.

Connection pooling: 4-8 reader connections for concurrent metadata queries; single writer for updates. See SQLite Operations Guide for detailed patterns.


Garbage Collection

Blobs with refcount = 0 are orphans and can be deleted:

#![allow(unused)]
fn main() {
impl CustomIndexedBackend {
    /// Run garbage collection (call periodically or on-demand).
    pub fn gc(&self) -> Result<GcStats, FsError> {
        let db = self.db.lock().map_err(|_| FsError::Backend("lock".into()))?;

        // Find orphaned blobs
        let orphans: Vec<String> = db.prepare(
            "SELECT blob_id FROM blobs WHERE refcount = 0"
        )?.query_map([], |row| row.get(0))?
          .filter_map(|r| r.ok())
          .collect();

        drop(db);

        // Delete from blob store
        let mut deleted = 0;
        for blob_id in &orphans {
            if self.blobs.delete(blob_id).is_ok() {
                deleted += 1;
            }
        }

        // Remove from SQLite
        let db = self.db.lock().unwrap();
        db.execute(
            "DELETE FROM blobs WHERE refcount = 0",
            [],
        )?;

        Ok(GcStats { orphans_found: orphans.len(), blobs_deleted: deleted })
    }
}
}

GC Safety:

  • Never delete blobs referenced by snapshots
  • Add snapshot_refs table or use refcount that includes snapshot references
  • Run GC in background, not during writes

Snapshots and Backup

Creating a Snapshot

#![allow(unused)]
fn main() {
impl CustomIndexedBackend {
    /// Create a point-in-time snapshot.
    pub fn snapshot(&self, name: &str) -> Result<SnapshotId, FsError> {
        let db = self.db.lock().unwrap();

        db.execute_batch(&format!(r#"
            BEGIN;

            -- Record snapshot
            INSERT INTO snapshots (name, created_at, root_manifest)
            VALUES ('{name}', strftime('%s', 'now'),
                    (SELECT json_group_array(blob_id) FROM blobs WHERE refcount > 0));

            -- Pin all current blobs (prevent GC)
            UPDATE blobs SET refcount = refcount + 1
            WHERE blob_id IN (SELECT blob_id FROM nodes WHERE blob_id IS NOT NULL);

            COMMIT;
        "#))?;

        Ok(SnapshotId(name.to_string()))
    }

    /// Export as single portable artifact.
    pub fn export(&self, dest: impl AsRef<Path>) -> Result<(), FsError> {
        // 1. SQLite backup API for metadata
        let db = self.db.lock().unwrap();
        let backup_db = Connection::open(dest.as_ref().join("metadata.db"))?;
        db.backup(rusqlite::DatabaseName::Main, &backup_db, None)?;

        // 2. Copy referenced blobs
        let blob_ids: Vec<String> = db.prepare(
            "SELECT DISTINCT blob_id FROM nodes WHERE blob_id IS NOT NULL"
        )?.query_map([], |row| row.get(0))?
          .filter_map(|r| r.ok())
          .collect();

        drop(db);

        let blobs_dir = dest.as_ref().join("blobs");
        std::fs::create_dir_all(&blobs_dir)?;

        for blob_id in blob_ids {
            let data = self.blobs.get(&blob_id)?;
            std::fs::write(blobs_dir.join(&blob_id), data)?;
        }

        Ok(())
    }
}
}

Middleware Integration

Middleware works unchanged - it wraps the hybrid backend like any other:

#![allow(unused)]
fn main() {
use anyfs::{FileStorage, QuotaLayer, TracingLayer, PathFilterLayer};

let backend = CustomIndexedBackend::open("drive.db", LocalCasBackend::new("./blobs"))?;

// Standard middleware stack
let backend = backend
    .layer(QuotaLayer::builder()
        .max_total_size(50 * 1024 * 1024 * 1024)  // 50 GB
        .build())
    .layer(PathFilterLayer::builder()
        .deny("**/.env")
        .build())
    .layer(TracingLayer::new());

let fs = FileStorage::new(backend);

// Use like any other filesystem
fs.write("/documents/report.pdf", &pdf_bytes)?;
}

Quota tracking note: QuotaLayer tracks logical size (what users see), not physical size (with dedup). For physical tracking, the backend could expose physical_usage() separately.


Async Considerations

The hybrid pattern benefits significantly from async (ADR-024):

OperationSync PainAsync Benefit
Blob upload to S3Blocks threadConcurrent uploads
Multiple readsSequentialParallel fetches
Write queueblocking_recv()Native async channel
GCBlocks all opsBackground task

When AsyncFs traits exist (ADR-024), the hybrid backend can use them naturally:

#![allow(unused)]
fn main() {
#[async_trait]
impl AsyncFsRead for CustomIndexedBackend {
    async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let blob_id = self.lookup_blob_id(path).await?;
        self.blobs.get_async(&blob_id).await  // Non-blocking!
    }
}
}

Identified Gaps

Areas where the current framework could be enhanced:

GapCurrent StateRecommendation
Two-phase commit patternNot documentedAdd to backend guide
Refcount/GC patternsNot documentedAdd section
Streaming large filesopen_read/open_write existDocument chunked patterns
Physical vs logical sizeQuota tracks logical onlyConsider PhysicalStats trait
Background tasks (GC)No patternDocument spawn pattern

Summary

Framework validation: PASSED

The current AnyFS trait design supports hybrid backends:

  • Traits define operations, not storage
  • Interior mutability allows single-writer patterns
  • Middleware composes unchanged
  • Async strategy (ADR-024) enhances this pattern

Key patterns for hybrid backends:

  1. Single-writer queue for SQLite
  2. Two-phase commit (blob upload → SQLite commit)
  3. Content-addressing for dedup
  4. Refcounting for GC safety
  5. Snapshot pinning for backup safety

This validates that AnyFS is flexible enough for advanced storage architectures while maintaining its simple middleware composition model.

Zero-Cost Alternatives for I/O Operations

This document analyzes alternatives to dynamic dispatch (Box<dyn Trait>) for streaming I/O and directory iteration.

Decision: See ADR-025: Strategic Boxing for the formal decision.

TL;DR: We follow Tower/Axum’s approach - zero-cost on hot path (read(), write()), box at cold path boundaries (open_read(), read_dir()). We avoid heap allocations and dynamic dispatch unless they buy flexibility with negligible performance impact.


Current Design (Dynamic Dispatch)

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}

// Where ReadDirIter is:
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send>);
}

Cost: One heap allocation per open_read(), open_write(), or read_dir() call.


Option 1: Associated Types (Classic Approach)

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    type Reader: Read + Send;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError>;
}

pub trait FsDir: Send + Sync {
    type DirIter: Iterator<Item = Result<DirEntry, FsError>> + Send;

    fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError>;
}
}

Implementation

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    type Reader = std::io::Cursor<Vec<u8>>;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
        let data = self.read(path)?;
        Ok(std::io::Cursor::new(data))
    }
}

impl FsDir for MemoryBackend {
    type DirIter = std::vec::IntoIter<Result<DirEntry, FsError>>;

    fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError> {
        let entries = self.collect_entries(path)?;
        Ok(entries.into_iter())
    }
}
}

Middleware Propagation Problem

#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Quota<B> {
    // Must define our own Reader type that wraps B::Reader
    type Reader = QuotaReader<B::Reader>;

    fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
        let inner = self.inner.open_read(path)?;
        Ok(QuotaReader::new(inner, self.usage.clone()))
    }
}

// Every middleware needs a custom wrapper type
struct QuotaReader<R> {
    inner: R,
    usage: Arc<RwLock<QuotaUsage>>,
}

impl<R: Read> Read for QuotaReader<R> {
    fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
        // Track bytes read if needed
        self.inner.read(buf)
    }
}
}

The Type Explosion

With a middleware stack like Quota<PathFilter<Tracing<MemoryBackend>>>:

#![allow(unused)]
fn main() {
type FinalReader = QuotaReader<PathFilterReader<TracingReader<Cursor<Vec<u8>>>>>;
type FinalDirIter = QuotaIter<PathFilterIter<TracingIter<IntoIter<Result<DirEntry, FsError>>>>>;
}

Verdict

AspectAssessment
Heap allocations✅ None
Type complexity❌ Exponential growth
Middleware authoring❌ Every middleware needs wrapper types
User ergonomics⚠️ Type annotations become unwieldy
Compile times❌ Longer due to monomorphization

Not recommended as the primary API due to complexity explosion.


Option 2: RPITIT (Rust 1.75+)

Return Position Impl Trait in Traits allows:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path)
        -> Result<impl Iterator<Item = Result<DirEntry, FsError>> + Send, FsError>;
}
}

How It Works

The compiler infers a unique anonymous type for each implementor:

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        let data = self.read(path)?;
        Ok(std::io::Cursor::new(data))  // Returns Cursor<Vec<u8>>, but caller sees impl Read
    }
}

impl FsRead for SqliteBackend {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        Ok(SqliteReader::new(self.conn.clone(), path))  // Different type, same interface
    }
}
}

Middleware Still Works

#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Tracing<B> {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
        let span = tracing::span!(Level::DEBUG, "open_read");
        let _guard = span.enter();
        self.inner.open_read(path)  // Just forward - return type is inferred
    }
}
}

The Catch: Object Safety

RPITIT makes traits non-object-safe. You cannot do:

#![allow(unused)]
fn main() {
// This won't compile with RPITIT
let backends: Vec<Box<dyn FsRead>> = vec![...];
}

Verdict

AspectAssessment
Heap allocations✅ None
Type complexity✅ Hidden behind impl Trait
Middleware authoring✅ Simple forwarding
User ergonomics✅ Clean API
Object safety❌ Lost - can’t use dyn FsRead
Rust version⚠️ Requires 1.75+

Good for performance-critical paths but sacrifices dyn usage.


Option 3: Generic Associated Types (GATs)

For readers that borrow from the backend:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    type Reader<'a>: Read + Send where Self: 'a;

    fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError>;
}
}

Use Case: Zero-Copy Reads

#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
    type Reader<'a> = &'a [u8];  // Borrow directly from internal storage!

    fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError> {
        let data = self.storage.read().unwrap();
        let bytes = data.get(path.as_ref())
            .ok_or(FsError::NotFound { path: path.as_ref().to_path_buf() })?;
        Ok(bytes.as_slice())
    }
}
}

Complexity

GATs are powerful but add significant complexity:

  • Lifetime parameters propagate through middleware
  • Not all backends can provide borrowed data (SQLite must copy)
  • Makes trait definitions harder to understand

Verdict

AspectAssessment
Heap allocations✅ Can be zero-copy
Type complexity❌ High (lifetimes everywhere)
Middleware authoring❌ Complex lifetime handling
Use case fit⚠️ Only benefits backends with owned data

Overkill for most use cases. Consider only for specialized zero-copy scenarios.


Provide both dynamic and static APIs:

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    /// Dynamic dispatch version (simple, flexible)
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

/// Extension trait for zero-cost static dispatch
pub trait FsReadTyped: FsRead {
    type Reader: Read + Send;

    /// Static dispatch version (zero-cost, less flexible)
    fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}

// Blanket impl for convenience when types align
impl<T: FsReadTyped> FsRead for T {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
        Ok(Box::new(self.open_read_typed(path)?))
    }
}
}

Usage

#![allow(unused)]
fn main() {
// Default: dynamic dispatch (works everywhere)
let reader = fs.open_read("/file.txt")?;

// Performance-critical: static dispatch
let reader: MemoryReader = fs.open_read_typed("/file.txt")?;
}

Verdict

AspectAssessment
Heap allocations✅ Optional (use _typed to avoid)
Type complexity✅ Hidden unless you opt-in
Middleware authoring✅ Only implement base trait
User ergonomics✅ Simple default, power when needed
Object safety✅ Base trait remains object-safe

Best of both worlds - simple default, zero-cost opt-in.


Option 5: Callback-Based Iteration

Avoid returning iterators entirely:

#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
    fn for_each_entry<F>(&self, path: &Path, f: F) -> Result<(), FsError>
    where
        F: FnMut(DirEntry) -> ControlFlow<(), ()>;
}
}

Usage

#![allow(unused)]
fn main() {
fs.for_each_entry("/dir", |entry| {
    println!("{}", entry.name);
    ControlFlow::Continue(())
})?;
}

Verdict

AspectAssessment
Heap allocations✅ None
Ergonomics❌ Callbacks are awkward
Early exit✅ Via ControlFlow::Break
Composability❌ Can’t chain iterator methods

Not recommended as primary API. Could be added as optimization option.


Option 6: Stack-Allocated Small Buffer

For directory iteration, most directories are small:

#![allow(unused)]
fn main() {
use smallvec::SmallVec;

pub struct ReadDirIter {
    // Stack-allocate up to 32 entries, heap only if larger
    entries: SmallVec<[Result<DirEntry, FsError>; 32]>,
    index: usize,
}
}

Verdict

AspectAssessment
Heap allocations⚠️ Avoided for small directories
Memory overhead⚠️ Larger stack frames
Dependencies⚠️ Adds smallvec crate

Reasonable optimization for directory iteration specifically.


Recommendation

Primary API: Keep Dynamic Dispatch

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}

pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}
}

Why:

  1. Simplicity - One type to learn, one API
  2. Object safety - Can use Box<dyn Fs> for runtime polymorphism
  3. Middleware simplicity - No wrapper types needed
  4. Actual cost is low - One allocation per stream open, not per read

Optional: Static Dispatch Extension (Fast Path)

For performance-critical code, offer typed variants. This is the first-class fast path for hot loops when the backend type is known:

#![allow(unused)]
fn main() {
pub trait FsReadTyped: FsRead {
    type Reader: Read + Send;
    fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}

Future: RPITIT When Object Safety Not Needed

If a user doesn’t need dyn Fs, they can define their own trait:

#![allow(unused)]
fn main() {
pub trait FsReadStatic: Send + Sync {
    fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}
}

Cost Analysis: Is It Actually a Problem?

Heap Allocation Cost

OperationAllocationsTypical SizeCost
open_read()1~24-48 bytes (vtable + pointer)~20-50ns
read() (data)0-1File sizeDominates
read_dir()1~24-48 bytes~20-50ns
Iteration0--

The allocation is dwarfed by actual I/O time. For a 4KB file read from SQLite or disk, the Box allocation is <0.1% of total time.

When It Matters

ScenarioMatters?
Reading large filesNo - I/O dominates
Reading many small filesMaybe - consider batching
Hot loop micro-benchmarksYes
Real-world applicationsRarely

Conclusion

Dynamic dispatch is the right default. The cost is negligible for real workloads, and the ergonomic benefits are substantial. Offer static dispatch as an opt-in escape hatch for the rare cases where it matters.


Summary Decision Matrix

ApproachAlloc-FreeSimpleObject-SafeRecommended
Current (Box<dyn>)✅ Default
Associated Types❌ Too complex
RPITIT⚠️ When no dyn needed
GATs❌ Overkill
Hybrid✅ opt-in✅ Best of both
Callbacks❌ Awkward API
SmallVec⚠️⚠️ For ReadDirIter

Indexing Middleware (Design Plan)

Status: Accepted (Future) — See ADR-031 Scope: Design plan only (no API break)


Summary

Provide a consistent, queryable index of file activity and metadata for real filesystems (SQLite default). The index tracks operations (create, write, rename, delete) and maintains a catalog of files for fast queries and statistics. This enables workflows like “manage a flash drive and query every change” and “mount a drive and get an implicit audit trail.”

Direction: Middleware-only. Indexing is a composable layer users opt into when they want a queryable catalog of file activity.


Goals

  • Preserve std::fs-style DX via FileStorage (no change to core traits).
  • Track file operations with timestamps in a durable index (SQLite default).
  • Provide fast queries (by path, prefix, mtime, size, hash).
  • Keep index consistent for operations executed through AnyFS.
  • Keep the design open to future index engines via a small trait (SQLite default).

Non-Goals

  • Full OS-level auditing outside AnyFS (requires kernel hooks).
  • Mandatory hashing of all files (optional and expensive).
  • Replacing Tracing (indexing is storage + query, tracing is instrumentation).

Architecture (Middleware-Only)

Indexing Middleware (Primary)

Shape: Indexing<B> where B: Fs

Layer: IndexLayer (builder-based, like QuotaLayer, TracingLayer)

Behavior: Intercepts operations and writes entries into an index (SQLite by default). Works on all backends (Memory, SQLite, VRootFs, custom). Guarantees apply only to operations that flow through AnyFS.

Pros

  • Backend-agnostic.
  • Useful for virtual backends too.
  • No special-case OS behavior.

Cons

  • External changes on real FS are not captured (unless a watcher/scan helper is added later).

Where It Fits

Use CaseRecommended
AnyFS app wants an audit trailIndexing<B> middleware
Virtual backend needs queryable catalogIndexing<B> middleware
Real FS with external edits to trackIndexing middleware + future watcher/scan helper
Mounted drive where all access goes through AnyFSIndexing middleware (enough)

Consistency Model

  • Through AnyFS: Strong consistency for index updates in strict mode.
  • External OS changes: Not captured by default. A future watcher/scan helper can reconcile.

Modes:

#![allow(unused)]
fn main() {
enum IndexConsistency {
    Strict,      // If index update fails, return error from FS op
    BestEffort,  // FS op succeeds even if index update fails
}
}

Index Schema (SQLite)

Minimal schema focused on query speed and durability:

CREATE TABLE IF NOT EXISTS nodes (
  path TEXT PRIMARY KEY,
  parent_path TEXT NOT NULL,
  file_type INTEGER NOT NULL,      -- 0=file, 1=dir, 2=symlink
  size INTEGER NOT NULL DEFAULT 0,
  inode INTEGER,
  nlink INTEGER,
  permissions INTEGER,
  created_at INTEGER,
  modified_at INTEGER,
  accessed_at INTEGER,
  symlink_target TEXT,
  hash BLOB,                        -- optional
  exists INTEGER NOT NULL DEFAULT 1,
  last_seen_at INTEGER NOT NULL
);

CREATE INDEX IF NOT EXISTS idx_nodes_parent ON nodes(parent_path);
CREATE INDEX IF NOT EXISTS idx_nodes_mtime ON nodes(modified_at);
CREATE INDEX IF NOT EXISTS idx_nodes_hash ON nodes(hash);

CREATE TABLE IF NOT EXISTS ops (
  id INTEGER PRIMARY KEY,
  ts INTEGER NOT NULL,
  op TEXT NOT NULL,                 -- "write", "rename", "remove", ...
  path TEXT,
  path_to TEXT,
  bytes INTEGER,
  status TEXT NOT NULL,             -- "ok" | "err"
  error TEXT
);

CREATE TABLE IF NOT EXISTS config (
  key TEXT PRIMARY KEY,
  value TEXT NOT NULL
);

Path normalization: Store virtual paths (what the user sees), not host paths. For host FS, optionally store host paths in a separate table if needed.


Operation Mapping

OperationIndex Update
write, append, truncateUpsert node, update size/mtime, log op
create_dir, create_dir_allInsert dir nodes, log op
remove_fileMark exists=0, log op
remove_dir, remove_dir_allMark subtree removed (prefix query), log op
renameUpdate path + parent for subtree, log op
copyInsert new node from source metadata, log op
symlink, hard_linkInsert node, set link metadata, log op
read/read_rangeOptional op log only (configurable)

Streaming writes: Wrap open_write() with a counting writer that records final size and timestamps on close.


Configuration

#![allow(unused)]
fn main() {
pub struct IndexConfig {
    pub index_file: PathBuf,            // sidecar index file (SQLite default)
    pub consistency: IndexConsistency,
    pub track_reads: bool,
    pub track_errors: bool,
    pub track_metadata: bool,
    pub content_hashing: ContentHashing, // None | OnWrite | OnDemand
    pub initial_scan: InitialScan,       // None | OnDemand | FullScan
}
}

Naming

  • Middleware type: Indexing<B>
  • Layer: IndexLayer
  • Builder methods emphasize intent: index_file, consistency, track_*, content_hashing, initial_scan

Example Configuration

#![allow(unused)]
fn main() {
let backend = MemoryBackend::new()
    .layer(IndexLayer::builder()
        .index_file("index.db")
        .consistency(IndexConsistency::Strict)
        .track_reads(false)
        .build());
}

Index Engine Abstraction (Future)

To keep the middleware ergonomic while enabling alternate engines, define a small storage trait and keep SQLite as the default implementation:

#![allow(unused)]
fn main() {
pub trait IndexStore: Send + Sync {
    fn upsert_node(&self, node: IndexNode) -> Result<(), IndexError>;
    fn mark_removed(&self, path: &Path) -> Result<(), IndexError>;
    fn rename_prefix(&self, from: &Path, to: &Path) -> Result<(), IndexError>;
    fn record_op(&self, entry: OpEntry) -> Result<(), IndexError>;
}

}

The default implementation uses SQLite at index_file. If/when alternate engines are needed, the IndexLayer builder can accept a boxed IndexStore for advanced use without introducing an enum.


Performance Notes

  • Use WAL mode for concurrency.
  • Batch updates for recursive operations (rename/remove_dir_all).
  • Hashing is optional and should be off by default.
  • Keep op logs bounded (optional retention policy).

For detailed SQLite tuning (pragmas, connection pooling, checkpointing), see the SQLite Operations Guide.


Security and Containment

  • Index file should live outside the root path by default.
  • For mounted drives, use a dedicated index path per mount.
  • Respect PathFilter and Restrictions when operating through middleware.

Mounting Scenario

When mounted via anyfs (with fuse or winfsp feature flags), all access goes through AnyFS. The index becomes an implicit audit trail:

  • Every file operation is logged.
  • Queries reflect all operations routed through AnyFS.

Open Questions

  • Should op logs be bounded by size/time by default?
  • Do we need a query API in anyfs or a separate anyfs-index crate?
  • Should middleware expose a read-only IndexStore handle for queries?
  • Should we add a companion watcher/scan tool for external changes on real FS?

Layered Traits (anyfs-backend)

AnyFS uses a layered trait architecture for maximum flexibility with minimal complexity.

See ADR-030 for the design rationale.


Trait Hierarchy

                    FsPosix
                       │
        ┌──────────────┼──────────────┐
        │              │              │
   FsHandles       FsLock        FsXattr
        │              │              │
        └──────────────┴──────────────┘
                       │
                    FsFuse ← FsFull + FsInode
                       │
        ┌──────────────┴──────────────┐
        │                             │
     FsFull                       FsInode
        │
        │
        ├──────┬───────┬───────┬──────┐
        │      │       │       │      │
   FsLink   FsPerm  FsSync  FsStats   │
        │      │       │       │      │
        └──────┴───────┴───────┴──────┘
                       │
                      Fs   ← Most users only need this
                       │
           ┌───────────┼───────────┐
           │           │           │
        FsRead      FsWrite     FsDir

                                              Derived Traits (auto-impl)
                                              ───────────────────────────
                                              FsPath: FsRead + FsLink
                                                (path canonicalization)

Simple rule: Import Fs for basic use. Add traits as needed for advanced features.

Note: FsPath is a derived trait with a blanket impl. Any type implementing FsRead + FsLink automatically gets FsPath. SelfResolving is a marker trait - backends that implement it should be used with NoOpResolver in FileStorage (there is no automatic detection; use FileStorage::with_resolver(backend, NoOpResolver) explicitly). PathResolver is a strategy trait for pluggable path resolution (see ADR-033).


Layer 1: Core Traits (Required)

Thread Safety: All traits require Send + Sync and use &self for all methods. Backend implementers MUST use interior mutability (RwLock, Mutex, etc.) to ensure thread-safe concurrent access. See ADR-023 for rationale.

Path Parameters: Core traits use &Path so they are object-safe (dyn Fs works). For ergonomics, FileStorage and FsExt accept impl AsRef<Path> and forward to the core traits.

FsRead

#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    fn read_to_string(&self, path: &Path) -> Result<String, FsError>;
    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError>;
    fn exists(&self, path: &Path) -> Result<bool, FsError>;
    fn metadata(&self, path: &Path) -> Result<Metadata, FsError>;
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
}

FsWrite

#![allow(unused)]
fn main() {
pub trait FsWrite: Send + Sync {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    fn remove_file(&self, path: &Path) -> Result<(), FsError>;
    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError>;
    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError>;
    fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError>;
    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
}
}

FsDir

#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
    fn create_dir(&self, path: &Path) -> Result<(), FsError>;
    fn create_dir_all(&self, path: &Path) -> Result<(), FsError>;
    fn remove_dir(&self, path: &Path) -> Result<(), FsError>;
    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError>;
}

/// Iterator over directory entries. Wraps a boxed iterator for flexibility.
///
/// - Outer `Result` (from `read_dir()`) = "can I open this directory?"
/// - Inner `Result` (per item) = "can I read this entry?"
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send + 'static>);

impl Iterator for ReadDirIter {
    type Item = Result<DirEntry, FsError>;
    fn next(&mut self) -> Option<Self::Item> { self.0.next() }
}

impl ReadDirIter {
    pub fn new(iter: impl Iterator<Item = Result<DirEntry, FsError>> + Send + 'static) -> Self {
        Self(Box::new(iter))
    }

    /// Create from a pre-collected vector (useful for middleware like Overlay).
    pub fn from_vec(entries: Vec<Result<DirEntry, FsError>>) -> Self {
        Self(Box::new(entries.into_iter()))
    }

    /// Collect all entries, short-circuiting on first error.
    pub fn collect_all(self) -> Result<Vec<DirEntry>, FsError> {
        self.collect()
    }
}
}

Layer 2: Extended Traits (Optional)

#![allow(unused)]
fn main() {
pub trait FsLink: Send + Sync {
    fn symlink(&self, target: &Path, link: &Path) -> Result<(), FsError>;
    fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError>;
    fn read_link(&self, path: &Path) -> Result<PathBuf, FsError>;
    fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError>;
}
}

FsPermissions

#![allow(unused)]
fn main() {
pub trait FsPermissions: Send + Sync {
    fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError>;
}
}

FsSync

#![allow(unused)]
fn main() {
pub trait FsSync: Send + Sync {
    fn sync(&self) -> Result<(), FsError>;
    fn fsync(&self, path: &Path) -> Result<(), FsError>;
}
}

FsStats

#![allow(unused)]
fn main() {
pub trait FsStats: Send + Sync {
    fn statfs(&self) -> Result<StatFs, FsError>;
}
}

FsPath (Optimizable)

Path canonicalization with a default implementation. Backends can override for optimized resolution.

#![allow(unused)]
fn main() {
pub trait FsPath: FsRead + FsLink {
    /// Resolve all symlinks and normalize path (.., .).
    /// Default: iterative resolution via read_link() and symlink_metadata().
    fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        // ... default impl ...
    }

    /// Like canonicalize, but allows non-existent final component.
    fn soft_canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        // ... default impl ...
    }
}
impl<T: FsRead + FsLink> FsPath for T {}
}

SelfResolving (Marker)

Marker trait for backends that handle their own path resolution (e.g., VRootFsBackend, StdFsBackend). When using these backends, use NoOpResolver explicitly:

#![allow(unused)]
fn main() {
pub trait SelfResolving {}

// Usage:
let fs = FileStorage::with_resolver(VRootFsBackend::new("/data")?, NoOpResolver);
}

Layer 3: Inode Trait (For FUSE)

FsInode

Required for FUSE mounting (FUSE operates on inodes, not paths). Also enables correct hardlink reporting (same inode = same file, proper nlink count).

Note: FsLink defines hardlink creation (hard_link()). FsInode enables FUSE to track hardlinks via inode identity.

#![allow(unused)]
fn main() {
pub trait FsInode: Send + Sync {
    fn path_to_inode(&self, path: &Path) -> Result<u64, FsError>;
    fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError>;
    fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError>;
    fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError>;
}
}

Layer 4: POSIX Traits (Full POSIX)

POSIX Types

#![allow(unused)]
fn main() {
/// Opaque file handle (inode-based for efficiency)
pub struct Handle(pub u64);

/// File open flags (mirrors POSIX)
#[derive(Clone, Copy, Debug)]
pub struct OpenFlags {
    pub read: bool,
    pub write: bool,
    pub create: bool,
    pub truncate: bool,
    pub append: bool,
}

impl OpenFlags {
    pub const READ: Self = Self { read: true, write: false, create: false, truncate: false, append: false };
    pub const WRITE: Self = Self { read: false, write: true, create: true, truncate: true, append: false };
    pub const READ_WRITE: Self = Self { read: true, write: true, create: false, truncate: false, append: false };
    pub const APPEND: Self = Self { read: false, write: true, create: true, truncate: false, append: true };
}

/// File lock type (mirrors POSIX flock)
#[derive(Clone, Copy, Debug)]
pub enum LockType {
    Shared,     // Multiple readers
    Exclusive,  // Single writer
}
}

FsHandles

#![allow(unused)]
fn main() {
pub trait FsHandles: Send + Sync {
    fn open(&self, path: &Path, flags: OpenFlags) -> Result<Handle, FsError>;
    fn read_at(&self, handle: Handle, buf: &mut [u8], offset: u64) -> Result<usize, FsError>;
    fn write_at(&self, handle: Handle, data: &[u8], offset: u64) -> Result<usize, FsError>;
    fn close(&self, handle: Handle) -> Result<(), FsError>;
}
}

FsLock

#![allow(unused)]
fn main() {
pub trait FsLock: Send + Sync {
    fn lock(&self, handle: Handle, lock: LockType) -> Result<(), FsError>;
    fn try_lock(&self, handle: Handle, lock: LockType) -> Result<bool, FsError>;
    fn unlock(&self, handle: Handle) -> Result<(), FsError>;
}
}

FsXattr

#![allow(unused)]
fn main() {
pub trait FsXattr: Send + Sync {
    fn get_xattr(&self, path: &Path, name: &str) -> Result<Vec<u8>, FsError>;
    fn set_xattr(&self, path: &Path, name: &str, value: &[u8]) -> Result<(), FsError>;
    fn remove_xattr(&self, path: &Path, name: &str) -> Result<(), FsError>;
    fn list_xattr(&self, path: &Path) -> Result<Vec<String>, FsError>;
}
}

Convenience Supertraits

These are automatically implemented via blanket impls:

#![allow(unused)]
fn main() {
/// Basic filesystem - covers 90% of use cases
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}

/// Full filesystem with all std::fs features
pub trait FsFull: Fs + FsLink + FsPermissions + FsSync + FsStats {}
impl<T: Fs + FsLink + FsPermissions + FsSync + FsStats> FsFull for T {}

/// FUSE-mountable filesystem
pub trait FsFuse: FsFull + FsInode {}
impl<T: FsFull + FsInode> FsFuse for T {}

/// Full POSIX filesystem
pub trait FsPosix: FsFuse + FsHandles + FsLock + FsXattr {}
impl<T: FsFuse + FsHandles + FsLock + FsXattr> FsPosix for T {}
}

When to Use Each Level

LevelTraitUse When
1FsBasic file operations (read, write, dirs)
2FsFullNeed links, permissions, sync, or stats
3FsFuseFUSE mounting or hardlink support
4FsPosixFull POSIX (file handles, locks, xattr)

Implementing Functions

Use trait bounds to specify requirements:

#![allow(unused)]
fn main() {
use anyfs::FileStorage;

// Works with any backend, keeps std::fs-style paths
fn process_files<B: Fs>(fs: &FileStorage<B>) -> Result<(), FsError> {
    let data = fs.read("/input.txt")?;
    fs.write("/output.txt", &data)?;
    Ok(())
}

// Requires link support
fn create_backup<B: Fs + FsLink>(fs: &FileStorage<B>) -> Result<(), FsError> {
    fs.hard_link("/data.txt", "/data.txt.bak")?;
    Ok(())
}

// Requires FsFuse trait + fuse/winfsp feature
fn mount_filesystem(fs: impl FsFuse) -> Result<(), MountError> {
    anyfs::MountHandle::mount(fs, "/mnt/myfs")?;
    Ok(())
}
}

Extension Trait

FsExt provides convenience methods for any Fs backend:

#![allow(unused)]
fn main() {
pub trait FsExt: Fs {
    /// Check if path is a file.
    fn is_file(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;

    /// Check if path is a directory.
    fn is_dir(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;

    /// JSON methods (require optional `serde` feature in anyfs-backend)
    #[cfg(feature = "serde")]
    fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError>;
    #[cfg(feature = "serde")]
    fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError>;
}

// Blanket implementation
impl<B: Fs> FsExt for B {}
}

FileStorage (anyfs)

Ergonomic wrapper for std::fs-aligned API


Overview

FileStorage<B> is a thin wrapper that provides a familiar std::fs-aligned API with:

  • B - Backend type (the only generic)
  • Built-in path resolution via boxed PathResolver (swappable at runtime)

It is the intended application-facing API: std::fs-style paths with object-safe core traits under the hood.

It does TWO things:

  1. Ergonomics (std::fs-aligned API with impl AsRef<Path> convenience)
  2. Path resolution for virtual backends (via boxed PathResolver - cold path, boxing is acceptable)

All policy (limits, feature gates, logging) is handled by middleware, not FileStorage.


Why Only One Generic?

Previous designs used FileStorage<B, R, M> with three type parameters. We simplified to FileStorage<B>:

Old ParamPurposeWhy Removed
R (Resolver)Swappable path resolutionBoxed internally—resolution is a cold path (ADR-025)
M (Marker)Compile-time safetyUsers can create wrapper newtypes if needed

Result: Simpler API for 90% of users. Those who need type-safe markers wrap FileStorage themselves.


Creating a Container

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};

// Simple: ergonomics + default path resolution
let fs = FileStorage::new(MemoryBackend::new());
}

With middleware (layer-based):

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, FileStorage};

let fs = FileStorage::new(
    MemoryBackend::new()
        .layer(QuotaLayer::builder()
            .max_total_size(100 * 1024 * 1024)
            .build())
        .layer(RestrictionsLayer::builder()
            .deny_permissions()
            .build())
);
}

With custom resolver:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
use anyfs::resolvers::{CachingResolver, IterativeResolver};

// Custom resolver for read-heavy workloads
let fs = FileStorage::with_resolver(
    MemoryBackend::new(),
    CachingResolver::new(IterativeResolver::default())
);
}

Type-Safe Markers (User-Defined Wrappers)

If you need compile-time safety to prevent mixing filesystems, create wrapper newtypes:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Define your own wrapper types
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);

impl SandboxFs {
    fn new() -> Self {
        SandboxFs(FileStorage::new(MemoryBackend::new()))
    }
}

// Type-safe function signatures prevent mixing
fn process_sandbox(fs: &SandboxFs) {
    // Can only accept SandboxFs
}

fn save_user_file(fs: &UserDataFs, name: &str, data: &[u8]) {
    // Can only accept UserDataFs
}

// Compile-time safety:
let sandbox = SandboxFs::new();
process_sandbox(&sandbox);     // OK
// process_sandbox(&userdata); // Compile error! Wrong type
}

When to Use Wrapper Types

ScenarioUse Wrapper?Why
Single containerNoFileStorage<B> is sufficient
Multiple containers, same typeYesPrevent accidental mixing
Multi-tenant systemsYesCompile-time tenant isolation
Sandbox + user dataYesNever write user data to sandbox

std::fs-aligned Methods

FileStorage mirrors std::fs naming:

FileStoragestd::fs
read()std::fs::read
read_to_string()std::fs::read_to_string
write()std::fs::write
read_dir()std::fs::read_dir
create_dir()std::fs::create_dir
create_dir_all()std::fs::create_dir_all
remove_file()std::fs::remove_file
remove_dir()std::fs::remove_dir
remove_dir_all()std::fs::remove_dir_all
rename()std::fs::rename
copy()std::fs::copy
metadata()std::fs::metadata
symlink_metadata()std::fs::symlink_metadata
read_link()std::fs::read_link
set_permissions()std::fs::set_permissions

When the backend implements extended traits (e.g., FsLink, FsInode, FsHandles), FileStorage forwards those methods too and keeps the same impl AsRef<Path> ergonomics for path parameters.


What FileStorage Does NOT Do

ConcernUse Instead
Quota enforcementQuota<B>
Feature gatingRestrictions<B>
Audit loggingTracing<B>
Path containmentPathFilter middleware or VRootFsBackend containment

FileStorage is not a policy layer. If you need policy, compose middleware.


FileStorage Implementation

#![allow(unused)]
fn main() {
use anyfs_backend::{Fs, PathResolver};
use anyfs::resolvers::IterativeResolver;

/// Ergonomic wrapper with single generic parameter.
pub struct FileStorage<B> {
    backend: B,
    resolver: Box<dyn PathResolver>,  // Boxed: resolution is cold path
}

impl<B: Fs> FileStorage<B> {
    /// Create with default resolver (IterativeResolver).
    pub fn new(backend: B) -> Self {
        FileStorage {
            backend,
            resolver: Box::new(IterativeResolver::new()),
        }
    }

    /// Create with custom resolver.
    pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self {
        FileStorage {
            backend,
            resolver: Box::new(resolver),
        }
    }

    /// Type-erase the backend for simpler types (opt-in boxing).
    pub fn boxed(self) -> FileStorage<Box<dyn Fs>> {
        FileStorage {
            backend: Box::new(self.backend),
            resolver: self.resolver,
        }
    }
}
}

Path Resolution

FileStorage handles path resolution for virtual backends via the boxed PathResolver. The default IterativeResolver provides symlink-aware canonicalization.

Backends implementing SelfResolving (like VRootFsBackend) skip resolution since the OS handles it.


Type Erasure (Opt-in)

When you need simpler types (e.g., storing in collections), use .boxed():

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage, Fs};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Type-erased for uniform storage
let filesystems: Vec<FileStorage<Box<dyn Fs>>> = vec![
    FileStorage::new(MemoryBackend::new()).boxed(),
    FileStorage::new(SqliteBackend::open("a.db")?).boxed(),
    FileStorage::new(SqliteBackend::open("b.db")?).boxed(),
];
}

When to use .boxed():

SituationUse GenericUse .boxed()
Local variablesYesNo
Function paramsYes (impl Fs)No
Return typesYes (impl Fs)No
Collections of mixed backendsNoYes
Struct fields (want simple type)MaybeYes

Direct Backend Access

If you don’t need the wrapper, use backends directly:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, FileStorage};

let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build());

// Use FileStorage for std::fs-style paths
let fs = FileStorage::new(backend);
fs.write("/file.txt", b"data")?;
}

FileStorage<B> is part of the anyfs crate, not a separate crate.

Backends Guide

This guide explains each backend available for AnyFS—both built-in (in anyfs) and ecosystem crates (anyfs-sqlite, anyfs-indexed)—how they work internally, when to use them, and the trade-offs involved.


Quick Reference: Which Backend Should You Use?

TL;DR — Pick the first match from top to bottom:

Your SituationBest ChoiceWhy
Writing testsMemoryBackendFast, isolated, no cleanup
Running in WASM/browserMemoryBackendSimplest option
Need encrypted single-file storageanyfs-sqlite: SqliteBackendAES-256 via encryption feature (ecosystem crate)
Need portable single-file databaseanyfs-sqlite: SqliteBackendCross-platform, ACID (ecosystem crate)
Large files (>100MB) with path isolationanyfs-indexed: IndexedBackendVirtual paths + native disk I/O (ecosystem crate)
Containing untrusted code to a directoryVRootFsBackendPrevents path traversal attacks
Working with real files in trusted environmentStdFsBackendDirect OS operations
Need layered filesystem (container-like)Overlay (middleware)Base + writable upper layer

⚠️ Security Warning: StdFsBackend provides NO isolation. Never use with untrusted input.

Ecosystem Crates: Complex backends like SqliteBackend and IndexedBackend live in separate crates (anyfs-sqlite, anyfs-indexed) because they require internal runtime complexity (connection pooling, sharding, chunking).


Backend Categories

AnyFS backends fall into two fundamental categories based on who resolves paths:

CategoryPath ResolutionSymlink HandlingIsolation
Type 1: Virtual FilesystemPathResolver (pluggable)Simulated by AnyFSComplete
Type 2: Real FilesystemOperating SystemDelegated to OSPartial/None

Type 1: Virtual Filesystem Backends

These backends store filesystem data in an abstract format (memory, database, etc.). AnyFS handles path resolution via pluggable PathResolver (see ADR-033), including:

  • Path traversal (.., .)
  • Symlink following (simulated)
  • Hard link tracking (simulated)
  • Path normalization

Key benefit: Complete isolation from the host OS. Identical behavior across all platforms.

Type 2: Real Filesystem Backends

These backends delegate operations to the actual operating system. The OS handles path resolution, which means:

  • Native symlink behavior
  • Native permission enforcement
  • Platform-specific edge cases
  • Potential security considerations (path escapes)

Key benefit: Native performance and compatibility with existing files.


Type 1: Virtual Filesystem Backends

MemoryBackend

An in-memory filesystem. All data lives in RAM and is lost when the process exits.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};

let fs = FileStorage::new(MemoryBackend::new());
fs.write("/data.txt", b"Hello, World!")?;
}

How It Works

  • Files and directories stored in a tree structure (HashMap or similar)
  • Symlinks stored as data pointing to target paths
  • Hard links share the same underlying data node
  • All operations are memory-only (no disk I/O)
  • Supports snapshots via Clone and persistence via save_to()/load_from()

Performance

OperationSpeedNotes
Read/WriteVery FastNo I/O, pure memory operations
Path ResolutionVery FastIn-memory tree traversal
Large Files⚠️ Memory-boundLimited by available RAM

Advantages

  • Fastest backend - no disk I/O overhead
  • Deterministic - perfect for testing
  • Portable - works on all platforms including WASM
  • Snapshots - Clone creates instant backups
  • No cleanup - no temp files to delete

Disadvantages

  • Volatile - data lost on process exit (unless serialized)
  • Memory-limited - large filesystems consume RAM
  • No persistence - must explicitly save/load state

When to Use

Use CaseRecommendation
Unit testsIdeal - fast, isolated, deterministic
Integration testsIdeal - no filesystem pollution
Temporary workspacesGood - fast scratch space
Build cachesGood - if fits in memory
WASM/BrowserIdeal - simplest option
Large file storageAvoid - use anyfs-sqlite or disk
Persistent dataAvoid - unless you handle serialization

✅ USE MemoryBackend when:

  • Writing unit tests (fast, isolated, deterministic)
  • Writing integration tests (no filesystem pollution)
  • Building temporary workspaces or scratch space
  • Caching data that fits in memory
  • Running in WASM/browser environments (simplest option)
  • Need instant snapshots via Clone

❌ DON’T USE MemoryBackend when:

  • Storing files larger than available RAM
  • Data must survive process restart (use anyfs-sqlite)
  • Working with existing files on disk (use VRootFsBackend)

SqliteBackend (Ecosystem Crate)

Crate: anyfs-sqlite

Complex backends live in separate crates. See AGENTS.md “Crate Ecosystem” section.

Stores the entire filesystem in a single SQLite database file.

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;
use anyfs::FileStorage;

let fs = FileStorage::new(SqliteBackend::open("myfs.db")?);
fs.write("/documents/report.txt", b"Annual Report")?;
}

How It Works

  • Single .db file contains all files, directories, and metadata
  • Schema: nodes table (path, type, content, permissions, timestamps)
  • Symlinks stored as rows with target path in content
  • Hard links share the same inode (row ID)
  • Uses WAL mode for concurrent read access
  • Connection pooling: multiple readers, single writer with batching
  • Write batching: groups operations into transactions for efficiency
  • Transactions ensure atomic operations

Key insight: “Writes are expensive.” SqliteBackend batches writes internally because one transaction per batch is far more efficient than one transaction per operation.

Performance

OperationSpeedNotes
Read/Write🐢 SlowerSQLite query overhead
Path Resolution🐢 SlowerDatabase lookups per component
TransactionsAtomicACID guarantees
Large Files🟡 VariesSee note below

Large file behavior: SQLite streams BLOB content incrementally via sqlite3_blob_read/write, so files don’t need to fit entirely in RAM. However, very large BLOBs (>100MB) can cause higher memory pressure during I/O operations due to SQLite’s internal buffering and page management. For frequent large file operations, consider IndexedBackend which uses native file I/O.

Performance note: SQLite performance varies significantly based on hardware, configuration, and workload. With proper tuning (WAL mode, connection pooling, write batching), a single SQLite database on modern hardware can achieve high throughput. See sqlite-operations.md for tuning guidance.

Advantages

  • Single-file portability - entire filesystem in one .db file
  • ACID transactions - atomic operations, crash recovery
  • Cross-platform - works on all platforms including WASM
  • Complete isolation - no interaction with host filesystem
  • Queryable - can inspect with SQLite tools
  • Optional encryption - AES-256 via SQLCipher with encryption feature

Disadvantages

  • Slower than memory - database overhead on every operation
  • Single-writer - SQLite’s write lock limits concurrency
  • Large file overhead - very large BLOBs (>100MB) have higher memory pressure due to SQLite buffering

When to Use

Use CaseRecommendation
Portable storageIdeal - single file, works everywhere
Embedded databasesIdeal - self-contained
Sandboxed environmentsGood - complete isolation
Encrypted storageGood - use open_encrypted() with feature
Archive/backupGood - atomic, portable
Large media filesWorks - higher memory pressure during I/O
High-throughput I/O⚠️ Tradeoff - database overhead vs MemoryBackend
External tool accessAvoid - files not on real filesystem

✅ USE SqliteBackend when:

  • Need portable, single-file storage (easy to copy, backup, share)
  • Building embedded/self-contained applications
  • Complete isolation from host filesystem is required
  • Want encryption (use open_encrypted() with encryption feature)
  • Need ACID transactions and crash recovery
  • Cross-platform consistency is critical

❌ DON’T USE SqliteBackend when:

  • Files must be accessible to external tools (use VRootFsBackend)
  • Minimizing memory pressure for very large files is critical (use anyfs-indexed: IndexedBackend)

💡 SqliteBackend vs IndexedBackend: Both provide complete path isolation. Choose SqliteBackend for single-file portability and portable storage. Choose IndexedBackend (anyfs-indexed) for very large files (>100MB) that need native disk streaming performance.


IndexedBackend (Ecosystem Crate)

Crate: anyfs-indexed

Complex backends live in separate crates. See AGENTS.md “Crate Ecosystem” section.

A hybrid backend: virtual paths with disk-based content storage. Paths, directories, symlinks, and metadata are stored in an index database. File content is stored on the real filesystem as opaque blobs.

Key insight: Same isolation model as SqliteBackend, but file content stored externally for native I/O performance with large files.

#![allow(unused)]
fn main() {
use anyfs_indexed::IndexedBackend;
use anyfs::FileStorage;

// Files stored in ./storage/, index in ./storage/index.db
let fs = FileStorage::new(IndexedBackend::open("./storage")?);
fs.write("/documents/report.pdf", &pdf_bytes)?;
// Actually stored as: ./storage/a1b2c3d4-5678-...-1704067200.bin
}

How It Works

Virtual Path                    Real Storage
─────────────────────────────────────────────────────
/documents/report.pdf    →    ./storage/blobs/a1b2c3d4-...-1704067200.bin
/images/photo.jpg        →    ./storage/blobs/b2c3d4e5-...-1704067201.bin
/config.json             →    ./storage/blobs/c3d4e5f6-...-1704067202.bin

index.db contains:
┌─────────────────────────┬──────────────────────────────┬──────────┐
│ virtual_path            │ blob_name                    │ metadata │
├─────────────────────────┼──────────────────────────────┼──────────┤
│ /documents/report.pdf   │ a1b2c3d4-...-1704067200.bin  │ {...}    │
│ /images/photo.jpg       │ b2c3d4e5-...-1704067201.bin  │ {...}    │
└─────────────────────────┴──────────────────────────────┴──────────┘
  • Virtual filesystem, real content: Directory structure, paths, symlinks, and metadata are virtual (stored in index.db). Only raw file content lives on disk as opaque blobs.
  • Files stored with UUID + timestamp names (flat, meaningless filenames)
  • index.db SQLite database maps virtual paths to blob names
  • Symlinks and hard links are simulated in the index (not OS symlinks)
  • Path resolution handled by AnyFS framework (Type 1 backend)
  • File content streamed directly from disk (native I/O performance)

Performance

OperationSpeedNotes
Read/Write🟢 FastNative disk I/O for content
Path Resolution🟡 ModerateIndex lookup + disk access
Large FilesExcellentStreamed directly from disk
Metadata Ops🟢 FastIndex-only, no disk I/O
Index Optimization

The SQLite index benefits from the same performance tuning as SqliteBackend:

SettingDefaultPurpose
journal_modeWALConcurrent reads during metadata updates
synchronousFULLIndex integrity on power loss (safe default)
cache_size16MBSmaller cache for metadata-only index
busy_timeout5000msGracefully handle lock contention
auto_vacuumINCREMENTALReclaim space from deleted entries

Connection pooling: 4-8 reader connections for concurrent index queries; single writer for metadata updates. Blob I/O bypasses SQLite entirely, so the bottleneck is typically blob disk throughput, not index performance.

See anyfs-indexed#9 for detailed performance guidance.

Advantages

  • Native file I/O - content stored as raw files, fast streaming
  • Large file support - uses OS file I/O, avoids SQLite BLOB buffering overhead
  • Complete path isolation - virtual paths, same as SqliteBackend
  • Inspectable - can see blob files on disk (though with opaque names)
  • Cross-platform - works identically on all platforms

Disadvantages

  • Index dependency - losing index.db = losing virtual structure (blobs become orphaned)
  • Two-component backup - must copy directory + index.db together
  • Content exposure - blob files are readable on disk (paths are hidden, content is not)
  • Not single-file portable - unlike SqliteBackend

When to Use

Use CaseRecommendation
Large file storageIdeal - native I/O performance
Media librariesIdeal - stream large videos/images
Document managementGood - virtual paths + fast I/O
Sandboxed + large filesIdeal - virtual paths, real I/O
Single-file portabilityAvoid - use anyfs-sqlite: SqliteBackend
Content confidentiality⚠️ Wrap - use Encryption middleware for protection
WASM/BrowserAvoid - requires real filesystem

✅ USE IndexedBackend when:

  • Storing large files (videos, images, documents >100MB)
  • Need native I/O performance for streaming content
  • Building media libraries or document management systems
  • Want virtual path isolation but with real disk performance
  • Files are large but path structure should be sandboxed

❌ DON’T USE IndexedBackend when:

  • Need single-file portability (use anyfs-sqlite: SqliteBackend)
  • Content must be hidden from host filesystem (use anyfs-sqlite: SqliteBackend with encryption)
  • Need WASM/browser support (use MemoryBackend)

🔒 Encryption Tip: If you need large file performance but content confidentiality matters, you can implement an Encryption<B> middleware wrapper to encrypt blob contents at rest. This is a user-defined middleware pattern (not built-in) - see the middleware implementation guide for how to create custom middleware. Alternatively, use SqliteBackend with SQLCipher encryption for simpler encrypted storage.


Type 2: Real Filesystem Backends

StdFsBackend

Direct delegation to std::fs. Every call maps 1:1 to the standard library.

#![allow(unused)]
fn main() {
use anyfs::{StdFsBackend, FileStorage, NoOpResolver};

// SelfResolving backends require explicit NoOpResolver
let fs = FileStorage::with_resolver(StdFsBackend::new(), NoOpResolver);
fs.write("/tmp/data.txt", b"Hello")?; // Actually writes to /tmp/data.txt
}

How It Works

  • Every method directly calls the equivalent std::fs function
  • Paths passed through unchanged
  • OS handles all resolution, symlinks, permissions
  • Implements SelfResolving marker (use NoOpResolver to skip virtual resolution)

Performance

OperationSpeedNotes
Read/Write🟢 NormalNative OS speed
Path ResolutionFastOS kernel handles it
SymlinksNativeOS behavior

Advantages

  • Zero overhead - direct OS calls
  • Full compatibility - works with all existing files
  • Native features - OS permissions, ACLs, xattrs
  • Middleware-ready - add Quota, Tracing, etc. to real filesystem

Disadvantages

  • No isolation - full filesystem access
  • No containment - paths can escape anywhere
  • Platform differences - Windows vs Unix behavior
  • Security risk - must trust path inputs

When to Use

Use CaseRecommendation
Adding middleware to real FSIdeal - wrap with Quota, Tracing
Trusted environmentsGood - when isolation not needed
Migration pathGood - gradually add AnyFS features
Full host FS featuresGood - ACLs, xattrs, etc.
Untrusted inputNever - use VRootFsBackend
SandboxingNever - no containment whatsoever
Multi-tenant systemsAvoid - use virtual backends

✅ USE StdFsBackend when:

  • Adding middleware (Quota, Tracing, etc.) to real filesystem operations
  • Operating in a fully trusted environment with controlled inputs
  • Migrating existing code to AnyFS incrementally
  • Need full access to host filesystem features (ACLs, xattrs)
  • Building tools that work with user’s actual files

❌ DON’T USE StdFsBackend when:

  • Handling untrusted path inputs (use VRootFsBackend)
  • Any form of sandboxing is required (no containment!)
  • Building multi-tenant systems (use virtual backends)
  • Security isolation matters at all

⚠️ Security Warning: StdFsBackend provides ZERO isolation. Paths like ../../etc/passwd will work. Only use with fully trusted, controlled inputs.


VRootFsBackend

Sets a directory as a virtual root. All operations are contained within it.

Feature: vrootfs

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, FileStorage, NoOpResolver};

// /home/user/sandbox becomes the virtual "/"
// SelfResolving backends require explicit NoOpResolver
let fs = FileStorage::with_resolver(
    VRootFsBackend::new("/home/user/sandbox")?,
    NoOpResolver
);

fs.write("/data.txt", b"Hello")?; 
// Actually writes to: /home/user/sandbox/data.txt

fs.read("/../../../etc/passwd")?;
// Resolves to: /home/user/sandbox/etc/passwd (clamped!)
}

How It Works

  • Configured with a real directory as the “virtual root”
  • All paths are validated and clamped to stay within root
  • Uses strict-path crate for escape prevention
  • Symlinks are followed but targets validated
  • Implements SelfResolving marker (OS handles resolution after validation)
Virtual Path          Validation              Real Path
───────────────────────────────────────────────────────────────
/data.txt        →   validate & join    →   /home/user/sandbox/data.txt
/../../../etc    →   clamp to root      →   /home/user/sandbox/etc
/link → /tmp     →   validate target    →   ERROR or clamped

Performance

OperationSpeedNotes
Read/Write🟡 ModerateValidation overhead
Path Resolution🐢 SlowerExtra I/O for symlink checks
Symlink Following🐢 SlowerMust validate each hop

Advantages

  • Path containment - cannot escape virtual root
  • Real file access - native OS performance for content
  • Symlink safety - targets validated against root
  • Drop-in sandboxing - wrap existing directories

Disadvantages

  • Performance overhead - validation on every operation
  • Extra I/O - symlink following requires lstat calls
  • Platform quirks - symlink behavior varies (especially Windows)
  • Theoretical edge cases - TOCTOU races exist but are difficult to exploit

When to Use

Use CaseRecommendation
User uploads directoryIdeal - contain user content
Plugin sandboxingGood - limit plugin file access
Chroot-like isolationGood - without actual chroot
AI agent workspacesGood - bound agent to directory
Real FS + path containmentIdeal - native I/O with boundaries
Maximum security⚠️ Careful - theoretical TOCTOU exists
Cross-platform symlinks⚠️ Careful - Windows behavior differs
Complete host isolationAvoid - use SqliteBackend instead

✅ USE VRootFsBackend when:

  • Containing user-uploaded content to a specific directory
  • Sandboxing plugins, extensions, or untrusted code
  • Need chroot-like isolation without actual chroot privileges
  • Building AI agent workspaces with filesystem boundaries
  • Want real filesystem performance with path containment

❌ DON’T USE VRootFsBackend when:

  • Maximum security required (theoretical TOCTOU edge cases exist - use MemoryBackend)
  • Need highest I/O performance (validation adds overhead)
  • Cross-platform symlink consistency is critical (Windows differs)
  • Want complete isolation from host (use SqliteBackend)

🔒 Encryption Tip: For sensitive data in sandboxed directories (user uploads, plugin workspaces, AI agent data), consider implementing an Encryption<B> middleware wrapper. This is a user-defined middleware pattern - you would create a custom Layer that encrypts data before delegating to the inner backend. See the middleware implementation guide for the pattern.


Composition Middleware

Overlay<Base, Upper>

Union filesystem middleware combining a read-only base with a writable upper layer.

Note: Overlay is middleware (in anyfs/middleware/overlay.rs), not a standalone backend. It composes two backends into a layered view.

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, MemoryBackend, Overlay, FileStorage};

// Base: read-only template
let base = VRootFsBackend::new("/var/templates")?;

// Upper: writable scratch layer  
let upper = MemoryBackend::new();

let fs = FileStorage::new(Overlay::new(base, upper));

// Read: checks upper first, falls back to base
let data = fs.read("/config.txt")?;

// Write: always goes to upper
fs.write("/config.txt", b"modified")?;

// Delete: creates "whiteout" in upper, shadows base
fs.remove_file("/unwanted.txt")?;
}

How It Works

┌─────────────────────────────────────────────────┐
│                  Overlay<B, U>                  │
├─────────────────────────────────────────────────┤
│  Read:   upper.exists(path)?                    │
│            → upper.read(path)                   │
│            : base.read(path)                    │
│                                                 │
│  Write:  upper.write(path, data)                │
│          (base unchanged)                       │
│                                                 │
│  Delete: upper.mark_whiteout(path)              │
│          (shadows base, doesn't delete it)      │
│                                                 │
│  List:   merge(base.read_dir(), upper.read_dir())│
│          - exclude whiteouts                    │
└─────────────────────────────────────────────────┘

         ┌──────────────┐
         │    Upper     │  ← Writes go here
         │ (MemoryFs)   │  ← Modifications stored here
         │              │  ← Whiteouts (deletions) here
         └──────┬───────┘
                │ if not found
                ▼
         ┌──────────────┐
         │     Base     │  ← Read-only layer
         │ (SqliteFs)   │  ← Original/template data
         │              │  ← Never modified
         └──────────────┘
  • Reads: Check upper layer first, fall back to base
  • Writes: Always go to upper layer (base is read-only)
  • Deletes: Create “whiteout” marker in upper (shadows base file)
  • Directory listing: Merge both layers, exclude whiteouts

Performance

OperationSpeedNotes
Read (upper hit)FastSingle layer lookup
Read (base fallback)🟡 ModerateTwo-layer lookup
WriteDepends on upperUpper layer speed
Directory listing🐢 SlowerMust merge both layers

Advantages

  • Copy-on-write semantics - modifications don’t affect base
  • Instant rollback - discard upper layer to reset
  • Space efficient - only changes stored in upper
  • Template pattern - share base across multiple instances
  • Testing isolation - test against real data without modifying it

Disadvantages

  • Complexity - whiteout handling, merge logic
  • Directory listing overhead - must combine and filter
  • Two backends to manage - lifecycle of both layers
  • Not true CoW - doesn’t deduplicate at block level

When to Use

Use CaseRecommendation
Container imagesIdeal - base image + writable layer
Template filesystemsIdeal - shared base, per-user upper
Testing with real dataIdeal - modify without consequences
Rollback capabilityGood - discard upper to reset
Git-like branchingGood - branch = new upper layer
Simple use casesOverkill - use single backend
Block-level CoWAvoid - Overlay is file-level
Dir listing perfAvoid - merge overhead on listings

✅ USE Overlay when:

  • Building container-like systems (base image + writable layer)
  • Sharing a template filesystem across multiple instances
  • Testing against production data without modifying it
  • Need instant rollback capability (discard upper layer)
  • Implementing git-like branching at filesystem level

❌ DON’T USE Overlay when:

  • Simple, single-purpose filesystem (unnecessary complexity)
  • Need block-level copy-on-write (Overlay is file-level)
  • Directory listing performance is critical (merge overhead)
  • Don’t need layered semantics (use single backend)

Backend Selection Guide

Quick Decision Tree

Do you need persistence?
├─ No → MemoryBackend
└─ Yes
   ├─ Single portable file? → SqliteBackend
   ├─ Large files + path isolation? → IndexedBackend
   └─ Access existing files on disk?
      ├─ Need containment? → VRootFsBackend  
      └─ Trusted environment? → StdFsBackend

Comparison Matrix

BackendSpeedIsolationPersistenceLarge FilesWASM
MemoryBackend⚡ Very Fast✅ Complete❌ None⚠️ RAM-limited
SqliteBackend🐢 Slower✅ Complete✅ Single file✅ Supported
IndexedBackend🟢 Fast✅ Complete✅ Directory✅ Native I/O
StdFsBackend🟢 Normal❌ None✅ Native✅ Native
VRootFsBackend🟡 Moderate✅ Strong✅ Native✅ Native
Overlay†VariesVariesVariesVariesVaries

†Overlay is middleware that composes two backends; characteristics depend on the backends used.

By Use Case

Use CaseRecommended
Unit testingMemoryBackend
Integration testingMemoryBackend or SqliteBackend
Portable application dataSqliteBackend
Encrypted storageSqliteBackend (with encryption feature)
Large file + isolationIndexedBackend
Media librariesIndexedBackend
Plugin/agent sandboxingVRootFsBackend
Adding middleware to real FSStdFsBackend
Container-like isolationOverlay<SqliteBackend, MemoryBackend>
Template with modificationsOverlay<Base, Upper>
WASM/BrowserMemoryBackend or SqliteBackend

Platform Compatibility

BackendWindowsLinuxmacOSWASM
MemoryBackend
SqliteBackend✅*
IndexedBackend
StdFsBackend
VRootFsBackend✅**
Overlay†Varies

* Requires wasm32-compatible SQLite build
** Windows symlinks require elevated privileges or Developer Mode
†Overlay is middleware; platform support depends on the backends composed


Common Mistakes to Avoid

❌ Mistake✅ Instead
Using StdFsBackend with user-provided pathsUse VRootFsBackend - it prevents ../../etc/passwd attacks
Using MemoryBackend for data that must survive restartUse SqliteBackend for persistence, or call save_to() to serialize
Expecting identical symlink behavior across platforms with VRootFsBackendUse MemoryBackend or SqliteBackend for consistent cross-platform symlinks
Using Overlay when a simple backend would sufficeKeep it simple - use Overlay only when you need true layered semantics

PathResolver: The Simple Explanation

What Problem Does It Solve?

Imagine you’re giving someone directions to a room in a building:

“Go to the office, then into the storage closet, then back out, then into the conference room.”

That’s a lot of steps! A smart person would simplify it:

“Just go to the conference room.”

PathResolver does exactly this for file paths.


The Problem: Messy Paths

When programs work with files, they often create messy paths like:

/home/user/../user/./documents/../documents/report.txt

This path says:

  • Go to /home/user
  • Go back up (..)
  • Go to user again
  • Stay here (.)
  • Go to documents
  • Go back up (..)
  • Go to documents again
  • Finally, report.txt

That’s exhausting! The simple answer is just:

/home/user/documents/report.txt

PathResolver’s job is to figure out the simple answer.


Why Can’t the Backend Just Do This?

Good question! Here’s why we separated it:

1. Different Backends, Same Logic

Think of backends like different types of filing cabinets:

  • MemoryBackend = Files in your brain (RAM)
  • anyfs-sqlite: SqliteBackend = Files in a database (ecosystem crate)
  • VRootFsBackend = Files on your hard drive

The path simplification logic is the same for all of them:

  • .. means “go up one level”
  • . means “stay here”
  • Symlinks mean “actually go over there instead”

Why write this logic three times? Write it once, use it everywhere.

2. We Can Test It Alone

If path resolution is buried inside each backend, testing is hard:

❌ To test path resolution, you need:
   - A real backend
   - Real files
   - Complex setup

With PathResolver separated:

✅ To test path resolution, you need:
   - Just the resolver
   - Simple inputs and outputs
   - No files required!

3. We Can Benchmark It

“Is our path resolution fast enough?”

If it’s mixed with everything else, you can’t measure it. Separated, you can:

#![allow(unused)]
fn main() {
// Easy to benchmark!
let resolver = IterativeResolver::new();
benchmark(|| resolver.canonicalize("/a/b/../c", &mock_fs));
}

4. We Can Swap It

Different situations need different approaches:

ResolverBest For
IterativeResolverGeneral use (walks path step by step)
CachingResolverRepeated paths (remembers answers)
NoOpResolverReal filesystem (OS already handles it)

With separation, switching is one line:

#![allow(unused)]
fn main() {
use anyfs::resolvers::{CachingResolver, IterativeResolver};

// Default
let fs = FileStorage::new(backend);

// With caching (for performance)
let fs = FileStorage::with_resolver(
    backend, 
    CachingResolver::new(IterativeResolver::default())
);
}

The Analogy: GPS Navigation

Think of PathResolver like a GPS system separate from your car:

ComponentIn AnyFSIn a Car
StorageBackend (MemoryBackend, SqliteBackend)The roads themselves
NavigationPathResolverGPS device
InterfaceFileStorageDashboard

Why is GPS a separate device?

  • ✅ You can upgrade the GPS without changing the car
  • ✅ You can test the GPS in a simulator
  • ✅ Different GPS apps can work in the same car
  • ✅ The car maker doesn’t need to be a GPS expert

Same reasons we separated PathResolver!


What PathResolver Actually Does

Input:  /home/user/../admin/./config.txt
                  ↓
         [PathResolver]
                  ↓
Output: /home/admin/config.txt

Step by step:

  1. /home/user → go to user’s home
  2. .. → go back up to /home
  3. admin → go into admin folder
  4. . → stay here (ignore)
  5. config.txt → the file!

Result: /home/admin/config.txt

Symlinks are like shortcuts. If /home/admin is actually a symlink pointing to /users/administrator, the resolver follows it:

Input:  /home/admin/config.txt
        (but /home/admin → /users/administrator)
                  ↓
         [PathResolver]
                  ↓
Output: /users/administrator/config.txt

The Two Main Methods

canonicalize() - Strict Mode

“Give me the real, final path. Everything must exist.”

#![allow(unused)]
fn main() {
resolver.canonicalize("/a/b/../c/file.txt", &fs)
// Returns: /a/c/file.txt (if it exists)
// Error: if any part doesn't exist
}

soft_canonicalize() - Relaxed Mode

“Resolve what you can, but the last part doesn’t need to exist yet.”

#![allow(unused)]
fn main() {
resolver.soft_canonicalize("/a/b/../c/new_file.txt", &fs)
// Returns: /a/c/new_file.txt (even if new_file.txt doesn't exist)
// Error: only if /a/c doesn't exist
}

This is useful for creating new files—you need to know WHERE to create them, but they don’t exist yet!


Summary: Why Separate?

BenefitExplanation
TestableTest path logic without touching real files
BenchmarkableMeasure performance in isolation
SwappableDifferent resolvers for different needs
MaintainableOne place to fix bugs, benefits all backends
UnderstandableEach piece has one job

In Code

#![allow(unused)]
fn main() {
// The trait (the "job description")
// Only canonicalize() is required - soft_canonicalize has a default implementation
pub trait PathResolver: Send + Sync {
    fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError>;
    
    // Default: canonicalize parent, append final component
    // Handles edge cases (root path, empty parent)
    fn soft_canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
        match path.parent() {
            Some(parent) if !parent.as_os_str().is_empty() => {
                let canonical_parent = self.canonicalize(parent, fs)?;
                match path.file_name() {
                    Some(name) => Ok(canonical_parent.join(name)),
                    None => Ok(canonical_parent),
                }
            }
            _ => self.canonicalize(path, fs),  // Root or single component
        }
    }
}

// For symlink-aware resolution (when backend implements FsLink):
pub trait PathResolverWithLinks: PathResolver {
    fn canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError>;
    // soft_canonicalize_following_links also has a default that delegates
}

// FileStorage uses it (boxed for flexibility)
pub struct FileStorage<B> {
    backend: B,                        // Where files live
    resolver: Box<dyn PathResolver>,   // How to simplify paths (boxed: cold path)
}
}

That’s it! PathResolver answers one question: “What’s the real, simple path?”

The soft_canonicalize variant is just a convenience - it reuses canonicalize internally but allows creating new files.

Everything else—reading files, writing files, listing directories—is the backend’s job.

Which Crate Should I Use?


Decision Guide

You want to…Use
Build an applicationanyfs
Use built-in backends (Memory, StdFs, VRootFs)anyfs
Use built-in middleware (Quota, PathFilter, etc.)anyfs
Use SQLite or IndexedBackendanyfs-sqlite / anyfs-indexed
Implement a custom backendanyfs-backend only
Implement custom middlewareanyfs-backend only

Quick Examples

Simple usage

#![allow(unused)]
fn main() {
use anyfs::MemoryBackend;
use anyfs::FileStorage;

let fs = FileStorage::new(MemoryBackend::new());
fs.create_dir_all("/data")?;
fs.write("/data/file.txt", b"hello")?;
}

With middleware (quotas, sandboxing, tracing)

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, PathFilterLayer, TracingLayer};
use anyfs::FileStorage;

let stack = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .deny("**/.env")
        .build())
    .layer(TracingLayer::new());

let fs = FileStorage::new(stack);
}

Custom backend implementation

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, DirEntry};
use std::path::Path;

pub struct MyBackend;

// Implement the three core traits - Fs is auto-implemented via blanket impl
impl FsRead for MyBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        todo!()
    }
    // ... 5 more FsRead methods
}

impl FsWrite for MyBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        todo!()
    }
    // ... 6 more FsWrite methods
}

impl FsDir for MyBackend {
    // ... 5 FsDir methods
}
// Total: 18 methods across FsRead + FsWrite + FsDir
}

Custom middleware implementation

#![allow(unused)]
fn main() {
use anyfs_backend::{Fs, Layer, FsError};
use std::path::Path;

pub struct MyMiddleware<B: Fs> {
    inner: B,
}

impl<B: Fs> Fs for MyMiddleware<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // Intercept, transform, or delegate
        self.inner.read(path)
    }
    // ... implement all methods
}

pub struct MyMiddlewareLayer;

impl<B: Fs> Layer<B> for MyMiddlewareLayer {
    type Backend = MyMiddleware<B>;
    fn layer(self, backend: B) -> Self::Backend {
        MyMiddleware { inner: backend }
    }
}
}

Common Mistakes

  • Don’t depend on anyfs if you’re only implementing a backend or middleware. Use anyfs-backend.
  • Don’t put policy in backends. Use middleware (Quota, PathFilter, etc.).
  • Don’t put policy in FileStorage. It is an ergonomic wrapper with centralized path resolution, not a policy layer.

Consumer Documentation Planning

This document specifies what the Context7-style consumer documentation should contain when the AnyFS library is implemented. This is a planning/specification document, not actual API documentation.


Purpose

When AnyFS is implemented, we need a Context7-style reference document that LLMs can use to correctly consume the AnyFS API. This document specifies what that reference should contain.

Why Context7-style?

  • LLMs need quick decision trees to select the right components
  • Copy-paste-ready patterns reduce hallucination
  • Common mistakes section prevents known pitfalls
  • Trait hierarchy helps understand what to implement

Required Sections

The consumer documentation MUST include these sections:

1. Quick Decision Trees

Decision trees help LLMs quickly navigate to the right component. Include:

Decision TreePurpose
Which Crate?anyfs-backend vs anyfs
Which Backend?Memory, SQLite, VRootFs, etc.
Which Middleware?Quota, PathFilter, ReadOnly, etc.
Which Trait Level?Fs, FsFull, FsFuse, FsPosix

Format: ASCII tree diagrams with terminal answers.

Example structure (to be filled with actual API when implemented):

Is data persistence required?
├─ NO → MemoryBackend
└─ YES → Is encryption needed?
         ├─ YES → SqliteBackend with `encryption` feature
         └─ NO → [continue decision tree...]

2. Common Patterns

Provide copy-paste-ready code for these scenarios:

PatternDescription
Simple File Operationsread, write, delete, check existence
Directory Operationscreate, list, remove
Sandboxed AI AgentFull middleware stack example
Persistent DatabaseSqliteBackend setup
Type-Safe WrappersUser-defined newtypes for compile-time safety
Streaming Large Filesopen_read/open_write usage

Requirements for each pattern:

  • Complete, runnable code blocks
  • All imports included
  • Proper error handling (no .unwrap())
  • Minimal code that demonstrates the concept

3. Trait Hierarchy Diagram

Visual representation of the trait hierarchy:

FsPosix  ← Full POSIX (handles, locks, xattr)
    ↑
FsFuse   ← FUSE-mountable (+ inodes)
    ↑
FsFull   ← std::fs features (+ links, permissions, sync, stats)
    ↑
   Fs    ← Basic filesystem (90% of use cases)
    ↑
FsRead + FsWrite + FsDir  ← Core traits

With clear guidance: “Implement the lowest level you need. Higher levels include all below.”

4. Backend Implementation Pattern

Template for implementing custom backends. The consumer docs should include:

LevelTraits to ImplementResult
MinimumFsRead + FsWrite + FsDirFs
ExtendedAdd FsLink, FsPermissions, FsSync, FsStatsFsFull
FUSEAdd FsInodeFsFuse
POSIXAdd FsHandles, FsLock, FsXattrFsPosix

Each level should have a complete template showing all required method signatures.

5. Middleware Implementation Pattern

Template showing:

  • How to wrap an inner backend with a generic type parameter
  • Which methods to intercept vs delegate
  • The Layer trait for .layer() syntax
  • Common middleware patterns table:
PatternInterceptDelegateExample
LoggingAll (before/after)AllTracing
Block writesWrite methods → errorRead methodsReadOnly
Transform dataread/writeEverything elseEncryption
Check accessAll (before)AllPathFilter
Enforce limitsWrite methods (check size)Read methodsQuota

6. Adapter Patterns

Templates for interoperability:

Adapter TypeDescription
FROM externalWrap external crate’s filesystem as AnyFS backend
TO externalWrap AnyFS backend to satisfy external crate’s trait

7. Error Handling Reference

All FsError variants with when to use each:

VariantWhen to Return
NotFoundPath doesn’t exist
AlreadyExistsPath already exists (create conflict)
NotAFileExpected file, got directory
NotADirectoryExpected directory, got file
DirectoryNotEmptyCan’t remove non-empty directory
ReadOnlyWrite blocked by ReadOnly middleware
AccessDeniedBlocked by PathFilter or permissions
QuotaExceededSize/count limit exceeded
NotSupportedBackend doesn’t support this operation
BackendBackend-specific error

8. Common Mistakes & Fixes

MistakeFix
Using unwrap()Always use ? or handle FsError
Assuming paths normalizedUse canonicalize() first
Forgetting parent dirsUse create_dir_all
Holding handles too longDrop promptly
Mixing backend typesUse FileStorage::boxed()
Testing with real filesUse MemoryBackend

Document Structure

When creating the actual consumer documentation, follow this structure:

# AnyFS Implementation Patterns

## Quick Decision Trees
### Which Crate Do I Need?
### Which Backend Should I Use?
### Do I Need Middleware?
### Which Trait Level?

## Common Patterns
### Simple File Operations
### Directory Operations
### Sandboxed AI Agent
### Persistent Database
### Type-Safe Wrapper Types

## Trait Hierarchy (Pick Your Level)

## Pattern 1: Implement a Backend
### Minimum: Implement Fs
### Add Links/Permissions: Implement FsFull
### Add FUSE Support: Implement FsFuse

## Pattern 2: Implement Middleware
### Template
### Common Middleware Patterns

## Pattern 3: Implement an Adapter
### Adapter FROM another crate
### Adapter TO another crate

## Error Handling Reference

## Common Mistakes & Fixes

## Quick Reference: What to Implement

Creation Guidelines

When creating the actual consumer documentation after implementation:

  1. Use actual tested code - Every example must compile and run
  2. Include all imports - LLMs need complete context
  3. Show error handling - Never use .unwrap() in examples
  4. Keep examples minimal - Shortest code that demonstrates the pattern
  5. Update with API changes - This doc must stay in sync with implementation
  6. Validate against real usage - Test each pattern before including it

Quality Checklist

Before publishing the consumer documentation:

  • All code examples compile
  • All code examples run without panics
  • Decision trees lead to correct answers
  • Error variants match actual FsError enum
  • Trait hierarchy matches actual trait definitions
  • Common mistakes reflect actual issues found in testing

DocumentPurpose
LLM Development MethodologyFor implementers: how to structure code for LLM development
This documentSpecification for consumer documentation
Backend GuideDesign for backend implementation
Middleware TutorialDesign for middleware creation

Tracking

This planning document should be replaced with actual consumer documentation when:

  1. AnyFS is implemented - The crates exist and compile
  2. API is stable - No major breaking changes expected
  3. Examples are tested - All patterns verified working

GitHub Issue: Create Context7-style consumer documentation

  • Status: Blocked by AnyFS implementation
  • Template: This planning document

LLM-Optimized Development Methodology

Purpose: This document defines the methodology for structuring AnyFS code so that each component is independently testable, reviewable, replaceable, and fixable—by both humans and LLMs—without requiring full project context.


Core Principle: Context-Independent Components

Every component in AnyFS should be understandable and modifiable with only local context. An LLM (or human contributor) should be able to:

  1. Understand a component by reading only its file + trait definition
  2. Test a component in isolation without the rest of the system
  3. Fix a bug by looking at only the failing component + error message
  4. Review changes without understanding the entire architecture
  5. Replace a component with an alternative implementation

This is achieved through strict separation of concerns, clear contracts (traits), and self-documenting structure.


The Five Pillars

1. Single Responsibility per File

Each file implements exactly one concept:

FileImplementsDependencies
fs_read.rsFsRead traitFsError, Metadata
quota.rsQuota<B> middlewareFs trait
memory.rsMemoryBackendFs, FsLink, etc.
iterative.rsIterativeResolverPathResolver trait

Why: An LLM can be given just the file + its dependencies. No need for “the big picture.”

2. Contract-First Design (Traits as Contracts)

Every component implements a well-defined trait. The trait IS the specification:

#![allow(unused)]
fn main() {
/// Read operations for a virtual filesystem.
/// 
/// # Contract
/// - All methods use `&self` (interior mutability)
/// - Thread-safe: `Send + Sync` required
/// - Errors are always `FsError`, never panic
/// 
/// # Implementor Checklist
/// - [ ] Handle non-existent paths with `FsError::NotFound`
/// - [ ] Handle non-UTF8 content in `read_to_string` with `FsError::InvalidData`
/// - [ ] `metadata()` follows symlinks; use `symlink_metadata()` for link info
pub trait FsRead: Send + Sync {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    // ...
}
}

LLM Instruction: “Implement FsRead for MyBackend. Follow the contract in the trait doc.”

3. Isolated Testing (No Integration Dependencies)

Each component has tests that run without external dependencies:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::*;
    
    // Mock only what's needed
    struct MockFs {
        files: HashMap<PathBuf, Vec<u8>>,
    }
    
    #[test]
    fn quota_rejects_oversized_write() {
        let mock = MockFs::new();
        let quota = mock.layer(QuotaLayer::builder()
            .max_file_size(100)
            .build());
        
        let result = quota.write(Path::new("/big.txt"), &[0u8; 200]);
        assert!(matches!(result, Err(FsError::FileSizeExceeded { .. })));
    }
}
}

Why: LLM can run tests for just the component being fixed. No database, no filesystem, no network.

4. Error Messages as Documentation

Errors must contain enough context to fix the problem without reading other code:

#![allow(unused)]
fn main() {
// ❌ Bad: Requires context to understand
Err(FsError::NotFound { path: path.to_path_buf() })

// ✅ Good: Self-explanatory
Err(FsError::NotFound { 
    path: path.to_path_buf(),
    operation: "read",
    context: "file does not exist or is a directory".into(),
})
}

LLM Instruction: “The error says ‘quota exceeded: limit 100MB, requested 150MB, usage 80MB’. Fix the code that’s writing 150MB.”

5. Documentation at Every Boundary

Every public item has documentation explaining:

  • What it does (one line)
  • When to use it (use case)
  • How to use it (example)
  • Why it exists (rationale, if non-obvious)
#![allow(unused)]
fn main() {
/// Path resolution strategy using iterative component-by-component traversal.
///
/// # When to Use
/// - Default resolver for virtual backends (MemoryBackend, SqliteBackend)
/// - When you need standard POSIX-like symlink resolution
///
/// # Example
/// ```rust
/// let resolver = IterativeResolver::new();
/// let canonical = resolver.canonicalize(Path::new("/a/b/../c"), &fs)?;
/// ```
///
/// # Performance
/// O(n) where n = number of path components. For deep paths with many symlinks,
/// consider `CachingResolver` wrapper.
pub struct IterativeResolver { /* ... */ }
}

File Structure Convention

Every implementation file follows this structure:

#![allow(unused)]
fn main() {
//! # Component Name
//!
//! Brief description of what this component does.
//!
//! ## Responsibility
//! - Single bullet point describing THE responsibility
//!
//! ## Dependencies
//! - List of traits/types this depends on
//!
//! ## Usage
//! ```rust
//! // Minimal working example
//! ```

use crate::{...}; // Minimal imports

// ============================================================================
// Types
// ============================================================================

/// Primary type for this component.
pub struct ComponentName { /* ... */ }

// ============================================================================
// Trait Implementations
// ============================================================================

impl SomeTrait for ComponentName {
    // Implementation
}

// ============================================================================
// Public API
// ============================================================================

impl ComponentName {
    /// Constructor with sensible defaults.
    pub fn new() -> Self { /* ... */ }
    
    /// Builder-style configuration.
    pub fn with_option(self, value: T) -> Self { /* ... */ }
}

// ============================================================================
// Private Helpers
// ============================================================================

impl ComponentName {
    fn internal_helper(&self) { /* ... */ }
}

// ============================================================================
// Tests
// ============================================================================

#[cfg(test)]
mod tests {
    use super::*;
    
    // Tests that verify the contract
}
}

LLM Prompting Patterns

Pattern 1: Implement a Component

Implement `CachingResolver` in `anyfs/src/resolvers/caching.rs`.

Contract: Implement `PathResolver` trait (see anyfs-backend/src/path_resolver.rs).

Requirements:
- Wrap another resolver with LRU cache
- Cache resolved canonical paths keyed by input path
- Bounded cache size (configurable max entries)

Test: Write a test verifying cache hit returns same result as cache miss.

Pattern 2: Fix a Bug

Bug: `Quota<B>` doesn't account for existing file size when checking write limits.

File: src/middleware/quota.rs
Error: QuotaExceeded when writing 50 bytes to a 30-byte file with 100-byte limit.
Expected: Should succeed (30 + 50 = 80 < 100).

Fix the `check_write_quota` method.

Pattern 3: Add a Feature

Add `max_path_depth` limit to `Quota<B>` middleware.

File: src/middleware/quota.rs
Contract: Reject operations that would create paths deeper than the limit.

Example:
```rust
let fs = backend.layer(QuotaLayer::builder()
    .max_path_depth(5)
    .build());
fs.create_dir_all("/a/b/c/d/e/f")?; // Err: depth 6 > limit 5

### Pattern 4: Review a Change

Review this change to IterativeResolver:

  • Does it maintain the PathResolver contract?
  • Are edge cases handled (empty path, root path, circular symlinks)?
  • Are error messages informative?
  • Are tests sufficient?

[diff]


---

## Component Isolation Checklist

Before considering a component complete, verify:

- [ ] **Single file** - Component lives in one file (or one module with mod.rs)
- [ ] **Clear contract** - Implements a trait with documented invariants
- [ ] **Minimal dependencies** - Only depends on traits/types, not other implementations
- [ ] **Self-contained tests** - Tests use mocks, not real backends
- [ ] **Informative errors** - Error messages explain what went wrong and how to fix
- [ ] **Usage example** - Doc comment shows how to use in isolation
- [ ] **No global state** - All state is in the struct instance
- [ ] **Thread-safe** - `Send + Sync` where required
- [ ] **Documented edge cases** - What happens with empty input, None, errors?

---

## Open Source Contribution Benefits

This methodology directly enables:

| Benefit                     | How                                                             |
| --------------------------- | --------------------------------------------------------------- |
| **First-time contributors** | Can understand one component without reading the whole codebase |
| **Focused PRs**             | Changes stay in one file, easy to review                        |
| **Parallel development**    | Multiple contributors work on different components              |
| **Quick onboarding**        | Read the trait, implement the trait, done                       |
| **CI efficiency**           | Test just the changed component                                 |

---

## Anti-Patterns to Avoid

### ❌ Spaghetti Dependencies

```rust
// Bad: Middleware knows about specific backends
impl<B: Fs> Quota<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        if let Some(sqlite) = self.inner.downcast_ref::<SqliteBackend>() {
            // Special case for SQLite
        }
    }
}

❌ Hidden Context Requirements

#![allow(unused)]
fn main() {
// Bad: Requires knowing about global configuration
impl FsRead for MyBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let config = CONFIG.lock().unwrap(); // Hidden global!
        // ...
    }
}
}

❌ Tests That Require Setup

#![allow(unused)]
fn main() {
// Bad: Requires database, filesystem, network
#[test]
fn test_vrootfs_backend() {
    let db = VRootFsBackend::new("/tmp/test").unwrap(); // Creates real files!
    // ...
}
}

❌ Vague Errors

#![allow(unused)]
fn main() {
// Bad: No context
Err(FsError::Backend("operation failed".into()))
}

Integration with Context7-style Documentation

When the project is complete, we will provide a consumer-facing LLM context document that:

  1. Explains the surface API (what to import, what to call)
  2. Provides decision trees (which backend? which middleware?)
  3. Shows complete, runnable examples
  4. Lists common mistakes and how to avoid them

This is separate from AGENTS.md (for contributors) and lives in Implementation Patterns.


Summary

PrincipleImplementation
IsolatedOne file, one concept
ContractedTraits define the spec
TestableMock-based unit tests
DebuggableRich error context
DocumentedExamples at every boundary
LLM-ReadyPromptable patterns for common tasks

By following this methodology, AnyFS becomes a codebase where any component can be understood, tested, fixed, or replaced by an LLM (or human) with only local context. This is the foundation for sustainable AI-assisted development.

Cross-Platform Virtual Drive Mounting

Mounting AnyFS backends as real filesystem mount points


Overview

AnyFS backends implementing FsFuse can be mounted as real filesystem drives that any application can access. This is part of the anyfs crate (behind feature flags: fuse for Linux/macOS, winfsp for Windows) because mounting is a core promise of AnyFS, not an optional extra.


Product Promise

Mounting is a core AnyFS promise: make filesystem composition easy, safe, and genuinely enjoyable for programmers. The mount API prioritizes:

  • Easy onboarding (one handle, one builder, minimal boilerplate)
  • Safe defaults (explicit read-only modes, clear errors, no hidden behavior)
  • Delightful DX (predictable behavior, fast feedback, good docs)

Roadmap (MVP to Cross-Platform)

Phase 0: Design and API shape (complete)

  • API spec defines MountHandle, MountBuilder, MountOptions, MountError
  • Platform detection hooks (is_available) and consistent error mapping
  • Examples and docs anchored in this guide Acceptance: Spec review complete; API signatures consistent across docs; error mapping defined.

Phase 1: Linux FUSE MVP (read-only, pending)

  • fuser adapter for lookup/getattr/readdir/read
  • Read-only mount option; write ops return PermissionDenied Acceptance: Mount/unmount works on Linux; read-only operations pass smoke tests; unmount-on-drop is reliable.

Phase 2: Linux FUSE read/write (pending)

  • Full write path: create, write, rename, remove, link operations
  • Capability reporting and correct metadata mapping Acceptance: Conformance tests pass for FsFuse path/inode behavior; no panics; clean shutdown.

Phase 3: macOS parity (macFUSE, pending)

  • Port Linux FUSE adapter to macFUSE requirements
  • Driver detection and install guidance Acceptance: Mount/unmount works on macOS with core read/write flows.

Phase 4: Windows support (WinFsp, optional Dokan, pending)

  • WinFsp adapter with required mapping for Windows semantics
  • Optional Dokan path as alternative provider Acceptance: Mount/unmount works on Windows; driver detection errors are clear and actionable.

Non-goals

  • Kernel drivers or kernel-space code
  • WASM or browser environments
  • Network filesystem protocols (NFS/SMB)

Platform Technologies

PlatformTechnologyRust CrateUser Installation
LinuxFUSEfuserUsually pre-installed
macOSmacFUSEfusermacFUSE
WindowsWinFspwinfspWinFsp
WindowsDokandokanDokan

Key insight: Linux and macOS both use FUSE (via fuser crate), but Windows requires a completely different API (WinFsp or Dokan).


Architecture

Unified Mount Trait

#![allow(unused)]
fn main() {
/// Platform-agnostic mount handle.
/// Drop to unmount.
pub struct MountHandle {
    inner: Box<dyn MountHandleInner>,
}

impl MountHandle {
    /// Mount a backend at the specified path.
    ///
    /// Platform requirements:
    /// - Linux: FUSE (usually available)
    /// - macOS: macFUSE must be installed
    /// - Windows: WinFsp or Dokan must be installed
    pub fn mount<B: FsFuse>(backend: B, path: impl AsRef<Path>) -> Result<Self, MountError> {
        #[cfg(unix)]
        return fuse_mount(backend, path);

        #[cfg(windows)]
        return winfsp_mount(backend, path);

        #[cfg(not(any(unix, windows)))]
        return Err(MountError::PlatformNotSupported);
    }

    /// Check if mounting is available on this platform.
    pub fn is_available() -> bool {
        #[cfg(target_os = "linux")]
        return check_fuse_available();

        #[cfg(target_os = "macos")]
        return check_macfuse_available();

        #[cfg(windows)]
        return check_winfsp_available() || check_dokan_available();

        #[cfg(not(any(unix, windows)))]
        return false;
    }

    /// Unmount the filesystem.
    pub fn unmount(self) -> Result<(), MountError> {
        self.inner.unmount()
    }
}

impl Drop for MountHandle {
    fn drop(&mut self) {
        let _ = self.inner.unmount();
    }
}
}

Platform Adapters

┌─────────────────────────────────────────────────────────────┐
│                     MountHandle (unified API)               │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────┐ │
│  │ FuseAdapter │  │ FuseAdapter │  │ WinFspAdapter       │ │
│  │   (Linux)   │  │   (macOS)   │  │    (Windows)        │ │
│  └──────┬──────┘  └──────┬──────┘  └──────────┬──────────┘ │
│         │                │                     │            │
│         ▼                ▼                     ▼            │
│    ┌─────────┐      ┌─────────┐         ┌──────────┐       │
│    │  fuser  │      │  fuser  │         │  winfsp  │       │
│    │  crate  │      │  crate  │         │  crate   │       │
│    └────┬────┘      └────┬────┘         └────┬─────┘       │
│         │                │                   │              │
│         ▼                ▼                   ▼              │
│    ┌─────────┐      ┌─────────┐         ┌──────────┐       │
│    │  FUSE   │      │ macFUSE │         │  WinFsp  │       │
│    │ (kernel)│      │ (kext)  │         │ (driver) │       │
│    └─────────┘      └─────────┘         └──────────┘       │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Module Structure

Mounting is part of the anyfs crate:

anyfs/
  src/
    mount/
      mod.rs                    # MountHandle, MountError, re-exports
      error.rs                  # MountError definitions
      handle.rs                 # MountHandle, MountOptions, builder

      unix/
        mod.rs                  # cfg(unix)
        fuse_adapter.rs         # FUSE implementation via fuser

      windows/
        mod.rs                  # cfg(windows)
        winfsp_adapter.rs       # WinFsp implementation

Feature Flags in anyfs Cargo.toml

[package]
name = "anyfs"
version = "0.1.0"

[dependencies]
anyfs-backend = { version = "0.1" }

[target.'cfg(unix)'.dependencies]
fuser = { version = "0.14", optional = true }

[target.'cfg(windows)'.dependencies]
winfsp = { version = "0.4", optional = true }

[features]
default = []
fuse = ["dep:fuser"]      # Enable mounting on Linux/macOS
winfsp = ["dep:winfsp"]   # Enable mounting on Windows

FUSE Adapter (Linux/macOS)

The FUSE adapter translates between fuser::Filesystem trait and our FsFuse trait:

#![allow(unused)]
fn main() {
use fuser::{Filesystem, Request, ReplyEntry, ReplyAttr, ReplyData, ReplyDirectory};
use anyfs_backend::{FsFuse, FsError, Metadata, FileType};

pub struct FuseAdapter<B: FsFuse> {
    backend: B,
}

impl<B: FsFuse> Filesystem for FuseAdapter<B> {
    fn lookup(&mut self, _req: &Request, parent: u64, name: &OsStr, reply: ReplyEntry) {
        match self.backend.lookup(parent, name) {
            Ok(inode) => {
                match self.backend.metadata_by_inode(inode) {
                    Ok(meta) => reply.entry(&TTL, &to_fuse_attr(&meta), 0),
                    Err(e) => reply.error(to_errno(&e)),
                }
            }
            Err(e) => reply.error(to_errno(&e)),
        }
    }

    fn getattr(&mut self, _req: &Request, ino: u64, reply: ReplyAttr) {
        match self.backend.metadata_by_inode(ino) {
            Ok(meta) => reply.attr(&TTL, &to_fuse_attr(&meta)),
            Err(e) => reply.error(to_errno(&e)),
        }
    }

    fn read(&mut self, _req: &Request, ino: u64, _fh: u64, offset: i64, size: u32, _flags: i32, _lock: Option<u64>, reply: ReplyData) {
        let path = match self.backend.inode_to_path(ino) {
            Ok(p) => p,
            Err(e) => return reply.error(to_errno(&e)),
        };

        match self.backend.read_range(&path, offset as u64, size as usize) {
            Ok(data) => reply.data(&data),
            Err(e) => reply.error(to_errno(&e)),
        }
    }

    fn readdir(&mut self, _req: &Request, ino: u64, _fh: u64, offset: i64, mut reply: ReplyDirectory) {
        let path = match self.backend.inode_to_path(ino) {
            Ok(p) => p,
            Err(e) => return reply.error(to_errno(&e)),
        };

        match self.backend.read_dir(&path) {
            Ok(entries) => {
                for (i, entry) in entries.iter().enumerate().skip(offset as usize) {
                    let file_type = match entry.file_type {
                        FileType::File => fuser::FileType::RegularFile,
                        FileType::Directory => fuser::FileType::Directory,
                        FileType::Symlink => fuser::FileType::Symlink,
                    };

                    if reply.add(entry.inode, (i + 1) as i64, file_type, &entry.name) {
                        break;
                    }
                }
                reply.ok();
            }
            Err(e) => reply.error(to_errno(&e)),
        }
    }

    // ... write, create, mkdir, unlink, rmdir, rename, symlink, etc.
}

fn to_errno(e: &FsError) -> i32 {
    match e {
        FsError::NotFound { .. } => libc::ENOENT,
        FsError::AlreadyExists { .. } => libc::EEXIST,
        FsError::NotADirectory { .. } => libc::ENOTDIR,
        FsError::NotAFile { .. } => libc::EISDIR,
        FsError::DirectoryNotEmpty { .. } => libc::ENOTEMPTY,
        FsError::AccessDenied { .. } => libc::EACCES,
        FsError::ReadOnly { .. } => libc::EROFS,
        FsError::QuotaExceeded { .. } => libc::ENOSPC,
        _ => libc::EIO,
    }
}
}

WinFsp Adapter (Windows)

WinFsp has a different API but similar concepts:

#![allow(unused)]
fn main() {
use winfsp::filesystem::{FileSystem, FileSystemContext, FileInfo, DirInfo};
use anyfs_backend::{FsFuse, FsError};

pub struct WinFspAdapter<B: FsFuse> {
    backend: B,
}

impl<B: FsFuse> FileSystem for WinFspAdapter<B> {
    fn get_file_info(&self, file_context: &FileContext) -> Result<FileInfo, NTSTATUS> {
        let meta = self.backend.metadata(&file_context.path)
            .map_err(to_ntstatus)?;
        Ok(to_file_info(&meta))
    }

    fn read(&self, file_context: &FileContext, buffer: &mut [u8], offset: u64) -> Result<usize, NTSTATUS> {
        let data = self.backend.read_range(&file_context.path, offset, buffer.len())
            .map_err(to_ntstatus)?;
        buffer[..data.len()].copy_from_slice(&data);
        Ok(data.len())
    }

    fn read_directory(&self, file_context: &FileContext, marker: Option<&str>, callback: impl FnMut(DirInfo)) -> Result<(), NTSTATUS> {
        let entries = self.backend.read_dir(&file_context.path)
            .map_err(to_ntstatus)?;

        for entry in entries {
            // ReadDirIter yields Result<DirEntry, FsError>
            let entry = entry.map_err(to_ntstatus)?;
            callback(to_dir_info(&entry));
        }
        Ok(())
    }

    // ... write, create, delete, rename, etc.
}

fn to_ntstatus(e: FsError) -> NTSTATUS {
    match e {
        FsError::NotFound { .. } => STATUS_OBJECT_NAME_NOT_FOUND,
        FsError::AlreadyExists { .. } => STATUS_OBJECT_NAME_COLLISION,
        FsError::AccessDenied { .. } => STATUS_ACCESS_DENIED,
        FsError::ReadOnly { .. } => STATUS_MEDIA_WRITE_PROTECTED,
        _ => STATUS_INTERNAL_ERROR,
    }
}
}

Usage

Basic Mount

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, MountHandle};

// Create backend with middleware
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build());

// Mount as drive
let mount = MountHandle::mount(backend, "/mnt/ramdisk")?;

// Now /mnt/ramdisk is a real mount point
// Any application can read/write files there

// Unmount when done (or on drop)
mount.unmount()?;
}

Windows Drive Letter

#![allow(unused)]
fn main() {
#[cfg(windows)]
let mount = MountHandle::mount(backend, "X:")?;

// Now X: is a virtual drive
}

Check Availability

#![allow(unused)]
fn main() {
if MountHandle::is_available() {
    let mount = MountHandle::mount(backend, path)?;
} else {
    eprintln!("Mounting not available. Install:");
    #[cfg(target_os = "macos")]
    eprintln!("  - macFUSE: https://osxfuse.github.io/");
    #[cfg(windows)]
    eprintln!("  - WinFsp: https://winfsp.dev/");
}
}

Mount Options

#![allow(unused)]
fn main() {
let mount = MountHandle::builder(backend)
    .mount_point("/mnt/data")
    .read_only(true)                    // Force read-only mount
    .allow_other(true)                  // Allow other users (Linux/macOS)
    .auto_unmount(true)                 // Unmount on process exit
    .uid(1000)                          // Override UID (Linux/macOS)
    .gid(1000)                          // Override GID (Linux/macOS)
    .mount()?;
}

Error Handling

#![allow(unused)]
fn main() {
pub enum MountError {
    /// Platform doesn't support mounting (e.g., WASM)
    PlatformNotSupported,

    /// Required driver not installed (macFUSE, WinFsp)
    DriverNotInstalled {
        driver: &'static str,
        install_url: &'static str,
    },

    /// Mount point doesn't exist or isn't accessible
    InvalidMountPoint { path: PathBuf },

    /// Mount point already in use
    MountPointBusy { path: PathBuf },

    /// Permission denied (need root/admin)
    PermissionDenied,

    /// Backend error during mount
    Backend(FsError),

    /// Platform-specific error
    Platform(String),

    /// Missing mount point in options
    MissingMountPoint,
}
}

Integration with Middleware

All middleware works transparently when mounted:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, TracingLayer, RateLimitLayer, MountHandle};

// Build secure, audited, rate-limited mount
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(1024 * 1024 * 1024)  // 1 GB
        .build())
    .layer(PathFilterLayer::builder()
        .deny("**/.git/**")
        .deny("**/.env")
        .build())
    .layer(RateLimitLayer::builder()
        .max_ops(10000)
        .per_second()
        .build())
    .layer(TracingLayer::new());

let mount = MountHandle::mount(backend, "/mnt/secure")?;

// External apps see a normal filesystem
// But all operations are:
// - Quota-limited
// - Path-filtered
// - Rate-limited
// - Traced/audited

// Imagine: A mounted "USB drive" that reports real-time IOPS
// to a Prometheus dashboard!
}

Real-Time Observability

Because the mount point sits on top of your middleware stack, you get live visibility into OS operations:

  • Metrics: See valid IOPS, throughput, and latency for your virtual drive in Grafana.
  • Audit Logs: Record every file your legacy app touches.
  • Virus Scanning: Scan files as the OS writes them, rejecting malware in real-time.

---

## Use Cases

### Temporary Workspace

```rust
let workspace = MemoryBackend::new();
let mount = MountHandle::mount(workspace, "/tmp/workspace")?;

// Run build tools that expect real filesystem
std::process::Command::new("cargo")
    .current_dir("/tmp/workspace")
    .arg("build")
    .status()?;

Portable Database as Drive

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// User's files stored in SQLite
let db = SqliteBackend::open("user_files.db")?;
let mount = MountHandle::mount(db, "U:")?;

// User can browse U: in Explorer
// Files are actually in SQLite database
}

Network Storage

#![allow(unused)]
fn main() {
// Remote backend (future anyfs-s3, anyfs-sftp, etc.)
let remote = S3Backend::new("my-bucket")?;
let cached = remote.layer(CacheLayer::builder()
    .max_size(100 * 1024 * 1024)
    .build());
let mount = MountHandle::mount(cached, "/mnt/cloud")?;

// Local apps see /mnt/cloud as regular filesystem
// Actually reads/writes to S3 with local caching
}

Platform Requirements Summary

PlatformDriverInstall Command / URL
LinuxFUSEUsually pre-installed. If not: apt install fuse3
macOSmacFUSEhttps://osxfuse.github.io/
WindowsWinFsphttps://winfsp.dev/ (recommended)
WindowsDokanhttps://dokan-dev.github.io/ (alternative)

Limitations

  1. Requires external driver - Users must install macFUSE (macOS) or WinFsp (Windows)
  2. Root/admin may be required - Some mount operations need elevated privileges
  3. Not available on WASM - Browser environment has no filesystem mounting
  4. Performance overhead - Userspace filesystem has kernel boundary crossing overhead
  5. Backend must implement FsFuse - Requires FsInode trait for inode operations

Alternative: No Mount Needed

For many use cases, mounting isn’t necessary. AnyFS backends can be used directly:

NeedWith MountingWithout Mounting
Build toolsMount, run toolsUse tool’s VFS plugin if available
File browserMount as driveBuild custom UI with AnyFS API
BackupMount, use rsyncUse AnyFS API directly
DatabaseMount for SQL toolsQuery SQLite directly

Rule of thumb: Only mount when you need compatibility with external applications that expect real filesystem paths.

Tutorial: Building a TXT Backend (Yes, Really)

How to turn a humble text file into a functioning virtual filesystem


The Absurd Premise

What if your entire filesystem was just… a text file you can edit in Notepad?

path,type,mode,data
/,dir,755,
/hello.txt,file,644,SGVsbG8sIFdvcmxkIQ==
/docs,dir,755,
/docs/readme.md,file,644,IyBXZWxjb21lIQoKWWVzLCB0aGlzIGlzIGluIGEgLnR4dCBmaWxl

One line per file. Comma-separated. Base64 content. Open it in Notepad, edit a file, save, done.

Sounds ridiculous? It is. But it works. And building it teaches you everything about implementing AnyFS backends.

Let’s do this.


Why This Is Actually Useful

Beyond the memes, a TXT backend demonstrates:

  1. Backend flexibility - AnyFS doesn’t care how you store bytes
  2. Trait implementation - You’ll implement FsRead, FsWrite, FsDir
  3. Middleware composition - We’ll add Quota to prevent the file from exploding
  4. Real-world patterns - The same patterns apply to serious backends
  5. Separation of concerns - Backends just store bytes; FileStorage handles path resolution

Plus, you can literally edit your “filesystem” in Notepad. Try doing that with ext4.

Important: Backends receive already-resolved paths from FileStorage. You don’t need to handle .., symlinks, or normalization - that’s FileStorage’s job. Your backend just stores and retrieves bytes at the given paths.


The Format

One line per entry. Four comma-separated fields. Dead simple:

path,type,mode,data
FieldDescriptionExample
pathAbsolute path/docs/file.txt
typefile or dirfile
modeUnix permissions (octal)644
dataBase64-encoded contentSGVsbG8=

Directories have empty data field. That’s the entire format. Open in Notepad, add a line, you created a file.


Step 1: Data Structures

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use base64::{Engine as _, engine::general_purpose::STANDARD as BASE64};

/// A single entry in our TXT filesystem
#[derive(Clone, Debug)]
struct TxtEntry {
    path: PathBuf,
    is_dir: bool,
    mode: u32,
    content: Vec<u8>,
}

impl TxtEntry {
    fn new_dir(path: impl Into<PathBuf>) -> Self {
        Self {
            path: path.into(),
            is_dir: true,
            mode: 0o755,
            content: Vec::new(),
        }
    }

    fn new_file(path: impl Into<PathBuf>, content: Vec<u8>) -> Self {
        Self {
            path: path.into(),
            is_dir: false,
            mode: 0o644,
            content,
        }
    }

    /// Serialize to a line: path,type,mode,data
    fn to_line(&self) -> String {
        let file_type = if self.is_dir { "dir" } else { "file" };
        let data_b64 = if self.content.is_empty() {
            String::new()
        } else {
            BASE64.encode(&self.content)
        };

        format!("{},{},{:o},{}", self.path.display(), file_type, self.mode, data_b64)
    }

    /// Parse from line: path,type,mode,data
    fn from_line(line: &str) -> Result<Self, TxtParseError> {
        let parts: Vec<&str> = line.splitn(4, ',').collect();
        if parts.len() < 3 {
            return Err(TxtParseError::InvalidFormat);
        }

        let content = if parts.len() == 4 && !parts[3].is_empty() {
            BASE64.decode(parts[3]).map_err(|_| TxtParseError::InvalidBase64)?
        } else {
            Vec::new()
        };

        Ok(Self {
            path: PathBuf::from(parts[0]),
            is_dir: parts[1] == "dir",
            mode: u32::from_str_radix(parts[2], 8)
                .map_err(|_| TxtParseError::InvalidNumber)?,
            content,
        })
    }
}

#[derive(Debug)]
enum TxtParseError {
    InvalidFormat,
    InvalidBase64,
    InvalidNumber,
}
}

Step 2: The Backend Structure

#![allow(unused)]
fn main() {
use std::sync::{Arc, RwLock};
use std::fs::File;
use std::io::{BufRead, BufReader, Write};

/// A filesystem backend that stores everything in a .txt file.
///
/// Yes, this is cursed. Yes, it works. Yes, you can edit it in Notepad.
pub struct TxtBackend {
    /// Path to the .txt file on the host filesystem
    txt_path: PathBuf,
    /// In-memory cache of entries (path -> entry)
    entries: Arc<RwLock<HashMap<PathBuf, TxtEntry>>>,
}

impl TxtBackend {
    /// Create a new TXT backend, loading from file if it exists
    pub fn open(txt_path: impl Into<PathBuf>) -> Result<Self, FsError> {
        let txt_path = txt_path.into();
        let mut entries = HashMap::new();

        // Always ensure root directory exists
        entries.insert(PathBuf::from("/"), TxtEntry::new_dir("/"));

        // Load existing entries if file exists
        if txt_path.exists() {
            let file = File::open(&txt_path)
                .map_err(|e| FsError::Io {
                    operation: "open txt",
                    path: txt_path.clone(),
                    source: e,
                })?;

            for (line_num, line) in BufReader::new(file).lines().enumerate() {
                let line = line.map_err(|e| FsError::Io {
                    operation: "read line",
                    path: txt_path.clone(),
                    source: e,
                })?;

                // Skip header line
                if line_num == 0 && line.starts_with("path,") {
                    continue;
                }

                // Skip empty lines
                if line.trim().is_empty() {
                    continue;
                }

                let entry = TxtEntry::from_line(&line)
                    .map_err(|_| FsError::CorruptedData {
                        path: txt_path.clone(),
                        details: format!("line {}", line_num + 1),
                    })?;

                entries.insert(entry.path.clone(), entry);
            }
        }

        Ok(Self {
            txt_path,
            entries: Arc::new(RwLock::new(entries)),
        })
    }

    /// Create a new in-memory backend (won't persist to disk)
    pub fn in_memory() -> Self {
        let mut entries = HashMap::new();
        entries.insert(PathBuf::from("/"), TxtEntry::new_dir("/"));

        Self {
            txt_path: PathBuf::from(":memory:"),
            entries: Arc::new(RwLock::new(entries)),
        }
    }

    /// Flush all entries to the .txt file
    fn flush(&self) -> Result<(), FsError> {
        // Skip if in-memory mode
        if self.txt_path.as_os_str() == ":memory:" {
            return Ok(());
        }

        let entries = self.entries.read().unwrap();

        let mut file = File::create(&self.txt_path)
            .map_err(|e| FsError::Io {
                operation: "create txt",
                path: self.txt_path.clone(),
                source: e,
            })?;

        // Write header
        writeln!(file, "path,type,mode,data")
            .map_err(|e| FsError::Io {
                operation: "write header",
                path: self.txt_path.clone(),
                source: e,
            })?;

        // Write entries (sorted for consistency)
        let mut paths: Vec<_> = entries.keys().collect();
        paths.sort();

        for path in paths {
            let entry = &entries[path];
            writeln!(file, "{}", entry.to_line())
                .map_err(|e| FsError::Io {
                    operation: "write entry",
                    path: path.clone(),
                    source: e,
                })?;
        }

        Ok(())
    }

}
}

Step 3: Implement FsRead

Now the fun part - making it quack like a filesystem:

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsError, Metadata, FileType};

impl FsRead for TxtBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();
        let entries = self.entries.read().unwrap();

        let entry = entries.get(&path)
            .ok_or_else(|| FsError::NotFound { path: path.clone() })?;

        if entry.is_dir {
            return Err(FsError::NotAFile { path });
        }

        Ok(entry.content.clone())
    }

    fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
        let bytes = self.read(path.as_ref())?;
        String::from_utf8(bytes)
            .map_err(|_| FsError::InvalidData {
                path: path.as_ref().to_path_buf(),
                details: "not valid UTF-8".to_string(),
            })
    }

    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
        let content = self.read(path)?;
        let start = offset as usize;

        if start >= content.len() {
            return Ok(Vec::new());
        }

        let end = (start + len).min(content.len());
        Ok(content[start..end].to_vec())
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        let path = path.as_ref();
        let entries = self.entries.read().unwrap();
        Ok(entries.contains_key(path))
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        let path = path.as_ref();
        let entries = self.entries.read().unwrap();

        let entry = entries.get(path)
            .ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;

        Ok(Metadata {
            file_type: if entry.is_dir { FileType::Directory } else { FileType::File },
            size: entry.content.len() as u64,
            permissions: Permissions::from_mode(entry.mode),
            created: std::time::UNIX_EPOCH,   // TxtBackend doesn't track timestamps
            modified: std::time::UNIX_EPOCH,
            accessed: std::time::UNIX_EPOCH,
            inode: 0,   // No inode support
            nlink: 1,   // No hardlink support
        })
    }

    fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
        let content = self.read(path)?;
        Ok(Box::new(std::io::Cursor::new(content)))
    }
}
}

Step 4: Implement FsWrite

Where the magic happens - writing files to a text file:

#![allow(unused)]
fn main() {
use anyfs_backend::FsWrite;

impl FsWrite for TxtBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();

        // Ensure parent directory exists
        if let Some(parent) = path.parent() {
            let parent_str = parent.to_string_lossy();
            if parent_str != "/" && !parent_str.is_empty() {
                let entries = self.entries.read().unwrap();
                if !entries.contains_key(parent) {
                    drop(entries);
                    return Err(FsError::NotFound {
                        path: parent.to_path_buf()
                    });
                }
            }
        }

        let mut entries = self.entries.write().unwrap();

        // Check if it's a directory
        if let Some(existing) = entries.get(&path) {
            if existing.is_dir {
                return Err(FsError::NotAFile { path });
            }
        }

        // Create or update the file
        let entry = TxtEntry::new_file(path.clone(), data.to_vec());
        entries.insert(path, entry);

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();
        let mut entries = self.entries.write().unwrap();

        let entry = entries.get_mut(&path)
            .ok_or_else(|| FsError::NotFound { path: path.clone() })?;

        if entry.is_dir {
            return Err(FsError::NotAFile { path });
        }

        entry.content.extend_from_slice(data);

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();
        let mut entries = self.entries.write().unwrap();

        let entry = entries.get(&path)
            .ok_or_else(|| FsError::NotFound { path: path.clone() })?;

        if entry.is_dir {
            return Err(FsError::NotAFile { path });
        }

        entries.remove(&path);

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        let from = from.as_ref().to_path_buf();
        let to = to.as_ref().to_path_buf();

        let mut entries = self.entries.write().unwrap();

        let mut entry = entries.remove(&from)
            .ok_or_else(|| FsError::NotFound { path: from.clone() })?;

        entry.path = to.clone();
        entries.insert(to, entry);

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        let from = from.as_ref().to_path_buf();
        let to = to.as_ref().to_path_buf();

        let entries = self.entries.read().unwrap();

        let source = entries.get(&from)
            .ok_or_else(|| FsError::NotFound { path: from.clone() })?;

        if source.is_dir {
            return Err(FsError::NotAFile { path: from });
        }

        let mut new_entry = source.clone();
        new_entry.path = to.clone();

        drop(entries);

        let mut entries = self.entries.write().unwrap();
        entries.insert(to, new_entry);

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();
        let mut entries = self.entries.write().unwrap();

        let entry = entries.get_mut(&path)
            .ok_or_else(|| FsError::NotFound { path: path.clone() })?;

        if entry.is_dir {
            return Err(FsError::NotAFile { path });
        }

        entry.content.truncate(size as usize);

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
        // For simplicity, we buffer writes and apply on drop
        // A real implementation would be more sophisticated
        let path = path.as_ref().to_path_buf();

        // Ensure file exists (create empty if not)
        if !self.exists(&path)? {
            self.write(&path, b"")?;
        }

        Ok(Box::new(TxtFileWriter {
            backend: self.entries.clone(),
            txt_path: self.txt_path.clone(),
            path,
            buffer: Vec::new(),
        }))
    }
}

/// Writer that buffers content and writes to TXT on drop
struct TxtFileWriter {
    backend: Arc<RwLock<HashMap<PathBuf, TxtEntry>>>,
    txt_path: PathBuf,
    path: PathBuf,
    buffer: Vec<u8>,
}

impl std::io::Write for TxtFileWriter {
    fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
        self.buffer.extend_from_slice(buf);
        Ok(buf.len())
    }

    fn flush(&mut self) -> std::io::Result<()> {
        Ok(())
    }
}

impl Drop for TxtFileWriter {
    fn drop(&mut self) {
        let mut entries = self.backend.write().unwrap();
        if let Some(entry) = entries.get_mut(&self.path) {
            entry.content = std::mem::take(&mut self.buffer);
        }
        // Note: flush to disk happens on next explicit flush() call
    }
}
}

Step 5: Implement FsDir

Directory operations to complete the Fs trait:

#![allow(unused)]
fn main() {
use anyfs_backend::{FsDir, DirEntry};

impl FsDir for TxtBackend {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        let path = path.as_ref().to_path_buf();
        let entries = self.entries.read().unwrap();

        // Verify the path is a directory
        let entry = entries.get(&path)
            .ok_or_else(|| FsError::NotFound { path: path.clone() })?;

        if !entry.is_dir {
            return Err(FsError::NotADirectory { path });
        }

        // Find all direct children
        let mut children = Vec::new();

        for (child_path, child_entry) in entries.iter() {
            if let Some(parent) = child_path.parent() {
                if parent == path && child_path != &path {
                    children.push(DirEntry {
                        name: child_path.file_name()
                            .unwrap_or_default()
                            .to_string_lossy()
                            .into_owned(),
                        path: child_path.clone(),
                        file_type: if child_entry.is_dir {
                            FileType::Directory
                        } else {
                            FileType::File
                        },
                        size: child_entry.size,
                        inode: 0,  // No inode support
                    });
                }
            }
        }

        // Sort for consistent ordering
        children.sort_by(|a, b| a.name.cmp(&b.name));

        // Wrap in ReadDirIter (items are Ok since we've already validated them)
        Ok(ReadDirIter::new(children.into_iter().map(Ok)))
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();

        // Check parent exists
        if let Some(parent) = path.parent() {
            let parent_str = parent.to_string_lossy();
            if parent_str != "/" && !parent_str.is_empty() {
                let entries = self.entries.read().unwrap();
                let parent_entry = entries.get(parent)
                    .ok_or_else(|| FsError::NotFound {
                        path: parent.to_path_buf()
                    })?;

                if !parent_entry.is_dir {
                    return Err(FsError::NotADirectory {
                        path: parent.to_path_buf()
                    });
                }
            }
        }

        let mut entries = self.entries.write().unwrap();

        // Check if already exists
        if entries.contains_key(&path) {
            return Err(FsError::AlreadyExists { path, operation: "create_dir" });
        }

        entries.insert(path.clone(), TxtEntry::new_dir(path));

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();

        // Build list of directories to create
        let mut to_create = Vec::new();
        let mut current = path.clone();

        loop {
            {
                let entries = self.entries.read().unwrap();
                if entries.contains_key(&current) {
                    break;
                }
            }

            to_create.push(current.clone());

            match current.parent() {
                Some(parent) if !parent.as_os_str().is_empty() => {
                    current = parent.to_path_buf();
                }
                _ => break,
            }
        }

        // Create directories from root to leaf
        to_create.reverse();
        for dir_path in to_create {
            let mut entries = self.entries.write().unwrap();
            if !entries.contains_key(&dir_path) {
                entries.insert(dir_path.clone(), TxtEntry::new_dir(dir_path));
            }
        }

        self.flush()?;
        Ok(())
    }

    fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();

        // Can't remove root
        if path.to_string_lossy() == "/" {
            return Err(FsError::PermissionDenied {
                path,
                operation: "remove root directory"
            });
        }

        let entries = self.entries.read().unwrap();

        let entry = entries.get(&path)
            .ok_or_else(|| FsError::NotFound { path: path.clone() })?;

        if !entry.is_dir {
            return Err(FsError::NotADirectory { path });
        }

        // Check if empty
        let has_children = entries.keys().any(|p| {
            p != &path && p.starts_with(&path)
        });

        if has_children {
            return Err(FsError::DirectoryNotEmpty { path });
        }

        drop(entries);

        let mut entries = self.entries.write().unwrap();
        entries.remove(&path);

        drop(entries);
        self.flush()?;

        Ok(())
    }

    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref().to_path_buf();

        // Can't remove root
        if path.to_string_lossy() == "/" {
            return Err(FsError::PermissionDenied {
                path,
                operation: "remove root directory"
            });
        }

        let mut entries = self.entries.write().unwrap();

        // Verify it exists and is a directory
        let entry = entries.get(&path)
            .ok_or_else(|| FsError::NotFound { path: path.clone() })?;

        if !entry.is_dir {
            return Err(FsError::NotADirectory { path: path.clone() });
        }

        // Remove all entries under this path
        let to_remove: Vec<_> = entries.keys()
            .filter(|p| p.starts_with(&path))
            .cloned()
            .collect();

        for p in to_remove {
            entries.remove(&p);
        }

        drop(entries);
        self.flush()?;

        Ok(())
    }
}
}

Step 6: Putting It All Together

Now you have a complete Fs implementation! Let’s use it:

use anyfs::{FileStorage, QuotaLayer, TracingLayer};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create our glorious TXT filesystem
    let backend = TxtBackend::open("my_filesystem.txt")?
        // Wrap it with middleware to prevent the file from exploding
        .layer(QuotaLayer::builder()
            .max_total_size(10 * 1024 * 1024)  // 10 MB max
            .max_file_size(1 * 1024 * 1024)    // 1 MB per file
            .build())
        // Add tracing because why not
        .layer(TracingLayer::new());

    // Create the filesystem wrapper
    let fs = FileStorage::new(backend);

    // Use it like any other filesystem!
    fs.create_dir_all("/projects/secret")?;
    fs.write("/projects/secret/plans.txt", b"World domination via TXT")?;
    fs.write("/projects/readme.md", b"# My TXT-backed project\n\nYes, really.")?;

    // Read it back
    let content = fs.read_to_string("/projects/secret/plans.txt")?;
    println!("Plans: {}", content);

    // List directory
    for entry in fs.read_dir("/projects")? {
        println!("  {} ({})", entry.name,
            if entry.file_type == FileType::Directory { "dir" } else { "file" });
    }

    // Copy a file
    fs.copy("/projects/readme.md", "/projects/readme_backup.md")?;

    // Delete a file
    fs.remove_file("/projects/readme_backup.md")?;

    println!("\nNow open my_filesystem.txt in Notepad!");

    Ok(())
}

The Result

After running the code, your my_filesystem.txt looks like:

path,type,mode,data
/,dir,755,
/projects,dir,755,
/projects/secret,dir,755,
/projects/secret/plans.txt,file,644,V29ybGQgZG9taW5hdGlvbiB2aWEgVFhU
/projects/readme.md,file,644,IyBNeSBUWFQtYmFja2VkIHByb2plY3QKCllzLCByZWFsbHku

Open it in Notepad. Marvel at your filesystem. Edit a line. Save. You just modified a file.


Why This Actually Matters

This ridiculous example demonstrates the power of AnyFS’s design:

  1. True backend abstraction - The FileStorage API doesn’t know or care that it’s backed by a text file

  2. Middleware just works - Quota and Tracing wrap your custom backend with zero extra code

  3. Type safety preserved - Compile-time guarantees work with any backend

  4. Easy to implement - ~250 lines for a complete working backend

  5. Testable - Use TxtBackend::in_memory() for fast tests

  6. Human-editable - Open in Notepad, add a line, you created a file


Next Steps

If you’re feeling brave:

  1. Add symlink support - Implement FsLink trait
  2. Make it async - Wrap with tokio::fs for the host CSV file
  3. Add compression - Gzip the base64 content
  4. Excel integration - Add formulas that compute file sizes (why not?)

Bonus: Mount It as a Drive

With the fuse feature enabled, you can mount your text file as a real filesystem:

#![allow(unused)]
fn main() {
use anyfs::MountHandle;

let backend = TxtBackend::open("filesystem.txt")?;
let mount = MountHandle::mount(backend, "/mnt/txt")?;

// Now /mnt/txt is a real mount point backed by a .txt file
// Any application can read/write files there
// The data goes into a text file you can edit in Notepad
// This is fine
}

The Moral

AnyFS doesn’t care where bytes come from or where they go. Memory, SQLite, a text file, a REST API, carrier pigeons with USB drives - if you can implement the traits, it’s a valid backend.

The middleware layer (quotas, sandboxing, rate limiting, logging) works transparently on any backend. That’s the power of good abstractions.

Now go build something less cursed. Or don’t. I’m not your supervisor.


“I store my production data in text files” - Nobody, ever (until now)

“Can I edit my filesystem in Notepad?” - Yes. Yes you can.

Tutorial: Building Your First Middleware

From zero to intercepting filesystem operations in 15 minutes


What is Middleware?

Middleware wraps a backend and intercepts operations. That’s it.

User Request → [Your Middleware] → [Backend] → Storage
              ↑                  ↓
              └── intercept ─────┘

You can:

  • Block operations (ReadOnly, PathFilter)
  • Transform data (Encryption, Compression)
  • Count/Log operations (Counter, Tracing)
  • Enforce limits (Quota, RateLimit)

Let’s build one.


The Simplest Middleware: Operation Counter

We’ll count every operation. That’s our entire goal.

Step 1: The Struct

#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicU64, Ordering};

/// Counts every operation performed on the wrapped backend.
pub struct Counter<B> {
    inner: B,                    // The backend we're wrapping
    pub count: AtomicU64,        // Our counter
}

impl<B> Counter<B> {
    pub fn new(inner: B) -> Self {
        Self {
            inner,
            count: AtomicU64::new(0),
        }
    }

    pub fn operations(&self) -> u64 {
        self.count.load(Ordering::Relaxed)
    }
}
}

That’s the entire struct. We wrap something (inner) and add our state (count).

Step 2: Implement FsRead

Now we implement the same traits as the inner backend, intercepting each method:

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsError, Metadata};
use std::path::Path;

impl<B: FsRead> FsRead for Counter<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);  // COUNT IT
        self.inner.read(path)                         // DELEGATE
    }

    fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.read_to_string(path)
    }

    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.read_range(path, offset, len)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.exists(path)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.metadata(path)
    }

    fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.open_read(path)
    }
}
}

The pattern is always the same:

  1. Do your thing (count)
  2. Call self.inner.method(args) (delegate)

Step 3: Implement FsWrite

Same pattern:

#![allow(unused)]
fn main() {
use anyfs_backend::FsWrite;

impl<B: FsWrite> FsWrite for Counter<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.write(path, data)
    }

    fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.append(path, data)
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.remove_file(path)
    }

    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.rename(from, to)
    }

    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.copy(from, to)
    }

    fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.truncate(path, size)
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.open_write(path)
    }
}
}

Step 4: Implement FsDir

#![allow(unused)]
fn main() {
use anyfs_backend::{FsDir, ReadDirIter};

impl<B: FsDir> FsDir for Counter<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.read_dir(path)
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.create_dir(path)
    }

    fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.create_dir_all(path)
    }

    fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.remove_dir(path)
    }

    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.remove_dir_all(path)
    }
}

// Counter<B> now implements Fs when B: Fs (blanket impl)!
}

Step 5: Use It

use anyfs::{FileStorage, MemoryBackend};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let fs = FileStorage::new(Counter::new(MemoryBackend::new()));

    fs.write("/hello.txt", b"Hello, World!")?;
    fs.read("/hello.txt")?;
    fs.read("/hello.txt")?;
    fs.exists("/hello.txt")?;

    println!("Total operations: {}", fs.operations());  // 4

    Ok(())
}

That’s it. You built middleware.


Adding .layer() Support

Want the fluent .layer() syntax? Add a Layer struct:

#![allow(unused)]
fn main() {
use anyfs_backend::{Layer, Fs};

/// Layer for creating Counter middleware.
pub struct CounterLayer;

impl<B: Fs> Layer<B> for CounterLayer {
    type Backend = Counter<B>;

    fn layer(self, backend: B) -> Counter<B> {
        Counter::new(backend)
    }
}
}

Now you can do:

#![allow(unused)]
fn main() {
use anyfs::FileStorage;

let fs = FileStorage::new(
    MemoryBackend::new()
        .layer(CounterLayer)
);

fs.write("/test.txt", b"data")?;
println!("Operations: {}", fs.operations());
}

A More Useful Middleware: SecretBlocker

Let’s build something practical - block access to files matching a pattern:

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter};
use std::path::Path;

/// Blocks access to files containing "secret" in the path.
pub struct SecretBlocker<B> {
    inner: B,
}

impl<B> SecretBlocker<B> {
    pub fn new(inner: B) -> Self {
        Self { inner }
    }

    /// Check if path is forbidden.
    fn is_secret(&self, path: &Path) -> bool {
        path.to_string_lossy().to_lowercase().contains("secret")
    }

    /// Return error if path is secret.
    fn check(&self, path: &Path) -> Result<(), FsError> {
        if self.is_secret(path) {
            Err(FsError::AccessDenied {
                path: path.to_path_buf(),
                reason: "secret files are blocked".to_string(),
            })
        } else {
            Ok(())
        }
    }
}
}

Implement the Traits

#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for SecretBlocker<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();
        self.check(path)?;           // BLOCK if secret
        self.inner.read(path)        // DELEGATE otherwise
    }

    fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.read_to_string(path)
    }

    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.read_range(path, offset, len)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.exists(path)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.metadata(path)
    }

    fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.open_read(path)
    }
}

impl<B: FsWrite> FsWrite for SecretBlocker<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.write(path, data)
    }

    fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.append(path, data)
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.remove_file(path)
    }

    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        let from = from.as_ref();
        let to = to.as_ref();
        self.check(from)?;
        self.check(to)?;  // Block both source and destination
        self.inner.rename(from, to)
    }

    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        let from = from.as_ref();
        let to = to.as_ref();
        self.check(from)?;
        self.check(to)?;
        self.inner.copy(from, to)
    }

    fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.truncate(path, size)
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.open_write(path)
    }
}

impl<B: FsDir> FsDir for SecretBlocker<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.read_dir(path)
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.create_dir(path)
    }

    fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.create_dir_all(path)
    }

    fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.remove_dir(path)
    }

    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        self.check(path)?;
        self.inner.remove_dir_all(path)
    }
}
}

Use It

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let fs = FileStorage::new(SecretBlocker::new(MemoryBackend::new()));

    // These work fine
    fs.write("/public/data.txt", b"Hello!")?;
    fs.read("/public/data.txt")?;

    // These are blocked
    assert!(fs.write("/secret/passwords.txt", b"hunter2").is_err());
    assert!(fs.read("/my-secret-diary.txt").is_err());
    assert!(fs.create_dir("/SECRET").is_err());

    println!("Secret files successfully blocked!");
    Ok(())
}

The Middleware Pattern Cheat Sheet

What You WantInterceptDelegateReturn
Count operationsBefore callAlwaysInner result
Block some pathsBefore callIf allowedError or inner result
Block writesWrite methodsRead methodsError or inner result
Transform dataread/writeEverything elseModified data
Log operationsBefore/afterAlwaysInner result

Three Types of Middleware

1. Pass-through with side effects (Counter, Logger)

#![allow(unused)]
fn main() {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
    log::info!("Reading: {:?}", path.as_ref());  // Side effect
    self.inner.read(path)                         // Always delegate
}
}

2. Conditional blocking (PathFilter, ReadOnly)

#![allow(unused)]
fn main() {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
    if self.is_blocked(path.as_ref()) {
        return Err(FsError::AccessDenied { ... });  // Block
    }
    self.inner.write(path, data)                    // Allow
}
}

3. Data transformation (Encryption, Compression)

#![allow(unused)]
fn main() {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
    let encrypted = self.inner.read(path)?;  // Get data
    Ok(self.decrypt(&encrypted))              // Transform
}

fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
    let encrypted = self.encrypt(data);       // Transform
    self.inner.write(path, &encrypted)        // Store
}
}

Example: Indexing Middleware (Future)

Use IndexLayer to keep a queryable index of file activity:

#![allow(unused)]
fn main() {
use anyfs::{IndexLayer, FileStorage, MemoryBackend};

let backend = MemoryBackend::new()
    .layer(IndexLayer::builder()
        .index_file("index.db")
        .consistency(IndexConsistency::Strict)
        .track_reads(false)
        .build());

let fs = FileStorage::new(backend);
fs.write("/docs/hello.txt", b"hello")?;
}

Complete Example: ReadOnly Middleware

The classic - block all writes:

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter, Layer, Fs};
use std::path::Path;

/// Makes any backend read-only.
pub struct ReadOnly<B> {
    inner: B,
}

impl<B> ReadOnly<B> {
    pub fn new(inner: B) -> Self {
        Self { inner }
    }
}

// FsRead: delegate everything
impl<B: FsRead> FsRead for ReadOnly<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        self.inner.read(path)
    }

    fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
        self.inner.read_to_string(path)
    }

    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
        self.inner.read_range(path, offset, len)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        self.inner.exists(path)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        self.inner.metadata(path)
    }

    fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
        self.inner.open_read(path)
    }
}

// FsWrite: block everything
impl<B: FsWrite> FsWrite for ReadOnly<B> {
    fn write(&self, _: &Path, _: &[u8]) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "write" })
    }

    fn append(&self, _: &Path, _: &[u8]) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "append" })
    }

    fn remove_file(&self, _: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "remove_file" })
    }

    fn rename(&self, _: &Path, _: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "rename" })
    }

    fn copy(&self, _: &Path, _: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "copy" })
    }

    fn truncate(&self, _: &Path, _: u64) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "truncate" })
    }

    fn open_write(&self, _: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
        Err(FsError::ReadOnly { operation: "open_write" })
    }
}

// FsDir: delegate reads, block writes
impl<B: FsDir> FsDir for ReadOnly<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        self.inner.read_dir(path)  // Reading is OK
    }

    fn create_dir(&self, _: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "create_dir" })
    }

    fn create_dir_all(&self, _: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "create_dir_all" })
    }

    fn remove_dir(&self, _: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "remove_dir" })
    }

    fn remove_dir_all(&self, _: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "remove_dir_all" })
    }
}

// Layer for .layer() syntax
pub struct ReadOnlyLayer;

impl<B: Fs> Layer<B> for ReadOnlyLayer {
    type Backend = ReadOnly<B>;

    fn layer(self, backend: B) -> Self::Backend {
        ReadOnly::new(backend)
    }
}
}

Usage

#![allow(unused)]
fn main() {
let fs = FileStorage::new(
    MemoryBackend::new()
        .layer(ReadOnlyLayer)
);

// Reads work
fs.exists("/anything")?;

// Writes fail
assert!(fs.write("/file.txt", b"data").is_err());
assert!(fs.create_dir("/new").is_err());
}

Stacking Middleware

Middleware composes naturally:

#![allow(unused)]
fn main() {
let fs = MemoryBackend::new()
    .layer(SecretBlockerLayer)      // Block secret files
    .layer(ReadOnlyLayer)           // Make read-only
    .layer(CounterLayer);           // Count operations

// Layers wrap from inside out. For a request:
// Counter (outermost) → ReadOnly → SecretBlocker → MemoryBackend (innermost)
// The innermost middleware (closest to backend) applies first to the actual operation.
}

Middleware Checklist

Before publishing your middleware:

  • Depends only on anyfs-backend
  • Implements same traits as inner backend (FsRead, FsWrite, FsDir)
  • Has a Layer implementation for .layer() syntax
  • Documents which operations are intercepted vs delegated
  • Handles errors properly (doesn’t panic)
  • Is thread-safe (&self methods, use atomics/locks for state)

Summary

Middleware is just:

  1. A struct wrapping inner: B
  2. Implementing the same traits as B
  3. Intercepting some methods, delegating others

The three patterns:

  1. Side effects: Do something, then delegate
  2. Blocking: Check condition, return error or delegate
  3. Transform: Modify data on the way in/out

That’s it. Go build something useful.


“Middleware: because sometimes you need to do something between nothing and everything.”

Remote Backend Patterns

Building networked filesystem backends and clients

This guide covers patterns for exposing AnyFS backends over a network and building clients that mount remote filesystems.


Overview

A remote filesystem has three components:

┌─────────────┐      Network       ┌─────────────┐      ┌─────────────┐
│   Client    │ ←───────────────→  │   Server    │  ──→ │   Backend   │
│  (FUSE)     │     RPC/REST       │   (API)     │      │  (Storage)  │
└─────────────┘                    └─────────────┘      └─────────────┘
     User's                           Cloud               SQLite/CAS
     Machine                          Service             Hybrid/etc

AnyFS backends are local by design. To go remote, you need:

  1. Server: Exposes backend operations over network
  2. Protocol: Wire format for requests/responses
  3. Client: Implements Fs traits by calling server

Protocol Design

Operations to Expose

Map Fs trait methods to RPC operations:

Trait MethodRPC OperationNotes
read(path)Read(path, range?)Support partial reads
write(path, data)Write(path, data)Chunked for large files
exists(path)Exists(path)Or combine with Metadata
metadata(path)Metadata(path)Return full stat
read_dir(path)ListDir(path, cursor?)Paginated for large dirs
create_dir(path)CreateDir(path)
create_dir_all(path)CreateDirAll(path)Or client-side loop
remove_file(path)Remove(path)
remove_dir(path)RemoveDir(path)
remove_dir_all(path)RemoveDirAll(path)Recursive
rename(from, to)Rename(from, to)
copy(from, to)Copy(from, to)Server-side copy

Request/Response Format

Use a simple, efficient format. Here’s a protobuf-style schema:

// requests.proto

message Request {
  string request_id = 1;  // For idempotency
  string auth_token = 2;  // Authentication

  oneof operation {
    ReadRequest read = 10;
    WriteRequest write = 11;
    MetadataRequest metadata = 12;
    ListDirRequest list_dir = 13;
    CreateDirRequest create_dir = 14;
    RemoveRequest remove = 15;
    RenameRequest rename = 16;
    CopyRequest copy = 17;
  }
}

message ReadRequest {
  string path = 1;
  optional uint64 offset = 2;
  optional uint64 length = 3;
}

message WriteRequest {
  string path = 1;
  bytes data = 2;
  bool append = 3;
}

message MetadataRequest {
  string path = 1;
}

message ListDirRequest {
  string path = 1;
  optional string cursor = 2;  // For pagination
  optional uint32 limit = 3;
}

// ... other requests

message Response {
  string request_id = 1;
  bool success = 2;

  oneof result {
    ErrorResult error = 10;
    ReadResult read = 11;
    WriteResult write = 12;
    MetadataResult metadata = 13;
    ListDirResult list_dir = 14;
    // ... others return empty success
  }
}

message ErrorResult {
  string code = 1;    // "not_found", "permission_denied", etc.
  string message = 2;
  string path = 3;
}

message MetadataResult {
  string file_type = 1;  // "file", "dir", "symlink"
  uint64 size = 2;
  uint32 mode = 3;
  optional uint64 created_at = 4;
  optional uint64 modified_at = 5;
  optional uint64 accessed_at = 6;
  optional uint64 inode = 7;
  optional uint32 nlink = 8;
}

message ListDirResult {
  repeated DirEntry entries = 1;
  optional string next_cursor = 2;  // Null if no more
}

message DirEntry {
  string name = 1;
  string path = 2;       // Full path to entry
  string file_type = 3;
  uint64 size = 4;
  optional uint64 inode = 5;
}

Protocol Choices

ProtocolProsConsUse When
gRPCFast, typed, streamingComplex setupHigh performance
REST/JSONSimple, debuggableSlower, no streamingCompatibility
WebSocketBidirectional, real-timeMore complexLive updates
Custom TCPMaximum controlBuild everythingSpecial needs

Recommendation: Start with gRPC (tonic in Rust). Fall back to REST for web clients.


Server Implementation

Basic Server Structure

#![allow(unused)]
fn main() {
use tonic::{transport::Server, Request, Response, Status};
use anyfs_backend::Fs;
use anyfs::FileStorage;

pub struct FsServer<B: Fs> {
    backend: FileStorage<B>,
}

impl<B: Fs + Send + Sync + 'static> FsServer<B> {
    pub fn new(backend: B) -> Self {
        Self { backend: FileStorage::new(backend) }
    }

    pub async fn serve(self, addr: &str) -> Result<(), Box<dyn std::error::Error>> {
        let addr = addr.parse()?;

        Server::builder()
            .add_service(FsServiceServer::new(self))
            .serve(addr)
            .await?;

        Ok(())
    }
}

#[tonic::async_trait]
impl<B: Fs + Send + Sync + 'static> FsService for FsServer<B> {
    async fn read(
        &self,
        request: Request<ReadRequest>,
    ) -> Result<Response<ReadResponse>, Status> {
        let req = request.into_inner();

        let data = match req.length {
            Some(len) => self.backend.read_range(&req.path, req.offset.unwrap_or(0), len as usize),
            None => self.backend.read(&req.path),
        };

        match data {
            Ok(bytes) => Ok(Response::new(ReadResponse {
                data: bytes,
                success: true,
                error: None,
            })),
            Err(e) => Ok(Response::new(ReadResponse {
                data: vec![],
                success: false,
                error: Some(fs_error_to_proto(e)),
            })),
        }
    }

    async fn write(
        &self,
        request: Request<WriteRequest>,
    ) -> Result<Response<WriteResponse>, Status> {
        let req = request.into_inner();

        let result = if req.append {
            self.backend.append(&req.path, &req.data)
        } else {
            self.backend.write(&req.path, &req.data)
        };

        match result {
            Ok(()) => Ok(Response::new(WriteResponse {
                success: true,
                error: None,
            })),
            Err(e) => Ok(Response::new(WriteResponse {
                success: false,
                error: Some(fs_error_to_proto(e)),
            })),
        }
    }

    // ... implement other methods
}

fn fs_error_to_proto(e: FsError) -> ProtoError {
    match e {
        FsError::NotFound { path } => ProtoError {
            code: "not_found".into(),
            message: "File not found".into(),
            path: path.to_string_lossy().into(),
        },
        FsError::AlreadyExists { path, .. } => ProtoError {
            code: "already_exists".into(),
            message: "File already exists".into(),
            path: path.to_string_lossy().into(),
        },
        // ... map other errors
        _ => ProtoError {
            code: "internal".into(),
            message: e.to_string(),
            path: String::new(),
        },
    }
}
}

Authentication Middleware

Add authentication as a tower layer:

#![allow(unused)]
fn main() {
use tonic::service::Interceptor;

#[derive(Clone)]
pub struct AuthInterceptor {
    valid_tokens: Arc<HashSet<String>>,
}

impl Interceptor for AuthInterceptor {
    fn call(&mut self, mut request: Request<()>) -> Result<Request<()>, Status> {
        let token = request
            .metadata()
            .get("authorization")
            .and_then(|v| v.to_str().ok())
            .map(|s| s.trim_start_matches("Bearer "));

        match token {
            Some(t) if self.valid_tokens.contains(t) => Ok(request),
            _ => Err(Status::unauthenticated("Invalid or missing token")),
        }
    }
}

// Usage
Server::builder()
    .add_service(FsServiceServer::with_interceptor(fs_server, auth_interceptor))
    .serve(addr)
    .await?;
}

Rate Limiting

Protect against abuse:

#![allow(unused)]
fn main() {
use governor::{Quota, RateLimiter};
use std::num::NonZeroU32;

pub struct RateLimitedServer<B: Fs> {
    inner: FsServer<B>,
    limiter: RateLimiter<String>,  // Per-user rate limiter
}

impl<B: Fs> RateLimitedServer<B> {
    pub fn new(backend: B, requests_per_second: u32) -> Self {
        let quota = Quota::per_second(NonZeroU32::new(requests_per_second).unwrap());
        Self {
            inner: FsServer::new(backend),
            limiter: RateLimiter::keyed(quota),
        }
    }

    async fn check_rate_limit(&self, user_id: &str) -> Result<(), Status> {
        self.limiter
            .check_key(&user_id.to_string())
            .map_err(|_| Status::resource_exhausted("Rate limit exceeded"))?;
        Ok(())
    }
}
}

Idempotency

Handle retried requests safely:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::time::{Duration, Instant};

pub struct IdempotencyCache {
    cache: RwLock<HashMap<String, (Instant, Response)>>,
    ttl: Duration,
}

impl IdempotencyCache {
    pub fn new(ttl: Duration) -> Self {
        Self {
            cache: RwLock::new(HashMap::new()),
            ttl,
        }
    }

    /// Check if we've seen this request before.
    pub fn get(&self, request_id: &str) -> Option<Response> {
        let cache = self.cache.read().unwrap();
        cache.get(request_id)
            .filter(|(ts, _)| ts.elapsed() < self.ttl)
            .map(|(_, resp)| resp.clone())
    }

    /// Store response for future duplicate requests.
    pub fn put(&self, request_id: String, response: Response) {
        let mut cache = self.cache.write().unwrap();
        cache.insert(request_id, (Instant::now(), response));
    }

    /// Clean up expired entries (call periodically).
    pub fn cleanup(&self) {
        let mut cache = self.cache.write().unwrap();
        cache.retain(|_, (ts, _)| ts.elapsed() < self.ttl);
    }
}
}

Client Implementation

Remote Backend (Client-Side)

The client implements Fs traits by making RPC calls:

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter, DirEntry};
use std::path::Path;

pub struct RemoteBackend {
    client: FsServiceClient<tonic::transport::Channel>,
    auth_token: String,
}

impl RemoteBackend {
    pub async fn connect(addr: &str, auth_token: String) -> Result<Self, FsError> {
        let client = FsServiceClient::connect(addr.to_string())
            .await
            .map_err(|e| FsError::Backend(format!("connect failed: {}", e)))?;

        Ok(Self { client, auth_token })
    }

    fn request<T>(&self, req: T) -> tonic::Request<T> {
        let mut request = tonic::Request::new(req);
        request.metadata_mut().insert(
            "authorization",
            format!("Bearer {}", self.auth_token).parse().unwrap(),
        );
        request
    }
}

impl FsRead for RemoteBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // Note: This is sync, but we're calling async code
        // In practice, use tokio::runtime::Handle or async traits

        let rt = tokio::runtime::Handle::current();
        rt.block_on(async {
            let req = self.request(ReadRequest {
                path: path.as_ref().to_string_lossy().into(),
                offset: None,
                length: None,
            });

            let response = self.client.clone().read(req)
                .await
                .map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
                .into_inner();

            if response.success {
                Ok(response.data)
            } else {
                Err(proto_error_to_fs(response.error.unwrap()))
            }
        })
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        // Could be a dedicated RPC or use metadata
        match self.metadata(path) {
            Ok(_) => Ok(true),
            Err(FsError::NotFound { .. }) => Ok(false),
            Err(e) => Err(e),
        }
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        let rt = tokio::runtime::Handle::current();
        rt.block_on(async {
            let req = self.request(MetadataRequest {
                path: path.as_ref().to_string_lossy().into(),
            });

            let response = self.client.clone().metadata(req)
                .await
                .map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
                .into_inner();

            if response.success {
                Ok(proto_metadata_to_fs(response.metadata.unwrap()))
            } else {
                Err(proto_error_to_fs(response.error.unwrap()))
            }
        })
    }

    // ... other methods
}

impl FsWrite for RemoteBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let rt = tokio::runtime::Handle::current();
        rt.block_on(async {
            let req = self.request(WriteRequest {
                path: path.as_ref().to_string_lossy().into(),
                data: data.to_vec(),
                append: false,
            });

            let response = self.client.clone().write(req)
                .await
                .map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
                .into_inner();

            if response.success {
                Ok(())
            } else {
                Err(proto_error_to_fs(response.error.unwrap()))
            }
        })
    }

    // ... other methods
}

impl FsDir for RemoteBackend {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        let rt = tokio::runtime::Handle::current();
        rt.block_on(async {
            let mut all_entries = Vec::new();
            let mut cursor = None;

            // Paginate through all results
            loop {
                let req = self.request(ListDirRequest {
                    path: path.as_ref().to_string_lossy().into(),
                    cursor: cursor.clone(),
                    limit: Some(1000),
                });

                let response = self.client.clone().list_dir(req)
                    .await
                    .map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
                    .into_inner();

                if !response.success {
                    return Err(proto_error_to_fs(response.error.unwrap()));
                }

                all_entries.extend(response.entries.into_iter().map(proto_entry_to_fs));

                match response.next_cursor {
                    Some(c) => cursor = Some(c),
                    None => break,
                }
            }

            Ok(ReadDirIter::new(all_entries.into_iter().map(Ok)))
        })
    }

    // ... other methods
}
}

Caching Layer

Network calls are slow. Add caching:

#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::sync::RwLock;
use std::time::{Duration, Instant};

/// Client-side cache for remote filesystem.
pub struct CachingBackend<B> {
    inner: B,
    metadata_cache: RwLock<HashMap<PathBuf, (Instant, Metadata)>>,
    content_cache: RwLock<LruCache<PathBuf, Vec<u8>>>,
    metadata_ttl: Duration,
    max_cached_file_size: u64,
}

impl<B> CachingBackend<B> {
    pub fn new(inner: B) -> Self {
        Self {
            inner,
            metadata_cache: RwLock::new(HashMap::new()),
            content_cache: RwLock::new(LruCache::new(100)),  // 100 files
            metadata_ttl: Duration::from_secs(5),
            max_cached_file_size: 1024 * 1024,  // 1 MB
        }
    }

    /// Invalidate cache for a path (call after writes).
    pub fn invalidate(&self, path: &Path) {
        self.metadata_cache.write().unwrap().remove(path);
        self.content_cache.write().unwrap().pop(path);
    }

    /// Invalidate everything under a directory.
    pub fn invalidate_prefix(&self, prefix: &Path) {
        let mut meta = self.metadata_cache.write().unwrap();
        let mut content = self.content_cache.write().unwrap();

        meta.retain(|k, _| !k.starts_with(prefix));
        // LruCache doesn't have retain, so we'd need a different structure
    }
}

impl<B: FsRead> FsRead for CachingBackend<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();

        // Check cache first
        if let Some(data) = self.content_cache.read().unwrap().peek(path) {
            return Ok(data.clone());
        }

        // Cache miss - fetch from remote
        let data = self.inner.read(path)?;

        // Cache if small enough
        if data.len() as u64 <= self.max_cached_file_size {
            self.content_cache.write().unwrap().put(path.to_path_buf(), data.clone());
        }

        Ok(data)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        let path = path.as_ref();

        // Check cache
        {
            let cache = self.metadata_cache.read().unwrap();
            if let Some((ts, meta)) = cache.get(path) {
                if ts.elapsed() < self.metadata_ttl {
                    return Ok(meta.clone());
                }
            }
        }

        // Cache miss
        let meta = self.inner.metadata(path)?;

        // Store in cache
        self.metadata_cache.write().unwrap()
            .insert(path.to_path_buf(), (Instant::now(), meta.clone()));

        Ok(meta)
    }

    // ... other methods
}

impl<B: FsWrite> FsWrite for CachingBackend<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();

        // Write through to remote
        self.inner.write(path, data)?;

        // Invalidate cache
        self.invalidate(path);

        Ok(())
    }

    // ... other methods - all invalidate cache after modifying
}
}

Cache Invalidation Strategies

StrategyHowWhen to Use
TTLExpire after N secondsRead-heavy, eventual consistency OK
Write-throughInvalidate on local writeSingle client
Server pushWebSocket notificationsReal-time consistency
Version/ETagCheck version on readBalance of consistency/perf

Offline Mode

Handle network failures gracefully:

#![allow(unused)]
fn main() {
use anyfs::FileStorage;

pub struct OfflineCapableBackend<B> {
    remote: FileStorage<B>,
    local_cache: SqliteBackend,  // Local SQLite for offline ops
    mode: RwLock<ConnectionMode>,
    pending_writes: RwLock<Vec<PendingWrite>>,
}

#[derive(Clone, Copy)]
enum ConnectionMode {
    Online,
    Offline,
    Reconnecting,
}

struct PendingWrite {
    path: PathBuf,
    operation: WriteOperation,
    timestamp: Instant,
}

enum WriteOperation {
    Write(Vec<u8>),
    Append(Vec<u8>),
    Remove,
    CreateDir,
    // ...
}

impl<B: Fs> OfflineCapableBackend<B> {
    fn is_online(&self) -> bool {
        matches!(*self.mode.read().unwrap(), ConnectionMode::Online)
    }

    fn go_offline(&self) {
        *self.mode.write().unwrap() = ConnectionMode::Offline;
    }

    fn try_reconnect(&self) -> bool {
        *self.mode.write().unwrap() = ConnectionMode::Reconnecting;

        // Try a simple operation
        if self.remote.exists("/").is_ok() {
            *self.mode.write().unwrap() = ConnectionMode::Online;
            self.sync_pending_writes();
            true
        } else {
            *self.mode.write().unwrap() = ConnectionMode::Offline;
            false
        }
    }

    fn sync_pending_writes(&self) {
        let mut pending = self.pending_writes.write().unwrap();

        for write in pending.drain(..) {
            let result = match write.operation {
                WriteOperation::Write(data) => self.remote.write(&write.path, &data),
                WriteOperation::Append(data) => self.remote.append(&write.path, &data),
                WriteOperation::Remove => self.remote.remove_file(&write.path),
                WriteOperation::CreateDir => self.remote.create_dir(&write.path),
            };

            if result.is_err() {
                // Put back and stop syncing
                // (In practice, need conflict resolution)
                break;
            }
        }
    }
}

impl<B: FsRead> FsRead for OfflineCapableBackend<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();

        if self.is_online() {
            match self.remote.read(path) {
                Ok(data) => {
                    // Update local cache
                    let _ = self.local_cache.write(path, &data);
                    Ok(data)
                }
                Err(FsError::Backend(_)) => {
                    // Network error - go offline, try cache
                    self.go_offline();
                    self.local_cache.read(path)
                }
                Err(e) => Err(e),
            }
        } else {
            // Offline - use cache
            self.local_cache.read(path)
        }
    }
}

impl<B: FsWrite> FsWrite for OfflineCapableBackend<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();

        // Always write to local cache
        self.local_cache.write(path, data)?;

        if self.is_online() {
            match self.remote.write(path, data) {
                Ok(()) => Ok(()),
                Err(FsError::Backend(_)) => {
                    // Network error - queue for later
                    self.go_offline();
                    self.pending_writes.write().unwrap().push(PendingWrite {
                        path: path.to_path_buf(),
                        operation: WriteOperation::Write(data.to_vec()),
                        timestamp: Instant::now(),
                    });
                    Ok(())  // Return success - we wrote locally
                }
                Err(e) => Err(e),
            }
        } else {
            // Offline - queue for later sync
            self.pending_writes.write().unwrap().push(PendingWrite {
                path: path.to_path_buf(),
                operation: WriteOperation::Write(data.to_vec()),
                timestamp: Instant::now(),
            });
            Ok(())
        }
    }
}
}

Conflict Resolution

When syncing offline writes, conflicts can occur:

#![allow(unused)]
fn main() {
enum ConflictResolution {
    /// Server version wins (discard local changes)
    ServerWins,
    /// Client version wins (overwrite server)
    ClientWins,
    /// Keep both (rename local to .conflict)
    KeepBoth,
    /// Ask user
    Manual,
}

fn resolve_conflict(
    path: &str,
    local_data: &[u8],
    server_data: &[u8],
    strategy: ConflictResolution,
) -> Result<(), FsError> {
    match strategy {
        ConflictResolution::ServerWins => {
            // Discard local, use server version
            Ok(())
        }
        ConflictResolution::ClientWins => {
            // Overwrite server with local
            remote.write(path, local_data)
        }
        ConflictResolution::KeepBoth => {
            // Rename local to path.conflict
            let conflict_path = format!("{}.conflict", path);
            remote.write(&conflict_path, local_data)?;
            Ok(())
        }
        ConflictResolution::Manual => {
            Err(FsError::Conflict { path: path.to_path_buf() })
        }
    }
}
}

FUSE Client

Mount the remote filesystem locally using FUSE:

#![allow(unused)]
fn main() {
use fuser::{Filesystem, MountOption, Request, ReplyData, ReplyEntry, ReplyAttr, ReplyDirectory};

pub struct RemoteFuse<B: Fs> {
    backend: B,
    // Inode management for FUSE
    inodes: RwLock<BiMap<u64, PathBuf>>,
    next_inode: AtomicU64,
}

impl<B: Fs> Filesystem for RemoteFuse<B> {
    fn lookup(&mut self, _req: &Request, parent: u64, name: &OsStr, reply: ReplyEntry) {
        let parent_path = self.inode_to_path(parent);
        let path = parent_path.join(name);

        match self.backend.metadata(&path) {
            Ok(meta) => {
                let inode = self.path_to_inode(&path);
                let attr = metadata_to_fuse_attr(inode, &meta);
                reply.entry(&Duration::from_secs(1), &attr, 0);
            }
            Err(_) => reply.error(libc::ENOENT),
        }
    }

    fn read(
        &mut self,
        _req: &Request,
        ino: u64,
        _fh: u64,
        offset: i64,
        size: u32,
        _flags: i32,
        _lock_owner: Option<u64>,
        reply: ReplyData,
    ) {
        let path = self.inode_to_path(ino);

        match self.backend.read_range(&path, offset as u64, size as usize) {
            Ok(data) => reply.data(&data),
            Err(_) => reply.error(libc::EIO),
        }
    }

    fn write(
        &mut self,
        _req: &Request,
        ino: u64,
        _fh: u64,
        offset: i64,
        data: &[u8],
        _write_flags: u32,
        _flags: i32,
        _lock_owner: Option<u64>,
        reply: fuser::ReplyWrite,
    ) {
        let path = self.inode_to_path(ino);

        // For simplicity, read-modify-write
        // (Real impl would use open_write with seeking)
        match self.backend.read(&path) {
            Ok(mut content) => {
                let offset = offset as usize;
                if offset > content.len() {
                    content.resize(offset, 0);
                }
                if offset + data.len() > content.len() {
                    content.resize(offset + data.len(), 0);
                }
                content[offset..offset + data.len()].copy_from_slice(data);

                match self.backend.write(&path, &content) {
                    Ok(()) => reply.written(data.len() as u32),
                    Err(_) => reply.error(libc::EIO),
                }
            }
            Err(_) => reply.error(libc::EIO),
        }
    }

    fn readdir(
        &mut self,
        _req: &Request,
        ino: u64,
        _fh: u64,
        offset: i64,
        mut reply: ReplyDirectory,
    ) {
        let path = self.inode_to_path(ino);

        match self.backend.read_dir(&path) {
            Ok(entries) => {
                let entries: Vec<_> = entries.filter_map(|e| e.ok()).collect();

                for (i, entry) in entries.iter().enumerate().skip(offset as usize) {
                    let child_path = path.join(&entry.name);
                    let child_inode = self.path_to_inode(&child_path);
                    let file_type = match entry.file_type {
                        FileType::File => fuser::FileType::RegularFile,
                        FileType::Directory => fuser::FileType::Directory,
                        FileType::Symlink => fuser::FileType::Symlink,
                    };

                    if reply.add(child_inode, (i + 1) as i64, file_type, &entry.name) {
                        break;  // Buffer full
                    }
                }
                reply.ok();
            }
            Err(_) => reply.error(libc::EIO),
        }
    }

    // ... implement other FUSE methods
}

// Mount the remote filesystem
pub fn mount_remote(backend: impl Fs, mountpoint: &Path) -> Result<(), Box<dyn Error>> {
    let fuse = RemoteFuse::new(backend);

    fuser::mount2(
        fuse,
        mountpoint,
        &[
            MountOption::RO,  // Or RW
            MountOption::FSName("anyfs-remote".to_string()),
            MountOption::AutoUnmount,
        ],
    )?;

    Ok(())
}
}

Summary: Building a Cloud Filesystem

To build a complete cloud filesystem service:

Server Side

  1. Wrap your backend (e.g., IndexedBackend or custom) with middleware
  2. Expose via gRPC/REST server
  3. Add authentication, rate limiting, idempotency

Client Side

  1. Implement RemoteBackend that calls server RPC
  2. Wrap with CachingBackend for performance
  3. Optionally add OfflineCapableBackend
  4. Mount via FUSE for native OS integration

Architecture

┌─────────────────────────────────────────────────────────────┐
│  Client Machine                                             │
│  ┌─────────┐    ┌────────────┐    ┌───────────────────┐    │
│  │  FUSE   │ →  │  Caching   │ →  │  RemoteBackend    │    │
│  │ Mount   │    │  Backend   │    │  (RPC Client)     │    │
│  └─────────┘    └────────────┘    └─────────┬─────────┘    │
└────────────────────────────────────────────│───────────────┘
                                              │ Network
┌────────────────────────────────────────────│───────────────┐
│  Server                                     ▼               │
│  ┌─────────────────┐    ┌─────────────────────────────┐    │
│  │   RPC Server    │ →  │  Middleware Stack           │    │
│  │  (Auth, Rate)   │    │  Quota → Tracing → Backend  │    │
│  └─────────────────┘    └─────────────┬───────────────┘    │
│                                        ▼                    │
│                         ┌─────────────────────────────┐    │
│                         │     IndexedBackend          │    │
│                         │  SQLite Index + Disk Blobs  │    │
│                         └─────────────────────────────┘    │
└─────────────────────────────────────────────────────────────┘

This gives you a complete cloud filesystem with:

  • Native OS mounting (FUSE)
  • Offline support
  • Caching for performance
  • Server-side quotas and logging
  • Large file streaming performance

AnyFS: Comparison, Positioning & Honest Assessment

A comprehensive look at why AnyFS exists, how it compares, and where it falls short


Origin Story

AnyFS didn’t start as a filesystem abstraction. It started as a security problem.

The Path Security Problem

While exploring filesystem security, I created the strict-path crate to ensure that externally-sourced paths could never escape their boundaries. The approach: resolve a boundary path, resolve the provided path, and validate containment.

This proved far more challenging than expected. Attack vectors kept appearing:

  • Symlinks pointing outside the boundary
  • Windows junction points
  • NTFS Alternate Data Streams (file.txt:hidden:$DATA)
  • Windows 8.3 short names (PROGRA~1)
  • Linux /proc magic symlinks that escape namespaces
  • Unicode normalization tricks (NFC vs NFD)
  • URL-encoded traversal (%2e%2e)
  • TOCTOU race conditions

Eventually, strict-path addressed 19+ attack vectors, making it (apparently) comprehensive. But it came with costs:

  • I/O overhead - Real filesystem resolution is expensive
  • Existing paths only - std::fs::canonicalize requires paths to exist
  • Residual TOCTOU risk - A symlink created between verification and operation (extremely rare, but possible)

The SQLite Revelation

Then a new idea emerged: What if the filesystem didn’t exist on disk at all?

A SQLite-backed virtual filesystem would:

  • Eliminate path security issues - Paths are just database keys, not real files
  • Be fully portable - A tenant’s entire filesystem in one .db file
  • Have no TOCTOU - Database transactions are atomic
  • Work on non-existing paths - No canonicalization needed

The Abstraction Need

But then: What if I wanted to switch from SQLite to something else later?

I didn’t want to rewrite code just to explore different backends. I needed an abstraction.

The Framework Vision

Research revealed that existing VFS solutions were either:

  • Too simple - Just swappable backends, no policies
  • Too fixed - Specific to one use case (AI agents, archives, etc.)
  • Insecure - Basic .. traversal prevention, missing 17+ attack vectors

My niche is security: isolating filesystems, limiting actions, controlling resources.

The Tower/Axum pattern for HTTP showed how to compose middleware elegantly. Why not apply the same pattern to filesystems?

Thus AnyFS: A composable middleware framework for filesystem operations.


The Landscape: What Already Exists

Rust Ecosystem

LibraryStarsDownloadsPurpose
vfs4641,700+ depsSwappable filesystem backends
virtual-filesystem~30~260/moBackends with basic sandboxing
AgentFSNewAlphaAI agent state management

Other Languages

LibraryLanguageStrength
fsspecPythonAsync, caching, 20+ backends
PyFilesystem2PythonClean URL-based API
AferoGoComposition patterns
Apache Commons VFSJavaEnterprise, many backends
System.IO.Abstractions.NETTesting, mirrors System.IO

Honest Comparison

What Others Do Well

vfs crate:

  • Mature (464 stars, 1,700+ dependent projects)
  • Multiple backends (Memory, Physical, Overlay, Embedded)
  • Async support (though being sunset)
  • Simple, focused API

virtual-filesystem:

  • ZIP/TAR archive support
  • Mountable filesystem
  • Basic sandboxing attempt

AgentFS:

  • Purpose-built for AI agents
  • SQLite backend with FUSE mounting
  • Key-value store included
  • Audit trail built-in
  • Backed by Turso (funded company)
  • TypeScript/Python SDKs

fsspec (Python):

  • Block-wise caching (not just whole-file)
  • Async-first design
  • Excellent data science integration

What Others Do Poorly

Security in existing solutions is inadequate.

I examined virtual-filesystem’s SandboxedPhysicalFS. Here’s their entire security implementation:

#![allow(unused)]
fn main() {
impl PathResolver for SandboxedPathResolver {
    fn resolve_path(root: &Path, path: &str) -> Result<PathBuf> {
        let root = root.canonicalize()?;
        let host_path = root.join(make_relative(path)).canonicalize()?;

        if !host_path.starts_with(root) {
            return Err(io::Error::new(ErrorKind::PermissionDenied, "Traversal prevented"));
        }
        Ok(host_path)
    }
}
}

That’s it. ~10 lines covering 2 out of 19+ attack vectors.

Attack Vectorvirtual-filesystemstrict-path
Basic .. traversal
Symlink following
NTFS Alternate Data Streams
Windows 8.3 short names
Unicode normalization
TOCTOU race conditions
Non-existing paths❌ FAILS
URL-encoded traversal
Windows UNC paths
Linux /proc magic symlinks
Null byte injection
Unicode direction override
Windows reserved names
Junction point escapes
Coverage2/1919/19

The vfs crate’s AltrootFS is similarly basic - just path prefix translation.

No middleware composition exists anywhere.

None of the filesystem libraries offer Tower-style middleware. You can’t do something like:

#![allow(unused)]
fn main() {
// Hypothetical - doesn't exist in other libraries
backend
    .layer(QuotaLayer)
    .layer(RateLimitLayer)
    .layer(TracingLayer)
}

If you want quotas in vfs, you’d have to build it INTO each backend. Then build it again for the next backend.


What Makes AnyFS Unique

1. Middleware Composition (Nobody Else Has This)

#![allow(unused)]
fn main() {
let fs = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)  // 100 MB
        .build())
    .layer(RateLimitLayer::builder()
        .max_ops(100)
        .per_second()
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .deny("/workspace/.git/**")
        .build())
    .layer(TracingLayer::new());
}

Add, remove, or reorder middleware without touching backends. Write middleware once, use with any backend.

2. Type-Safe Domain Separation (User-Defined Wrappers)

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Users who need type-safe domain separation can create wrapper types
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);

let sandbox = SandboxFs(FileStorage::new(memory_backend));
let userdata = UserDataFs(FileStorage::new(sqlite_backend));

fn process_sandbox(fs: &SandboxFs) { ... }

process_sandbox(&sandbox);   // OK
process_sandbox(&userdata);  // COMPILE ERROR - different type
}

Compile-time prevention of mixing storage domains via user-defined wrapper types.

3. Backend-Agnostic Policies (Nobody Else Has This)

MiddlewareFunctionWorks on ANY backend
Quota<B>Size/count limits
RateLimit<B>Ops per second
PathFilter<B>Path-based access control
Restrictions<B>Disable operations
Tracing<B>Audit logging
ReadOnly<B>Block all writes
Cache<B>LRU caching
Overlay<B1,B2>Union filesystem

4. Comprehensive Security Testing

The planned conformance test suite targets 50+ security tests covering:

  • Path traversal (URL-encoded, backslash, mixed)
  • Symlink attacks (escape, loops, TOCTOU)
  • Platform-specific (NTFS ADS, 8.3 names, /proc)
  • Unicode (normalization, RTL override, homoglyphs)
  • Resource exhaustion

Derived from vulnerabilities in Apache Commons VFS, Afero, PyFilesystem2, and our own strict-path research.


Honest Downsides of AnyFS

1. We’re New, They’re Established

MetricvfsAnyFS
Stars4640 (new)
Dependent projects1,700+0 (new)
Years maintained5+New
Contributors171

Reality: The vfs crate works fine for 90% of use cases. If you just need swappable backends for testing, vfs is battle-tested.

2. Complexity vs Simplicity

#![allow(unused)]
fn main() {
// vfs: Simple
let fs = MemoryFS::new();
fs.create_file("test.txt")?.write_all(b"hello")?;

// AnyFS: More setup if you use middleware
let fs = MemoryBackend::new()
    .layer(QuotaLayer::builder().max_total_size(1024 * 1024).build());  // 1 MB
fs.write("/test.txt", b"hello")?;
}

If you don’t need middleware, AnyFS adds conceptual overhead.

3. Sync-Only (For Now)

AnyFS is sync-first. In an async-dominated ecosystem (Tokio, etc.), this may limit adoption.

fsspec (Python) and OpenDAL (Rust) are async-first. We’re not.

Mitigation: ADR-024 plans async support. Our Send + Sync bounds enable spawn_blocking wrappers today.

4. AgentFS Has Momentum for AI Agents

If you’re building AI agents specifically:

FeatureAgentFSAnyFS
SQLite backend
FUSE mountingPlanned
Key-value store❌ (different abstraction)
Tool call auditing✅ Built-inVia Tracing middleware
TypeScript SDK
Python SDKComing
Corporate backingTursoNone

AgentFS is purpose-built for AI agents with corporate resources. We’re a general-purpose framework.

5. Performance Overhead

Middleware composition has costs:

  • Each layer adds a function call
  • Quota tracking requires size accounting
  • Rate limiting needs timestamp checks

For hot paths with millions of ops/second, this matters. For normal usage, it doesn’t.

6. Real Filesystem Security Has Limits

For VRootFsBackend (wrapping real filesystem):

  • Still has I/O costs for path resolution
  • Residual TOCTOU risk (extremely rare)
  • strict-path covers 19 vectors, but unknown unknowns exist

Virtual backends (Memory, SQLite) don’t have these issues - paths are just keys.


Feature Matrix

FeatureAnyFSvfsvirtual-fsAgentFSOpenDAL
Composable middleware
Multiple backends
SQLite backend
Memory backend
Quota enforcement
Rate limiting
Type-safe wrappers✅*
Path sandboxingBasicBasic (2 vectors)
Async API🔜Partial
std::fs-aligned APICustomCustom
FUSE mountingMVP scope
Conformance testsPlanned (80+)UnknownUnknownUnknownUnknown

When to Use AnyFS

Good Fit

  • Multi-tenant SaaS - Per-tenant quotas, path isolation, rate limiting
  • Untrusted input sandboxing - Comprehensive path security
  • Policy-heavy environments - When you need composable rules
  • Backend flexibility - When you might swap storage later
  • Type-safe domain separation - When mixing containers is dangerous

Not a Good Fit

  • Simple testing - vfs is simpler if you just need mock FS
  • AI agent runtime - AgentFS has more features for that specific use case
  • Cloud storage - OpenDAL is async-first with cloud backends
  • Async-first codebases - Wait for AnyFS async support
  • Must mount filesystem - Use anyfs with fuse/winfsp feature flags

Summary

AnyFS exists because:

  1. Existing VFS libraries have basic, inadequate security (2/19 attack vectors)
  2. No filesystem library offers middleware composition
  3. No filesystem library offers type-safe domain separation
  4. Policy enforcement (quotas, rate limits, path filtering) doesn’t exist elsewhere

AnyFS is honest about:

  1. We’re new, vfs is established
  2. We add complexity if you don’t need middleware
  3. We’re sync-only for now
  4. AgentFS has more resources for AI-specific use cases

AnyFS is positioned as:

“Tower for filesystems” - Composable middleware over pluggable backends, with comprehensive security testing.


Sources

Security Considerations

Security model, threat analysis, and containment guarantees


Overview

AnyFS is designed with security as a primary concern. Security policies are enforced via composable middleware, not hardcoded in backends or the container wrapper.


Threat Model

In Scope (Mitigated by Middleware)

ThreatDescriptionMiddleware
Path traversalAccess files outside allowed pathsPathFilter
Symlink attacksUse symlinks to bypass controlsBackend-dependent (see below)
Resource exhaustionFill storage or create excessive filesQuota
Runaway processesExcessive operations consuming resourcesRateLimit
Unauthorized writesModifications to read-only dataReadOnly
Sensitive file accessAccess to .env, secrets, etc.PathFilter

Out of Scope

ThreatReason
Side-channel attacksRequires OS-level mitigations
Physical accessDisk encryption is application’s responsibility
SQLite vulnerabilitiesUpstream dependency; update regularly
Network attacksAnyFS is local storage, not network-facing

Security Architecture

1. Middleware-Based Policy

Security policies are composable middleware layers:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RateLimitLayer, TracingLayer};

let secure_backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()              // Limit resources
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(PathFilterLayer::builder()         // Sandbox paths
        .allow("/workspace/**")
        .deny("**/.env")
        .deny("**/secrets/**")
        .build())
    .layer(RateLimitLayer::builder()          // Throttle operations
        .max_ops(1000)
        .per_second()
        .build())
    .layer(TracingLayer::new());              // Audit trail
}

2. Path Sandboxing (PathFilter)

PathFilter middleware restricts path access using glob patterns:

#![allow(unused)]
fn main() {
PathFilterLayer::builder()
    .allow("/workspace/**")    // Allow workspace access
    .deny("**/.env")           // Block .env files
    .deny("**/secrets/**")     // Block secrets directories
    .deny("**/*.key")          // Block key files
    .build()
    .layer(backend)
}

Guarantees:

  • First matching rule wins
  • No rule = denied (deny by default)
  • read_dir filters denied entries from results

Symlink/hard-link capability is determined by trait bounds, not middleware:

#![allow(unused)]
fn main() {
// MemoryBackend implements FsLink → symlinks work
let fs = FileStorage::new(MemoryBackend::new());
fs.symlink("/target", "/link")?;  // ✅ Works

// Custom backend without FsLink → symlinks won't compile
let fs = FileStorage::new(MySimpleBackend::new());
fs.symlink("/target", "/link")?;  // ❌ Compile error
}

If you don’t want symlinks: Use a backend that doesn’t implement FsLink.

The Restrictions middleware only controls permission operations:

#![allow(unused)]
fn main() {
RestrictionsLayer::builder()
    .deny_permissions()        // Block set_permissions() calls
    .build()
    .layer(backend)
}

Use cases:

  • Sandboxing untrusted code (block permission changes)
  • Read-only-ish environments (block permission mutations)

4. Resource Limits (Quota)

Quota middleware enforces capacity limits:

#![allow(unused)]
fn main() {
QuotaLayer::builder()
    .max_total_size(100 * 1024 * 1024)  // 100 MB total
    .max_file_size(10 * 1024 * 1024)    // 10 MB per file
    .max_node_count(10_000)             // Max files/dirs
    .max_dir_entries(1_000)             // Max per directory
    .max_path_depth(64)                 // Max nesting
    .build()
    .layer(backend)
}

Guarantees:

  • Writes rejected when limits exceeded
  • Streaming writes tracked via CountingWriter

5. Rate Limiting (RateLimit)

RateLimit middleware throttles operations:

#![allow(unused)]
fn main() {
RateLimitLayer::builder()
    .max_ops(1000)
    .per_second()
    .build()
    .layer(backend)
}

Guarantees:

  • Operations rejected when limit exceeded
  • Protects against runaway processes

6. Backend-Level Containment

Different backends achieve containment differently:

BackendContainment Mechanism
MemoryBackendIsolated in process memory
SqliteBackendEach container is a separate .db file
IndexedBackendSQLite index + isolated blob directory (UUID-named blobs)
StdFsBackendNone - full filesystem access (do NOT use with untrusted input)
VRootFsBackendUses strict-path::VirtualRoot to contain paths

⚠️ Warning: PathFilter middleware on StdFsBackend does NOT provide sandboxing. The OS still resolves paths (including symlinks) before PathFilter can check them. For path containment with real filesystems, use VRootFsBackend.

7. Why Virtual Backends Are Inherently Safe

For MemoryBackend and SqliteBackend, the underlying storage is isolated from the host filesystem. There is no OS filesystem to exploit - paths operate entirely within the virtual structure.

Path resolution is symlink-aware but contained: FileStorage resolves paths by walking the virtual directory structure (using metadata() and read_link() on the backend), not the OS filesystem:

Virtual backend symlink example:
  /foo/bar  where bar → /other/place
  /foo/bar/..  resolves to /other (following the symlink target's parent)

This is correct filesystem semantics - but it happens entirely within
the virtual structure. There is no host filesystem to escape to.

This means:

  • No host filesystem access - symlinks point to paths within the virtual structure only
  • No TOCTOU via OS state - resolution uses the backend’s own data
  • Controlled by PathResolver - the default IterativeResolver follows symlinks when FsLink is available; custom resolvers can implement different behaviors

For VRootFsBackend (real filesystem), strict-path::VirtualRoot provides equivalent guarantees by validating and containing all paths before they reach the OS.

The security concern with symlinks is following them, not creating them.

Symlinks are just data. Creating /sandbox/link -> /etc/passwd is harmless. The danger is when reading /sandbox/link follows the symlink and accesses /etc/passwd.

Backend TypeSymlink CreationSymlink Following
MemoryBackendSupported (FsLink)FileStorage resolves (non-SelfResolving)
SqliteBackendSupported (FsLink)FileStorage resolves (non-SelfResolving)
VRootFsBackendSupported (FsLink)OS controls - strict-path prevents escapes

Virtual Backends (Memory, SQLite)

Virtual backends that implement FsLink follow symlinks during FileStorage resolution. Symlink capability is determined by trait bounds:

  • MemoryBackend: FsLink → supports symlinks
  • SqliteBackend: FsLink → supports symlinks
  • Custom backend without FsLink → no symlinks (compile-time enforced)

If you need symlink-free behavior, use a backend that does not implement FsLink.

This is the actual security feature - controlling whether symlinks are even possible via trait bounds.

Real Filesystem Backend (VRootFsBackend)

VRootFsBackend calls OS functions (std::fs::read(), etc.) which follow symlinks automatically. We cannot control this - the OS does the symlink resolution, not us.

strict-path::VirtualRoot prevents escapes:

User requests: /sandbox/link
link -> ../../../etc/passwd
strict-path: canonicalize(/sandbox/link) = /etc/passwd
strict-path: /etc/passwd is NOT within /sandbox → DENIED

This is “follow and verify containment” - symlinks are followed by the OS, but escapes are blocked by strict-path.

Limitation: Symlinks within the jail are followed. We cannot disable this without implementing custom path resolution (TOCTOU risk) or platform-specific hacks.

Summary

ConcernVirtual BackendVRootFsBackend
Symlink creationSupported (FsLink)Supported (FsLink)
Symlink followingFileStorage resolves (non-SelfResolving)OS controls (strict-path prevents escapes)
Jail escape via symlinkNo host FS to escapePrevented by strict-path

Secure Usage Patterns

AI Agent Sandbox

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RateLimitLayer, TracingLayer, FileStorage};

let sandbox = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(50 * 1024 * 1024)
        .max_file_size(5 * 1024 * 1024)
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .deny("**/.env")
        .deny("**/secrets/**")
        .build())
    .layer(RateLimitLayer::builder()
        .max_ops(1000)
        .per_second()
        .build())
    .layer(TracingLayer::new());

let fs = FileStorage::new(sandbox);
// Agent code can only access /workspace, limited resources, audited
// Note: MemoryBackend implements FsLink, so symlinks work if needed
}

Multi-Tenant Isolation

#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, FileStorage, Fs};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

fn create_tenant_storage(tenant_id: &str, quota_bytes: u64) -> FileStorage<impl Fs> {
    let db_path = format!("tenants/{}.db", tenant_id);
    let backend = QuotaLayer::builder()
        .max_total_size(quota_bytes)
        .build()
        .layer(SqliteBackend::open(&db_path).unwrap());

    FileStorage::new(backend)
}

// Complete isolation: separate database files
}

Read-Only Browsing

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, ReadOnly, FileStorage};

let readonly_fs = FileStorage::new(
    ReadOnly::new(VRootFsBackend::new("/var/archive")?)
);

// All write operations return FsError::ReadOnly
}

Security Checklist

For Application Developers

  • Use PathFilter to sandbox untrusted code
  • Use Quota to prevent resource exhaustion
  • Use Restrictions when you need to disable risky operations
  • Use RateLimit for untrusted/shared environments
  • Use Tracing for audit trails
  • Use separate backends for separate tenants
  • Keep dependencies updated

For Backend Implementers

  • Ensure paths cannot escape intended scope
  • For filesystem backends: use strict-path for containment
  • Handle concurrent access safely
  • Don’t leak internal paths in errors

For Middleware Implementers

  • Handle streaming I/O appropriately (wrap or block)
  • Document which operations are intercepted
  • Fail closed (deny on error)

Encryption and Integrity Protection

AnyFS’s design enables encryption at multiple levels. Understanding the difference between container-level and file-level protection is crucial for choosing the right approach.

Container-Level vs File-Level Protection

LevelWhat’s ProtectedIntegrityImplementation
Container-levelEntire storage medium (.db file, serialized state)Full structure protectedEncrypted backend
File-levelIndividual file contentsFile contents onlyEncryption middleware

Key insight: File-level encryption alone is NOT sufficient. If an attacker can modify the container structure (directory tree, metadata, file names), they can sabotage integrity even without decrypting file contents.

Threat Analysis

ThreatFile-Level EncryptionContainer-Level Encryption
Read file contentsProtectedProtected
Modify file contentsDetected (with AEAD)Detected
Delete filesNOT protectedProtected
Rename/move filesNOT protectedProtected
Corrupt directory structureNOT protectedProtected
Replay old file versionsNOT protectedProtected (with versioning)
Metadata exposure (filenames, sizes)NOT protectedProtected

Recommendation: For sensitive data, prefer container-level encryption. Use file-level encryption when you need selective access (some files encrypted, others not).

Container-Level Encryption

Option 1: SQLCipher Backend

SQLCipher provides transparent AES-256 encryption for SQLite. In AnyFS, encryption is a feature of SqliteBackend (from the anyfs-sqlite ecosystem crate), not a separate type:

#![allow(unused)]
fn main() {
/// SqliteBackend with encryption enabled (requires `encryption` feature).
/// Uses SQLCipher for transparent AES-256 encryption.
use anyfs_sqlite::SqliteBackend;

// Open with password (derives key via PBKDF2)
let backend = SqliteBackend::open_encrypted("secure.db", "password")?;

// Or open with raw 256-bit key
let backend = SqliteBackend::open_with_key("secure.db", &key)?;

// Change password on open database
backend.change_password("new_password")?;
}

What’s protected:

  • All file contents
  • All metadata (names, sizes, timestamps, permissions)
  • Directory structure
  • Inode mappings
  • Everything in the .db file

Usage:

#![allow(unused)]
fn main() {
let backend = SqliteBackend::open_encrypted("secure.db", "correct-horse-battery-staple")?;
let fs = FileStorage::new(backend);

// If someone gets secure.db without the password, they see random bytes
}

Option 2: Encrypted Serialization (MemoryBackend)

For in-memory backends that need persistence:

#![allow(unused)]
fn main() {
impl MemoryBackend {
    /// Serialize entire state to encrypted blob.
    pub fn serialize_encrypted(&self, key: &[u8; 32]) -> Result<Vec<u8>, FsError> {
        let plaintext = bincode::serialize(&self.state)?;
        let nonce = generate_nonce();
        let ciphertext = aes_gcm_encrypt(key, &nonce, &plaintext)?;
        Ok([nonce.as_slice(), &ciphertext].concat())
    }

    /// Deserialize from encrypted blob.
    pub fn deserialize_encrypted(data: &[u8], key: &[u8; 32]) -> Result<Self, FsError> {
        let (nonce, ciphertext) = data.split_at(12);
        let plaintext = aes_gcm_decrypt(key, nonce, ciphertext)?;
        let state = bincode::deserialize(&plaintext)?;
        Ok(Self { state })
    }
}
}

Use case: Periodically save encrypted snapshots, load on startup.

File-Level Encryption (Middleware)

When you need selective encryption or per-file keys:

#![allow(unused)]
fn main() {
/// Middleware that encrypts file contents on write, decrypts on read.
/// Does NOT protect metadata, filenames, or directory structure.
pub struct FileEncryption<B> {
    inner: B,
    key: Secret<[u8; 32]>,
}

impl<B: Fs> FsWrite for FileEncryption<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        // Encrypt content with authenticated encryption (AES-GCM)
        let nonce = generate_nonce();
        let ciphertext = aes_gcm_encrypt(&self.key, &nonce, data)?;
        let encrypted = [nonce.as_slice(), &ciphertext].concat();
        self.inner.write(path, &encrypted)
    }
}

impl<B: Fs> FsRead for FileEncryption<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let encrypted = self.inner.read(path)?;
        let (nonce, ciphertext) = encrypted.split_at(12);
        aes_gcm_decrypt(&self.key, nonce, ciphertext)
            .map_err(|_| FsError::IntegrityError { path: path.as_ref().to_path_buf() })
    }
}
}

Limitations:

  • Filenames visible
  • Directory structure visible
  • File sizes visible (roughly - ciphertext slightly larger)
  • Metadata unprotected

When to use:

  • Some files need encryption, others don’t
  • Different files need different keys
  • Interop with systems that expect plaintext structure

Integrity Without Encryption

Sometimes you need tamper detection without hiding contents:

#![allow(unused)]
fn main() {
/// Middleware that adds HMAC to each file for integrity verification.
pub struct IntegrityVerified<B> {
    inner: B,
    key: Secret<[u8; 32]>,
}

impl<B: Fs> FsWrite for IntegrityVerified<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let mac = hmac_sha256(&self.key, data);
        let protected = [data, mac.as_slice()].concat();
        self.inner.write(path, &protected)
    }
}

impl<B: Fs> FsRead for IntegrityVerified<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let protected = self.inner.read(path)?;
        let (data, mac) = protected.split_at(protected.len() - 32);
        if !hmac_verify(&self.key, data, mac) {
            return Err(FsError::IntegrityError { path: path.as_ref().to_path_buf() });
        }
        Ok(data.to_vec())
    }
}
}

RAM Encryption and Secure Memory

For high-security scenarios where memory dumps are a threat:

Threat Levels

ThreatMitigationLibrary-Level?
Memory inspection after process exitzeroize on dropYes
Core dumpsDisable via setrlimitYes (process config)
Swap file exposuremlock() to pin pagesYes (OS permitting)
Live memory scanning (same user)OS process isolationNo
Cold boot attackHardware RAM encryptionNo (Intel TME/AMD SME)
Hypervisor/DMA attackSGX/SEV enclavesNo (hardware)

Encrypted Memory Backend (Illustrative Pattern)

Note: EncryptedMemoryBackend is an illustrative pattern for users who need encrypted RAM storage. It is not a built-in backend. Users can implement this pattern using the guidance below.

Keep data encrypted even in RAM - decrypt only during active use:

#![allow(unused)]
fn main() {
use zeroize::{Zeroize, ZeroizeOnDrop};
use secrecy::Secret;

/// Memory backend that stores all data encrypted in RAM.
/// Plaintext exists only briefly during read operations.
pub struct EncryptedMemoryBackend {
    /// All nodes stored as encrypted blobs
    nodes: HashMap<PathBuf, EncryptedNode>,
    /// Encryption key - auto-zeroized on drop
    key: Secret<[u8; 32]>,
}

struct EncryptedNode {
    /// Encrypted file content (nonce || ciphertext)
    encrypted_data: Vec<u8>,
    /// Metadata can be encrypted too, or stored in the encrypted blob
    metadata: EncryptedMetadata,
}

impl FsRead for EncryptedMemoryBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let node = self.nodes.get(path.as_ref())
            .ok_or_else(|| FsError::NotFound { path: path.as_ref().to_path_buf() })?;

        // Decrypt - plaintext briefly in RAM
        let plaintext = self.decrypt(&node.encrypted_data)?;

        // Return owned Vec - caller responsible for zeroizing if sensitive
        Ok(plaintext)
    }
}

impl FsWrite for EncryptedMemoryBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        // Encrypt immediately - plaintext never stored
        let encrypted = self.encrypt(data)?;

        self.nodes.insert(path.as_ref().to_path_buf(), EncryptedNode {
            encrypted_data: encrypted,
            metadata: self.encrypt_metadata(...)?,
        });
        Ok(())
    }
}

impl Drop for EncryptedMemoryBackend {
    fn drop(&mut self) {
        // Zeroize all encrypted data (defense in depth)
        for node in self.nodes.values_mut() {
            node.encrypted_data.zeroize();
        }
        // Key is auto-zeroized via Secret<>
    }
}
}

Serialization of Encrypted RAM

When persisting an encrypted memory backend:

#![allow(unused)]
fn main() {
impl EncryptedMemoryBackend {
    /// Serialize to disk - data stays encrypted throughout.
    /// RAM encrypted → Serialized encrypted → Disk encrypted
    pub fn save_to_file(&self, path: &Path) -> Result<(), FsError> {
        // Data is already encrypted in self.nodes
        // Serialize the encrypted blobs directly - no decryption needed
        let serialized = bincode::serialize(&self.nodes)?;

        // Optionally add another encryption layer with different key
        // (defense in depth: compromise of runtime key doesn't expose persisted data)
        std::fs::write(path, &serialized)?;
        Ok(())
    }

    /// Load from disk - data stays encrypted throughout.
    /// Disk encrypted → Deserialized encrypted → RAM encrypted
    pub fn load_from_file(path: &Path, key: Secret<[u8; 32]>) -> Result<Self, FsError> {
        let serialized = std::fs::read(path)?;
        let nodes = bincode::deserialize(&serialized)?;

        Ok(Self { nodes, key })
    }
}
}

Key property: Plaintext NEVER exists during save/load. Data flows:

Write: plaintext → encrypt → RAM (encrypted) → serialize → disk (encrypted)
Read:  disk (encrypted) → deserialize → RAM (encrypted) → decrypt → plaintext

Secure Allocator Considerations

#![allow(unused)]
fn main() {
// In Cargo.toml - mimalloc secure mode zeros on free
mimalloc = { version = "0.1", features = ["secure"] }

// Note: This prevents USE-AFTER-FREE info leaks, but does NOT:
// - Encrypt RAM contents
// - Prevent live memory scanning
// - Protect against cold boot attacks
}

For true defense against memory scanning, combine:

  1. EncryptedMemoryBackend (data encrypted at rest in RAM)
  2. zeroize (immediate cleanup of temporary plaintext)
  3. mlock() (prevent swapping sensitive pages)
  4. Minimize plaintext lifetime (decrypt → use → zeroize immediately)

Encryption Summary

ApproachProtects ContentsProtects StructureRAM SecurityPersistence
SqliteBackend with encryptionYesYesNo (SQLite uses plaintext RAM)Encrypted .db file
FileEncryption<B> middlewareYesNoDepends on BDepends on B
EncryptedMemoryBackend (illustrative)YesYesYes (encrypted in RAM)Via save_to_file()
IntegrityVerified<B> middlewareNoNo (files only)NoDepends on B

Sensitive Data Storage

#![allow(unused)]
fn main() {
// Full protection: encrypted container + secure memory practices
let backend = SqliteBackend::open_encrypted("secure.db", password)?;
let fs = FileStorage::new(backend);
}

High-Security RAM Processing (Illustrative)

#![allow(unused)]
fn main() {
// Data never plaintext at rest (RAM or disk)
// Note: EncryptedMemoryBackend is user-implemented (see pattern above)
let backend = EncryptedMemoryBackend::new(derive_key(password));
// ... use fs ...
backend.save_to_file("snapshot.enc")?;  // Persists encrypted
}

Selective File Encryption

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// Some files encrypted, structure visible
let backend = FileEncryption::new(SqliteBackend::open("data.db")?)
    .with_key(key);
}

TOCTOU-Proof Tenant Isolation with Virtual Backends

Why Virtual Backends Eliminate TOCTOU

Traditional path security libraries like strict-path work against a real filesystem:

┌─────────────────────────────────────────────────────────────────┐
│                    REAL FILESYSTEM SECURITY                      │
│                                                                  │
│   Your Process          OS Filesystem         Other Processes   │
│   ┌──────────┐         ┌───────────┐         ┌──────────────┐   │
│   │ Check    │────────▶│ Canonical │◀────────│ Create       │   │
│   │ path     │         │ path      │         │ symlink      │   │
│   └──────────┘         └───────────┘         └──────────────┘   │
│        │                     │                      │           │
│        │    TOCTOU WINDOW    │                      │           │
│        ▼                     ▼                      ▼           │
│   ┌──────────┐         ┌───────────┐         ┌──────────────┐   │
│   │ Use      │────────▶│ DIFFERENT │◀────────│ Modified!    │   │
│   │ path     │         │ path now! │         │              │   │
│   └──────────┘         └───────────┘         └──────────────┘   │
│                                                                  │
│   Problem: OS state can change between check and use             │
└─────────────────────────────────────────────────────────────────┘

Virtual backends eliminate this entirely:

┌─────────────────────────────────────────────────────────────────┐
│                   VIRTUAL BACKEND SECURITY                       │
│                                                                  │
│   Your Process                                                   │
│   ┌─────────────────────────────────────────────────────────┐   │
│   │                    FileStorage                           │   │
│   │  ┌──────────┐    ┌───────────┐    ┌──────────────────┐  │   │
│   │  │ Resolve  │───▶│ SQLite    │───▶│ Return data      │  │   │
│   │  │ path     │    │ Transaction│   │                  │  │   │
│   │  └──────────┘    └───────────┘    └──────────────────┘  │   │
│   │                        │                                 │   │
│   │              ATOMIC - No external modification possible  │   │
│   └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│   No OS filesystem. No other processes. No TOCTOU.               │
└─────────────────────────────────────────────────────────────────┘

Security Comparison: strict-path vs Virtual Backend

Threatstrict-path (Real FS)Virtual Backend
Path traversalPrevented (canonicalize + verify)Impossible (no host FS to traverse to)
Symlink race (TOCTOU)Mitigated (canonicalize first)Impossible (we control all symlinks)
External symlink creationVulnerable window existsImpossible (single-process ownership)
Windows 8.3 short namesPartial (only existing files)N/A (no Windows FS)
Namespace escapes (/proc)Fixed in soft-canonicalizeImpossible (no /proc exists)
Concurrent modificationOS handles (may race)Atomic (SQLite transactions)
Tenant A accessing Tenant BRequires careful path filteringImpossible (separate .db files)

Encryption: Separation of Concerns

Design principle: Backends handle storage, middleware handles policy. Container-level encryption is the exception.

Security LevelImplementationWhy
Locked (container)SqliteBackend with encryption featureMust encrypt entire .db file at storage level
Privacy (file contents)FileEncryption<SqliteBackend> middlewareContent encryption is policy
NormalSqliteBackendUser applies encryption as needed

Why encryption is a feature, not a separate type:

  • SQLCipher is a drop-in replacement for SQLite with identical API
  • The only difference is how the connection is opened (with password/key)
  • Connection must be opened with password before ANY query
  • Cannot be added as middleware - it’s a property of the connection itself
  • Everything is encrypted: file contents, filenames, directory structure, timestamps, inodes

SqliteBackend Encryption (Ecosystem Crate, feature: encryption)

Full container encryption using SQLCipher. Encryption is a feature of SqliteBackend, not a separate type:

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

/// Encryption methods are only available with the `encryption` feature.
/// Uses SQLCipher for transparent AES-256 encryption.
///
/// Without the password, the .db file is indistinguishable from random bytes.

// Open with password (derives key via PBKDF2)
let backend = SqliteBackend::open_encrypted("secure.db", "password")?;

// Open with raw 256-bit key (no key derivation)
let backend = SqliteBackend::open_with_key("secure.db", &key)?;

// Create new encrypted database
let backend = SqliteBackend::create_encrypted("new.db", "password")?;

// Change password on open database
backend.change_password("new_password")?;
}

What SQLCipher Encrypts

DataEncrypted?
File contentsYes
FilenamesYes
Directory structureYes
File sizesYes
TimestampsYes
PermissionsYes
Inode mappingsYes
SQLite metadataYes
Everything in the .db fileYes

Cargo Configuration

[dependencies]
# anyfs-sqlite ecosystem crate with optional encryption
anyfs-sqlite = { version = "0.1" }                     # No encryption
anyfs-sqlite = { version = "0.1", features = ["encryption"] }  # With SQLCipher

Note: The encryption feature enables SQLCipher. When enabled, open_encrypted() and open_with_key() methods become available.

Achieving Security Modes with Composition

Users compose backends and middleware to achieve their desired security level:

Locked Mode (Full Container Encryption)

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate with `encryption` feature

// Everything encrypted - password required to access anything
let backend = SqliteBackend::open_encrypted("tenant.db", "correct-horse-battery-staple")?;
let fs = FileStorage::new(backend);

// Without password: .db file is random bytes
// With password: full access to everything
}

Privacy Mode (Contents Encrypted, Metadata Visible)

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// File contents encrypted, metadata (names, sizes, structure) visible
let backend = FileEncryption::new(
    SqliteBackend::open("tenant.db")?
)
.with_key(content_key);

let fs = FileStorage::new(backend);

// Host can: list files, see sizes, run statistics
// Host cannot: read file contents
}

Normal Mode (No Encryption)

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// No encryption - user encrypts sensitive files themselves
let backend = SqliteBackend::open("tenant.db")?;
let fs = FileStorage::new(backend);

// User applies per-file encryption as needed
}

Mode Comparison

AspectLockedPrivacyNormal
ImplementationSqliteBackend with encryptionFileEncryption<SqliteBackend>SqliteBackend
File contentsEncrypted (SQLCipher)Encrypted (AES-GCM)Plaintext
FilenamesEncryptedVisibleVisible
Directory structureEncryptedVisibleVisible
File sizesEncryptedVisibleVisible
TimestampsEncryptedVisibleVisible
Host can analyzeNothingMetadata onlyEverything
PerformanceSlowest (~10-15% overhead)MediumFastest
Feature flagencryptionmiddleware(none)

Why This Is TOCTOU-Proof

  1. No external filesystem - Paths exist only in our SQLite tables
  2. Atomic transactions - Path resolution + data access in single transaction
  3. Single-process ownership - No other process can modify the .db during operation
  4. We control symlinks - Symlinks are just rows in nodes table, we decide when to follow
  5. No OS involvement - OS never resolves our virtual paths
#![allow(unused)]
fn main() {
// This is TOCTOU-proof:
impl SecureSqliteBackend {
    fn resolve_and_read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // Single transaction wraps everything
        let tx = self.conn.transaction()?;

        // 1. Resolve path (following symlinks in OUR table)
        let inode = self.resolve_path_internal(&tx, path)?;

        // 2. Read content
        // No TOCTOU - same transaction, same snapshot
        let data = tx.query_row(
            "SELECT data FROM content WHERE inode = ?",
            [inode],
            |row| row.get(0)
        )?;

        // Transaction ensures atomicity
        Ok(data)
    }
}
}

Multi-Tenant Isolation

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate with `encryption` feature

/// Each tenant gets their own .db file - complete physical isolation
fn create_tenant_storage(tenant_id: &str, encrypted: bool) -> impl Fs {
    let path = format!("tenants/{}.db", tenant_id);

    if encrypted {
        let password = get_tenant_password(tenant_id);
        SqliteBackend::open_encrypted(&path, &password).unwrap()
    } else {
        SqliteBackend::open(&path).unwrap()
    }
}

// Tenant A literally cannot access Tenant B's data:
// - Different .db files
// - Different passwords (if encrypted)
// - No shared state whatsoever
// - No path filtering bugs possible - there's nothing to filter
}

Comparison with strict-path approach:

ApproachTenant Isolation
Shared filesystem + strict-pathLogical isolation (paths filtered)
Shared filesystem + PathFilterLogical isolation (middleware enforced)
Separate .db file per tenantPhysical isolation (separate files)

Physical isolation is strictly stronger - there’s no bug in path filtering that could leak data because there’s no shared data to leak.

Host Analysis with Privacy Mode

When using FileEncryption<SqliteBackend> (Privacy mode), the host can query metadata directly from SQLite:

#![allow(unused)]
fn main() {
// Host can analyze metadata without the content encryption key
fn get_tenant_statistics(tenant_db: &str) -> TenantStats {
    // Connect directly to SQLite (no content key needed)
    let conn = Connection::open(tenant_db)?;

    let (file_count, dir_count, total_size) = conn.query_row(
        "SELECT
            COUNT(*) FILTER (WHERE node_type = 0),
            COUNT(*) FILTER (WHERE node_type = 1),
            SUM(size)
         FROM nodes",
        [],
        |row| Ok((row.get(0)?, row.get(1)?, row.get(2)?))
    )?;

    TenantStats { file_count, dir_count, total_size }
}

// List all files (names visible, contents encrypted)
fn list_tenant_files(tenant_db: &str) -> Vec<FileInfo> {
    let conn = Connection::open(tenant_db)?;
    conn.prepare("SELECT name, size, modified_at FROM nodes WHERE node_type = 0")?
        .query_map([], |row| Ok(FileInfo { ... }))?
        .collect()
}
}

Replacing strict-path Usage

For projects currently using strict-path for tenant isolation:

Before (strict-path):

#![allow(unused)]
fn main() {
use strict_path::VirtualRoot;

fn handle_tenant_request(tenant_id: &str, requested_path: &str) -> Result<Vec<u8>> {
    // Shared filesystem, path containment via strict-path
    let root = VirtualRoot::new(format!("/data/tenants/{}", tenant_id))?;
    let safe_path = root.resolve(requested_path)?;  // TOCTOU window here
    std::fs::read(safe_path)  // Another process could have modified
}
}

After (SqliteBackend with encryption - ecosystem crate):

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate with `encryption` feature

fn handle_tenant_request(tenant_id: &str, requested_path: &str) -> Result<Vec<u8>> {
    // Separate encrypted database per tenant - no path containment needed
    let backend = get_tenant_backend(tenant_id);  // Cached connection
    backend.read(requested_path)  // Atomic, TOCTOU-proof
}
}
Aspectstrict-pathVirtual Backend
Isolation modelLogical (path filtering)Physical (separate files)
TOCTOUMitigatedEliminated
External interferencePossibleImpossible
Symlink attacksResolved at check timeWe control all symlinks
Cross-tenant leakageBug in filtering could leakNo shared data exists
PerformanceReal FS I/O + canonicalizationSQLite (often faster for small files)
EncryptionSeparate concernBuilt-in (encryption feature) or middleware

Known Limitations

  1. No ACLs: Simple permissions only (Unix mode bits)
  2. Side channels: Timing attacks, cache attacks require OS/hardware mitigations
  3. SQLite file access: Host OS can still access the .db file (use Locked mode for encryption)

For implementation details, see Architecture Decision Records.

AnyFS - Technical Comparison with Alternatives

This document compares AnyFS with existing Rust filesystem abstractions.


Executive Summary

AnyFS is to filesystems what Axum/Tower is to HTTP: a composable middleware stack with pluggable backends.

Key differentiators:

  • Composable middleware - Stack quota, sandboxing, tracing, caching as independent layers
  • Backend agnostic - Swap Memory/SQLite/RealFS without code changes
  • Policy separation - Storage logic separate from policy enforcement
  • Third-party extensibility - Custom backends and middleware depend only on anyfs-backend

Compared Solutions

SolutionWhat it isMiddlewareMultiple Backends
vfsVFS trait + backendsNoYes
AgentFSSQLite agent runtimeNoNo (SQLite only)
OpenDALObject storage layerYesYes (cloud-focused)
AnyFSVFS + middleware stackYesYes

1. Architecture Comparison

vfs Crate

Path-based trait, no middleware pattern:

#![allow(unused)]
fn main() {
pub trait FileSystem: Send + Sync {
    fn read_dir(&self, path: &str) -> VfsResult<Box<dyn Iterator<Item = String>>>;
    fn open_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndRead>>;
    fn create_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndWrite>>;
    // ...
}
}

Limitations:

  • No standard way to add quotas, logging, sandboxing
  • Each concern must be built into backends or wrapped externally
  • Path validation is backend-specific

AgentFS

SQLite-based agent runtime:

#![allow(unused)]
fn main() {
// Fixed to SQLite, includes KV store and tool auditing
let fs = AgentFS::open("agent.db")?;
fs.write_file("/path", data)?;
fs.kv_set("key", "value")?;  // KV store bundled
fs.toolcall_start("tool")?;  // Auditing bundled
}

Limitations:

  • Locked to SQLite (no memory backend for testing, no real FS)
  • Monolithic design (can’t use FS without KV/auditing)
  • No composable middleware

AnyFS

Tower-style middleware + pluggable backends:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RestrictionsLayer, TracingLayer, FileStorage};

// Compose middleware stack
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .deny("**/.env")
        .build())
    .layer(TracingLayer::new());

let fs = FileStorage::new(backend);
}

Advantages:

  • Add/remove middleware without touching backends
  • Swap backends without touching middleware
  • Third-party extensions via anyfs-backend trait

2. Feature Comparison

FeatureAnyFSvfsAgentFSOpenDAL
Middleware patternYesNoNoYes
Multiple backendsYesYesNoYes
SQLite backendYesNoYesNo
Memory backendYesYesNoYes
Real FS backendYesYesNoNo
Quota enforcementMiddlewareManualNoNo
Path sandboxingMiddlewareManualNoNo
Feature gatingMiddlewareNoNoNo
Rate limitingMiddlewareNoNoNo
Tracing/loggingMiddlewareManualBuilt-inMiddleware
Streaming I/OYesYesYesYes
Async APIFuturePartialNoYes
POSIX extensionFutureNoNoNo
FUSE mountableYesNoNoNo
KV storeNoNoYesNo

3. Middleware Stack

AnyFS middleware can intercept, transform, and control operations:

MiddlewareInterceptsAction
QuotaWritesReject if over limit
PathFilterAll opsBlock denied paths
RestrictionsPermission changesBlock via .deny_permissions()
RateLimitAll opsThrottle per second
ReadOnlyWritesBlock all writes
TracingAll opsLog with tracing crate
DryRunWritesLog without executing
CacheReadsLRU caching
OverlayAll opsUnion filesystem
CustomAnyEncryption, compression, …

4. Backend Trait

#![allow(unused)]
fn main() {
pub trait Fs: Send + Sync {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
    // ... methods aligned with std::fs
}
}

Design principles:

  • &Path in core traits (object-safe); FileStorage/FsExt accept impl AsRef<Path> for ergonomics
  • Aligned with std::fs naming
  • Streaming I/O via open_read/open_write
  • Send bound for async compatibility

5. When to Use What

Use CaseRecommendation
Need composable middlewareAnyFS
Need backend flexibilityAnyFS
Need SQLite + Memory + RealFSAnyFS
Need just VFS abstraction (no policies)vfs
Need AI agent runtime with KV + auditingAgentFS
Need cloud object storageOpenDAL
Need async-first designOpenDAL (or wait for AnyFS async)

6. Deep Dive: vfs Crate Compatibility

The vfs crate is the most similar project. This section details why we don’t adopt their trait and how we’ll provide interop.

vfs::FileSystem Trait (Complete)

#![allow(unused)]
fn main() {
pub trait FileSystem: Send + Sync {
    // Required (9 methods)
    fn read_dir(&self, path: &str) -> VfsResult<Box<dyn Iterator<Item = String>>>;
    fn create_dir(&self, path: &str) -> VfsResult<()>;
    fn open_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndRead>>;
    fn create_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndWrite>>;
    fn append_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndWrite>>;
    fn metadata(&self, path: &str) -> VfsResult<VfsMetadata>;
    fn exists(&self, path: &str) -> VfsResult<bool>;
    fn remove_file(&self, path: &str) -> VfsResult<()>;
    fn remove_dir(&self, path: &str) -> VfsResult<()>;

    // Optional - default to NotSupported (6 methods)
    fn set_creation_time(&self, path: &str, time: SystemTime) -> VfsResult<()>;
    fn set_modification_time(&self, path: &str, time: SystemTime) -> VfsResult<()>;
    fn set_access_time(&self, path: &str, time: SystemTime) -> VfsResult<()>;
    fn copy_file(&self, src: &str, dest: &str) -> VfsResult<()>;
    fn move_file(&self, src: &str, dest: &str) -> VfsResult<()>;
    fn move_dir(&self, src: &str, dest: &str) -> VfsResult<()>;
}
}

Feature Gap Analysis

FeaturevfsAnyFSGap
Basic read/writeYesYes-
Directory opsYesYes-
Streaming I/OYesYes-
renamemove_fileYes-
copycopy_fileYes-
SymlinksNoYesCritical
Hard linksNoYesCritical
PermissionsNoYesCritical
truncateNoYesMissing
sync/fsyncNoYesMissing
statfsNoYesMissing
read_rangeNoYesMissing
symlink_metadataNoYesMissing
Path type&str&Path (core) + impl AsRef<Path> in ergonomic layerDifferent
MiddlewareNoYesArchitectural

Why Not Adopt Their Trait?

  1. No symlinks/hardlinks - Can’t virtualize real filesystem semantics
  2. No permissions - Our Restrictions middleware needs set_permissions to gate
  3. No durability primitives - No sync/fsync for data integrity
  4. No middleware pattern - Their VfsPath bakes in behaviors we want composable
  5. &str paths - Core traits use &Path for object safety; ergonomics come from FileStorage/FsExt

Our trait is a strict superset. Everything vfs can do, we can do. The reverse is not true.

vfs Backends

vfs BackendAnyFS EquivalentNotes
PhysicalFSStdFsBackendBoth use real filesystem directly
MemoryFSMemoryBackendBoth in-memory
OverlayFSOverlay<B1,B2>Both union filesystems
AltrootFSVRootFsBackendBoth provide path containment
EmbeddedFS(none)Read-only embedded assets
(none)SqliteBackendWe have SQLite

Interoperability Plan

Future anyfs-vfs-compat crate provides bidirectional adapters:

#![allow(unused)]
fn main() {
use anyfs_vfs_compat::{VfsCompat, AnyFsCompat};

// Use a vfs backend in AnyFS
// Missing features return FsError::NotSupported
let backend = VfsCompat::new(vfs::MemoryFS::new());
let fs = FileStorage::new(backend);

// Use an AnyFS backend in vfs-based code
// Only exposes what vfs supports
let anyfs_backend = MemoryBackend::new();
let vfs_fs: Box<dyn vfs::FileSystem> = Box::new(AnyFsCompat::new(anyfs_backend));
}

Use cases:

  • Migrate from vfs to AnyFS incrementally
  • Use vfs::EmbeddedFS in AnyFS (read-only embedded assets)
  • Use AnyFS backends in projects depending on vfs

7. Tradeoffs

AnyFS Advantages

  • Composable middleware pattern
  • Backend-agnostic
  • Third-party extensibility
  • Clean separation of concerns
  • Full filesystem semantics (symlinks, permissions, durability)

AnyFS Limitations

  • Sync-first (async planned)
  • Smaller ecosystem (new project)
  • Not full POSIX emulation

If this document conflicts with AGENTS.md or src/architecture/design-overview.md, treat those as authoritative.

AnyFS - Build vs. Reuse Analysis

Can your goals be achieved with existing crates, or does this project need to exist?


Core Requirements

  1. Backend flexibility - swap storage without changing application code
  2. Composable middleware - add/remove capabilities (quotas, sandboxing, logging)
  3. Tenant isolation - each tenant gets an isolated namespace
  4. Portable storage - single-file backend (SQLite) for easy move/copy/backup
  5. Filesystem semantics - std::fs-aligned operations including symlinks and hard links
  6. Path containment - prevent traversal attacks

What Already Exists

vfs crate (Rust)

What it provides:

  • Filesystem abstraction with multiple backends
  • MemoryFS, PhysicalFS, AltrootFS, OverlayFS, EmbeddedFS

What it lacks:

  • SQLite backend
  • Composable middleware pattern
  • Quota/limit enforcement
  • Policy layers (feature gating, path filtering)

AgentFS (Turso)

What it provides:

  • SQLite-based filesystem for AI agents
  • Key-value store
  • Tool call auditing
  • FUSE mounting

What it lacks:

  • Multiple backend types (SQLite only)
  • Composable middleware
  • Backend-agnostic abstraction

rusqlite

What it provides: SQLite bindings, transactions, blobs.

What it lacks: Filesystem semantics, quota enforcement.

strict-path

What it provides: Path validation and containment (VirtualRoot).

What it lacks: Storage backends, filesystem API.


Gap Analysis

RequirementvfsAgentFSrusqlitestrict-path
Filesystem APIYesYesNoNo
Multiple backendsYesNoN/ANo
SQLite backendNoYesYes (raw)No
Composable middlewareNoNoNoNo
Quota enforcementNoNoManualNo
Path sandboxingPartialNoManualYes
Symlink/hard link controlBackend-depYesManualN/A

Conclusion: No existing crate provides:

“Backend-agnostic filesystem abstraction with composable middleware for quotas, sandboxing, and policy enforcement.”


Why AnyFS Exists

AnyFS fills the gap by separating concerns:

CrateResponsibility
anyfs-backendTrait (Fs, Layer) + types
anyfsBackends + middleware + ergonomic wrapper (FileStorage<B>)

The middleware pattern (like Tower/Axum) enables composition:

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, TracingLayer, FileStorage};

let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024)
        .build())
    .layer(PathFilterLayer::builder()
        .allow("/workspace/**")
        .build())
    .layer(TracingLayer::new());

let fs = FileStorage::new(backend);
fs.write("/workspace/doc.txt", b"hello")?;
}

Alternatives Considered

Option A: Implement SQLite backend for vfs crate

Pros: Ecosystem compatibility.

Cons:

  • No middleware pattern for quotas/policies
  • Would still need to build quota/sandboxing outside the trait
  • Doesn’t solve the composability problem

Option B: Use AgentFS

Pros: Already exists, SQLite-based, FUSE support.

Cons:

  • Locked to SQLite (can’t swap to memory/real FS)
  • No composable middleware
  • Includes KV store and auditing we may not need

Pros:

  • Backend-agnostic (swap storage without code changes)
  • Composable middleware (add/remove capabilities)
  • Clean separation of concerns
  • Third-party extensibility

Cons:

  • New project, not yet widely adopted

Recommendation

Build AnyFS with reusable primitives (rusqlite, strict-path, thiserror, tracing) but maintain the two-crate split. The middleware pattern is what makes the design both flexible and safe.

Compatibility option: Later, provide an adapter that implements vfs traits on top of Fs for projects that need vfs compatibility.

Prior Art Analysis: Filesystem Abstraction Libraries

This document analyzes filesystem abstraction libraries in other languages to learn from their successes, identify features we should adopt, and avoid known vulnerabilities.


Executive Summary

LibraryLanguageKey StrengthKey WeaknessWhat We Can Learn
fsspecPythonAsync + caching + data science integrationNo middleware compositionCaching strategies, async design
PyFilesystem2PythonClean URL-based APISymlink handling issuesPath normalization
AferoGoComposition (CopyOnWrite, Cache, BasePathFs)Symlink escape in BasePathFsComposition patterns
Apache Commons VFSJavaEnterprise-grade, many backendsCVE: Path traversal with encoded ..URL encoding attacks
System.IO.Abstractions.NETPerfect for testing, mirrors System.IONo middleware/compositionMockFileSystem patterns
memfsNode.jsBrowser + Node unified APIFork exists due to “longstanding bugs”In-memory implementation
soft-canonicalizeRustNon-existing path resolution, TOCTOU-safeReal FS only (not virtual)Attack patterns to defend
strict-pathRust19+ attack types blocked, type-safe markersReal FS only (not virtual)Attack catalog for testing

Detailed Analysis

1. Python: fsspec

Repository: fsspec/filesystem_spec

What they do well:

  1. Unified Interface Across 20+ Backends

    • Local, S3, GCS, Azure, HDFS, HTTP, FTP, SFTP, ZIP, TAR, Git, etc.
    • Same API regardless of backend
  2. Sophisticated Caching

    # Block-wise caching - only download accessed parts
    fs = fsspec.filesystem('blockcache', target_protocol='s3',
                           cache_storage='/tmp/cache')
    
    # Whole-file caching
    fs = fsspec.filesystem('filecache', target_protocol='s3',
                           cache_storage='/tmp/cache')
    
  3. Async Support

    • AsyncFileSystem base class for async implementations
    • Concurrent bulk operations (cat fetches many files at once)
    • Used by Dask for parallel data processing
  4. Data Science Integration

    • Native integration with Pandas, Dask, Intake
    • Parquet optimization with parallel chunk fetching

What we should adopt:

  • Block-wise caching strategy (not just whole-file LRU)
  • Async design from the start (our ADR-024 async plan)
  • Consider “parts caching” for large file access patterns

What they lack that we have:

  • No middleware composition pattern
  • No quota/rate limiting built-in
  • No path filtering/sandboxing

2. Python: PyFilesystem2

Repository: PyFilesystem/pyfilesystem2

What they do well:

  1. URL-based Filesystem Specification

    from fs import open_fs
    
    home_fs = open_fs('osfs://~/')
    zip_fs = open_fs('zip://foo.zip')
    ftp_fs = open_fs('ftp://ftp.example.com')
    mem_fs = open_fs('mem://')
    
  2. Consistent Path Handling

    • Forward slashes everywhere (even on Windows)
    • Paths normalized automatically
  3. Glob Support Built-in

    for match in fs.glob('**/*.py'):
        print(match.path)
    

Known Issues (from GitHub):

IssueDescriptionImpact
#171Symlink loops cause infinite recursionDoS potential
#417No symlink creation supportMissing feature
#411Incorrect handling of symlinks with non-existing targetsBroken functionality
#61Symlinks not detected properlySecurity concern

Lessons for AnyFS:

  • ⚠️ Symlink handling is complex - we must handle loops, non-existent targets, and escaping
  • URL-based opening is convenient - consider for future
  • Consistent path format - virtual backends use forward slashes internally; OS-backed backends follow OS semantics

3. Go: Afero

Repository: spf13/afero

What they do well:

  1. Composition Pattern (Similar to Ours!)

    // Sandboxing
    baseFs := afero.NewOsFs()
    restrictedFs := afero.NewBasePathFs(baseFs, "/var/data")
    
    // Caching layer
    cachedFs := afero.NewCacheOnReadFs(baseFs, afero.NewMemMapFs(), time.Hour)
    
    // Copy-on-write
    cowFs := afero.NewCopyOnWriteFs(baseFs, afero.NewMemMapFs())
    
  2. io/fs Compatibility

    • Works with Go 1.16+ standard library interfaces
    • ReadDirFS, ReadFileFS, etc.
  3. Extensive Backend Support

    • OS, Memory, SFTP, GCS
    • Community: S3, MinIO, Dropbox, Google Drive, Git

Known Issues:

IssueDescriptionOur Mitigation
#282Symlinks in BasePathFs can escape jailUse strict-path crate for VRootFsBackend
#88Symlink handling inconsistentDocument behavior clearly
#344BasePathFs fails when basepath is .Test edge cases

BasePathFs Symlink Escape Issue:

“SymlinkIfPossible will resolve the RealPath of underlayer filesystem before make a symlink. For example, creating a link like ‘/foo/bar’ -> ‘/foo/file’ will be transform into a link point to ‘/{basepath}/foo/file.’”

This means symlinks can potentially point outside the base path!

Our Solution:

  • VRootFsBackend uses strict-path for real filesystem containment
  • Virtual backends (Memory, SQLite) are inherently safe - paths are just keys
  • PathFilter middleware provides additional sandboxing layer

What we should verify:

  • Test symlink creation pointing outside VRootFsBackend
  • Test .. in symlink targets
  • Test symlink loops with max depth

4. Java: Apache Commons VFS

Repository: Apache Commons VFS

🔴 CRITICAL VULNERABILITY: CVE in versions < 2.10.0

The Bug:

// FileObject API has resolveFile with scope parameter
FileObject file = baseFile.resolveFile("../secret.txt", NameScope.DESCENDENT);
// SHOULD throw exception - "../secret.txt" is not a descendent

// BUT with URL encoding:
FileObject file = baseFile.resolveFile("%2e%2e/secret.txt", NameScope.DESCENDENT);
// DOES NOT throw exception! Returns file outside base directory.

Root Cause: Path validation happened BEFORE URL decoding.

Lesson for AnyFS:

#![allow(unused)]
fn main() {
// WRONG - validate then decode
fn resolve(path: &str) -> Result<PathBuf, FsError> {
    validate_no_traversal(path)?;  // Checks for ".."
    let decoded = url_decode(path);  // "../" appears after decode!
    Ok(PathBuf::from(decoded))
}

// CORRECT - decode then validate
fn resolve(path: &str) -> Result<PathBuf, FsError> {
    let decoded = url_decode(path);
    let normalized = normalize_path(&decoded);  // Resolve all ".."
    validate_containment(&normalized)?;
    Ok(normalized)
}
}

Action Items:

  • Add test: URL-encoded %2e%2e path traversal attempt
  • Add test: Double-encoding %252e%252e
  • Ensure path normalization happens BEFORE validation
  • Document in security model

5. .NET: System.IO.Abstractions

Repository: TestableIO/System.IO.Abstractions

What they do well:

  1. Perfect API Compatibility

    • Mirrors System.IO exactly
    • Drop-in replacement for testing
  2. MockFileSystem for Testing

    var fileSystem = new MockFileSystem(new Dictionary<string, MockFileData>
    {
        { @"c:\myfile.txt", new MockFileData("Testing") },
        { @"c:\demo\jQuery.js", new MockFileData("jQuery content") },
    });
    
    // Use in tests
    var sut = new MyComponent(fileSystem);
    
  3. Analyzers Package

    • Roslyn analyzers warn when using System.IO directly
    • Guides developers to use abstractions

What they lack:

  • No middleware/composition
  • No caching layer
  • No sandboxing/path filtering
  • Testing-focused, not production backends

What we should adopt:

  • Consider Rust analyzer/clippy lint for std::fs usage
  • MockFileSystem pattern is similar to our MemoryBackend

6. Node.js: memfs + unionfs

Repository: streamich/memfs

What they do well:

  1. Browser + Node Unified

    • Works in browser via File System API
    • Same API as Node’s fs
  2. Union Filesystem Composition

    import { Union } from 'unionfs';
    import { fs as memfs } from 'memfs';
    import * as fs from 'fs';
    
    const ufs = new Union();
    ufs.use(fs);        // Real filesystem as base
    ufs.use(memfs);     // Memory overlay
    

Known Issues:

“There is a fork of memfs maintained by SageMath (sagemathinc/memfs-js) which was created to fix 13 security vulnerabilities revealed by npm audit. This fork exists because, as their GitHub description notes, ‘there are longstanding bugs’ in the upstream memfs.”

Lesson: Even popular libraries can have security issues. Our conformance test suite should be comprehensive.


Vulnerabilities Summary

LibraryVulnerabilityTypeOur Mitigation
Apache Commons VFSCVE (pre-2.10.0)URL-encoded path traversalDecode before validate
Afero (Go)Issue #282, #88Symlink escape from BasePathFsUse strict-path, test thoroughly
PyFilesystem2Issue #171Symlink loop causes infinite recursionLoop detection with max depth
memfs (Node)13 vulns in npm auditVarious (unspecified)Comprehensive test suite

Features Comparison Matrix

FeaturefsspecPyFS2AferoCommons VFSSystem.IO.AbsAnyFS
Middleware composition
Quota enforcement
Path sandboxing
Rate limiting
Caching layer
Async support🔜
Block-wise caching
URL-based opening
Union/overlay FS
Memory backend
SQLite backend
FUSE mounting🔜
Type-safe wrappers*

Future Ideas to Consider

These are optional extensions inspired by other ecosystems. They are intentionally not part of the core scope.

Keep (add-ons that fit the current design):

  • URL-based backend registry (sqlite://, mem://, stdfs://) as a helper crate, not in core APIs.
  • Bulk operation helpers (read_many, write_many, copy_many, glob, walk) as FsExt or a utilities crate.
  • Early async adapter crate (anyfs-async) to support remote backends without changing sync traits.
  • Bash-style shell (example app or anyfs-shell crate) that routes ls/cd/cat/cp/mv/rm/mkdir/stat through FileStorage to demonstrate middleware and backend neutrality (navigation and file management only, not full bash scripting).
  • Copy-on-write overlay middleware (Afero-style CopyOnWriteFs) as a specialized Overlay variant.
  • Archive backends (zip/tar) as separate crates implementing Fs (PyFilesystem/fsspec-style).

Defer (valuable, but needs data or wider review):

  • Range/block caching middleware for read_range heavy workloads (fsspec-style block cache).
  • Runtime capability discovery (Capabilities struct) for feature detection (symlink control, case sensitivity, max path length).
  • Lint/analyzer to discourage direct std::fs usage in app code (System.IO.Abstractions-style).
  • Retry/timeout middleware for remote backends (once remote backends exist).

Drop for now (adds noise or cross-platform complexity):

  • Change notification support (optional FsWatch trait or polling middleware).

Security Tests to Add

Based on vulnerabilities found in other libraries, add these to our conformance test suite:

Path Traversal Tests

#![allow(unused)]
fn main() {
#[test]
fn test_url_encoded_path_traversal() {
    let fs = create_sandboxed_fs("/sandbox");

    // These should all fail or be contained
    assert!(fs.read("%2e%2e/etc/passwd").is_err());      // URL-encoded ../
    assert!(fs.read("%252e%252e/secret").is_err());      // Double-encoded
    assert!(fs.read("..%2f..%2fetc/passwd").is_err());   // Mixed encoding
    assert!(fs.read("....//....//etc/passwd").is_err()); // Extra dots
}

#[test]
fn test_symlink_escape() {
    let fs = create_sandboxed_fs("/sandbox");

    // Symlink pointing outside should fail or be contained
    assert!(fs.symlink("/etc/passwd", "/sandbox/link").is_err());
    assert!(fs.symlink("../../../etc/passwd", "/sandbox/link").is_err());

    // Even if symlink created, reading should fail
    fs.symlink("../secret", "/sandbox/link").ok();
    assert!(fs.read("/sandbox/link").is_err());
}

#[test]
fn test_symlink_loop_detection() {
    let fs = MemoryBackend::new();

    // Create loop: a -> b -> a
    fs.symlink("/b", "/a").unwrap();
    fs.symlink("/a", "/b").unwrap();

    // Should detect loop, not hang
    let result = fs.read("/a");
    assert!(matches!(result, Err(FsError::TooManySymlinks { .. })));
}
}

Resource Exhaustion Tests

#![allow(unused)]
fn main() {
#[test]
fn test_deep_directory_traversal() {
    let fs = create_fs_with_depth_limit(64);

    // Creating very deep paths should fail
    let deep_path = "/".to_string() + &"a/".repeat(100);
    assert!(fs.create_dir_all(&deep_path).is_err());
}

#[test]
fn test_many_open_handles() {
    let fs = create_fs();
    let mut handles = vec![];

    // Opening many files shouldn't crash
    for i in 0..10000 {
        fs.write(format!("/file{}", i), b"x").unwrap();
        if let Ok(h) = fs.open_read(format!("/file{}", i)) {
            handles.push(h);
        }
    }
    // Should either succeed or return resource error, not crash
}
}

Action Items

High Priority

TaskSourcePriority
Add URL-encoded path traversal testsApache Commons VFS CVE🔴 Critical
Add symlink escape tests for VRootFsBackendAfero issues🔴 Critical
Add symlink loop detectionPyFilesystem2 #171🔴 Critical
Verify strict-path handles all edge casesAfero BasePathFs issues🔴 Critical

Medium Priority (Future)

TaskSourcePriority
Consider block-wise caching for large filesfsspec🟡 Enhancement
Add async supportfsspec async design🟡 Enhancement
URL-based filesystem specificationPyFilesystem2, Commons VFS🟢 Nice-to-have

Documentation

TaskSource
Document symlink behavior for each backendAll libraries have issues
Add security considerations for path handlingApache Commons VFS CVE
Compare AnyFS to alternativesThis analysis

Sibling Rust Projects: Path Security Libraries

AnyFS builds on foundational security work from two related Rust crates that specifically address path resolution vulnerabilities. These crates are planned to be used in AnyFS’s path handling implementation.

soft-canonicalize-rs

Repository: DK26/soft-canonicalize-rs

Purpose: Path canonicalization that works with non-existing paths—a critical gap in std::fs::canonicalize.

Security Features:

FeatureDescriptionAttack Prevented
NTFS ADS validationBlocks alternate data stream syntaxHidden data, path escape
Symlink cycle detectionBounded depth trackingDoS via infinite loops
Path traversal clampingCan’t ascend past rootDirectory escape
Null byte rejectionEarly validationNull injection
TOCTOU resistanceAtomic-like resolutionRace conditions
Windows UNC handlingNormalizes extended pathsPath confusion
Linux namespace preservationUses proc-canonicalizeContainer escape via /proc/PID/root

Key Innovation: Anchored Canonicalization

#![allow(unused)]
fn main() {
// All paths (including symlink targets) are clamped to anchor
let result = anchored_canonicalize("/workspace", user_input)?;
// If symlink points to /etc/passwd, result becomes /workspace/etc/passwd
}

This is exactly what VRootFsBackend needs for safe path containment.

strict-path-rs

Repository: DK26/strict-path-rs

Purpose: Type-safe path handling that prevents traversal attacks at compile time.

Two Modes:

ModeBehaviorUse Case
StrictPathReturns Err(PathEscapesBoundary) on escapeArchive extraction, file uploads
VirtualPathClamps escape attempts within sandboxMulti-tenant, per-user storage

Documented Attack Coverage (19+ vulnerabilities):

Attack TypeDescription
Symlink/junction escapesFollows and validates canonical paths
Windows 8.3 short namesDetects PROGRA~1 obfuscation
NTFS Alternate Data StreamsBlocks file.txt:hidden:$DATA
Zip Slip (CVE-2018-1000178)Validates archive entries before extraction
TOCTOU (CVE-2022-21658)Handles time-of-check-time-of-use races
Unicode/encoding bypassesNormalizes path representations
Mixed separatorsHandles / and \ on Windows
UNC path tricksPrevents \\?\C:\..\..\ attacks

Type-Safe Marker Pattern (mirrors AnyFS’s design!):

#![allow(unused)]
fn main() {
struct UserFiles;
struct SystemFiles;

fn process_user(f: &StrictPath<UserFiles>) { /* ... */ }
// Wrong marker type = compile error
}

Applicability to AnyFS

Important distinction:

Backend TypeStorage MechanismPath Resolution Provider
VRootFsBackendReal filesystemOS (backend is SelfResolving)
MemoryBackendHashMap keysFileStorage (symlink-aware)
SqliteBackendDB stringsFileStorage (symlink-aware)

For virtual backends (Memory, SQLite, etc.):

  • These third-party crates perform real filesystem resolution (follow actual symlinks on disk)
  • Virtual backends treat paths as keys, so these crates can’t help
  • AnyFS implements its own path resolution in FileStorage that:
    1. Walks path components via metadata() and read_link()
    2. Resolves symlinks by reading targets from virtual storage
    3. Handles .. correctly after symlink resolution
    4. Detects loops by tracking visited virtual paths

For VRootFsBackend only:

  • Since it wraps the real filesystem, strict-path provides safe containment
  • The backend implements SelfResolving, so FileStorage skips its own resolution

Security Tests Added to Conformance Suite

Based on these libraries, we’ve added tests for:

Windows-Specific:

  • NTFS Alternate Data Streams (file.txt:hidden)
  • Windows 8.3 short names (PROGRA~1)
  • UNC path traversal (\\?\C:\..\..\)
  • Reserved device names (CON, PRN, NUL)
  • Junction point escapes

Linux-Specific:

  • /proc/PID/root magic symlinks
  • /dev/fd/N file descriptor symlinks

Unicode:

  • NFC vs NFD normalization
  • Right-to-Left Override (U+202E)
  • Homoglyph confusion (Cyrillic vs Latin)

TOCTOU:

  • Check-then-use race conditions
  • Symlink target changes during resolution

Conclusion

What makes AnyFS unique:

  1. Middleware composition - Only Afero has this, and we do it better (Tower-style)
  2. Quota + rate limiting - No other library has built-in resource control
  3. Type-safe wrappers - Users can create wrapper newtypes for compile-time container isolation
  4. SQLite backend - No other abstraction library offers this

What we should learn from others:

  1. Path traversal via encoding - Apache Commons VFS vulnerability
  2. Symlink handling complexity - All libraries struggle with this
  3. Caching strategies - fsspec’s block-wise caching is sophisticated
  4. Async support - fsspec shows how to do this well

Critical security tests to add:

  1. URL-encoded path traversal (%2e%2e)
  2. Symlink escape from sandboxed directories
  3. Symlink loop detection
  4. Deep path exhaustion

Sources

External Libraries

Sibling Rust Projects

Vulnerability References

Benchmarking Plan

This document specifies the benchmarking strategy for AnyFS when the implementation exists. Functionality and security are the primary goals; performance validation is secondary but important.


Goals

  1. Validate design decisions - Confirm that the Tower-style middleware approach doesn’t introduce unacceptable overhead
  2. Identify optimization opportunities - Find hot paths that need attention
  3. Establish baselines - Know where we stand relative to alternatives
  4. Prevent regressions - Track performance across versions

Benchmark Categories

1. Backend Benchmarks

Compare AnyFS backends against equivalent solutions for their specific use cases.

MemoryBackend vs Alternatives

CompetitorUse CaseWhy Compare
std::collections::HashMapRaw key-value baselineTheoretical minimum overhead
tempfile + std::fsIn-memory temp filesCommon testing approach
vfs::MemoryFSVirtual filesystemDirect competitor
virtual-fsIn-memory FSAnother VFS crate

Metrics:

  • Sequential read/write throughput (1KB, 64KB, 1MB, 16MB files)
  • Random access latency (small reads at random offsets)
  • Directory listing performance (10, 100, 1000, 10000 entries)
  • Memory overhead per file/directory

SqliteBackend vs Alternatives

CompetitorUse CaseWhy Compare
rusqlite rawBaseline SQLite performanceMeasure our abstraction cost
sledEmbedded databaseAlternative storage engine
redbEmbedded databaseModern alternative
File-per-recordDirect filesystemTraditional approach

Metrics:

  • Insert throughput (batch vs individual)
  • Read throughput (sequential vs random)
  • Transaction overhead
  • Database size vs raw file size
  • Startup time (opening existing database)

VRootFsBackend vs Alternatives

CompetitorUse CaseWhy Compare
std::fs directBaseline filesystemMeasure containment overhead
cap-stdCapability-based FSSecurity-focused alternative
chroot simulationTraditional sandboxingSystem-level approach

Metrics:

  • Path resolution overhead
  • Symlink traversal cost
  • Escape attempt detection cost

2. Middleware Overhead Benchmarks

Measure the cost of each middleware layer.

MiddlewareWhat to Measure
Quota<B>Size tracking overhead per operation
PathFilter<B>Glob matching cost per path
ReadOnly<B>Should be zero (just error return)
RateLimit<B>Fixed-window counter check overhead
Tracing<B>Span creation/logging cost
Cache<B>Cache hit/miss latency difference

Key question: What’s the cost of a 5-layer middleware stack vs direct backend access?

Target: Middleware overhead should be <5% of I/O time for typical operations.

3. Composition Benchmarks

Measure real-world stacks, not isolated components.

AI Agent Sandbox Stack

Quota → PathFilter → RateLimit → Tracing → MemoryBackend

Compare against:

  • Raw MemoryBackend (baseline)
  • Manual checks in application code (alternative approach)

Persistent Database Stack

Cache → Tracing → SqliteBackend

Compare against:

  • Raw SqliteBackend (baseline)
  • Application-level caching (alternative approach)

4. Trait Implementation Benchmarks

Validate that strategic boxing doesn’t hurt performance.

OperationExpected Cost
read() / write()Zero-cost (monomorphized)
open_read()Box<dyn Read>~50ns allocation, negligible vs I/O
read_dir()ReadDirIterOne allocation per call
FileStorage::boxed()One-time cost at setup

Competitor Matrix

By Use Case

Use CaseAnyFS ComponentPrimary Competitors
Testing/mockingMemoryBackendtempfile, vfs::MemoryFS
Embedded databaseSqliteBackendsled, redb, raw SQLite
Sandboxed host accessVRootFsBackendcap-std, chroot
Policy enforcementMiddleware stackManual application code
Union filesystemOverlayoverlayfs (kernel), fuse-overlayfs

Crate Comparison

CrateStrengthsWeaknessesCompare For
vfsSimple APINo middleware, limited featuresAPI ergonomics
virtual-fsWASM supportLess composableCross-platform
cap-stdSecurity-focusedDifferent abstraction levelSandboxing
tempfileBattle-testedNot a VFSTemp file operations
include_dirCompile-time embeddingRead-onlyEmbedded assets

Benchmark Infrastructure

Framework

Use criterion for statistical rigor:

  • Warm-up iterations
  • Outlier detection
  • Comparison between runs

Test Data Sets

DatasetContentsPurpose
Small files1000 files × 1KBMetadata-heavy workload
Large files10 files × 100MBThroughput workload
Deep hierarchy10 levels × 10 dirsPath resolution stress
Wide directory1 dir × 10000 filesListing performance
Mixed realisticProject-like structureReal-world simulation

Reporting

Generate:

  • Throughput charts (ops/sec, MB/sec)
  • Latency histograms (p50, p95, p99)
  • Memory usage graphs
  • Comparison tables vs competitors

Performance Targets

These are aspirational targets to validate during implementation:

MetricTargetRationale
Middleware overhead<5% of I/O timeComposability shouldn’t cost much
MemoryBackend vs HashMap<2x slowerAbstraction cost
SqliteBackend vs raw SQLite<1.5x slowerThin wrapper
VRootFsBackend vs std::fs<1.2x slowerPath checking cost
5-layer stack<10% overheadReal-world composition

Benchmark Workflow

Development Phase

cargo bench --bench <component>

Run focused benchmarks during development to catch regressions.

Release Phase

cargo bench --all

Full benchmark suite before releases, with comparison to previous version.

CI Integration

  • Run subset of benchmarks on PR (smoke test)
  • Full benchmark suite on main branch
  • Store results for trend analysis

Non-Goals

  • Beating std::fs at raw I/O - We add abstraction; some overhead is acceptable
  • Micro-optimizing cold paths - Focus on hot paths (read, write, metadata)
  • Benchmark gaming - Optimize for real use cases, not synthetic benchmarks

Tracking

GitHub Issue: Implement benchmark suite

  • Blocked by: Core AnyFS implementation
  • Dependencies: criterion, test data generation
  • Milestone: Post-1.0 (after functionality and security are solid)

Implementation Plan

This plan describes a phased rollout of the AnyFS ecosystem:

  • anyfs-backend: Layered traits (Fs, FsFull, FsFuse, FsPosix) + Layer + types
  • anyfs: Built-in backends + middleware (feature-gated) + FileStorage<B> ergonomic wrapper

Implementation Guidelines

These guidelines apply to ALL implementation work. Derived from analysis of issues in similar projects (vfs, agentfs).

1. No Panic Policy

NEVER panic in library code. Always return Result<T, FsError>.

  • Audit all .unwrap() and .expect() calls - replace with ? or proper error handling
  • Use ok_or_else(|| FsError::...) instead of .unwrap()
  • Edge cases must return errors, not panic
  • Test in constrained environments (WASM) to catch hidden panics
#![allow(unused)]
fn main() {
// BAD
let entry = self.entries.get(&path).unwrap();

// GOOD
let entry = self.entries.get(&path)
    .ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
}

2. Thread Safety Requirements

All backends must be safe for concurrent access:

  • MemoryBackend: Use Arc<RwLock<...>> for internal state
  • SqliteBackend: Use WAL mode, handle SQLITE_BUSY
  • VRootFsBackend: File operations are inherently concurrent-safe

Required: Concurrent stress tests in conformance suite.

3. Consistent Path Handling

FileStorage handles path resolution via pluggable PathResolver trait (see ADR-033):

  • Always absolute paths internally
  • Always / separator (even on Windows)
  • Default IterativeResolver: symlink-aware canonicalization (not lexical)
  • Handle edge cases: //, trailing /, empty string, circular symlinks
  • Optional resolver: CachingResolver (for read-heavy workloads)

Public canonicalization API on FileStorage:

  • canonicalize(path) - strict, all components must exist
  • soft_canonicalize(path) - resolves existing, appends non-existent lexically
  • anchored_canonicalize(path, anchor) - sandboxed resolution

Standalone utility:

  • normalize(path) - lexical cleanup only (collapses //, removes trailing /). Does NOT resolve . or ...

4. Error Type Design

FsError must be:

  • Easy to pattern match
  • Include context (path, operation)
  • Derive thiserror for good messages
  • Use #[non_exhaustive] for forward compatibility
#![allow(unused)]
fn main() {
#[non_exhaustive]
#[derive(Debug, thiserror::Error)]
pub enum FsError {
    // Path/File Errors
    #[error("not found: {path}")]
    NotFound { path: PathBuf },

    #[error("{operation}: already exists: {path}")]
    AlreadyExists { path: PathBuf, operation: &'static str },

    #[error("not a file: {path}")]
    NotAFile { path: PathBuf },

    #[error("not a directory: {path}")]
    NotADirectory { path: PathBuf },

    #[error("directory not empty: {path}")]
    DirectoryNotEmpty { path: PathBuf },

    // Permission/Access Errors
    #[error("{operation}: permission denied: {path}")]
    PermissionDenied { path: PathBuf, operation: &'static str },

    #[error("access denied: {path} ({reason})")]
    AccessDenied { path: PathBuf, reason: String },

    #[error("read-only filesystem: {operation}")]
    ReadOnly { operation: &'static str },

    #[error("{operation}: feature not enabled: {feature}")]
    FeatureNotEnabled { feature: &'static str, operation: &'static str },

    // Resource Limit Errors (from Quota middleware)
    #[error("quota exceeded: limit {limit}, requested {requested}, usage {usage}")]
    QuotaExceeded { limit: u64, requested: u64, usage: u64 },

    #[error("file size exceeded: {path} ({size} > {limit})")]
    FileSizeExceeded { path: PathBuf, size: u64, limit: u64 },

    #[error("rate limit exceeded: {limit}/s (window: {window_secs}s)")]
    RateLimitExceeded { limit: u32, window_secs: u64 },

    // ... see design-overview.md for complete list
}
}

See design-overview.md for the complete FsError definition.

5. Documentation Requirements

Every backend and middleware must document:

  • Thread safety guarantees
  • Performance characteristics
  • Which operations are O(1) vs O(n)
  • Any platform-specific behavior

Phase 1: anyfs-backend (core contract)

Goal: Define the stable backend interface using layered traits.

Layered Trait Architecture

                    FsPosix
                       │
        ┌──────────────┼──────────────┐
        │              │              │
   FsHandles      FsLock       FsXattr
        │              │              │
        └──────────────┴──────────────┘
                       │
                    FsFuse ← FsFull + FsInode
                       │
        ┌──────────────┴──────────────┐
        │                             │
     FsFull                       FsInode
        │
        │
        ├──────┬───────┬───────┬──────┐
        │      │       │       │      │
   FsLink  FsPerm  FsSync FsStats │
        │      │       │       │      │
        └──────┴───────┴───────┴──────┘
                       │
                       Fs  ← Most users only need this
                       │
           ┌───────────┼───────────┐
           │           │           │
        FsRead    FsWrite     FsDir

Core Traits (Layer 1 - Required)

  • FsRead: read, read_to_string, read_range, exists, metadata, open_read
  • FsWrite: write, append, remove_file, rename, copy, truncate, open_write
  • FsDir: read_dir, create_dir, create_dir_all, remove_dir, remove_dir_all

Extended Traits (Layer 2 - Optional)

  • FsLink: symlink, hard_link, read_link, symlink_metadata
  • FsPermissions: set_permissions
  • FsSync: sync, fsync
  • FsStats: statfs

Inode Trait (Layer 3 - For FUSE)

  • FsInode: path_to_inode, inode_to_path, lookup, metadata_by_inode
    • No blanket/default implementation - must be explicitly implemented
    • Required for FUSE mounting (FUSE operates on inodes, not paths)
    • Enables correct hardlink reporting (same inode = same file, nlink count)
    • Note: FsLink defines hardlink creation; FsInode enables FUSE to track them
    • inode_to_path requires backend to maintain path mappings

POSIX Traits (Layer 4 - Full POSIX)

  • FsHandles: open, read_at, write_at, close
  • FsLock: lock, try_lock, unlock
  • FsXattr: get_xattr, set_xattr, remove_xattr, list_xattr

Convenience Supertraits

#![allow(unused)]
fn main() {
/// Basic filesystem - covers 90% of use cases
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}

/// Full filesystem with all std::fs features
pub trait FsFull: Fs + FsLink + FsPermissions + FsSync + FsStats {}

/// FUSE-mountable filesystem
pub trait FsFuse: FsFull + FsInode {}

/// Full POSIX filesystem
pub trait FsPosix: FsFuse + FsHandles + FsLock + FsXattr {}
}

Other Definitions

  • Define Layer trait (Tower-style middleware composition)
  • Define FsExt trait (extension methods for JSON, type checks)
  • Define FsPath trait (path canonicalization with default impl, requires FsRead + FsLink)
  • Define core types (Metadata, Permissions, FileType, DirEntry, StatFs)
  • Define FsError with contextual variants (see guidelines above)
  • Define ROOT_INODE = 1 constant
  • Define SelfResolving marker trait (opt-in for backends that handle their own path resolution, e.g., VRootFsBackend)

Exit criteria: anyfs-backend stands alone with minimal dependencies (thiserror required; serde optional for JSON in FsExt).


Phase 2: anyfs (backends + middleware)

Goal: Provide reference backends and core middleware.

Path Resolution (FileStorage’s Responsibility)

FileStorage handles path resolution using its configured PathResolver:

  • Walks path component by component using metadata() and read_link()
  • Handles .. correctly after symlink resolution (symlink-aware, not lexical)
  • Default IterativeResolver follows symlinks for backends that implement FsLink
  • Custom resolvers can implement different behaviors (e.g., no symlink following)
  • Detects circular symlinks (max depth or visited set)
  • Returns canonical resolved path to the backend

SelfResolving backends (StdFsBackend, VRootFsBackend) handle their own resolution. Use FileStorage::with_resolver(backend, NoOpResolver) explicitly.

Backends receive already-resolved paths - they just store/retrieve bytes.

Backends (feature-gated)

Each backend implements the traits it supports:

  • memory (default): MemoryBackend
    • Implements: Fs + FsLink + FsPermissions + FsSync + FsStats + FsInode = FsFuse
    • FileStorage handles path resolution (symlink-aware)
    • Inode source: internal node IDs (incrementing counter)
  • stdfs (optional): StdFsBackend - direct std::fs delegation
    • Implements: FsPosix (all traits including Layer 4) + SelfResolving
    • Implements SelfResolving (OS handles resolution)
    • Inode source: OS inode numbers (std::fs::Metadata::ino())
    • No path containment - full filesystem access
    • Use when you only need middleware layers without sandboxing
  • vrootfs (optional): VRootFsBackend using strict-path for containment
    • Implements: FsPosix (all traits including Layer 4) + SelfResolving
    • Implements SelfResolving (OS handles resolution, strict-path prevents escapes)
    • Inode source: OS inode numbers (std::fs::Metadata::ino())

Middleware

  • Quota<B> + QuotaLayer - Resource limits
  • Restrictions<B> + RestrictionsLayer - Runtime policy (.deny_permissions())
  • PathFilter<B> + PathFilterLayer - Path-based access control
  • ReadOnly<B> + ReadOnlyLayer - Block writes
  • RateLimit<B> + RateLimitLayer - Operation throttling
  • Tracing<B> + TracingLayer - Instrumentation
  • DryRun<B> + DryRunLayer - Log without executing
  • Cache<B> + CacheLayer - LRU read cache
  • Overlay<B1,B2> + OverlayLayer - Union filesystem

FileStorage (Ergonomic Wrapper)

  • FileStorage<B> - Thin wrapper with std::fs-aligned API
    • Generic backend B (no boxing, static dispatch)
    • Boxed PathResolver internally (cold path, boxing OK per ADR-025)
    • .boxed() method for opt-in type erasure when needed
    • Users who need type-safe domains create wrapper types: struct SandboxFs(FileStorage<B>)
  • BackendStack builder for fluent middleware composition
  • Accepts impl AsRef<Path> in FileStorage/FsExt (core traits use &Path)
  • Delegates all operations to wrapped backend

Axum-style design: Zero-cost by default, type erasure opt-in.

Note: FileStorage contains NO policy logic. Policy is handled by middleware.

Exit criteria: Each backend implements the appropriate trait level (Fs, FsFull, FsFuse) and passes conformance suite. Each middleware wraps backends implementing the same traits. Applications can use FileStorage as drop-in for std::fs patterns.


Phase 3: Conformance test suite

Goal: Prevent backend divergence and validate middleware behavior.

Backend conformance tests

Conformance tests are organized by trait layer:

Layer 1: Fs (Core) - All backends MUST pass

  • FsRead: read/read_to_string/read_range/exists/metadata/open_read
  • FsWrite: write/append/remove_file/rename/copy/truncate/open_write
  • FsDir: read_dir/create_dir*/remove_dir*

Layer 2: FsFull (Extended) - Backends that support these features

  • FsLink: symlink/hard_link/read_link/symlink_metadata
  • FsPermissions: set_permissions
  • FsSync: sync/fsync
  • FsStats: statfs

Layer 3: FsFuse (Inode) - Backends that support FUSE mounting

  • FsInode: path_to_inode/inode_to_path/lookup/metadata_by_inode

Layer 4: FsPosix (Full POSIX) - Backends that support full POSIX

  • FsHandles: open/read_at/write_at/close
  • FsLock: lock/try_lock/unlock
  • FsXattr: get_xattr/set_xattr/remove_xattr/list_xattr

Path Resolution Tests (virtual backends only)

  • /foo/../bar resolves correctly when foo is a regular directory
  • /foo/../bar resolves correctly when foo is a symlink (follows symlink, then ..)
  • Symlink chains resolve correctly (A → B → C → target)
  • Circular symlink detection (A → B → A returns error, not infinite loop)
  • Max symlink depth enforced (prevent deep chains)
  • Reading a symlink follows the target (virtual backends)

Path Edge Cases (learned from vfs issues)

  • //double//slashes// normalizes correctly
  • Note: /foo/../bar requires resolution (see above), not simple normalization
  • Trailing slashes handled consistently
  • Empty path returns error (not panic)
  • Root path / works correctly
  • Very long paths (near OS limits)
  • Unicode paths
  • Paths with spaces and special characters

Thread Safety Tests (learned from vfs #72, #47)

  • Concurrent read from multiple threads
  • Concurrent write to different files
  • Concurrent create_dir_all to same path (must not race)
  • Concurrent read_dir while modifying directory
  • Stress test: 100 threads, 1000 operations each

Error Handling Tests (learned from vfs #8, #23)

  • Missing file returns NotFound, not panic
  • Missing parent directory returns error, not panic
  • Invalid UTF-8 in path returns error, not panic
  • All error variants are matchable

Platform Tests

  • Windows path separators (\ vs /)
  • Case sensitivity differences
  • Symlink behavior differences

Middleware tests

  • Quota: Limit enforcement, usage tracking, streaming writes
  • Restrictions: Permission blocking via .deny_permissions(), error messages
  • PathFilter: Glob pattern matching, deny-by-default
  • RateLimit: Throttling behavior, burst handling
  • ReadOnly: All write operations blocked
  • Tracing: Operations logged correctly
  • Middleware composition order (inner to outer)
  • Middleware with streaming I/O (wrappers work correctly)

No-Panic Tests

#![allow(unused)]
fn main() {
#[test]
fn no_panic_on_missing_file() {
    let backend = create_backend();
    let result = backend.read(std::path::Path::new("/nonexistent"));
    assert!(matches!(result, Err(FsError::NotFound { .. })));
}

#[test]
fn no_panic_on_invalid_operation() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
    // Try to read directory on a file
    let result = backend.read_dir(std::path::Path::new("/file.txt"));
    assert!(matches!(result, Err(FsError::NotADirectory { .. })));
}
}

WASM Compatibility Tests (learned from vfs #68)

#![allow(unused)]
fn main() {
#[cfg(target_arch = "wasm32")]
#[wasm_bindgen_test]
fn memory_backend_works_in_wasm() {
    let backend = MemoryBackend::new();
    backend.write(std::path::Path::new("/test.txt"), b"hello").unwrap();
    // Should not panic
}
}

Exit criteria: All backends pass same suite; middleware tests are backend-agnostic; zero panics in any test.


Phase 4: Documentation + examples

  • Keep AGENTS.md and src/architecture/design-overview.md authoritative
  • Provide example per backend
  • Provide backend implementer guide
  • Provide middleware implementer guide
  • Document performance characteristics per backend
  • Document thread safety guarantees per backend
  • Document platform-specific behavior

Phase 5: CI/CD Pipeline

Goal: Ensure quality across platforms and prevent regressions.

Cross-Platform Testing

# .github/workflows/ci.yml
strategy:
  matrix:
    os: [ubuntu-latest, windows-latest, macos-latest]
    rust: [stable, beta]

Required CI checks:

  • cargo test on all platforms
  • cargo clippy -- -D warnings
  • cargo fmt --check
  • cargo doc --no-deps
  • WASM build test: cargo build --target wasm32-unknown-unknown

Additional CI Jobs

  • Miri (undefined behavior detection): cargo +nightly miri test
  • Address Sanitizer: Detect memory issues
  • Thread Sanitizer: Detect data races
  • Coverage: Minimum 80% line coverage

Release Checklist

  • All CI checks pass
  • No new clippy warnings
  • CHANGELOG updated
  • Version bumped appropriately
  • Documentation builds without warnings

Phase 6: Mounting Support (fuse, winfsp features)

Goal: Make mounting AnyFS stacks easy, safe, and enjoyable for programmers. Mounting is part of the anyfs crate behind feature flags.

Milestones

  • Phase 0 (design complete): API shape and roadmap
    • MountHandle, MountBuilder, MountOptions, MountError
    • Platform detection hooks (is_available) and error mapping
    • Examples anchored in the mounting guide
  • Phase 1: Linux FUSE MVP (read-only)
    • Lookup/getattr/readdir/read via fuser
    • Read-only mount option; write ops return PermissionDenied
  • Phase 2: Linux FUSE read/write
    • Create/write/rename/remove/link operations
    • Capability reporting and metadata mapping
  • Phase 3: macOS parity (macFUSE)
    • Adapter compatibility + driver detection
  • Phase 4: Windows support (WinFsp, optional Dokan)
    • Windows-specific mapping + driver detection

Exit criteria: Phase 2 delivered with reliable mount/unmount, no panics, and smoke tests; macOS/Windows continue in subsequent milestones.

API sketch (subject to change):

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, FsFuse, MountHandle};

// RAM drive with 1GB quota
let backend = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(1024 * 1024 * 1024)
        .build());

// Backend must implement FsFuse (includes FsInode)
let mount = MountHandle::mount(backend, "/mnt/ramdisk")?;

// Now it's a real mount point:
// $ df -h /mnt/ramdisk
// $ cp large_file.bin /mnt/ramdisk/  # fast!
// $ gcc -o /mnt/ramdisk/build ...    # compile in RAM
}

Cross-Platform Support (planned):

PlatformProviderRust CrateFeature FlagUser Must Install
LinuxFUSEfuserfusefuse3 package
macOSmacFUSEfuserfusemacFUSE
WindowsWinFspwinfspwinfspWinFsp

The anyfs crate provides a unified API across platforms:

#![allow(unused)]
fn main() {
impl MountHandle {
    #[cfg(unix)]
    pub fn mount<B: FsFuse>(backend: B, path: impl AsRef<Path>) -> Result<Self, ...> {
        // Uses fuser crate
    }

    #[cfg(windows)]
    pub fn mount<B: FsFuse>(backend: B, path: impl AsRef<Path>) -> Result<Self, ...> {
        // Uses winfsp crate
    }
}
}

Creative Use Cases:

Backend StackWhat You Get
MemoryBackendRAM drive
MemoryBackend + QuotaRAM drive with size limit
SqliteBackendSingle-file portable drive
SqliteBackend (with SQLCipher)Encrypted portable drive
Overlay<SqliteBackend, MemoryBackend>Persistent base + RAM scratch layer
Cache<SqliteBackend>SQLite with RAM read cache
Tracing<MemoryBackend>RAM drive with full audit log
ReadOnly<SqliteBackend>Immutable snapshot mount

Example: AI Agent Sandbox

#![allow(unused)]
fn main() {
// Sandboxed workspace mounted as real filesystem
let sandbox = MountHandle::mount(
    MemoryBackend::new()
        .layer(PathFilterLayer::builder()
            .allow("/**")
            .deny("**/..*")             // No hidden files
            .build())
        .layer(QuotaLayer::builder()
            .max_total_size(100 * 1024 * 1024)
            .build()),
    "/mnt/agent-workspace"
)?;

// Agent's tools can now use standard filesystem APIs
// All operations are sandboxed, logged, and quota-limited
}

Architecture:

┌────────────────────────────────────────────────┐
│  /mnt/myfs (FUSE mount point)                  │
├────────────────────────────────────────────────┤
│  anyfs::mount (fuse/winfsp feature)            │
│    - Linux/macOS: fuser                        │
│    - Windows: winfsp                           │
├────────────────────────────────────────────────┤
│  Middleware stack (Quota, PathFilter, etc.)    │
├────────────────────────────────────────────────┤
│  FsFuse (Memory, SQLite, etc.)                 │
│    └─ includes FsInode for efficient lookups   │
│                                                │
│  Optional: FsPosix for locks/xattr             │
└────────────────────────────────────────────────┘

Requirements:

  • Backend must implement FsFuse (includes FsInode for efficient inode operations)
  • Backends implementing FsPosix get full lock/xattr support
  • Platform-specific FUSE provider must be installed

Future work (post-MVP)

  • Async API (AsyncFs, AsyncFsFull, etc.)
  • Import/export helpers (host path <-> container)
  • Encryption middleware
  • Compression middleware
  • no_std support (learned from vfs #38)
  • Batch operations for performance (learned from agentfs #130)
  • URL-based backend registry helper (e.g., sqlite://, mem://)
  • Copy-on-write overlay variant (Afero-style CopyOnWriteFs)
  • Archive backends (zip/tar) as separate crates
  • Indexing middleware with pluggable index backends (SQLite, PostgreSQL, MariaDB, etc.)
  • Companion shell (anyfs-shell) for interactive exploration of backends and middleware
  • Language bindings (anyfs-python via PyO3, C bindings) - see design-overview.md for approach
  • Dynamic middleware plugin system (MiddlewarePlugin trait for runtime-loaded .so/.dll plugins)
  • Metrics middleware with Prometheus exporter (GET /metrics endpoint)
  • Configurable tracing/logging backends (structured logs, CEF events, remote sinks)

anyfs-shell - Local Companion Shell

Minimal interactive shell for exploring AnyFS behavior without writing a full app. This is a companion crate, not part of the core libraries.

Goals:

  • Route all operations through FileStorage to exercise middleware and backend composition.
  • Provide a familiar, low-noise CLI for navigation and file management.
  • Keep scope intentionally small (no scripting, pipes, job control).

Command set:

  • ls [path] - list directory entries (default: current directory).
  • cd <path> - change working directory.
  • pwd - print current directory.
  • cat <path> - print file contents (UTF-8; error on invalid data).
  • cp <src> <dst> - copy files.
  • mv <src> <dst> - rename/move files.
  • rm <path> - remove file.
  • mkdir <path> - create directory.
  • stat <path> - show metadata (type, size, times, permissions if supported).
  • help, exit - basic shell control.

Flags (minimal):

  • ls -l - long listing with size/type and modified time (when available).
  • mkdir -p - create intermediate directories.
  • rm -r - remove directory tree.

Backend selection (initial sketch):

  • --backend mem (default), --backend sqlite --db path, --backend stdfs --root path, --backend vrootfs --root path.
  • --config path to load a small TOML file describing backend + middleware stack.

Example session:

anyfs:/ > ls
docs  tmp  hello.txt
anyfs:/ > cat hello.txt
Hello!
anyfs:/ > stat docs
type=dir size=0 modified=2025-02-01T12:34:56Z
anyfs:/ > exit

anyfs-vfs-compat - Interop with vfs crate

Adapter crate for bidirectional compatibility with the vfs crate ecosystem.

Why not adopt their trait? The vfs::FileSystem trait is too limited:

  • No symlinks, hard links, or permissions
  • No sync/fsync for durability
  • No truncate, statfs, or read_range
  • No middleware composition pattern

Our layered traits are a superset - Fs covers everything vfs::FileSystem does, plus our extended traits add more.

Adapters:

#![allow(unused)]
fn main() {
// Wrap a vfs::FileSystem to use as AnyFS backend
// Only implements Fs (Layer 1) - no links, permissions, etc.
pub struct VfsCompat<F: vfs::FileSystem>(F);
impl<F: vfs::FileSystem> FsRead for VfsCompat<F> { ... }
impl<F: vfs::FileSystem> FsWrite for VfsCompat<F> { ... }
impl<F: vfs::FileSystem> FsDir for VfsCompat<F> { ... }
// VfsCompat<F> implements Fs via blanket impl

// Wrap an AnyFS backend to use as vfs::FileSystem
// Any backend implementing Fs works
pub struct AnyFsCompat<B: Fs>(B);
impl<B: Fs> vfs::FileSystem for AnyFsCompat<B> { ... }
}

Use cases:

  • Migrate from vfs to AnyFS incrementally
  • Use existing vfs backends (EmbeddedFS) in AnyFS
  • Use AnyFS backends in projects that depend on vfs

Cloud Storage & Remote Access

The layered trait design enables building cloud storage services - each adapter requires only the traits it needs.

Architecture:

┌─────────────────────────────────────────────────────────────────────┐
│                          YOUR SERVER                                │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  Quota<Tracing<SqliteBackend>>  (implements FsFuse)          │  │
│  └───────────────────────────────────────────────────────────────┘  │
│         ▲              ▲              ▲              ▲              │
│         │              │              │              │              │
│    ┌────┴────┐   ┌─────┴─────┐  ┌─────┴─────┐  ┌─────┴─────┐       │
│    │ S3 API  │   │ gRPC/REST │  │    NFS    │  │  WebDAV   │       │
│    │  (Fs)   │   │   (Fs)    │  │ (FsFuse) │  │  (FsFull)│       │
│    └────┬────┘   └─────┬─────┘  └─────┬─────┘  └─────┬─────┘       │
└─────────┼──────────────┼──────────────┼──────────────┼─────────────┘
          │              │              │              │
          ▼              ▼              ▼              ▼
    AWS SDK/CLI    Your SDK/app    mount /cloud   mount /webdav

Future crates for remote access:

CrateRequired TraitPurpose
anyfs-s3-serverFsExpose as S3-compatible API (objects = files)
anyfs-sftp-serverFsFullSFTP server with permissions/links
anyfs-ssh-shellFsFuseSSH server with FUSE-mounted home directories
anyfs-remoteFsRemoteBackend client (implements Fs)
anyfs-grpcFsgRPC protocol adapter
anyfs-webdavFsFullWebDAV server (needs permissions)
anyfs-nfsFsFuseNFS server (needs inodes)

anyfs-s3-server - S3-Compatible Object Storage

Expose any Fs backend as an S3-compatible API. Users access your storage with standard AWS SDKs.

#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, TracingLayer};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate
use anyfs_s3_server::S3Server;

// Your storage backend with quotas and audit logging
let backend = SqliteBackend::open("storage.db")?
    .layer(TracingLayer::new())
    .layer(QuotaLayer::builder()
        .max_total_size(100 * 1024 * 1024 * 1024)  // 100GB
        .build());

S3Server::new(backend)
    .with_auth(auth_provider)       // Your auth implementation
    .with_bucket("user-files")      // Virtual bucket name
    .bind("0.0.0.0:9000")
    .run()
    .await?;
}

Client usage (standard AWS CLI/SDK):

# Upload a file
aws s3 cp document.pdf s3://user-files/ --endpoint-url http://yourserver:9000

# List files
aws s3 ls s3://user-files/ --endpoint-url http://yourserver:9000

# Download a file
aws s3 cp s3://user-files/document.pdf ./local.pdf --endpoint-url http://yourserver:9000

anyfs-remote - Remote Backend Client

An Fs implementation that connects to a remote server. Works with FileStorage or mounting.

#![allow(unused)]
fn main() {
use anyfs_remote::RemoteBackend;
use anyfs::FileStorage;

// Connect to your cloud service
let remote = RemoteBackend::connect("https://api.yourservice.com")
    .with_auth(api_key)
    .await?;

// Use like any other backend
let fs = FileStorage::new(remote);
fs.write("/documents/report.pdf", data)?;
}

Combined with FUSE for transparent mount:

#![allow(unused)]
fn main() {
use anyfs_remote::RemoteBackend;
use anyfs::MountHandle;

// Mount remote storage as local directory
let remote = RemoteBackend::connect("https://yourserver.com")?;
MountHandle::mount(remote, "/mnt/cloud")?;

// Now use standard filesystem tools:
// $ cp file.txt /mnt/cloud/
// $ ls /mnt/cloud/
// $ cat /mnt/cloud/file.txt
}

anyfs-grpc - gRPC Protocol

Efficient binary protocol for remote Fs access.

Server side:

#![allow(unused)]
fn main() {
use anyfs_grpc::GrpcServer;

let backend = SqliteBackend::open("storage.db")?;
GrpcServer::new(backend)
    .bind("[::1]:50051")
    .serve()
    .await?;
}

Client side:

#![allow(unused)]
fn main() {
use anyfs_grpc::GrpcBackend;

let backend = GrpcBackend::connect("http://[::1]:50051").await?;
let fs = FileStorage::new(backend);
}

Multi-Tenant Cloud Storage Example

#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, PathFilterLayer, TracingLayer};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate
use anyfs_s3_server::S3Server;

// Per-tenant backend factory
fn create_tenant_storage(tenant_id: &str, quota_bytes: u64) -> impl Fs {
    let db_path = format!("/data/tenants/{}.db", tenant_id);

    SqliteBackend::open(&db_path).unwrap()
        .layer(TracingLayer::new()
            .with_target(&format!("tenant.{}", tenant_id)))
        .layer(PathFilterLayer::builder()
            .allow("/**")
            .deny("../**")  // No path traversal
            .build())
        .layer(QuotaLayer::builder()
            .max_total_size(quota_bytes)
            .build())
}

// Tenant-aware S3 server
S3Server::new_multi_tenant(|request| {
    let tenant_id = extract_tenant(request)?;
    let quota = get_tenant_quota(tenant_id)?;
    Ok(create_tenant_storage(tenant_id, quota))
})
.bind("0.0.0.0:9000")
.run()
.await?;
}

anyfs-sftp-server - SFTP Access with Shell Commands

Expose a FsFull backend as an SFTP server. Users connect with standard SSH/SFTP clients and navigate with familiar shell commands.

Architecture:

┌─────────────────────────────────────────────────────────────────┐
│                      YOUR SERVER                                │
│                                                                 │
│  ┌───────────────┐    ┌───────────────────────────────────────┐ │
│  │ SFTP Server   │───▶│ User's isolated FileStorage           │ │
│  │ (anyfs-sftp)  │    │   └─▶ Quota<SqliteBackend>            │ │
│  │  └───────────────┘    │       └─▶ /data/users/alice.db        │ │
│  │         ▲             └───────────────────────────────────────┘ │
│  └─────────┼───────────────────────────────────────────────────────┘
│            │
│            │ sftp://
│            │
│      ┌─────┴─────┐
│      │  Remote   │  $ cd /documents
│      │  User     │  $ ls
│      │  (shell)  │  $ put file.txt
│      └───────────┘

Server implementation:

#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, TracingLayer};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate
use anyfs_sftp_server::SftpServer;

// Per-user isolated backend factory
fn get_user_storage(username: &str) -> impl FsFull {
    let db_path = format!("/data/users/{}.db", username);

    SqliteBackend::open(&db_path).unwrap()
        .layer(TracingLayer::new()
            .with_target(&format!("user.{}", username)))
        .layer(QuotaLayer::builder()
            .max_total_size(10 * 1024 * 1024 * 1024)  // 10GB per user
            .build())
}

SftpServer::new(get_user_storage)
    .with_host_key("/etc/ssh/host_key")
    .bind("0.0.0.0:22")
    .run()
    .await?;
}

User experience (standard SFTP client):

$ sftp alice@yourserver.com
Connected to yourserver.com.
sftp> pwd
/
sftp> ls
documents/  photos/  backup/
sftp> cd documents
sftp> ls
report.pdf  notes.txt
sftp> put local_file.txt
Uploading local_file.txt to /documents/local_file.txt
sftp> get notes.txt
Downloading /documents/notes.txt
sftp> mkdir projects
sftp> rm old_file.txt

All operations happen on the user’s isolated SQLite database on your server.

anyfs-ssh-shell - Full Shell Access with Sandboxed Home

Give users a real SSH shell where their home directory is backed by FsFuse.

Server implementation:

#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, MountHandle};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate
use anyfs_ssh_shell::SshShellServer;

// On user login, mount their isolated storage as $HOME
fn on_user_login(username: &str) -> Result<(), Error> {
    let db_path = format!("/data/users/{}.db", username);
    let backend = SqliteBackend::open(&db_path)?
        .layer(QuotaLayer::builder()
            .max_total_size(10 * 1024 * 1024 * 1024)
            .build());

    let mount_point = format!("/home/{}", username);
    MountHandle::mount(backend, &mount_point)?;
    Ok(())
}

SshShellServer::new()
    .on_login(on_user_login)
    .bind("0.0.0.0:22")
    .run()
    .await?;
}

User experience (full shell):

$ ssh alice@yourserver.com
Welcome to YourServer!

alice@server:~$ pwd
/home/alice
alice@server:~$ ls -la
total 3
drwxr-xr-x  4 alice alice 4096 Dec 25 10:00 .
drwxr-xr-x  2 alice alice 4096 Dec 25 10:00 documents
drwxr-xr-x  2 alice alice 4096 Dec 25 10:00 photos

alice@server:~$ cat documents/notes.txt
Hello world!

alice@server:~$ echo "new content" > documents/new_file.txt

alice@server:~$ du -sh .
150M    .

# Everything they do is actually stored in /data/users/alice.db on the server!
# They can use vim, gcc, python - all working on their isolated FsFuse backend

Isolated Shell Hosting Use Cases

Use CaseBackend StackWhat Users Get
Shared hostingQuota<SqliteBackend>Shell + isolated home in SQLite
Dev containersOverlay<BaseImage, MemoryBackend>Shared base + ephemeral scratch
Coding educationQuota<MemoryBackend>Temporary sandboxed environment
CI/CD runnersTracing<MemoryBackend>Audited ephemeral workspace
Secure file dropPathFilter<SqliteBackend>Write-only inbox directory

Access Pattern Summary

Access MethodCrateClient RequirementBest For
S3 APIanyfs-s3-serverAWS SDK (any language)Object storage, web apps
SFTPanyfs-sftp-serverAny SFTP clientShell-like file access
SSH Shellanyfs-ssh-shell + anyfs (fuse feature)SSH clientFull shell with sandboxed home
gRPCanyfs-grpcGenerated clientHigh-performance apps
RESTCustom adapterHTTP clientSimple integrations
FUSE mountanyfs (fuse feature) + anyfs-remoteFUSE installedTransparent local access
WebDAVanyfs-webdavWebDAV client/OSFile manager access
NFSanyfs-nfsNFS clientUnix network shares

Lessons Learned (Reference)

This plan incorporates lessons from issues in similar projects:

SourceIssueLesson Applied
vfs #72RwLock panicThread safety tests
vfs #47create_dir_all raceConcurrent stress tests
vfs #8, #23Panics instead of errorsNo-panic policy
vfs #24, #42Path inconsistenciesPath edge case tests
vfs #33Hard to match errorsErgonomic FsError design
vfs #68WASM panicsWASM compatibility tests
vfs #66'static confusionMinimal trait bounds
agentfs #130Slow file deletionPerformance documentation
agentfs #129Signal handlingProper Drop implementations

See Lessons from Similar Projects for full analysis.

Backend Implementer’s Guide

This guide walks you through implementing a custom AnyFS backend.


Overview

AnyFS uses layered traits - you implement only what you need:

FsPosix (full POSIX)
   │
FsFuse (FUSE-mountable)
   │
FsFull (std::fs features)
   │
   Fs (basic - 90% of use cases)
   │
FsRead + FsWrite + FsDir (core)

Key properties:

  • Backends accept &Path for all path parameters
  • Backends receive already-resolved paths - FileStorage handles path resolution via pluggable PathResolver (see ADR-033). Default is IterativeResolver for symlink-aware resolution.
  • Backends handle storage only - just store/retrieve bytes at given paths
  • Policy (limits, feature gates) is handled by middleware, not backends
  • Implement only the traits your backend supports
  • Backends must be thread-safe - all trait methods use &self, so backends must use interior mutability (e.g., RwLock, Mutex) for synchronization

Dependency

Depend only on anyfs-backend:

[dependencies]
anyfs-backend = "0.1"

Choosing Which Traits to Implement

Your Backend SupportsImplement
Basic file operationsFs (= FsRead + FsWrite + FsDir)
Links, permissions, syncAdd FsLink, FsPermissions, FsSync, FsStats
Hardlinks, FUSE mountingAdd FsInode → becomes FsFuse
Full POSIX (handles, locks, xattr)Add FsHandles, FsLock, FsXattr → becomes FsPosix

Minimal Backend: Just Fs

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, DirEntry};
use std::io::{Read, Write};
use std::path::{Path, PathBuf};

pub struct MyBackend {
    // Your storage fields
}

// Implement FsRead
impl FsRead for MyBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();
        todo!()
    }

    fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
        let data = self.read(path)?;
        String::from_utf8(data).map_err(|e| FsError::Backend(e.to_string()))
    }

    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
        todo!()
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        todo!()
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        todo!()
    }

    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
        let data = self.read(path)?;
        Ok(Box::new(std::io::Cursor::new(data)))
    }
}

// Implement FsWrite
impl FsWrite for MyBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        todo!()
    }

    fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        todo!()
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        todo!()
    }

    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        todo!()
    }

    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        todo!()
    }

    fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
        todo!()
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
        todo!()
    }
}

// Implement FsDir
impl FsDir for MyBackend {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        todo!()
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        todo!()
    }

    fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
        todo!()
    }

    fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
        todo!()
    }

    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
        todo!()
    }
}

// MyBackend now implements Fs automatically (blanket impl)!
}

Implementation Steps

Step 1: Pick a Data Model

Your backend needs internal storage. Options:

  • HashMap-based: HashMap<PathBuf, Entry> for simple cases
  • Tree-based: Explicit directory tree structure
  • Database-backed: SQLite, key-value store, etc.

Minimum metadata per entry:

  • File type (file/directory/symlink)
  • Size (for files)
  • Content (for files)
  • Timestamps (optional)
  • Permissions (optional)

Step 2: Implement FsRead (Layer 1)

Start with read operations (easiest):

#![allow(unused)]
fn main() {
impl FsRead for MyBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
    fn read_to_string(&self, path: &Path) -> Result<String, FsError>;
    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError>;
    fn exists(&self, path: &Path) -> Result<bool, FsError>;
    fn metadata(&self, path: &Path) -> Result<Metadata, FsError>;
    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
}

Streaming implementation options:

For MemoryBackend or similar, you can use std::io::Cursor:

#![allow(unused)]
fn main() {
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
    let data = self.read(path)?;
    Ok(Box::new(std::io::Cursor::new(data)))
}
}

For VRootFsBackend, return the actual file handle:

#![allow(unused)]
fn main() {
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
    let file = std::fs::File::open(self.resolve(path)?)?;
    Ok(Box::new(file))
}
}

Step 3: Implement FsWrite (Layer 1)

#![allow(unused)]
fn main() {
impl FsWrite for MyBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
    fn remove_file(&self, path: &Path) -> Result<(), FsError>;
    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError>;
    fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError>;
    fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError>;
    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
}
}

Note on truncate:

  • If size < current: discard trailing bytes
  • If size > current: extend with zero bytes
  • Required for FUSE support and editor save operations

Step 4: Implement FsDir (Layer 1)

#![allow(unused)]
fn main() {
impl FsDir for MyBackend {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
    fn create_dir(&self, path: &Path) -> Result<(), FsError>;
    fn create_dir_all(&self, path: &Path) -> Result<(), FsError>;
    fn remove_dir(&self, path: &Path) -> Result<(), FsError>;
    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError>;
}
}

Congratulations! After implementing FsRead, FsWrite, and FsDir, your backend implements Fs automatically (blanket impl). This covers 90% of use cases.


Optional: Layer 2 Traits

Add these if your backend supports the features:

#![allow(unused)]
fn main() {
impl FsLink for MyBackend {
    fn symlink(&self, original: &Path, link: &Path) -> Result<(), FsError>;
    fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError>;
    fn read_link(&self, path: &Path) -> Result<PathBuf, FsError>;
    fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError>;
}
}
  • Symlinks store a target path as a string
  • Hard links share content with the original (update link count)

FsPermissions

#![allow(unused)]
fn main() {
impl FsPermissions for MyBackend {
    fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError>;
}
}

FsSync - Durability

#![allow(unused)]
fn main() {
impl FsSync for MyBackend {
    fn sync(&self) -> Result<(), FsError>;
    fn fsync(&self, path: &Path) -> Result<(), FsError>;
}
}
  • sync(): Flush all pending writes to durable storage
  • fsync(path): Flush pending writes for a specific file
  • MemoryBackend can no-op these (volatile by design)
  • SqliteBackend: PRAGMA wal_checkpoint or connection flush
  • VRootFsBackend: std::fs::File::sync_all()

FsStats - Filesystem Stats

#![allow(unused)]
fn main() {
impl FsStats for MyBackend {
    fn statfs(&self) -> Result<StatFs, FsError>;
}
}

Return filesystem capacity information:

#![allow(unused)]
fn main() {
StatFs {
    total_bytes: 0,      // 0 = unlimited
    used_bytes: ...,
    available_bytes: ...,
    total_inodes: 0,
    used_inodes: ...,
    available_inodes: ...,
    block_size: 4096,
    max_name_len: 255,
}
}

Optional: Layer 3 - FsInode (For FUSE)

Implement FsInode if you need FUSE mounting or inode-based hardlink tracking:

#![allow(unused)]
fn main() {
impl FsInode for MyBackend {
    fn path_to_inode(&self, path: &Path) -> Result<u64, FsError>;
    fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError>;
    fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError>;
    fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError>;
}
}

No blanket/default implementation - you must implement this trait explicitly if you need:

  • FUSE mounting: FUSE operates on inodes, not paths
  • Inode tracking for hardlinks: Two paths share the same inode (note: hard_link() creation is in FsLink)

Level 1: Simple backend (no FsInode)

Don’t implement FsInode. The backend won’t support FUSE mounting. Hardlink creation via FsLink::hard_link() still works, but inode sharing won’t be tracked.

Level 2: Hardlink support

Override path_to_inode so hardlinked paths return the same inode:

#![allow(unused)]
fn main() {
struct Node {
    id: u64,          // Unique node ID (the inode)
    nlink: u64,       // Hard link count
    content: Vec<u8>,
}

struct MemoryBackend {
    next_id: u64,
    nodes: HashMap<u64, Node>,           // inode -> Node
    paths: HashMap<PathBuf, u64>,        // path -> inode
}

impl FsInode for MemoryBackend {
    fn path_to_inode(&self, path: &Path) -> Result<u64, FsError> {
        self.paths.get(path.as_ref())
            .copied()
            .ok_or_else(|| FsError::NotFound { path: path.as_ref().into() })
    }
    // ... implement others
}

impl FsLink for MemoryBackend {
    fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError> {
        let inode = self.path_to_inode(&original)?;
        self.paths.insert(link.as_ref().to_path_buf(), inode);
        self.nodes.get_mut(&inode).unwrap().nlink += 1;
        Ok(())
    }
}
}

Level 3: Full FUSE efficiency

Override all 4 methods for O(1) inode operations:

#![allow(unused)]
fn main() {
impl FsInode for SqliteBackend {
    fn path_to_inode(&self, path: &Path) -> Result<u64, FsError> {
        self.conn.query_row(
            "SELECT id FROM nodes WHERE path = ?",
            [path.as_ref().to_string_lossy()],
            |row| Ok(row.get::<_, i64>(0)? as u64),
        ).map_err(|_| FsError::NotFound { path: path.as_ref().into() })
    }

    fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError> {
        self.conn.query_row(
            "SELECT path FROM nodes WHERE id = ?",
            [inode as i64],
            |row| Ok(PathBuf::from(row.get::<_, String>(0)?)),
        ).map_err(|_| FsError::NotFound { path: format!("inode:{}", inode).into() })
    }

    fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError> {
        self.conn.query_row(
            "SELECT id FROM nodes WHERE parent_id = ? AND name = ?",
            params![parent_inode as i64, name.to_string_lossy()],
            |row| Ok(row.get::<_, i64>(0)? as u64),
        ).map_err(|_| FsError::NotFound { path: name.into() })
    }

    fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError> {
        self.conn.query_row(
            "SELECT type, size, nlink, created, modified FROM nodes WHERE id = ?",
            [inode as i64],
            |row| Ok(Metadata {
                inode,
                nlink: row.get(2)?,
                // ...
            }),
        ).map_err(|_| FsError::NotFound { path: format!("inode:{}", inode).into() })
    }
}
}

Summary:

Your BackendImplementResult
Simple (no hardlinks)NothingWorks with defaults
With hardlinksFsInode::path_to_inodeHardlinks work correctly
FUSE-optimizedFull FsInodeMaximum performance

Optional: Layer 4 - POSIX Traits

For full POSIX semantics (file handles, locking, extended attributes):

FsHandles - File Handle Operations

#![allow(unused)]
fn main() {
impl FsHandles for MyBackend {
    fn open(&self, path: &Path, flags: OpenFlags) -> Result<Handle, FsError>;
    fn read_at(&self, handle: Handle, buf: &mut [u8], offset: u64) -> Result<usize, FsError>;
    fn write_at(&self, handle: Handle, data: &[u8], offset: u64) -> Result<usize, FsError>;
    fn close(&self, handle: Handle) -> Result<(), FsError>;
}
}

FsLock - File Locking

#![allow(unused)]
fn main() {
impl FsLock for MyBackend {
    fn lock(&self, handle: Handle, lock: LockType) -> Result<(), FsError>;
    fn try_lock(&self, handle: Handle, lock: LockType) -> Result<bool, FsError>;
    fn unlock(&self, handle: Handle) -> Result<(), FsError>;
}
}

FsXattr - Extended Attributes

#![allow(unused)]
fn main() {
impl FsXattr for MyBackend {
    fn get_xattr(&self, path: &Path, name: &str) -> Result<Vec<u8>, FsError>;
    fn set_xattr(&self, path: &Path, name: &str, value: &[u8]) -> Result<(), FsError>;
    fn remove_xattr(&self, path: &Path, name: &str) -> Result<(), FsError>;
    fn list_xattr(&self, path: &Path) -> Result<Vec<String>, FsError>;
}
}

Note: Most backends don’t need Layer 4. Only implement if you’re wrapping a real filesystem (VRootFsBackend) or building a database that needs full POSIX semantics.


Error Handling

Return appropriate FsError variants:

SituationError
Path doesn’t existFsError::NotFound { path, operation }
Path already existsFsError::AlreadyExists { path, operation }
Expected file, got dirFsError::NotAFile { path }
Expected dir, got fileFsError::NotADirectory { path }
Remove non-empty dirFsError::DirectoryNotEmpty { path }
Internal errorFsError::Backend { message }

What Backends Do NOT Do

ConcernWhere It Lives
Quota enforcementQuota<B> middleware
Feature gatingRestrictions<B> middleware
LoggingTracing<B> middleware
Ergonomic APIFileStorage<B> wrapper

Backends focus on storage. Keep them simple.


Optional Optimizations

Some trait methods have default implementations that work universally but may be suboptimal for specific backends. You can override these for better performance.

Path Canonicalization (FsPath Trait)

The FsPath trait provides canonicalize() and soft_canonicalize() with default implementations that call read_link() and symlink_metadata() per path component.

Default behavior: O(n) calls for a path with n components

When to override:

  • Your backend can resolve paths more efficiently (e.g., SQL query)
  • Your backend delegates to OS (which has optimized syscalls)

SQLite Example - Single Query Resolution:

#![allow(unused)]
fn main() {
impl FsPath for SqliteBackend {
    fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        // Resolve entire path in one recursive CTE query
        self.conn.query_row(
            r#"
            WITH RECURSIVE resolve(current, depth) AS (
                SELECT :path, 0
                UNION ALL
                SELECT 
                    CASE WHEN n.type = 'symlink' 
                         THEN n.target 
                         ELSE resolve.current 
                    END,
                    depth + 1
                FROM resolve
                LEFT JOIN nodes n ON n.path = resolve.current
                WHERE n.type = 'symlink' AND depth < 40
            )
            SELECT current FROM resolve ORDER BY depth DESC LIMIT 1
            "#,
            params![path.to_string_lossy()],
            |row| Ok(PathBuf::from(row.get::<_, String>(0)?))
        ).map_err(|_| FsError::NotFound { 
            path: path.into(), 
            operation: "canonicalize" 
        })
    }
}
}

VRootFsBackend Example - OS Delegation:

#![allow(unused)]
fn main() {
impl FsPath for VRootFsBackend {
    fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
        // Delegate to OS, which uses optimized syscalls
        let host_path = self.root.join(path.strip_prefix("/").unwrap_or(path));
        let resolved = std::fs::canonicalize(&host_path)
            .map_err(|e| FsError::NotFound { 
                path: path.into(), 
                operation: "canonicalize" 
            })?;
        
        // Verify containment (security check)
        if !resolved.starts_with(&self.root) {
            return Err(FsError::AccessDenied {
                path: path.into(),
                reason: "path escapes root".into(),
            });
        }
        
        // Convert back to virtual path
        Ok(PathBuf::from("/").join(resolved.strip_prefix(&self.root).unwrap()))
    }
}
}

Other Optimization Opportunities

MethodDefaultOptimization Opportunity
canonicalize()O(n) per componentSQL CTE, OS delegation
create_dir_all()Recursive create_dir()Single SQL INSERT with path hierarchy
remove_dir_all()Recursive traversalSQL DELETE with LIKE pattern
copy()read + writeDatabase-level copy, reflink

General Pattern:

#![allow(unused)]
fn main() {
// Override any trait method with optimized implementation
impl FsDir for SqliteBackend {
    fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
        // Instead of calling create_dir() for each level,
        // insert all parent paths in a single transaction
        self.conn.execute_batch(&format!(
            "INSERT OR IGNORE INTO nodes (path, type) VALUES {}",
            generate_ancestor_values(path)
        ))?;
        Ok(())
    }
}
}

When NOT to optimize:

  • MemoryBackend: In-memory operations are already fast; keep it simple
  • Low-volume operations: Optimize where it matters (hot paths)
  • Prototype phase: Get correctness first, optimize later

See ADR-032 for the full design rationale.

Testing Your Backend

Use the conformance test suite:

#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
    use super::MyBackend;
    use anyfs_backend::Fs;

    fn create_backend() -> MyBackend {
        MyBackend::new()
    }

    #[test]
    fn test_write_read() {
        let backend = create_backend();
        backend.write(std::path::Path::new("/test.txt"), b"hello").unwrap();
        let content = backend.read(std::path::Path::new("/test.txt")).unwrap();
        assert_eq!(content, b"hello");
    }

    #[test]
    fn test_create_dir() {
        let backend = create_backend();
        backend.create_dir(std::path::Path::new("/foo")).unwrap();
        assert!(backend.exists(std::path::Path::new("/foo")).unwrap());
    }

    // ... more tests
}
}

Note on VRootFsBackend

If you are implementing a backend that wraps a real host filesystem directory, consider using strict-path::VirtualPath and strict-path::VirtualRoot internally for path containment. This ensures paths cannot escape the designated root directory.

This is an implementation choice for filesystem-based backends, not a requirement of the Fs trait.


For Middleware Authors: Wrapping Streams

Middleware that needs to intercept streaming I/O must wrap the returned Box<dyn Read/Write>.

CountingWriter Example

#![allow(unused)]
fn main() {
use std::io::{self, Write};
use std::sync::{Arc, atomic::{AtomicU64, Ordering}};

pub struct CountingWriter<W: Write> {
    inner: W,
    bytes_written: Arc<AtomicU64>,
}

impl<W: Write> CountingWriter<W> {
    pub fn new(inner: W, counter: Arc<AtomicU64>) -> Self {
        Self { inner, bytes_written: counter }
    }
}

impl<W: Write + Send> Write for CountingWriter<W> {
    fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
        let n = self.inner.write(buf)?;
        self.bytes_written.fetch_add(n as u64, Ordering::Relaxed);
        Ok(n)
    }

    fn flush(&mut self) -> io::Result<()> {
        self.inner.flush()
    }
}
}

Using in Quota Middleware

#![allow(unused)]
fn main() {
impl<B: Fs> Fs for Quota<B> {
    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
        // Check if we're at quota before opening
        if self.usage.total_bytes >= self.limits.max_total_size {
            return Err(FsError::QuotaExceeded { ... });
        }

        let inner = self.inner.open_write(path)?;
        Ok(Box::new(CountingWriter::new(inner, self.usage.bytes_counter.clone())))
    }
}
}

Alternatives to Wrapping

MiddlewareAlternative to wrapping
PathFilterCheck path at open time, pass stream through
ReadOnlyBlock open_write entirely
RateLimitCount the open call, not stream bytes
TracingLog the open call, pass stream through
DryRunReturn std::io::sink() instead of real writer

Creating Custom Middleware

Custom middleware only requires anyfs-backend as a dependency - same as backends.

Dependency

[dependencies]
anyfs-backend = "0.1"

The Pattern (5 Minutes to Understand)

Middleware is just a struct that:

  1. Wraps another Fs
  2. Implements Fs itself
  3. Intercepts some methods, delegates others
#![allow(unused)]
fn main() {
//  ┌─────────────────────────────────────┐
//  │  Your Middleware                    │
//  │  ┌─────────────────────────────────┐│
//  │  │  Inner Backend (any Fs) ││
//  │  └─────────────────────────────────┘│
//  └─────────────────────────────────────┘
//
//  Request → Middleware (intercept/modify) → Inner Backend
//  Response ← Middleware (intercept/modify) ← Inner Backend
}

Simplest Possible Middleware: Operation Counter

This middleware counts how many operations are performed:

#![allow(unused)]
fn main() {
use anyfs_backend::{Fs, FsError, Metadata, DirEntry, Permissions, StatFs};
use std::sync::atomic::{AtomicU64, Ordering};
use std::path::{Path, PathBuf};

/// Counts all operations performed on the backend.
pub struct Counter<B> {
    inner: B,
    pub count: AtomicU64,
}

impl<B> Counter<B> {
    pub fn new(inner: B) -> Self {
        Self { inner, count: AtomicU64::new(0) }
    }

    pub fn operations(&self) -> u64 {
        self.count.load(Ordering::Relaxed)
    }
}

// Implement each trait the inner backend supports
impl<B: FsRead> FsRead for Counter<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);  // Count it
        self.inner.read(path)                         // Delegate
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);
        self.inner.exists(path)
    }

    // ... repeat for all FsRead methods
}

impl<B: FsWrite> FsWrite for Counter<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        self.count.fetch_add(1, Ordering::Relaxed);  // Count it
        self.inner.write(path, data)                  // Delegate
    }

    // ... repeat for all FsWrite methods
}

impl<B: FsDir> FsDir for Counter<B> {
    // ... implement FsDir methods
}

// Counter<B> now implements Fs when B: Fs (blanket impl)
}

Usage:

#![allow(unused)]
fn main() {
let backend = Counter::new(MemoryBackend::new());
backend.write(std::path::Path::new("/file.txt"), b"hello")?;
backend.read(std::path::Path::new("/file.txt"))?;
backend.read(std::path::Path::new("/file.txt"))?;

println!("Operations: {}", backend.operations());  // 3
}

That’s it. That’s the entire pattern.

Adding a Layer (for .layer() syntax)

To enable the fluent .layer() syntax, add a Layer struct. The .layer() method comes from the LayerExt trait which has a blanket impl for all Fs types:

#![allow(unused)]
fn main() {
use anyfs_backend::{Layer, LayerExt};  // LayerExt provides .layer() method

pub struct CounterLayer;

impl<B: Fs> Layer<B> for CounterLayer {
    type Backend = Counter<B>;

    fn layer(self, backend: B) -> Counter<B> {
        Counter::new(backend)
    }
}
}

Usage with .layer():

#![allow(unused)]
fn main() {
// LayerExt is re-exported from anyfs crate
use anyfs::LayerExt;

let backend = MemoryBackend::new()
    .layer(CounterLayer);
}

Real Example: ReadOnly Middleware

A practical middleware that blocks all write operations:

#![allow(unused)]
fn main() {
pub struct ReadOnly<B> {
    inner: B,
}

impl<B> ReadOnly<B> {
    pub fn new(inner: B) -> Self {
        Self { inner }
    }
}

// FsRead: just delegate
impl<B: FsRead> FsRead for ReadOnly<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        self.inner.read(path)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        self.inner.exists(path)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        self.inner.metadata(path)
    }

    // ... delegate all FsRead methods
}

// FsDir: delegate reads, block writes
impl<B: FsDir> FsDir for ReadOnly<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        self.inner.read_dir(path)
    }

    fn create_dir(&self, _path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "create_dir" })
    }

    fn create_dir_all(&self, _path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "create_dir_all" })
    }

    fn remove_dir(&self, _path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "remove_dir" })
    }

    fn remove_dir_all(&self, _path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "remove_dir_all" })
    }
}

// FsWrite: block all operations
impl<B: FsWrite> FsWrite for ReadOnly<B> {
    fn write(&self, _path: &Path, _data: &[u8]) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "write" })
    }

    fn remove_file(&self, _path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { operation: "remove_file" })
    }

    // ... block all FsWrite methods
}
}

Usage:

#![allow(unused)]
fn main() {
let backend = ReadOnly::new(MemoryBackend::new());

backend.read(std::path::Path::new("/file.txt"));       // OK (if file exists)
backend.write(std::path::Path::new("/file.txt"), b""); // Error: ReadOnly
}

Middleware Decision Table

What You WantInterceptDelegateExample
Count operationsAll methods (before)All methodsCounter
Block writesWrite methodsRead methodsReadOnly
Transform dataread/writeEverything elseEncryption
Check permissionsAll methods (before)All methodsPathFilter
Log operationsAll methods (before)All methodsTracing
Enforce limitsWrite methods (check size)Read methodsQuota

Macro for Boilerplate (Optional)

If you don’t want to manually delegate all 29 methods, you can use a macro:

#![allow(unused)]
fn main() {
macro_rules! delegate {
    ($self:ident, $method:ident, $($arg:ident),*) => {
        $self.inner.$method($($arg),*)
    };
}

impl<B: Fs> Fs for MyMiddleware<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // Your logic here
        delegate!(self, read, path)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        delegate!(self, exists, path)
    }

    // ... etc
}
}

Or provide a delegate_all! macro in anyfs-backend that generates all the passthrough implementations.

Complete Example: Encryption Middleware

#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, Layer, FsError, Metadata, DirEntry};
use std::io::{Read, Write};
use std::path::{Path, PathBuf};

/// Middleware that encrypts/decrypts file contents transparently.
pub struct Encrypted<B> {
    inner: B,
    key: [u8; 32],
}

impl<B> Encrypted<B> {
    pub fn new(inner: B, key: [u8; 32]) -> Self {
        Self { inner, key }
    }

    fn encrypt(&self, data: &[u8]) -> Vec<u8> {
        // Your encryption logic here
        data.iter().map(|b| b ^ self.key[0]).collect()
    }

    fn decrypt(&self, data: &[u8]) -> Vec<u8> {
        // Your decryption logic here (symmetric for XOR)
        self.encrypt(data)
    }
}

// FsRead: decrypt on read
impl<B: FsRead> FsRead for Encrypted<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let encrypted = self.inner.read(path)?;
        Ok(self.decrypt(&encrypted))
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        self.inner.exists(path)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        self.inner.metadata(path)
    }

    // ... delegate other FsRead methods
}

// FsWrite: encrypt on write
impl<B: FsWrite> FsWrite for Encrypted<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let encrypted = self.encrypt(data);
        self.inner.write(path, &encrypted)
    }

    // ... delegate/encrypt other FsWrite methods
}

// FsDir: just delegate (directories don't need encryption)
impl<B: FsDir> FsDir for Encrypted<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        self.inner.read_dir(path)
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        self.inner.create_dir(path)
    }

    // ... delegate other FsDir methods
}

// Encrypted<B> now implements Fs when B: Fs (blanket impl)

/// Layer for creating Encrypted middleware.
pub struct EncryptedLayer {
    key: [u8; 32],
}

impl EncryptedLayer {
    pub fn new(key: [u8; 32]) -> Self {
        Self { key }
    }
}

impl<B: Fs> Layer<B> for EncryptedLayer {
    type Backend = Encrypted<B>;

    fn layer(self, backend: B) -> Self::Backend {
        Encrypted::new(backend, self.key)
    }
}
}

Usage

#![allow(unused)]
fn main() {
use anyfs::MemoryBackend;
use my_middleware::{EncryptedLayer, Encrypted};

// Direct construction
let fs = Encrypted::new(MemoryBackend::new(), key);

// Or via Layer trait
let fs = MemoryBackend::new()
    .layer(EncryptedLayer::new(key));
}

Middleware Checklist

  • Depends only on anyfs-backend
  • Implements the same traits as the inner backend (FsRead, FsWrite, FsDir, etc.)
  • Implements Layer<B> for MyMiddlewareLayer
  • Delegates unmodified operations to inner backend
  • Handles streaming I/O appropriately (wrap, pass-through, or block)
  • Documents which operations are intercepted vs delegated

Backend Checklist

  • Depends only on anyfs-backend
  • Implements core traits: FsRead, FsWrite, FsDir (= Fs)
  • Optional: Implements FsLink, FsPermissions, FsSync, FsStats (= FsFull)
  • Optional: Implements FsInode for FUSE support (= FsFuse)
  • Optional: Implements FsHandles, FsLock, FsXattr for POSIX (= FsPosix)
  • Accepts &Path for all paths
  • Returns correct FsError variants
  • Passes conformance tests for implemented traits
  • No panics (see below)
  • Thread-safe (see below)
  • Documents performance characteristics

Critical Implementation Guidelines

These guidelines are derived from issues found in similar projects (vfs, agentfs). All implementations MUST follow these.

1. No Panic Policy

NEVER use .unwrap() or .expect() in library code.

#![allow(unused)]
fn main() {
// BAD - will panic on missing file
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
    let entry = self.entries.get(path.as_ref()).unwrap();  // PANIC!
    Ok(entry.content.clone())
}

// GOOD - returns error
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
    let path = path.as_ref();
    let entry = self.entries.get(path)
        .ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
    Ok(entry.content.clone())
}
}

Edge cases that must NOT panic:

  • File doesn’t exist
  • Directory doesn’t exist
  • Path is empty string
  • Path is invalid UTF-8 (if using OsStr)
  • Parent directory missing
  • Trying to read a directory as a file
  • Trying to list a file as a directory
  • Concurrent access conflicts

2. Thread Safety (Required)

All trait methods use &self, not &mut self. This means backends MUST use interior mutability for thread-safe concurrent access.

Why &self?

  • Enables concurrent access patterns (multiple readers, concurrent operations)
  • Matches real filesystem semantics (concurrent access is normal)
  • More flexible API (can share references without exclusive ownership)

Backend implementer responsibility:

  • Use RwLock, Mutex, or similar for internal state
  • Ensure operations are atomic (a single write() call shouldn’t produce partial results)
  • Handle lock poisoning gracefully

What the synchronization guarantees:

  • Memory safety (no data corruption)
  • Atomic operations (writes don’t interleave)

What it does NOT guarantee:

  • Order of concurrent writes to the same path (last write wins - standard FS behavior)
#![allow(unused)]
fn main() {
use std::sync::{Arc, RwLock};
use std::collections::HashMap;
use std::path::PathBuf;

pub struct MemoryBackend {
    entries: Arc<RwLock<HashMap<PathBuf, Entry>>>,
}

impl FsRead for MemoryBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let entries = self.entries.read()
            .map_err(|_| FsError::Backend("lock poisoned".into()))?;
        // ...
    }
}

impl FsWrite for MemoryBackend {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let mut entries = self.entries.write()
            .map_err(|_| FsError::Backend("lock poisoned".into()))?;
        // ...
    }
}
}

Common race conditions to avoid:

  • create_dir_all called concurrently for same path
  • read during write to same file
  • read_dir while directory is being modified
  • rename with concurrent access to source or destination

3. Path Resolution - NOT Your Job

Backends do NOT handle path resolution. FileStorage handles:

  • Resolving .. and . components
  • Following symlinks for non-SelfResolving backends that implement FsLink
  • Normalizing paths (///, trailing slashes, etc.)
  • Walking the virtual directory structure

Your backend receives already-resolved, clean paths. Just store and retrieve bytes at those paths.

#![allow(unused)]
fn main() {
impl FsRead for MyBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // Path is already resolved - just use it directly
        let path = path.as_ref();
        self.storage.get(path).ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })
    }
}
}

Exception: If your backend wraps a real filesystem (like VRootFsBackend), implement SelfResolving to tell FileStorage to skip resolution - the OS handles it.

#![allow(unused)]
fn main() {
impl SelfResolving for VRootFsBackend {}
}

4. Error Messages

Include context in errors for debugging:

#![allow(unused)]
fn main() {
// BAD - no context
Err(FsError::NotFound)

// GOOD - includes path
Err(FsError::NotFound { path: path.to_path_buf() })

// BETTER - includes operation context
Err(FsError::Io {
    path: path.to_path_buf(),
    operation: "read",
    source: io_error,
})
}

5. Drop Implementation

Ensure cleanup happens correctly:

#![allow(unused)]
fn main() {
impl Drop for SqliteBackend {
    fn drop(&mut self) {
        // Flush any pending writes
        if let Err(e) = self.sync() {
            eprintln!("Warning: failed to sync on drop: {}", e);
        }
    }
}
}

6. Performance Documentation

Document the complexity of operations:

#![allow(unused)]
fn main() {
/// Memory-based virtual filesystem backend.
///
/// # Performance Characteristics
///
/// | Operation | Complexity | Notes |
/// |-----------|------------|-------|
/// | `read` | O(1) | HashMap lookup |
/// | `write` | O(n) | n = data size |
/// | `read_dir` | O(k) | k = entries in directory |
/// | `create_dir_all` | O(d) | d = path depth |
/// | `remove_dir_all` | O(n) | n = total descendants |
///
/// # Thread Safety
///
/// All operations are thread-safe. Uses `RwLock` internally.
/// Multiple concurrent reads are allowed.
/// Writes are exclusive.
pub struct MemoryBackend { ... }
}

Testing Requirements

Your backend MUST pass these test categories:

Basic Functionality

#![allow(unused)]
fn main() {
#[test]
fn test_read_write_roundtrip() { ... }

#[test]
fn test_create_dir_and_list() { ... }

#[test]
fn test_remove_file() { ... }
}

Edge Cases (No Panics)

#![allow(unused)]
fn main() {
#[test]
fn test_read_nonexistent_returns_error() {
    let backend = create_backend();
    assert!(matches!(
        backend.read(std::path::Path::new("/nonexistent")),
        Err(FsError::NotFound { .. })
    ));
}

#[test]
fn test_read_dir_on_file_returns_error() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
    assert!(matches!(
        backend.read_dir(std::path::Path::new("/file.txt")),
        Err(FsError::NotADirectory { .. })
    ));
}

#[test]
fn test_empty_path_returns_error() {
    let backend = create_backend();
    assert!(backend.read(std::path::Path::new("")).is_err());
}
}

Thread Safety

#![allow(unused)]
fn main() {
#[test]
fn test_concurrent_reads() {
    let backend = Arc::new(create_backend_with_data());
    let handles: Vec<_> = (0..10).map(|_| {
        let backend = backend.clone();
        std::thread::spawn(move || {
            for _ in 0..100 {
                backend.read(std::path::Path::new("/test.txt")).unwrap();
            }
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }
}

#[test]
fn test_concurrent_create_dir_all() {
    let backend = Arc::new(RwLock::new(create_backend()));
    let handles: Vec<_> = (0..10).map(|_| {
        let backend = backend.clone();
        std::thread::spawn(move || {
            let mut backend = backend.write().unwrap();
            // Should not panic or corrupt state
            let _ = backend.create_dir_all(std::path::Path::new("/a/b/c/d"));
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }
}
}

Path Normalization

Note: These tests apply to FileStorage integration tests, NOT direct backend tests. Backends receive already-resolved paths from FileStorage. The tests below verify that FileStorage correctly normalizes paths before passing them to backends.

#![allow(unused)]
fn main() {
#[test]
fn test_filestorage_path_normalization() {
    // Use FileStorage, not raw backend
    let fs = FileStorage::new(create_backend());
    fs.create_dir_all("/foo/bar").unwrap();
    fs.write("/foo/bar/test.txt", b"data").unwrap();

    // FileStorage resolves these before calling backend
    assert_eq!(fs.read("/foo/bar/test.txt").unwrap(), b"data");
    assert_eq!(fs.read("/foo/bar/../bar/test.txt").unwrap(), b"data");
    assert_eq!(fs.read("/foo/./bar/test.txt").unwrap(), b"data");
}

// Direct backend calls should use clean paths only
#[test]
fn test_backend_with_clean_paths() {
    let backend = create_backend();
    backend.create_dir_all(std::path::Path::new("/foo/bar")).unwrap();
    backend.write(std::path::Path::new("/foo/bar/test.txt"), b"data").unwrap();

    // Backends receive clean, resolved paths
    assert_eq!(backend.read(std::path::Path::new("/foo/bar/test.txt")).unwrap(), b"data");
}
}

MemoryBackend Snapshot & Restore

MemoryBackend supports cloning its entire state (snapshot) and serializing to bytes for persistence.

Core Concept

Snapshot = Clone the storage. That’s it.

#![allow(unused)]
fn main() {
// MemoryBackend implements Clone (custom impl, not derive)
pub struct MemoryBackend { ... }

impl Clone for MemoryBackend {
    fn clone(&self) -> Self {
        // Deep copy of Arc<RwLock<...>> contents
        // ...
    }
}

// Snapshot is just .clone()
let snapshot = fs.clone();

// Restore is just assignment
fs = snapshot;
}

API

#![allow(unused)]
fn main() {
impl MemoryBackend {
    /// Clone the entire filesystem state.
    /// This is a DEEP COPY - modifications to the clone don't affect the original.
    /// Implemented via custom Clone (not #[derive(Clone)]) to ensure deep copy
    /// of Arc<RwLock<...>> contents.
    pub fn clone(&self) -> Self { ... }

    /// Serialize to bytes for persistence/transfer.
    pub fn to_bytes(&self) -> Result<Vec<u8>, FsError>;

    /// Deserialize from bytes.
    pub fn from_bytes(data: &[u8]) -> Result<Self, FsError>;

    /// Save to file.
    pub fn save_to(&self, path: impl AsRef<Path>) -> Result<(), FsError>;

    /// Load from file.
    pub fn load_from(path: impl AsRef<Path>) -> Result<Self, FsError>;
}
}

Usage

#![allow(unused)]
fn main() {
let fs = MemoryBackend::new();
fs.write(std::path::Path::new("/data.txt"), b"important")?;

// Snapshot = clone
let checkpoint = fs.clone();

// Do risky work...
fs.write(std::path::Path::new("/data.txt"), b"corrupted")?;

// Rollback = replace with clone
fs = checkpoint;
assert_eq!(fs.read(std::path::Path::new("/data.txt"))?, b"important");
}

Persistence

#![allow(unused)]
fn main() {
// Save to disk
fs.save_to("state.bin")?;

// Load from disk
let fs = MemoryBackend::load_from("state.bin")?;
}

SqliteBackend

SQLite already has persistence - the database file IS the snapshot. For explicit snapshots:

#![allow(unused)]
fn main() {
impl SqliteBackend {
    /// Create an in-memory copy of the database.
    pub fn clone_to_memory(&self) -> Result<Self, FsError>;

    /// Backup to another file.
    pub fn backup_to(&self, path: impl AsRef<Path>) -> Result<(), FsError>;
}
}

SQLite Operational Guide

Production-ready SQLite for filesystem backends

This guide covers everything you need to run SQLite-backed storage at scale.


Overview

SQLite is an excellent choice for filesystem backends:

  • Single-file deployment (portable, easy backup)
  • ACID transactions (data integrity)
  • Rich query capabilities (dashboards, analytics)
  • Proven at scale (handles terabytes)

But it has specific requirements for concurrent access that you must understand.

Real-World Performance Reference

A single SQLite database on modern hardware can scale remarkably well (source):

MetricTypical (8 vCPU, NVMe, 32GB RAM)
Read P958-12 ms
Write P95 (batched)25-40 ms
Peak throughput~25k requests/min
Database size18 GB

Key insight: “Our breakthrough was not faster hardware. It was deciding that writes were expensive.”


The Golden Rule: Single Writer

SQLite supports many readers but only ONE writer at a time.

Even in WAL mode, concurrent writes will block. This isn’t a bug - it’s a design choice that enables SQLite’s reliability.

The Write Queue Pattern

Note: This pattern shows an async implementation using tokio for reference. The AnyFS API is synchronous - if you need async, wrap calls with spawn_blocking. See also the sync alternative using std::sync::mpsc below.

For filesystem backends, use a single-writer queue:

#![allow(unused)]
fn main() {
// Async variant (optional - requires tokio runtime)
use tokio::sync::mpsc;
use rusqlite::Connection;

pub struct SqliteBackend {
    /// Read-only connection pool (many readers OK)
    read_pool: Pool<Connection>,

    /// Write commands go through this channel
    write_tx: mpsc::UnboundedSender<WriteCmd>,
}

enum WriteCmd {
    Write { path: PathBuf, data: Vec<u8>, reply: oneshot::Sender<Result<(), FsError>> },
    Remove { path: PathBuf, reply: oneshot::Sender<Result<(), FsError>> },
    Rename { from: PathBuf, to: PathBuf, reply: oneshot::Sender<Result<(), FsError>> },
    CreateDir { path: PathBuf, reply: oneshot::Sender<Result<(), FsError>> },
    // ...
}

// Single writer task
async fn writer_loop(conn: Connection, mut rx: mpsc::UnboundedReceiver<WriteCmd>) {
    while let Some(cmd) = rx.recv().await {
        let result = match cmd {
            WriteCmd::Write { path, data, reply } => {
                let r = execute_write(&conn, &path, &data);
                let _ = reply.send(r);
            }
            // ... handle other commands
        };
    }
}
}

Why this works:

  • No SQLITE_BUSY errors (single writer = no contention)
  • Predictable latency (queue depth = backpressure)
  • Natural batching opportunity (combine multiple ops per transaction)
  • Clean audit logging (all writes go through one place)

Sync Alternative (no tokio required):

#![allow(unused)]
fn main() {
// Sync variant using std channels
use std::sync::mpsc;

pub struct SqliteBackend {
    read_pool: Pool<Connection>,
    write_tx: mpsc::Sender<WriteCmd>,
}

// Writer runs in a dedicated thread
fn writer_thread(conn: Connection, rx: mpsc::Receiver<WriteCmd>) {
    while let Ok(cmd) = rx.recv() {
        match cmd {
            WriteCmd::Write { path, data, reply } => {
                let r = execute_write(&conn, &path, &data);
                let _ = reply.send(r);
            }
            // ... handle other commands
        }
    }
}
}

“One Door” Principle: Once there is one door to the database, nobody can sneak in a surprise write on the request path. This architectural discipline—not just code—is what makes SQLite reliable at scale.

Write Batching: The Key to Performance

“One transaction per event is a tax. One transaction per batch is a different economy.”

Treat writes like a budget. The breakthrough is not faster hardware—it’s deciding that writes are expensive and batching them accordingly.

Batch writes into single transactions for dramatic performance improvement:

#![allow(unused)]
fn main() {
impl SqliteBackend {
    fn flush_writes(&self) -> Result<(), FsError> {
        let ops = self.write_queue.drain();
        if ops.is_empty() { return Ok(()); }
        
        let tx = self.conn.transaction()?;
        for op in ops {
            op.execute(&tx)?;
        }
        tx.commit()?;  // One commit for many operations
        Ok(())
    }
}
}

Flush triggers:

  • Batch size reached (e.g., 100 operations)
  • Timeout elapsed (e.g., 50ms since first queued write)
  • Explicit sync() call
  • Read-after-write on same path (for consistency)

WAL Mode (Required)

Always enable WAL (Write-Ahead Logging) mode for concurrent access.

PragmaDefaultPurposeTradeoff
journal_modeWALConcurrent reads during writesCreates .wal/.shm files
synchronousFULLData integrity on power lossSlower writes, safest default
temp_storeMEMORYFaster temp operationsUses RAM for temp tables
cache_size-3200032MB page cacheTune based on dataset size
busy_timeout5000Wait 5s on lock contentionPrevents SQLITE_BUSY errors
foreign_keysONEnforce referential integritySlight overhead on writes
#![allow(unused)]
fn main() {
fn open_connection(path: &Path) -> Result<Connection, rusqlite::Error> {
    let conn = Connection::open(path)?;

    conn.execute_batch("
        PRAGMA journal_mode = WAL;
        PRAGMA synchronous = FULL;
        PRAGMA temp_store = MEMORY;
        PRAGMA cache_size = -32000;
        PRAGMA busy_timeout = 5000;
        PRAGMA foreign_keys = ON;
    ")?;

    Ok(conn)
}
}

Synchronous Mode: Safety vs Performance

ModeBehaviorUse When
FULLSync WAL before each commitDefault - data integrity is critical
NORMALSync WAL before checkpoint onlyHigh-throughput, battery-backed storage, or acceptable data loss
OFFNo syncsTesting only, high corruption risk

Why FULL is the default:

  • SqliteBackend stores file content—losing data on power failure is unacceptable
  • Consumer SSDs often lack power-loss protection
  • Filesystem users expect durability guarantees

When to use NORMAL:

#![allow(unused)]
fn main() {
// Opt-in for performance when you have:
// - Enterprise storage with battery-backed write cache
// - UPS-protected systems
// - Acceptable risk of losing last few transactions
let backend = SqliteBackend::builder()
    .synchronous(Synchronous::Normal)
    .build()?;
}

Cache Size Tuning

The default 32MB cache is conservative. Tune based on your dataset:

Dataset SizeRecommended CacheRationale
< 100MB8-16MBSmall datasets fit in cache easily
100MB - 1GB32-64MBDefault is appropriate
1GB - 10GB64-128MBLarger cache reduces disk I/O
> 10GB128-256MBDiminishing returns above this
#![allow(unused)]
fn main() {
// Configure via builder
let backend = SqliteBackend::builder()
    .cache_size_mb(64)
    .build()?;
}

WAL vs Rollback Journal

AspectWAL ModeRollback Journal
Concurrent reads during write✅ Yes❌ No (blocked)
Read performanceFasterSlower
Write performanceSimilarSimilar
File count3 files (.db, .wal, .shm)1-2 files
Crash recoveryAutomaticAutomatic

Always use WAL for filesystem backends.

WAL Checkpointing

WAL files grow until checkpointed. SQLite auto-checkpoints at 1000 pages, but you can control this:

#![allow(unused)]
fn main() {
// Manual checkpoint (call periodically or after bulk operations)
conn.execute_batch("PRAGMA wal_checkpoint(TRUNCATE);")?;

// Or configure auto-checkpoint threshold
conn.execute_batch("PRAGMA wal_autocheckpoint = 1000;")?;  // pages
}

Checkpoint modes:

  • PASSIVE - Checkpoint without blocking writers (may not complete)
  • FULL - Wait for writers, then checkpoint completely
  • RESTART - Like FULL, but also resets WAL file
  • TRUNCATE - Like RESTART, but truncates WAL to zero bytes

For filesystem backends, run TRUNCATE checkpoint:

  • During quiet periods
  • After bulk imports
  • Before backups

Busy Handling

Even with a write queue, reads might briefly block during checkpoints. Handle this gracefully:

#![allow(unused)]
fn main() {
fn open_connection(path: &Path) -> Result<Connection, rusqlite::Error> {
    let conn = Connection::open(path)?;

    // Wait up to 30 seconds if database is busy
    conn.busy_timeout(Duration::from_secs(30))?;

    // Or use a custom busy handler
    conn.busy_handler(Some(|attempts| {
        if attempts > 100 {
            false  // Give up after 100 retries
        } else {
            std::thread::sleep(Duration::from_millis(10 * attempts as u64));
            true   // Keep trying
        }
    }))?;

    Ok(conn)
}
}

Never let SQLITE_BUSY propagate to users - it’s a transient condition.


Connection Pooling

For read operations, use a connection pool:

#![allow(unused)]
fn main() {
use r2d2::{Pool, PooledConnection};
use r2d2_sqlite::SqliteConnectionManager;

pub struct SqliteBackend {
    read_pool: Pool<SqliteConnectionManager>,
    write_tx: mpsc::UnboundedSender<WriteCmd>,
}

impl SqliteBackend {
    pub fn open(path: impl AsRef<Path>) -> Result<Self, FsError> {
        let manager = SqliteConnectionManager::file(path.as_ref())
            .with_flags(rusqlite::OpenFlags::SQLITE_OPEN_READ_ONLY);

        let read_pool = Pool::builder()
            .max_size(10)  // 10 concurrent readers
            .build(manager)
            .map_err(|e| FsError::Backend(e.to_string()))?;

        // ... set up write queue

        Ok(Self { read_pool, write_tx })
    }
}

impl FsRead for SqliteBackend {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let conn = self.read_pool.get()
            .map_err(|e| FsError::Backend(format!("pool exhausted: {}", e)))?;

        // Use read-only connection
        query_file_content(&conn, path.as_ref())
    }
}
}

Pool sizing:

  • Start with max_size = CPU cores * 2
  • Monitor pool exhaustion
  • Increase if reads queue up

Vacuum and Maintenance

SQLite doesn’t automatically reclaim space from deleted data. You need VACUUM.

Enable incremental auto-vacuum for gradual space reclamation:

-- Set once when creating the database
PRAGMA auto_vacuum = INCREMENTAL;

-- Then periodically run (e.g., daily or after large deletes)
PRAGMA incremental_vacuum(1000);  -- Free up to 1000 pages

Manual Vacuum

Full vacuum rebuilds the entire database (expensive but thorough):

#![allow(unused)]
fn main() {
impl SqliteBackend {
    /// Compact the database. Call during maintenance windows.
    pub fn vacuum(&self) -> Result<(), FsError> {
        // Vacuum needs exclusive access - pause writes
        let conn = self.get_write_connection()?;

        // This can take a long time for large databases
        conn.execute_batch("VACUUM;")?;

        Ok(())
    }
}
}

When to vacuum:

  • After deleting >25% of data
  • After schema migrations
  • During scheduled maintenance
  • Never during peak usage

Integrity Check

Periodically verify database integrity:

#![allow(unused)]
fn main() {
impl SqliteBackend {
    pub fn check_integrity(&self) -> Result<bool, FsError> {
        let conn = self.read_pool.get()?;

        let result: String = conn.query_row(
            "PRAGMA integrity_check;",
            [],
            |row| row.get(0),
        )?;

        Ok(result == "ok")
    }
}
}

Run integrity checks:

  • After crash recovery
  • Before backups
  • Periodically (weekly/monthly)

Schema Migrations

Filesystem schemas evolve. Handle migrations properly:

Version Tracking

-- Store schema version in the database
CREATE TABLE IF NOT EXISTS meta (
    key TEXT PRIMARY KEY,
    value TEXT
);

INSERT OR REPLACE INTO meta (key, value) VALUES ('schema_version', '1');

Migration Pattern

#![allow(unused)]
fn main() {
const CURRENT_VERSION: i32 = 3;

impl SqliteBackend {
    fn migrate(&self, conn: &Connection) -> Result<(), FsError> {
        let version: i32 = conn.query_row(
            "SELECT COALESCE((SELECT value FROM meta WHERE key = 'schema_version'), '0')",
            [],
            |row| row.get::<_, String>(0)?.parse().map_err(|_| rusqlite::Error::InvalidQuery),
        ).unwrap_or(0);

        if version < 1 {
            self.migrate_v0_to_v1(conn)?;
        }
        if version < 2 {
            self.migrate_v1_to_v2(conn)?;
        }
        if version < 3 {
            self.migrate_v2_to_v3(conn)?;
        }

        conn.execute(
            "INSERT OR REPLACE INTO meta (key, value) VALUES ('schema_version', ?)",
            [CURRENT_VERSION.to_string()],
        )?;

        Ok(())
    }

    fn migrate_v1_to_v2(&self, conn: &Connection) -> Result<(), FsError> {
        conn.execute_batch("
            -- Add new column
            ALTER TABLE nodes ADD COLUMN checksum TEXT;

            -- Backfill (expensive but necessary)
            -- UPDATE nodes SET checksum = compute_checksum(content) WHERE content IS NOT NULL;
        ")?;
        Ok(())
    }
}
}

Migration rules:

  • Always wrap in transaction
  • Test migrations on copy of production data
  • Have rollback plan (backup before migration)
  • Never delete columns in SQLite (not supported) - add new ones instead

Backup Strategies

SQLite’s backup API creates consistent snapshots of a live database:

#![allow(unused)]
fn main() {
use rusqlite::backup::{Backup, Progress};

impl SqliteBackend {
    /// Create a consistent backup while database is in use.
    pub fn backup(&self, dest_path: impl AsRef<Path>) -> Result<(), FsError> {
        let src = self.get_read_connection()?;
        let mut dest = Connection::open(dest_path.as_ref())?;

        let backup = Backup::new(&src, &mut dest)?;

        // Copy in chunks (allows progress reporting)
        loop {
            let more = backup.step(100)?;  // 100 pages at a time

            if !more {
                break;
            }

            // Optional: report progress
            let progress = backup.progress();
            println!("Backup: {}/{} pages", progress.pagecount - progress.remaining, progress.pagecount);
        }

        Ok(())
    }
}
}

Benefits:

  • No downtime (backup while serving requests)
  • Consistent snapshot (point-in-time)
  • Can copy to any destination (file, memory, network)

File Copy (Simple but Risky)

Only safe if database is not in use:

#![allow(unused)]
fn main() {
// DANGER: Only do this if no connections are open!
impl SqliteBackend {
    pub fn backup_cold(&self, dest: impl AsRef<Path>) -> Result<(), FsError> {
        // Ensure WAL is checkpointed first
        let conn = self.get_write_connection()?;
        conn.execute_batch("PRAGMA wal_checkpoint(TRUNCATE);")?;
        drop(conn);

        // Now safe to copy
        std::fs::copy(&self.db_path, dest.as_ref())?;
        Ok(())
    }
}
}

Backup Schedule

ScenarioStrategy
DevelopmentManual or none
Small production (<1GB)Hourly online backup
Large production (>1GB)Daily full + WAL archiving
Critical dataContinuous WAL shipping to replica

Performance Tuning

Essential PRAGMAs

-- Safe defaults
PRAGMA journal_mode = WAL;           -- Required for concurrent access
PRAGMA synchronous = FULL;           -- Data integrity on power loss (default)
PRAGMA cache_size = -32000;          -- 32MB cache (tune based on dataset)
PRAGMA temp_store = MEMORY;          -- Temp tables in memory

-- Performance opt-in (when you have battery-backed storage)
-- PRAGMA synchronous = NORMAL;      -- Faster, risk of data loss on power failure
-- PRAGMA cache_size = -128000;      -- Larger cache for big datasets
-- PRAGMA mmap_size = 268435456;     -- 256MB memory-mapped I/O

-- For read-heavy workloads
PRAGMA read_uncommitted = ON;        -- Allow dirty reads (faster, use carefully)

-- For write-heavy workloads
PRAGMA wal_autocheckpoint = 10000;   -- Checkpoint less frequently

Indexing Strategy

-- Essential indexes for filesystem operations
CREATE INDEX idx_nodes_parent ON nodes(parent_inode);
CREATE INDEX idx_nodes_name ON nodes(parent_inode, name);

-- For metadata queries
CREATE INDEX idx_nodes_type ON nodes(node_type);
CREATE INDEX idx_nodes_modified ON nodes(modified_at);

-- For GC queries
CREATE INDEX idx_blobs_orphan ON blobs(refcount) WHERE refcount = 0;

-- Composite indexes for common queries
CREATE INDEX idx_nodes_parent_type ON nodes(parent_inode, node_type);

Query Optimization

#![allow(unused)]
fn main() {
// BAD: Multiple queries
fn get_children_with_metadata(parent: i64) -> Vec<Node> {
    let children = query("SELECT * FROM nodes WHERE parent = ?", [parent]);
    for child in children {
        let metadata = query("SELECT * FROM metadata WHERE inode = ?", [child.inode]);
        // ...
    }
}

// GOOD: Single query with JOIN
fn get_children_with_metadata(parent: i64) -> Vec<Node> {
    query("
        SELECT n.*, m.*
        FROM nodes n
        LEFT JOIN metadata m ON n.inode = m.inode
        WHERE n.parent = ?
    ", [parent])
}
}

Prepared Statements

Always use prepared statements for repeated queries:

#![allow(unused)]
fn main() {
impl SqliteBackend {
    fn prepare_statements(conn: &Connection) -> Result<Statements, FsError> {
        Ok(Statements {
            read_file: conn.prepare_cached(
                "SELECT content FROM nodes WHERE parent_inode = ? AND name = ?"
            )?,

            list_dir: conn.prepare_cached(
                "SELECT name, node_type, size FROM nodes WHERE parent_inode = ?"
            )?,

            // ... other common queries
        })
    }
}
}

Monitoring and Diagnostics

Key Metrics to Track

#![allow(unused)]
fn main() {
impl SqliteBackend {
    pub fn stats(&self) -> Result<DbStats, FsError> {
        let conn = self.read_pool.get()?;

        Ok(DbStats {
            // Database size
            page_count: pragma_i64(&conn, "page_count"),
            page_size: pragma_i64(&conn, "page_size"),

            // WAL status
            wal_pages: pragma_i64(&conn, "wal_checkpoint"),

            // Cache efficiency
            cache_hit: pragma_i64(&conn, "cache_hit"),
            cache_miss: pragma_i64(&conn, "cache_miss"),

            // Fragmentation
            freelist_count: pragma_i64(&conn, "freelist_count"),
        })
    }
}

fn pragma_i64(conn: &Connection, name: &str) -> i64 {
    conn.query_row(&format!("PRAGMA {}", name), [], |r| r.get(0)).unwrap_or(0)
}
}

Health Checks

#![allow(unused)]
fn main() {
impl SqliteBackend {
    pub fn health_check(&self) -> HealthStatus {
        // 1. Can we connect?
        let conn = match self.read_pool.get() {
            Ok(c) => c,
            Err(e) => return HealthStatus::Unhealthy(format!("pool: {}", e)),
        };

        // 2. Is database intact?
        let integrity: String = conn.query_row("PRAGMA integrity_check", [], |r| r.get(0))
            .unwrap_or_else(|_| "error".to_string());

        if integrity != "ok" {
            return HealthStatus::Unhealthy(format!("integrity: {}", integrity));
        }

        // 3. Is WAL file reasonable size?
        let wal_size = std::fs::metadata(format!("{}-wal", self.db_path))
            .map(|m| m.len())
            .unwrap_or(0);

        if wal_size > 100 * 1024 * 1024 {  // > 100MB
            return HealthStatus::Degraded("WAL file large - checkpoint needed".into());
        }

        // 4. Is write queue backed up?
        if self.write_queue_depth() > 1000 {
            return HealthStatus::Degraded("Write queue backlog".into());
        }

        HealthStatus::Healthy
    }
}
}

Common Pitfalls

1. Opening Too Many Connections

#![allow(unused)]
fn main() {
// BAD: New connection per operation
fn read(&self, path: &Path) -> Vec<u8> {
    let conn = Connection::open(&self.db_path).unwrap();  // DON'T
    // ...
}

// GOOD: Use connection pool
fn read(&self, path: &Path) -> Vec<u8> {
    let conn = self.pool.get().unwrap();  // Reuse connections
    // ...
}
}

2. Long-Running Transactions

#![allow(unused)]
fn main() {
// BAD: Transaction open while doing slow work
let tx = conn.transaction()?;
for file in files {
    tx.execute("INSERT ...", [&file])?;
    upload_to_s3(&file)?;  // SLOW - blocks other writers!
}
tx.commit()?;

// GOOD: Minimize transaction scope
for file in files {
    upload_to_s3(&file)?;  // Do slow work outside transaction
}
let tx = conn.transaction()?;
for file in files {
    tx.execute("INSERT ...", [&file])?;  // Fast inserts only
}
tx.commit()?;
}

3. Ignoring SQLITE_BUSY

#![allow(unused)]
fn main() {
// BAD: Crash on busy
conn.execute("INSERT ...", [])?;  // May return SQLITE_BUSY

// GOOD: Retry logic (or use busy_timeout)
loop {
    match conn.execute("INSERT ...", []) {
        Ok(_) => break,
        Err(rusqlite::Error::SqliteFailure(e, _)) if e.code == ErrorCode::DatabaseBusy => {
            std::thread::sleep(Duration::from_millis(10));
            continue;
        }
        Err(e) => return Err(e.into()),
    }
}
}

4. Forgetting to Checkpoint

// BAD: WAL grows forever
// (no checkpoint calls)

// GOOD: Periodic checkpoint
impl SqliteBackend {
    pub fn maintenance(&self) -> Result<(), FsError> {
        let conn = self.get_write_connection()?;
        conn.execute_batch("PRAGMA wal_checkpoint(TRUNCATE);")?;
        Ok(())
    }
}

5. Not Using Transactions for Batch Operations

#![allow(unused)]
fn main() {
// BAD: 1000 separate transactions
for item in items {
    conn.execute("INSERT ...", [item])?;  // Each is auto-committed
}

// GOOD: Single transaction
let tx = conn.transaction()?;
for item in items {
    tx.execute("INSERT ...", [item])?;
}
tx.commit()?;  // 10-100x faster
}

SQLCipher (Encryption)

For encrypted databases, use SQLCipher:

#![allow(unused)]
fn main() {
use rusqlite::Connection;

fn open_encrypted(path: &Path, key: &str) -> Result<Connection, rusqlite::Error> {
    let conn = Connection::open(path)?;

    // Set encryption key (must be first operation)
    conn.execute_batch(&format!("PRAGMA key = '{}';", key))?;

    // Verify encryption is working
    conn.execute_batch("SELECT count(*) FROM sqlite_master;")?;

    // Now configure as normal
    conn.execute_batch("
        PRAGMA journal_mode = WAL;
        PRAGMA synchronous = FULL;
    ")?;

    Ok(conn)
}
}

Key management:

  • Never hardcode keys
  • Rotate keys periodically (requires re-encryption)
  • Use key derivation (PBKDF2) for password-based keys
  • Store key metadata separately from data

See Security Model for key rotation patterns.


Path Resolution Performance

The N-query problem is the dominant cost for SQLite filesystems.

With a parent/name schema, resolving /documents/2024/q1/report.pdf requires:

Query 1: SELECT inode FROM nodes WHERE parent=1 AND name='documents' → 2
Query 2: SELECT inode FROM nodes WHERE parent=2 AND name='2024' → 3
Query 3: SELECT inode FROM nodes WHERE parent=3 AND name='q1' → 4
Query 4: SELECT inode FROM nodes WHERE parent=4 AND name='report.pdf' → 5

Four round-trips for one file! Deep directory structures multiply this cost.

Cache resolved paths at the FileStorage layer using CachingResolver:

#![allow(unused)]
fn main() {
let backend = SqliteBackend::open("data.db")?;
let fs = FileStorage::with_resolver(
    backend,
    CachingResolver::new(IterativeResolver, 10_000)  // 10K entry cache
);
}

Cache invalidation: Clear cache entries on rename/remove operations that affect path prefixes.

Solution 2: Recursive CTE (Single Query)

Resolve entire path in one query using SQLite’s recursive CTE:

WITH RECURSIVE path_walk(depth, inode, name, remaining) AS (
    -- Start at root
    SELECT 0, 1, '', '/documents/2024/q1/report.pdf'
    
    UNION ALL
    
    -- Walk each component
    SELECT 
        pw.depth + 1,
        n.inode,
        n.name,
        substr(pw.remaining, instr(pw.remaining, '/') + 1)
    FROM path_walk pw
    JOIN nodes n ON n.parent = pw.inode 
                AND n.name = substr(
                    pw.remaining, 
                    1, 
                    CASE WHEN instr(pw.remaining, '/') > 0 
                         THEN instr(pw.remaining, '/') - 1 
                         ELSE length(pw.remaining) 
                    END
                )
    WHERE pw.remaining != ''
)
SELECT inode FROM path_walk ORDER BY depth DESC LIMIT 1;

Tradeoff: More complex query, but single round-trip. Best for deep paths without caching.

Solution 3: Full-Path Index (Alternative Schema)

Store full paths as keys for O(1) lookups:

CREATE TABLE nodes (
    path TEXT PRIMARY KEY,  -- '/documents/2024/q1/report.pdf'
    parent_path TEXT NOT NULL,
    name TEXT NOT NULL,
    -- ... other columns
);

CREATE INDEX idx_nodes_parent_path ON nodes(parent_path);

Tradeoff: Instant lookups, but rename() must update all descendants’ paths.

Recommendation

WorkloadBest Approach
Read-heavy, shallow pathsParent/name + basic index
Read-heavy, deep pathsParent/name + CachingResolver
Write-heavy with renamesParent/name (rename is O(1))
Read-dominated, few renamesFull-path index

SqliteBackend Schema

The anyfs-sqlite crate stores everything in SQLite, including file content:

CREATE TABLE nodes (
    inode       INTEGER PRIMARY KEY,
    parent      INTEGER NOT NULL,
    name        TEXT NOT NULL,
    node_type   INTEGER NOT NULL,  -- 0=file, 1=dir, 2=symlink
    content     BLOB,              -- File content (inline)
    target      TEXT,              -- Symlink target
    size        INTEGER NOT NULL DEFAULT 0,
    permissions INTEGER NOT NULL DEFAULT 420,  -- 0o644
    nlink       INTEGER NOT NULL DEFAULT 1,
    created_at  INTEGER NOT NULL,
    modified_at INTEGER NOT NULL,
    accessed_at INTEGER NOT NULL,
    
    UNIQUE(parent, name)
);

-- Root directory
INSERT INTO nodes (inode, parent, name, node_type, created_at, modified_at, accessed_at)
VALUES (1, 1, '', 1, unixepoch(), unixepoch(), unixepoch());

-- Indexes
CREATE INDEX idx_nodes_parent ON nodes(parent);
CREATE INDEX idx_nodes_parent_name ON nodes(parent, name);

Key design choices:

  • Inline BLOBs: Simple, portable, single-file backup
  • Integer node_type: Faster comparison than TEXT
  • Parent/name unique: Enforces filesystem semantics at database level

BLOB Storage Strategies

Inline (SqliteBackend)

All content stored in nodes.content column:

ProsCons
Single-file portabilityMemory pressure for large files
Atomic operationsSQLite page overhead for small files
Simple backup/restoreWAL growth during large writes

Best for: Files <10MB, portability-focused use cases.

External (IndexedBackend)

Content stored as files, SQLite holds only metadata:

ProsCons
Native streaming I/OTwo-component backup
No memory pressureBlob/index consistency risk
Efficient for large filesMore complex implementation

Best for: Large files, media libraries, streaming workloads.

Hybrid Approach (Future Consideration)

Inline small files, external for large:

#![allow(unused)]
fn main() {
const INLINE_THRESHOLD: usize = 64 * 1024;  // 64KB

fn store_content(&self, data: &[u8]) -> Result<ContentRef, FsError> {
    if data.len() <= INLINE_THRESHOLD {
        Ok(ContentRef::Inline(data.to_vec()))
    } else {
        let blob_id = self.blob_store.put(data)?;
        Ok(ContentRef::External(blob_id))
    }
}
}

Tradeoff: Best of both worlds, but adds schema complexity.


When to Outgrow SQLite

From real-world experience:

“We eventually migrated, not because SQLite failed, but because our product changed. We added features that created heavier concurrent writes. That is when a single file stops being an advantage and starts being a ceiling.”

SQLite Works Well For

  • Read-heavy workloads (feeds, search, file serving)
  • Single-process applications
  • Embedded/desktop applications
  • Development and testing
  • Workloads up to ~25k requests/minute (read-dominated)

Consider Migration When

SignalWhat It Means
Write contention dominatesQueue depth grows, latency spikes
Multi-process writes neededSQLite’s single-writer limit
Horizontal scaling requiredSQLite can’t distribute
Real-time sync across nodesNo built-in replication

Migration Path

  1. Abstract early: Use AnyFS traits so backends are swappable
  2. Measure first: Profile before assuming SQLite is the bottleneck
  3. Consider IndexedBackend: External blobs reduce SQLite pressure
  4. Postgres/MySQL: When you truly need concurrent writes

Key insight: The architecture patterns (write batching, connection pooling, caching) transfer to any database. SQLite teaches discipline that scales.


Summary Checklist

Before deploying SQLite backend to production:

Architecture:

  • Single-writer queue implemented (“one door” principle)
  • Connection pool for readers (4-8 connections)
  • Write batching enabled (batch size + timeout flush)
  • Path resolution strategy chosen (caching, CTE, or full-path)

Configuration:

  • WAL mode enabled (PRAGMA journal_mode = WAL)
  • synchronous = FULL (safe default, opt-in to NORMAL)
  • Cache size tuned for dataset (default 32MB)
  • Busy timeout configured (5+ seconds)
  • Auto-vacuum configured (INCREMENTAL)

Indexes:

  • Parent/name composite index for path lookups
  • Indexes match actual query patterns (measure first!)
  • Partial indexes for GC queries

Operations:

  • Backup strategy in place (online backup API)
  • Monitoring for WAL size, queue depth, cache hit ratio
  • Integrity checks scheduled (weekly/monthly)
  • Migration path for schema changes

“SQLite did not scale our app. Measurement, batching, and restraint did.”

Security Model

Threat modeling, encryption, and security hardening for AnyFS deployments

This guide covers security considerations for deploying AnyFS-based filesystems, from single-user local use to multi-tenant cloud services.


Threat Model

Actors

ActorDescriptionTrust Level
UserLegitimate filesystem userTrusted for their data
Other UserAnother tenant (multi-tenant)Untrusted (isolation required)
OperatorSystem administratorTrusted for ops, not data
AttackerExternal malicious actorUntrusted
Compromised HostServer with attacker accessAssume worst case

Assets to Protect

AssetConfidentialityIntegrityAvailability
File contentsHighHighHigh
File metadata (names, sizes)MediumHighHigh
Directory structureMediumHighMedium
Encryption keysCriticalCriticalHigh
Audit logsMediumCriticalMedium
User credentialsCriticalCriticalHigh

Attack Vectors

VectorMitigation
Network interceptionTLS for all traffic
Unauthorized accessAuthentication + authorization
Data theft (at rest)Encryption (SQLCipher)
Data theft (in memory)Memory protection, key isolation
Tenant data leakageStrict isolation, no cross-tenant dedup
Path traversalPathFilter middleware, input validation
Denial of serviceRate limiting, quotas
Privilege escalationPrinciple of least privilege
Audit tamperingAppend-only logs, signatures

Encryption at Rest

SQLCipher Integration

For encrypted SQLite backends, use SQLCipher:

#![allow(unused)]
fn main() {
use rusqlite::Connection;

pub struct EncryptedSqliteBackend {
    conn: Connection,
}

impl EncryptedSqliteBackend {
    /// Open an encrypted database.
    ///
    /// # Security Notes
    /// - Key should be 256 bits of cryptographically random data
    /// - Or use a strong passphrase with proper key derivation
    pub fn open(path: &Path, key: &EncryptionKey) -> Result<Self, FsError> {
        let conn = Connection::open(path)
            .map_err(|e| FsError::Backend(e.to_string()))?;

        // Apply encryption key (MUST be first operation)
        match key {
            EncryptionKey::Raw(bytes) => {
                // Raw 256-bit key (hex encoded for SQLCipher)
                let hex_key = hex::encode(bytes);
                conn.execute_batch(&format!("PRAGMA key = \"x'{}'\";", hex_key))?;
            }
            EncryptionKey::Passphrase(pass) => {
                // Passphrase (SQLCipher uses PBKDF2 internally)
                conn.execute_batch(&format!("PRAGMA key = '{}';", escape_sql(pass)))?;
            }
        }

        // Verify encryption is working
        conn.execute_batch("SELECT count(*) FROM sqlite_master;")
            .map_err(|_| FsError::InvalidPassword)?;

        // Configure after key is set
        conn.execute_batch("
            PRAGMA journal_mode = WAL;
            PRAGMA synchronous = FULL;
        ")?;

        Ok(Self { conn })
    }
}

pub enum EncryptionKey {
    /// Raw 256-bit key (32 bytes)
    Raw([u8; 32]),
    /// Passphrase (key derived via PBKDF2)
    Passphrase(String),
}
}

Key Derivation

For passphrase-based keys, SQLCipher uses PBKDF2 internally. For custom key derivation:

#![allow(unused)]
fn main() {
use argon2::{Argon2, password_hash::SaltString};
use rand::rngs::OsRng;

/// Derive a 256-bit key from a passphrase.
pub fn derive_key(passphrase: &str, salt: &[u8]) -> [u8; 32] {
    let argon2 = Argon2::default();
    let mut key = [0u8; 32];

    argon2.hash_password_into(
        passphrase.as_bytes(),
        salt,
        &mut key,
    ).expect("key derivation failed");

    key
}

/// Generate a random salt for key derivation.
pub fn generate_salt() -> [u8; 16] {
    let mut salt = [0u8; 16];
    OsRng.fill_bytes(&mut salt);
    salt
}
}

Salt storage: Store salt separately from encrypted data (e.g., in a key management service or config file).


Key Management

Key Lifecycle

┌─────────┐     ┌─────────┐     ┌─────────┐     ┌─────────┐
│ Generate│ ──→ │  Store  │ ──→ │   Use   │ ──→ │ Rotate  │
└─────────┘     └─────────┘     └─────────┘     └─────────┘
                                                     │
                                                     ▼
                                               ┌─────────┐
                                               │ Destroy │
                                               └─────────┘

Key Generation

#![allow(unused)]
fn main() {
use rand::rngs::OsRng;
use rand::RngCore;

/// Generate a cryptographically secure 256-bit key.
pub fn generate_key() -> [u8; 32] {
    let mut key = [0u8; 32];
    OsRng.fill_bytes(&mut key);
    key
}

/// Generate a key ID for tracking.
pub fn generate_key_id() -> String {
    let mut id = [0u8; 16];
    OsRng.fill_bytes(&mut id);
    format!("key_{}", hex::encode(id))
}
}

Key Storage

Never store keys:

  • In source code
  • In plain text config files
  • In the same location as encrypted data
  • In environment variables (visible in process lists)

Recommended storage:

EnvironmentSolution
DevelopmentFile with restricted permissions (0600)
Production (cloud)KMS (AWS KMS, GCP KMS, Azure Key Vault)
Production (on-prem)HSM or dedicated secrets manager
User devicesOS keychain (macOS Keychain, Windows Credential Manager)
#![allow(unused)]
fn main() {
/// Key storage abstraction.
pub trait KeyStore: Send + Sync {
    /// Retrieve a key by ID.
    fn get_key(&self, key_id: &str) -> Result<[u8; 32], KeyError>;

    /// Store a new key, returns key ID.
    fn store_key(&self, key: &[u8; 32]) -> Result<String, KeyError>;

    /// Delete a key (after rotation).
    fn delete_key(&self, key_id: &str) -> Result<(), KeyError>;

    /// List all key IDs.
    fn list_keys(&self) -> Result<Vec<String>, KeyError>;
}

/// AWS KMS implementation.
pub struct AwsKmsKeyStore {
    client: aws_sdk_kms::Client,
    master_key_id: String,
}

impl KeyStore for AwsKmsKeyStore {
    fn get_key(&self, key_id: &str) -> Result<[u8; 32], KeyError> {
        // Keys are stored encrypted in DynamoDB/S3
        // Decrypt using KMS
        let encrypted = self.fetch_encrypted_key(key_id)?;

        let decrypted = self.client.decrypt()
            .key_id(&self.master_key_id)
            .ciphertext_blob(Blob::new(encrypted))
            .send()
            .await?;

        let plaintext = decrypted.plaintext().unwrap();
        let mut key = [0u8; 32];
        key.copy_from_slice(&plaintext.as_ref()[..32]);
        Ok(key)
    }

    // ... other methods
}
}

Key Rotation

Regular key rotation limits damage from key compromise:

#![allow(unused)]
fn main() {
impl EncryptedSqliteBackend {
    /// Rotate encryption key.
    ///
    /// This re-encrypts the entire database with a new key.
    /// Can take a long time for large databases.
    pub fn rotate_key(&self, new_key: &EncryptionKey) -> Result<(), FsError> {
        let new_key_sql = match new_key {
            EncryptionKey::Raw(bytes) => format!("\"x'{}'\"", hex::encode(bytes)),
            EncryptionKey::Passphrase(pass) => format!("'{}'", escape_sql(pass)),
        };

        // SQLCipher's PRAGMA rekey re-encrypts the database
        self.conn.execute_batch(&format!("PRAGMA rekey = {};", new_key_sql))
            .map_err(|e| FsError::Backend(format!("key rotation failed: {}", e)))?;

        Ok(())
    }
}

/// Key rotation schedule.
pub struct KeyRotationPolicy {
    /// Maximum age of a key before rotation.
    pub max_key_age: Duration,
    /// Maximum amount of data encrypted with one key.
    pub max_data_encrypted: u64,
    /// Whether to auto-rotate.
    pub auto_rotate: bool,
}

impl Default for KeyRotationPolicy {
    fn default() -> Self {
        Self {
            max_key_age: Duration::from_secs(90 * 24 * 60 * 60),  // 90 days
            max_data_encrypted: 1024 * 1024 * 1024 * 100,         // 100 GB
            auto_rotate: true,
        }
    }
}
}

Rotation workflow:

  1. Generate new key
  2. Store new key in key store
  3. Re-encrypt database with PRAGMA rekey
  4. Update key ID reference
  5. Audit log the rotation
  6. After retention period, delete old key

Multi-Tenant Isolation

Isolation Strategies

StrategyIsolation LevelComplexityUse Case
Separate databasesStrongestLowFew large tenants
Separate tablesStrongMediumMany small tenants
Row-levelModerateHighShared infrastructure

Recommendation: Separate databases (one SQLite file per tenant).

Per-Tenant Keys

Each tenant should have their own encryption key:

#![allow(unused)]
fn main() {
pub struct MultiTenantBackend {
    key_store: Arc<dyn KeyStore>,
    tenant_backends: RwLock<HashMap<TenantId, Arc<EncryptedSqliteBackend>>>,
}

impl MultiTenantBackend {
    /// Get or create backend for a tenant.
    pub fn get_tenant(&self, tenant_id: &TenantId) -> Result<Arc<EncryptedSqliteBackend>, FsError> {
        // Check cache
        {
            let backends = self.tenant_backends.read().unwrap();
            if let Some(backend) = backends.get(tenant_id) {
                return Ok(backend.clone());
            }
        }

        // Create new backend
        let key = self.key_store.get_key(&tenant_id.key_id())?;
        let path = self.tenant_db_path(tenant_id);
        let backend = Arc::new(EncryptedSqliteBackend::open(&path, &EncryptionKey::Raw(key))?);

        // Cache it
        let mut backends = self.tenant_backends.write().unwrap();
        backends.insert(tenant_id.clone(), backend.clone());

        Ok(backend)
    }
}
}

Cross-Tenant Dedup Considerations

Warning: Cross-tenant deduplication can leak information.

Tenant A uploads secret.pdf (hash: abc123)
Tenant B uploads same file → instantly deduped → B knows A has that file

Options:

ApproachDedup SavingsPrivacy
No cross-tenant dedupNoneFull privacy
Convergent encryptionPartialLeaks file existence
Per-tenant keys before hashNoneFull privacy

Recommendation: Only deduplicate within a tenant, not across tenants.

#![allow(unused)]
fn main() {
// Pattern for any hybrid backend (IndexedBackend or custom implementations)
// See hybrid-backend-design.md for the full pattern
impl IndexedBackend {
    fn blob_id_for_tenant(&self, tenant_id: &TenantId, data: &[u8]) -> String {
        // Include tenant ID in hash to prevent cross-tenant dedup
        let mut hasher = Sha256::new();
        hasher.update(tenant_id.as_bytes());
        hasher.update(data);
        hex::encode(hasher.finalize())
    }
}
}

Audit Logging

What to Log

EventSeverityData to Capture
File readInfopath, user, timestamp, size
File writeInfopath, user, timestamp, size, hash
File deleteWarningpath, user, timestamp
Permission changeWarningpath, user, old/new perms
Login successInfouser, IP, timestamp
Login failureWarninguser, IP, timestamp, reason
Key rotationCriticalkey_id, user, timestamp
Admin actionCriticalaction, user, timestamp

Audit Log Schema

CREATE TABLE audit_log (
    seq         INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp   INTEGER NOT NULL,
    event_type  TEXT NOT NULL,
    severity    TEXT NOT NULL,  -- 'info', 'warning', 'critical'
    actor       TEXT,           -- user ID or 'system'
    actor_ip    TEXT,
    resource    TEXT,           -- path or resource ID
    action      TEXT NOT NULL,
    details     TEXT,           -- JSON
    signature   BLOB            -- HMAC for tamper detection
);

CREATE INDEX idx_audit_timestamp ON audit_log(timestamp);
CREATE INDEX idx_audit_actor ON audit_log(actor);
CREATE INDEX idx_audit_resource ON audit_log(resource);

Tamper-Evident Logging

Sign audit entries to detect tampering:

#![allow(unused)]
fn main() {
use hmac::{Hmac, Mac};
use sha2::Sha256;

type HmacSha256 = Hmac<Sha256>;

pub struct AuditLogger {
    conn: Connection,
    signing_key: [u8; 32],
    prev_signature: RwLock<Vec<u8>>,  // Chain signatures
}

impl AuditLogger {
    pub fn log(&self, event: AuditEvent) -> Result<(), FsError> {
        let timestamp = SystemTime::now()
            .duration_since(UNIX_EPOCH)
            .unwrap()
            .as_secs() as i64;

        let details = serde_json::to_string(&event.details)?;

        // Create signature (includes previous signature for chaining)
        let prev_sig = self.prev_signature.read().unwrap().clone();
        let signature = self.sign_entry(timestamp, &event, &details, &prev_sig);

        self.conn.execute(
            "INSERT INTO audit_log (timestamp, event_type, severity, actor, actor_ip, resource, action, details, signature)
             VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)",
            params![
                timestamp,
                event.event_type,
                event.severity,
                event.actor,
                event.actor_ip,
                event.resource,
                event.action,
                details,
                &signature[..],
            ],
        )?;

        // Update chain
        *self.prev_signature.write().unwrap() = signature;

        Ok(())
    }

    fn sign_entry(
        &self,
        timestamp: i64,
        event: &AuditEvent,
        details: &str,
        prev_sig: &[u8],
    ) -> Vec<u8> {
        let mut mac = HmacSha256::new_from_slice(&self.signing_key).unwrap();

        mac.update(&timestamp.to_le_bytes());
        mac.update(event.event_type.as_bytes());
        mac.update(event.action.as_bytes());
        mac.update(details.as_bytes());
        mac.update(prev_sig);  // Chain to previous entry

        mac.finalize().into_bytes().to_vec()
    }

    /// Verify audit log integrity.
    pub fn verify_integrity(&self) -> Result<bool, FsError> {
        let mut prev_sig = Vec::new();

        let mut stmt = self.conn.prepare(
            "SELECT timestamp, event_type, severity, actor, actor_ip, resource, action, details, signature
             FROM audit_log ORDER BY seq"
        )?;

        let rows = stmt.query_map([], |row| {
            Ok(AuditRow {
                timestamp: row.get(0)?,
                event_type: row.get(1)?,
                severity: row.get(2)?,
                actor: row.get(3)?,
                actor_ip: row.get(4)?,
                resource: row.get(5)?,
                action: row.get(6)?,
                details: row.get(7)?,
                signature: row.get(8)?,
            })
        })?;

        for row in rows {
            let row = row?;

            let expected_sig = self.sign_entry(
                row.timestamp,
                &row.to_event(),
                &row.details,
                &prev_sig,
            );

            if expected_sig != row.signature {
                return Ok(false);  // Tampered!
            }

            prev_sig = row.signature;
        }

        Ok(true)
    }
}
}

Audit Log Retention

#![allow(unused)]
fn main() {
impl AuditLogger {
    /// Rotate old audit logs to cold storage.
    pub fn rotate(&self, max_age: Duration) -> Result<RotationStats, FsError> {
        let cutoff = SystemTime::now()
            .duration_since(UNIX_EPOCH)
            .unwrap()
            .as_secs() as i64 - max_age.as_secs() as i64;

        // Export old entries to archive
        let old_entries: Vec<AuditRow> = self.conn.prepare(
            "SELECT * FROM audit_log WHERE timestamp < ?"
        )?.query_map([cutoff], |row| /* ... */)?.collect();

        // Write to archive file (compressed, signed)
        self.write_archive(&old_entries)?;

        // Delete from active log
        let deleted = self.conn.execute(
            "DELETE FROM audit_log WHERE timestamp < ?",
            [cutoff],
        )?;

        Ok(RotationStats { archived: old_entries.len(), deleted })
    }
}
}

Access Control

Path-Based Access Control

Use PathFilterLayer middleware for path-based restrictions:

#![allow(unused)]
fn main() {
use anyfs::{PathFilterLayer};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

let backend = SqliteBackend::open("data.db")?
    .layer(PathFilterLayer::builder()
        // Allow specific directories
        .allow("/home/{user}/**")
        .allow("/shared/**")
        // Block sensitive paths
        .deny("**/.env")
        .deny("**/.git/**")
        .deny("**/node_modules/**")
        // Block by extension
        .deny("**/*.key")
        .deny("**/*.pem")
        .build());
}

Role-Based Access Control

Implement RBAC at the application layer:

#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub enum Role {
    Admin,
    ReadWrite,
    ReadOnly,
    Custom(Vec<Permission>),
}

#[derive(Debug, Clone)]
pub enum Permission {
    Read(PathPattern),
    Write(PathPattern),
    Delete(PathPattern),
    Admin,
}

pub struct RbacMiddleware<B> {
    inner: B,
    user_roles: Arc<dyn RoleProvider>,
}

impl<B: FsRead> FsRead for RbacMiddleware<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();
        let user = current_user()?;
        let role = self.user_roles.get_role(&user)?;

        if !role.can_read(path) {
            return Err(FsError::AccessDenied {
                path: path.to_path_buf(),
                reason: "insufficient permissions".into(),
            });
        }

        self.inner.read(path)
    }
}

impl<B: FsWrite> FsWrite for RbacMiddleware<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();
        let user = current_user()?;
        let role = self.user_roles.get_role(&user)?;

        if !role.can_write(path) {
            return Err(FsError::AccessDenied {
                path: path.to_path_buf(),
                reason: "write permission denied".into(),
            });
        }

        self.inner.write(path, data)
    }
}
}

Network Security

TLS Configuration

Always use TLS for network communication:

#![allow(unused)]
fn main() {
use tonic::transport::{Server, ServerTlsConfig, Identity, Certificate};

pub async fn serve_with_tls(
    backend: impl Fs,
    addr: &str,
    cert_path: &Path,
    key_path: &Path,
) -> Result<(), Box<dyn Error>> {
    let cert = std::fs::read_to_string(cert_path)?;
    let key = std::fs::read_to_string(key_path)?;

    let identity = Identity::from_pem(cert, key);

    let tls_config = ServerTlsConfig::new()
        .identity(identity);

    Server::builder()
        .tls_config(tls_config)?
        .add_service(FsServiceServer::new(FsServer::new(backend)))
        .serve(addr.parse()?)
        .await?;

    Ok(())
}
}

Client Certificate Authentication (mTLS)

For high-security deployments, require client certificates:

#![allow(unused)]
fn main() {
use tonic::transport::ClientTlsConfig;

pub async fn connect_with_mtls(
    addr: &str,
    ca_cert: &Path,
    client_cert: &Path,
    client_key: &Path,
) -> Result<FsServiceClient<Channel>, Box<dyn Error>> {
    let ca = std::fs::read_to_string(ca_cert)?;
    let cert = std::fs::read_to_string(client_cert)?;
    let key = std::fs::read_to_string(client_key)?;

    let tls_config = ClientTlsConfig::new()
        .ca_certificate(Certificate::from_pem(ca))
        .identity(Identity::from_pem(cert, key));

    let channel = Channel::from_shared(addr.to_string())?
        .tls_config(tls_config)?
        .connect()
        .await?;

    Ok(FsServiceClient::new(channel))
}
}

Security Checklist

Development

  • No secrets in source code
  • No secrets in logs
  • Input validation on all paths
  • Error messages don’t leak sensitive info

Deployment

  • TLS enabled for all network traffic
  • Encryption at rest (SQLCipher)
  • Keys stored in secure key management system
  • Key rotation policy defined and automated
  • Audit logging enabled
  • Rate limiting configured
  • Quotas configured

Operations

  • Regular security audits
  • Vulnerability scanning
  • Audit log review
  • Key rotation executed
  • Backup encryption verified
  • Access reviews (who has what permissions)

Multi-Tenant

  • Tenant isolation verified
  • Per-tenant encryption keys
  • No cross-tenant dedup (or risk accepted)
  • Tenant data segregation in backups

Summary

LayerProtection
TransportTLS, mTLS
AuthenticationTokens, certificates
AuthorizationRBAC, PathFilter
Data at restSQLCipher encryption
Key managementKMS, rotation
AuditTamper-evident logging
IsolationPer-tenant DBs and keys

Security is not optional. Build it in from the start.

Testing Guide

Comprehensive testing strategy for AnyFS


Overview

AnyFS uses a layered testing approach:

LayerWhat it testsRun with
Unit testsIndividual componentscargo test
Conformance testsBackend trait compliancecargo test --features conformance
Integration testsFull stack behaviorcargo test --test integration
Stress testsConcurrency & limitscargo test --release -- --ignored
Platform testsCross-platform behaviorCI matrix

1. Backend Conformance Tests

Every backend must pass the same conformance suite. This ensures backends are interchangeable.

Running Conformance Tests

#![allow(unused)]
fn main() {
use anyfs_test::{run_conformance_suite, ConformanceLevel};

#[test]
fn memory_backend_conformance() {
    run_conformance_suite(
        MemoryBackend::new(),
        ConformanceLevel::Fs,  // or FsFull, FsFuse, FsPosix
    );
}

#[test]
fn vrootfs_backend_conformance() {
    let temp = tempfile::tempdir().unwrap();
    run_conformance_suite(
        VRootFsBackend::new(temp.path()).unwrap(),
        ConformanceLevel::FsFull,
    );
}
}

Conformance Levels

FsPosix  ──▶ FsHandles, FsLock, FsXattr tests
    │
FsFuse   ──▶ FsInode tests (path_to_inode, lookup, etc.)
    │
FsFull   ──▶ FsLink, FsPermissions, FsSync, FsStats tests
    │
Fs       ──▶ FsRead, FsWrite, FsDir tests (REQUIRED for all)

Core Tests (Fs level)

#![allow(unused)]
fn main() {
#[test]
fn test_write_and_read() {
    let backend = create_backend();

    backend.write(std::path::Path::new("/file.txt"), b"hello world").unwrap();
    let content = backend.read(std::path::Path::new("/file.txt")).unwrap();

    assert_eq!(content, b"hello world");
}

#[test]
fn test_read_nonexistent_returns_not_found() {
    let backend = create_backend();

    let result = backend.read(std::path::Path::new("/nonexistent.txt"));

    assert!(matches!(result, Err(FsError::NotFound { .. })));
}

#[test]
fn test_create_dir_and_list() {
    let backend = create_backend();

    backend.create_dir(std::path::Path::new("/mydir")).unwrap();
    backend.write(std::path::Path::new("/mydir/file.txt"), b"data").unwrap();

    let entries: Vec<_> = backend.read_dir(std::path::Path::new("/mydir")).unwrap()
        .collect::<Result<Vec<_>, _>>().unwrap();
    assert_eq!(entries.len(), 1);
    assert_eq!(entries[0].name, "file.txt");
}

#[test]
fn test_create_dir_all() {
    let backend = create_backend();

    backend.create_dir_all(std::path::Path::new("/a/b/c/d")).unwrap();

    assert!(backend.exists(std::path::Path::new("/a/b/c/d")).unwrap());
}

#[test]
fn test_remove_file() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();

    backend.remove_file(std::path::Path::new("/file.txt")).unwrap();

    assert!(!backend.exists(std::path::Path::new("/file.txt")).unwrap());
}

#[test]
fn test_remove_dir_all() {
    let backend = create_backend();
    backend.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
    backend.write(std::path::Path::new("/a/b/c/file.txt"), b"data").unwrap();

    backend.remove_dir_all(std::path::Path::new("/a")).unwrap();

    assert!(!backend.exists(std::path::Path::new("/a")).unwrap());
}

#[test]
fn test_rename() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/old.txt"), b"data").unwrap();

    backend.rename(std::path::Path::new("/old.txt"), std::path::Path::new("/new.txt")).unwrap();

    assert!(!backend.exists(std::path::Path::new("/old.txt")).unwrap());
    assert_eq!(backend.read(std::path::Path::new("/new.txt")).unwrap(), b"data");
}

#[test]
fn test_copy() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/original.txt"), b"data").unwrap();

    backend.copy(std::path::Path::new("/original.txt"), std::path::Path::new("/copy.txt")).unwrap();

    assert_eq!(backend.read(std::path::Path::new("/original.txt")).unwrap(), b"data");
    assert_eq!(backend.read(std::path::Path::new("/copy.txt")).unwrap(), b"data");
}

#[test]
fn test_metadata() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"hello").unwrap();

    let meta = backend.metadata(std::path::Path::new("/file.txt")).unwrap();

    assert_eq!(meta.size, 5);
    assert!(meta.file_type.is_file());
}

#[test]
fn test_append() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"hello").unwrap();

    backend.append(std::path::Path::new("/file.txt"), b" world").unwrap();

    assert_eq!(backend.read(std::path::Path::new("/file.txt")).unwrap(), b"hello world");
}

#[test]
fn test_truncate() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"hello world").unwrap();

    backend.truncate(std::path::Path::new("/file.txt"), 5).unwrap();

    assert_eq!(backend.read(std::path::Path::new("/file.txt")).unwrap(), b"hello");
}

#[test]
fn test_read_range() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"hello world").unwrap();

    let partial = backend.read_range(std::path::Path::new("/file.txt"), 6, 5).unwrap();

    assert_eq!(partial, b"world");
}
}

Extended Tests (FsFull level)

#![allow(unused)]
fn main() {
#[test]
fn test_symlink() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/target.txt"), b"data").unwrap();

    backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();

    // read_link returns the target
    assert_eq!(backend.read_link(std::path::Path::new("/link.txt")).unwrap(), Path::new("/target.txt"));

    // reading the symlink follows it
    assert_eq!(backend.read(std::path::Path::new("/link.txt")).unwrap(), b"data");
}

#[test]
fn test_hard_link() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/original.txt"), b"data").unwrap();

    backend.hard_link(std::path::Path::new("/original.txt"), std::path::Path::new("/hardlink.txt")).unwrap();

    // Both point to same data
    assert_eq!(backend.read(std::path::Path::new("/hardlink.txt")).unwrap(), b"data");

    // Metadata shows nlink > 1
    let meta = backend.metadata(std::path::Path::new("/original.txt")).unwrap();
    assert!(meta.nlink >= 2);
}

#[test]
fn test_symlink_metadata() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/target.txt"), b"data").unwrap();
    backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();

    // symlink_metadata returns metadata of the symlink itself
    let meta = backend.symlink_metadata(std::path::Path::new("/link.txt")).unwrap();
    assert!(meta.file_type.is_symlink());
}

#[test]
fn test_set_permissions() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();

    backend.set_permissions(std::path::Path::new("/file.txt"), Permissions::from_mode(0o644)).unwrap();

    let meta = backend.metadata(std::path::Path::new("/file.txt")).unwrap();
    assert_eq!(meta.permissions.mode() & 0o777, 0o644);
}

#[test]
fn test_sync() {
    let backend = create_backend();
    backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();

    // Should not error
    backend.sync().unwrap();
    backend.fsync(std::path::Path::new("/file.txt")).unwrap();
}

#[test]
fn test_statfs() {
    let backend = create_backend();

    let stats = backend.statfs().unwrap();

    assert!(stats.total_bytes > 0 || stats.total_bytes == 0); // Memory may report 0
}
}

2. Middleware Tests

Each middleware is tested in isolation and in combination.

Quota Tests

#![allow(unused)]
fn main() {
#[test]
fn test_quota_blocks_when_exceeded() {
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder().max_total_size(100).build());
    let fs = FileStorage::new(backend);

    let result = fs.write("/big.txt", &[0u8; 200]);

    assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}

#[test]
fn test_quota_allows_within_limit() {
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder().max_total_size(1000).build());
    let fs = FileStorage::new(backend);

    fs.write("/small.txt", &[0u8; 100]).unwrap();

    assert!(fs.exists("/small.txt").unwrap());
}

#[test]
fn test_quota_tracks_deletes() {
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder().max_total_size(100).build());
    let fs = FileStorage::new(backend);

    fs.write("/file.txt", &[0u8; 50]).unwrap();
    fs.remove_file("/file.txt").unwrap();

    // Should be able to write again after delete
    fs.write("/file2.txt", &[0u8; 50]).unwrap();
}

#[test]
fn test_quota_max_file_size() {
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder().max_file_size(50).build());
    let fs = FileStorage::new(backend);

    let result = fs.write("/big.txt", &[0u8; 100]);

    assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}

#[test]
fn test_quota_streaming_write() {
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder().max_total_size(100).build());
    let fs = FileStorage::new(backend);

    let mut writer = fs.open_write("/file.txt").unwrap();
    writer.write_all(&[0u8; 50]).unwrap();
    writer.write_all(&[0u8; 50]).unwrap();
    drop(writer);

    // Next write should fail
    let result = fs.write("/file2.txt", &[0u8; 10]);
    assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}
}

Restrictions Tests

#![allow(unused)]
fn main() {
#[test]
fn test_restrictions_blocks_permissions() {
    let backend = MemoryBackend::new()
        .layer(RestrictionsLayer::builder().deny_permissions().build());
    let fs = FileStorage::new(backend);

    fs.write("/file.txt", b"data").unwrap();
    let result = fs.set_permissions("/file.txt", Permissions::from_mode(0o644));

    assert!(matches!(result, Err(FsError::FeatureNotEnabled { .. })));
}

#[test]
fn test_restrictions_allows_links() {
    // Restrictions doesn't block FsLink - capability is via trait bounds
    let backend = MemoryBackend::new()
        .layer(RestrictionsLayer::builder().deny_permissions().build());
    let fs = FileStorage::new(backend);

    fs.write("/target.txt", b"data").unwrap();
    fs.symlink("/target.txt", "/link.txt").unwrap();  // Works - MemoryBackend: FsLink
    fs.hard_link("/target.txt", "/hardlink.txt").unwrap();  // Works too
}

#[test]
fn test_restrictions_blocks_permissions() {
    let backend = MemoryBackend::new()
        .layer(RestrictionsLayer::builder().deny_permissions().build());
    let fs = FileStorage::new(backend);

    fs.write("/file.txt", b"data").unwrap();
    let result = fs.set_permissions("/file.txt", Permissions::from_mode(0o777));

    assert!(matches!(result, Err(FsError::FeatureNotEnabled { .. })));
}
}

PathFilter Tests

#![allow(unused)]
fn main() {
#[test]
fn test_pathfilter_allows_matching() {
    let backend = MemoryBackend::new()
        .layer(PathFilterLayer::builder().allow("/workspace/**").build());
    let fs = FileStorage::new(backend);

    fs.create_dir_all("/workspace/project").unwrap();
    fs.write("/workspace/project/file.txt", b"data").unwrap();
}

#[test]
fn test_pathfilter_blocks_non_matching() {
    let backend = MemoryBackend::new()
        .layer(PathFilterLayer::builder().allow("/workspace/**").build());
    let fs = FileStorage::new(backend);

    let result = fs.write("/etc/passwd", b"data");

    assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}

#[test]
fn test_pathfilter_deny_overrides_allow() {
    let backend = MemoryBackend::new()
        .layer(PathFilterLayer::builder()
            .allow("/workspace/**")
            .deny("**/.env")
            .build());
    let fs = FileStorage::new(backend);

    let result = fs.write("/workspace/.env", b"SECRET=xxx");

    assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}

#[test]
fn test_pathfilter_read_dir_filters() {
    let mut inner = MemoryBackend::new();
    inner.write(std::path::Path::new("/workspace/allowed.txt"), b"data").unwrap();
    inner.write(std::path::Path::new("/workspace/.env"), b"secret").unwrap();

    let backend = inner
        .layer(PathFilterLayer::builder()
            .allow("/workspace/**")
            .deny("**/.env")
            .build());
    let fs = FileStorage::new(backend);

    let entries: Vec<_> = fs.read_dir("/workspace").unwrap()
        .collect::<Result<Vec<_>, _>>().unwrap();

    // .env should be filtered out
    assert_eq!(entries.len(), 1);
    assert_eq!(entries[0].name, "allowed.txt");
}
}

ReadOnly Tests

#![allow(unused)]
fn main() {
#[test]
fn test_readonly_blocks_writes() {
    let mut inner = MemoryBackend::new();
    inner.write(std::path::Path::new("/file.txt"), b"original").unwrap();

    let backend = ReadOnly::new(inner);
    let fs = FileStorage::new(backend);

    let result = fs.write("/file.txt", b"modified");
    assert!(matches!(result, Err(FsError::ReadOnly { .. })));

    let result = fs.remove_file("/file.txt");
    assert!(matches!(result, Err(FsError::ReadOnly { .. })));
}

#[test]
fn test_readonly_allows_reads() {
    let mut inner = MemoryBackend::new();
    inner.write(std::path::Path::new("/file.txt"), b"data").unwrap();

    let backend = ReadOnly::new(inner);
    let fs = FileStorage::new(backend);

    assert_eq!(fs.read("/file.txt").unwrap(), b"data");
}
}

Middleware Composition Tests

#![allow(unused)]
fn main() {
#[test]
fn test_middleware_composition_order() {
    // Quota inside, Restrictions outside
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder().max_total_size(100).build())
        .layer(RestrictionsLayer::builder().deny_permissions().build());

    let fs = FileStorage::new(backend);

    // Write should hit quota
    let result = fs.write("/big.txt", &[0u8; 200]);
    assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}

#[test]
fn test_layer_syntax() {
    // All configurable middleware use builder pattern (per ADR-022)
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder().max_total_size(1000).build())
        .layer(RestrictionsLayer::builder().deny_permissions().build())
        .layer(TracingLayer::new());  // TracingLayer has sensible defaults

    let fs = FileStorage::new(backend);
    fs.write("/test.txt", b"data").unwrap();
}
}

3. FileStorage Tests

#![allow(unused)]
fn main() {
#[test]
fn test_filestorage_type_inference() {
    // Type should be inferred
    let fs = FileStorage::new(MemoryBackend::new());
    // No explicit type needed
}

#[test]
fn test_filestorage_wrapper_types() {
    // Users who need type-safe domain separation create wrapper types
    struct SandboxFs(FileStorage<MemoryBackend>);
    struct ProductionFs(FileStorage<MemoryBackend>);

    let sandbox = SandboxFs(FileStorage::new(MemoryBackend::new()));
    let prod = ProductionFs(FileStorage::new(MemoryBackend::new()));

    fn only_sandbox(_fs: &SandboxFs) {}

    only_sandbox(&sandbox);  // Compiles
    // only_sandbox(&prod);  // Would not compile - different type
}
}

4. Integration Tests (Real Filesystem)

Tests that use real filesystem backends (VRootFsBackend, tempfile) are integration tests, not unit tests.

#![allow(unused)]
fn main() {
#[test]
fn test_filestorage_boxed_with_real_fs() {
    // This is an INTEGRATION test - uses real filesystem via tempfile
    let fs1 = FileStorage::new(MemoryBackend::new()).boxed();
    let temp = tempfile::tempdir().unwrap();
    let fs2 = FileStorage::with_resolver(
        VRootFsBackend::new(temp.path()).unwrap(),
        NoOpResolver
    ).boxed();

    // Both can be stored in same collection
    let _filesystems: Vec<FileStorage<Box<dyn Fs>>> = vec![fs1, fs2];
}
}

5. Error Handling Tests

#![allow(unused)]
fn main() {
#[test]
fn test_error_not_found() {
    let fs = FileStorage::new(MemoryBackend::new());

    match fs.read("/nonexistent") {
        Err(FsError::NotFound { path, operation }) => {
            assert_eq!(path, Path::new("/nonexistent"));
            assert_eq!(operation, "read");
        }
        _ => panic!("Expected NotFound error"),
    }
}

#[test]
fn test_error_already_exists() {
    let fs = FileStorage::new(MemoryBackend::new());
    fs.create_dir("/mydir").unwrap();

    match fs.create_dir("/mydir") {
        Err(FsError::AlreadyExists { path, .. }) => {
            assert_eq!(path, Path::new("/mydir"));
        }
        _ => panic!("Expected AlreadyExists error"),
    }
}

#[test]
fn test_error_not_a_directory() {
    let fs = FileStorage::new(MemoryBackend::new());
    fs.write("/file.txt", b"data").unwrap();

    match fs.read_dir("/file.txt") {
        Err(FsError::NotADirectory { path }) => {
            assert_eq!(path, Path::new("/file.txt"));
        }
        _ => panic!("Expected NotADirectory error"),
    }
}

#[test]
fn test_error_directory_not_empty() {
    let fs = FileStorage::new(MemoryBackend::new());
    fs.create_dir("/mydir").unwrap();
    fs.write("/mydir/file.txt", b"data").unwrap();

    match fs.remove_dir("/mydir") {
        Err(FsError::DirectoryNotEmpty { path }) => {
            assert_eq!(path, Path::new("/mydir"));
        }
        _ => panic!("Expected DirectoryNotEmpty error"),
    }
}
}

5. Concurrency Tests

#![allow(unused)]
fn main() {
#[test]
fn test_concurrent_reads() {
    let backend = MemoryBackend::new();
    backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
    let backend = Arc::new(RwLock::new(backend));

    let handles: Vec<_> = (0..10).map(|_| {
        let backend = Arc::clone(&backend);
        thread::spawn(move || {
            let guard = backend.read().unwrap();
            guard.read(std::path::Path::new("/file.txt")).unwrap()
        })
    }).collect();

    for handle in handles {
        assert_eq!(handle.join().unwrap(), b"data");
    }
}

#[test]
fn test_concurrent_create_dir_all() {
    let backend = Arc::new(Mutex::new(MemoryBackend::new()));

    let handles: Vec<_> = (0..10).map(|_| {
        let backend = Arc::clone(&backend);
        thread::spawn(move || {
            let mut guard = backend.lock().unwrap();
            // Multiple threads creating same path should not race
            guard.create_dir_all(std::path::Path::new("/a/b/c/d")).unwrap();
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }

    assert!(backend.lock().unwrap().exists(std::path::Path::new("/a/b/c/d")).unwrap());
}

#[test]
#[ignore] // Run with: cargo test --release -- --ignored
fn stress_test_concurrent_operations() {
    let backend = Arc::new(Mutex::new(MemoryBackend::new()));

    let handles: Vec<_> = (0..100).map(|i| {
        let backend = Arc::clone(&backend);
        thread::spawn(move || {
            for j in 0..100 {
                let path = format!("/thread_{}/file_{}.txt", i, j);
                let mut guard = backend.lock().unwrap();
                guard.create_dir_all(std::path::Path::new(&format!("/thread_{}", i))).ok();
                guard.write(std::path::Path::new(&path), b"data").unwrap();
                drop(guard);

                let guard = backend.lock().unwrap();
                let _ = guard.read(std::path::Path::new(&path));
            }
        })
    }).collect();

    for handle in handles {
        handle.join().unwrap();
    }
}
}

6. Path Edge Case Tests

#![allow(unused)]
fn main() {
#[test]
fn test_path_normalization() {
    let fs = FileStorage::new(MemoryBackend::new());

    fs.write("/a/b/../c/file.txt", b"data").unwrap();

    // Should be accessible via normalized path
    assert_eq!(fs.read("/a/c/file.txt").unwrap(), b"data");
}

#[test]
fn test_double_slashes() {
    let fs = FileStorage::new(MemoryBackend::new());

    fs.write("//a//b//file.txt", b"data").unwrap();

    assert_eq!(fs.read("/a/b/file.txt").unwrap(), b"data");
}

#[test]
fn test_root_path() {
    let fs = FileStorage::new(MemoryBackend::new());

    // ReadDirIter is an iterator, use collect_all() to check contents
    let entries = fs.read_dir("/").unwrap().collect_all().unwrap();
    assert!(entries.is_empty());
}

#[test]
fn test_empty_path_returns_error() {
    let fs = FileStorage::new(MemoryBackend::new());

    let result = fs.read("");
    assert!(result.is_err());
}

#[test]
fn test_unicode_paths() {
    let fs = FileStorage::new(MemoryBackend::new());

    fs.write("/文件/データ.txt", b"data").unwrap();

    assert_eq!(fs.read("/文件/データ.txt").unwrap(), b"data");
}

#[test]
fn test_paths_with_spaces() {
    let fs = FileStorage::new(MemoryBackend::new());

    fs.write("/my folder/my file.txt", b"data").unwrap();

    assert_eq!(fs.read("/my folder/my file.txt").unwrap(), b"data");
}
}

7. No-Panic Guarantee Tests

#![allow(unused)]
fn main() {
#[test]
fn no_panic_missing_file() {
    let fs = FileStorage::new(MemoryBackend::new());
    let _ = fs.read("/missing");  // Should return Err, not panic
}

#[test]
fn no_panic_missing_parent() {
    let fs = FileStorage::new(MemoryBackend::new());
    let _ = fs.write("/missing/parent/file.txt", b"data");  // Should return Err
}

#[test]
fn no_panic_read_dir_on_file() {
    let fs = FileStorage::new(MemoryBackend::new());
    fs.write("/file.txt", b"data").unwrap();
    let _ = fs.read_dir("/file.txt");  // Should return Err, not panic
}

#[test]
fn no_panic_remove_nonempty_dir() {
    let fs = FileStorage::new(MemoryBackend::new());
    fs.create_dir("/dir").unwrap();
    fs.write("/dir/file.txt", b"data").unwrap();
    let _ = fs.remove_dir("/dir");  // Should return Err, not panic
}
}

#![allow(unused)]
fn main() {
// Virtual backend symlink resolution (always follows for FsLink backends)
#[test]
fn test_virtual_backend_symlink_following() {
    let backend = MemoryBackend::new();
    backend.write(std::path::Path::new("/target.txt"), b"secret").unwrap();
    backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();

    assert_eq!(backend.read(std::path::Path::new("/link.txt")).unwrap(), b"secret");
}

#[test]
fn test_symlink_chain_resolution() {
    let backend = MemoryBackend::new();
    backend.write(std::path::Path::new("/target.txt"), b"data").unwrap();
    backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link1.txt")).unwrap();
    backend.symlink(std::path::Path::new("/link1.txt"), std::path::Path::new("/link2.txt")).unwrap();

    // Should follow chain
    assert_eq!(backend.read(std::path::Path::new("/link2.txt")).unwrap(), b"data");
}

#[test]
fn test_symlink_loop_detection() {
    let backend = MemoryBackend::new();
    backend.symlink(std::path::Path::new("/link2.txt"), std::path::Path::new("/link1.txt")).unwrap();
    backend.symlink(std::path::Path::new("/link1.txt"), std::path::Path::new("/link2.txt")).unwrap();

    let result = backend.read(std::path::Path::new("/link1.txt"));
    assert!(matches!(result, Err(FsError::SymlinkLoop { .. })));
}

#[test]
fn test_virtual_symlink_cannot_escape() {
    let backend = MemoryBackend::new();
    // Create a symlink pointing "outside" - but in virtual backend, paths are just keys
    backend.symlink(std::path::Path::new("../../../etc/passwd"), std::path::Path::new("/link.txt")).unwrap();

    // Reading should fail (target doesn't exist), not read real /etc/passwd
    let result = backend.read(std::path::Path::new("/link.txt"));
    assert!(matches!(result, Err(FsError::NotFound { .. })));
}
}

VRootFsBackend Containment Tests

#![allow(unused)]
fn main() {
#[test]
fn test_vroot_prevents_path_traversal() {
    let temp = tempfile::tempdir().unwrap();
    let backend = VRootFsBackend::new(temp.path()).unwrap();
    let fs = FileStorage::new(backend);

    // Attempt to escape via ..
    let result = fs.read("/../../../etc/passwd");
    assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}

#[test]
fn test_vroot_prevents_symlink_escape() {
    let temp = tempfile::tempdir().unwrap();
    std::fs::write(temp.path().join("file.txt"), b"data").unwrap();

    // Create symlink pointing outside the jail
    #[cfg(unix)]
    std::os::unix::fs::symlink("/etc/passwd", temp.path().join("escape")).unwrap();

    let backend = VRootFsBackend::new(temp.path()).unwrap();
    let fs = FileStorage::new(backend);

    // Reading should be blocked by strict-path
    let result = fs.read("/escape");
    assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}

#[test]
fn test_vroot_allows_internal_symlinks() {
    let temp = tempfile::tempdir().unwrap();
    std::fs::write(temp.path().join("target.txt"), b"data").unwrap();

    #[cfg(unix)]
    std::os::unix::fs::symlink("target.txt", temp.path().join("link.txt")).unwrap();

    let backend = VRootFsBackend::new(temp.path()).unwrap();
    let fs = FileStorage::new(backend);

    // Internal symlinks should work
    assert_eq!(fs.read("/link.txt").unwrap(), b"data");
}

#[test]
fn test_vroot_canonicalizes_paths() {
    let temp = tempfile::tempdir().unwrap();
    let backend = VRootFsBackend::new(temp.path()).unwrap();
    let fs = FileStorage::new(backend);

    fs.create_dir("/a").unwrap();
    fs.write("/a/file.txt", b"data").unwrap();

    // Access via normalized path
    assert_eq!(fs.read("/a/../a/./file.txt").unwrap(), b"data");
}
}

9. RateLimit Middleware Tests

#![allow(unused)]
fn main() {
#[test]
fn test_ratelimit_allows_within_limit() {
    let backend = MemoryBackend::new()
        .layer(RateLimitLayer::builder().max_ops(10).per_second().build());
    let fs = FileStorage::new(backend);

    // Should succeed within limit
    for i in 0..5 {
        fs.write(format!("/file{}.txt", i), b"data").unwrap();
    }
}

#[test]
fn test_ratelimit_blocks_when_exceeded() {
    let backend = MemoryBackend::new()
        .layer(RateLimitLayer::builder().max_ops(3).per_second().build());
    let fs = FileStorage::new(backend);

    fs.write("/file1.txt", b"data").unwrap();
    fs.write("/file2.txt", b"data").unwrap();
    fs.write("/file3.txt", b"data").unwrap();

    let result = fs.write("/file4.txt", b"data");
    assert!(matches!(result, Err(FsError::RateLimitExceeded { .. })));
}

#[test]
fn test_ratelimit_resets_after_window() {
    let backend = MemoryBackend::new()
        .layer(RateLimitLayer::builder().max_ops(2).per(Duration::from_millis(100)).build());
    let fs = FileStorage::new(backend);

    fs.write("/file1.txt", b"data").unwrap();
    fs.write("/file2.txt", b"data").unwrap();

    // Wait for window to reset
    std::thread::sleep(Duration::from_millis(150));

    // Should succeed again
    fs.write("/file3.txt", b"data").unwrap();
}

#[test]
fn test_ratelimit_counts_all_operations() {
    let backend = MemoryBackend::new()
        .layer(RateLimitLayer::builder().max_ops(3).per_second().build());
    let fs = FileStorage::new(backend);

    fs.write("/file.txt", b"data").unwrap();  // 1
    let _ = fs.read("/file.txt");              // 2
    let _ = fs.exists("/file.txt");            // 3

    let result = fs.metadata("/file.txt");
    assert!(matches!(result, Err(FsError::RateLimitExceeded { .. })));
}
}

10. Tracing Middleware Tests

#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex};

#[derive(Default)]
struct TestLogger {
    logs: Arc<Mutex<Vec<String>>>,
}

impl TestLogger {
    fn entries(&self) -> Vec<String> {
        self.logs.lock().unwrap().clone()
    }
}

#[test]
fn test_tracing_logs_operations() {
    let logger = TestLogger::default();
    let logs = Arc::clone(&logger.logs);

    let backend = MemoryBackend::new()
        .layer(TracingLayer::new()
            .with_logger(move |op| {
                logs.lock().unwrap().push(op.to_string());
            }));
    let fs = FileStorage::new(backend);

    fs.write("/file.txt", b"data").unwrap();
    fs.read("/file.txt").unwrap();

    let entries = logger.entries();
    assert!(entries.iter().any(|e| e.contains("write")));
    assert!(entries.iter().any(|e| e.contains("read")));
}

#[test]
fn test_tracing_includes_path() {
    let logger = TestLogger::default();
    let logs = Arc::clone(&logger.logs);

    let backend = MemoryBackend::new()
        .layer(TracingLayer::new()
            .with_logger(move |op| {
                logs.lock().unwrap().push(op.to_string());
            }));
    let fs = FileStorage::new(backend);

    fs.write("/important/secret.txt", b"data").unwrap();

    let entries = logger.entries();
    assert!(entries.iter().any(|e| e.contains("/important/secret.txt")));
}

#[test]
fn test_tracing_logs_errors() {
    let logger = TestLogger::default();
    let logs = Arc::clone(&logger.logs);

    let backend = MemoryBackend::new()
        .layer(TracingLayer::new()
            .with_logger(move |op| {
                logs.lock().unwrap().push(op.to_string());
            }));
    let fs = FileStorage::new(backend);

    let _ = fs.read("/nonexistent.txt");

    let entries = logger.entries();
    assert!(entries.iter().any(|e| e.contains("NotFound") || e.contains("error")));
}

#[test]
fn test_tracing_with_span_context() {
    use tracing::{info_span, Instrument};

    let backend = MemoryBackend::new().layer(TracingLayer::new());
    let fs = FileStorage::new(backend);

    async {
        fs.write("/async.txt", b"data").unwrap();
    }
    .instrument(info_span!("test_operation"))
    .now_or_never();
}
}

11. Backend Interchangeability Tests

#![allow(unused)]
fn main() {
/// Ensure all backends can be used interchangeably
fn generic_filesystem_test<B: Fs>(mut backend: B) {
    backend.create_dir(std::path::Path::new("/test")).unwrap();
    backend.write(std::path::Path::new("/test/file.txt"), b"hello").unwrap();
    assert_eq!(backend.read(std::path::Path::new("/test/file.txt")).unwrap(), b"hello");
    backend.remove_dir_all(std::path::Path::new("/test")).unwrap();
    assert!(!backend.exists(std::path::Path::new("/test")).unwrap());
}

#[test]
fn test_memory_backend_interchangeable() {
    generic_filesystem_test(MemoryBackend::new());
}

#[test]
fn test_sqlite_backend_interchangeable() {
    let (backend, _temp) = temp_sqlite_backend();
    generic_filesystem_test(backend);
}

#[test]
fn test_vroot_backend_interchangeable() {
    let temp = tempfile::tempdir().unwrap();
    let backend = VRootFsBackend::new(temp.path()).unwrap();
    generic_filesystem_test(backend);
}

#[test]
fn test_middleware_stack_interchangeable() {
    let backend = MemoryBackend::new()
        .layer(QuotaLayer::builder()
            .max_total_size(1024 * 1024)
            .build())
        .layer(TracingLayer::new());
    generic_filesystem_test(backend);
}
}

12. Property-Based Tests

#![allow(unused)]
fn main() {
use proptest::prelude::*;

proptest! {
    #[test]
    fn prop_write_read_roundtrip(data: Vec<u8>) {
        let backend = MemoryBackend::new();
        backend.write(std::path::Path::new("/file.bin"), &data).unwrap();
        let read_data = backend.read(std::path::Path::new("/file.bin")).unwrap();
        prop_assert_eq!(data, read_data);
    }

    #[test]
    fn prop_path_normalization_idempotent(path in "[a-z/]{1,50}") {
        let backend = MemoryBackend::new();
        if let Ok(()) = backend.create_dir_all(std::path::Path::new(&path)) {
            // Creating again should either succeed or return AlreadyExists
            let result = backend.create_dir_all(std::path::Path::new(&path));
            prop_assert!(result.is_ok() || matches!(result, Err(FsError::AlreadyExists { .. })));
        }
    }

    #[test]
    fn prop_quota_never_exceeds_limit(
        file_count in 1..10usize,
        file_sizes in prop::collection::vec(1..100usize, 1..10)
    ) {
        let limit = 500usize;
        let backend = MemoryBackend::new()
            .layer(QuotaLayer::builder().max_total_size(limit as u64).build());
        let fs = FileStorage::new(backend);

        let mut total_written = 0usize;
        for (i, size) in file_sizes.into_iter().take(file_count).enumerate() {
            let data = vec![0u8; size];
            match fs.write(format!("/file{}.txt", i), &data) {
                Ok(()) => total_written += size,
                Err(FsError::QuotaExceeded { .. }) => break,
                Err(e) => panic!("Unexpected error: {:?}", e),
            }
        }
        prop_assert!(total_written <= limit);
    }
}
}

13. Snapshot & Restore Tests

#![allow(unused)]
fn main() {
// MemoryBackend implements Clone - that's the snapshot mechanism
#[test]
fn test_clone_creates_independent_copy() {
    let mut original = MemoryBackend::new();
    original.write(std::path::Path::new("/file.txt"), b"original").unwrap();

    // Clone = snapshot
    let mut snapshot = original.clone();

    // Modify original
    original.write(std::path::Path::new("/file.txt"), b"modified").unwrap();
    original.write(std::path::Path::new("/new.txt"), b"new").unwrap();

    // Snapshot is unchanged
    assert_eq!(snapshot.read(std::path::Path::new("/file.txt")).unwrap(), b"original");
    assert!(!snapshot.exists(std::path::Path::new("/new.txt")).unwrap());
}

#[test]
fn test_checkpoint_and_rollback() {
    let fs = MemoryBackend::new();
    fs.write(std::path::Path::new("/important.txt"), b"original").unwrap();

    // Checkpoint = clone
    let checkpoint = fs.clone();

    // Do risky work
    fs.write(std::path::Path::new("/important.txt"), b"corrupted").unwrap();

    // Rollback = replace with checkpoint
    fs = checkpoint;
    assert_eq!(fs.read(std::path::Path::new("/important.txt")).unwrap(), b"original");
}

#[test]
fn test_persistence_roundtrip() {
    let temp = tempfile::tempdir().unwrap();
    let path = temp.path().join("state.bin");

    let fs = MemoryBackend::new();
    fs.write(std::path::Path::new("/data.txt"), b"persisted").unwrap();

    // Save
    fs.save_to(&path).unwrap();

    // Load
    let restored = MemoryBackend::load_from(&path).unwrap();
    assert_eq!(restored.read(std::path::Path::new("/data.txt")).unwrap(), b"persisted");
}

#[test]
fn test_to_bytes_from_bytes() {
    let fs = MemoryBackend::new();
    fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
    fs.write(std::path::Path::new("/a/b/c/file.txt"), b"nested").unwrap();

    let bytes = fs.to_bytes().unwrap();
    let restored = MemoryBackend::from_bytes(&bytes).unwrap();

    assert_eq!(restored.read(std::path::Path::new("/a/b/c/file.txt")).unwrap(), b"nested");
}

#[test]
fn test_from_bytes_invalid_data() {
    let result = MemoryBackend::from_bytes(b"garbage");
    assert!(result.is_err());
}
}

14. Running Tests

# All tests
cargo test

# Specific backend conformance
cargo test memory_backend_conformance
cargo test sqlite_backend_conformance

# Middleware tests
cargo test quota
cargo test restrictions
cargo test pathfilter

# Stress tests (release mode)
cargo test --release -- --ignored

# With coverage
cargo tarpaulin --out Html

# Cross-platform (CI)
cargo test --target x86_64-unknown-linux-gnu
cargo test --target x86_64-pc-windows-msvc
cargo test --target x86_64-apple-darwin

# WASM
cargo test --target wasm32-unknown-unknown

9. Test Utilities

#![allow(unused)]
fn main() {
// In anyfs-test crate

/// Create a temporary backend for testing
pub fn temp_vrootfs_backend() -> (VRootFsBackend, tempfile::TempDir) {
    let temp = tempfile::tempdir().unwrap();
    let backend = VRootFsBackend::new(temp.path()).unwrap();
    (backend, temp)
}

/// Run a test against multiple backends
pub fn test_all_backends<F>(test: F)
where
    F: Fn(&mut dyn Fs),
{
    // Memory
    let backend = MemoryBackend::new();
    test(&mut backend);

    // VRootFs (real filesystem with containment)
    let (mut backend, _temp) = temp_vrootfs_backend();
    test(&mut backend);
}
}

Conformance Test Suite

Verify your backend or middleware works correctly with AnyFS

This document provides a complete test suite skeleton that any backend or middleware implementer can use to verify their implementation meets the AnyFS trait contracts.


Overview

The conformance test suite verifies:

  1. Correctness - Operations behave as specified
  2. Error handling - Correct errors for edge cases
  3. Thread safety - Safe concurrent access
  4. Trait contracts - All trait requirements met

Test Levels

LevelTraits TestedWhen to Use
CoreFsRead, FsWrite, FsDir (= Fs)All backends
Full+ FsLink, FsPermissions, FsSync, FsStatsExtended backends
Fuse+ FsInodeFUSE-mountable backends
Posix+ FsHandles, FsLock, FsXattrFull POSIX backends

Quick Start

For Backend Implementers

Add this to your Cargo.toml:

[dev-dependencies]
anyfs-test = "0.1"  # Conformance test suite

Then in your test file:

#![allow(unused)]
fn main() {
use anyfs_test::prelude::*;

// Tell the test suite how to create your backend
fn create_backend() -> MyBackend {
    MyBackend::new()
}

// Run all Fs-level tests
anyfs_test::generate_fs_tests!(create_backend);

// If you implement FsFull traits:
// anyfs_test::generate_fs_full_tests!(create_backend);

// If you implement FsFuse traits:
// anyfs_test::generate_fs_fuse_tests!(create_backend);
}

For Middleware Implementers

#![allow(unused)]
fn main() {
use anyfs_test::prelude::*;
use anyfs::MemoryBackend;

// Wrap MemoryBackend with your middleware
fn create_backend() -> MyMiddleware<MemoryBackend> {
    MyMiddleware::new(MemoryBackend::new())
}

// Run all tests through your middleware
anyfs_test::generate_fs_tests!(create_backend);
}

Core Test Suite (Fs Traits)

Copy this entire module into your test file and customize create_backend().

#![allow(unused)]
fn main() {
//! Conformance tests for Fs trait implementations.
//!
//! To use: implement `create_backend()` and include this module.

use anyfs_backend::{Fs, FsRead, FsWrite, FsDir, FsError, FileType, Metadata, ReadDirIter};
use std::path::Path;
use std::sync::Arc;
use std::thread;

/// Create a fresh backend instance for testing.
/// Implement this for your backend.
fn create_backend() -> impl Fs {
    todo!("Return your backend here")
}

// ============================================================================
// FsRead Tests
// ============================================================================

mod fs_read {
    use super::*;

    #[test]
    fn read_existing_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/test.txt"), b"hello world").unwrap();

        let content = fs.read(std::path::Path::new("/test.txt")).unwrap();
        assert_eq!(content, b"hello world");
    }

    #[test]
    fn read_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.read(std::path::Path::new("/nonexistent.txt"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn read_directory_returns_not_a_file() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/mydir")).unwrap();

        let result = fs.read(std::path::Path::new("/mydir"));
        assert!(matches!(result, Err(FsError::NotAFile { .. })));
    }

    #[test]
    fn read_to_string_valid_utf8() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/text.txt"), "hello unicode: 你好".as_bytes()).unwrap();

        let content = fs.read_to_string(std::path::Path::new("/text.txt")).unwrap();
        assert_eq!(content, "hello unicode: 你好");
    }

    #[test]
    fn read_to_string_invalid_utf8_returns_error() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/binary.bin"), &[0xFF, 0xFE, 0x00, 0x01]).unwrap();

        let result = fs.read_to_string(std::path::Path::new("/binary.bin"));
        assert!(result.is_err());
    }

    #[test]
    fn read_range_full_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();

        let content = fs.read_range(std::path::Path::new("/data.bin"), 0, 10).unwrap();
        assert_eq!(content, b"0123456789");
    }

    #[test]
    fn read_range_partial() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();

        let content = fs.read_range(std::path::Path::new("/data.bin"), 3, 4).unwrap();
        assert_eq!(content, b"3456");
    }

    #[test]
    fn read_range_past_end_returns_available() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();

        let content = fs.read_range(std::path::Path::new("/data.bin"), 8, 100).unwrap();
        assert_eq!(content, b"89");
    }

    #[test]
    fn read_range_offset_past_end_returns_empty() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();

        let content = fs.read_range(std::path::Path::new("/data.bin"), 100, 10).unwrap();
        assert!(content.is_empty());
    }

    #[test]
    fn exists_for_existing_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/exists.txt"), b"data").unwrap();

        assert!(fs.exists(std::path::Path::new("/exists.txt")).unwrap());
    }

    #[test]
    fn exists_for_nonexistent_file() {
        let fs = create_backend();

        assert!(!fs.exists(std::path::Path::new("/nonexistent.txt")).unwrap());
    }

    #[test]
    fn exists_for_directory() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/mydir")).unwrap();

        assert!(fs.exists(std::path::Path::new("/mydir")).unwrap());
    }

    #[test]
    fn metadata_for_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"hello").unwrap();

        let meta = fs.metadata(std::path::Path::new("/file.txt")).unwrap();
        assert_eq!(meta.file_type, FileType::File);
        assert_eq!(meta.size, 5);
    }

    #[test]
    fn metadata_for_directory() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/mydir")).unwrap();

        let meta = fs.metadata(std::path::Path::new("/mydir")).unwrap();
        assert_eq!(meta.file_type, FileType::Directory);
    }

    #[test]
    fn metadata_for_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.metadata(std::path::Path::new("/nonexistent"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn open_read_and_consume() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/stream.txt"), b"streaming content").unwrap();

        let mut reader = fs.open_read(std::path::Path::new("/stream.txt")).unwrap();
        let mut buf = Vec::new();
        std::io::Read::read_to_end(&mut reader, &mut buf).unwrap();

        assert_eq!(buf, b"streaming content");
    }
}

// ============================================================================
// FsWrite Tests
// ============================================================================

mod fs_write {
    use super::*;

    #[test]
    fn write_creates_new_file() {
        let fs = create_backend();

        fs.write(std::path::Path::new("/new.txt"), b"new content").unwrap();

        assert!(fs.exists(std::path::Path::new("/new.txt")).unwrap());
        assert_eq!(fs.read(std::path::Path::new("/new.txt")).unwrap(), b"new content");
    }

    #[test]
    fn write_overwrites_existing_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"original").unwrap();

        fs.write(std::path::Path::new("/file.txt"), b"replaced").unwrap();

        assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"replaced");
    }

    #[test]
    fn write_to_nonexistent_parent_returns_not_found() {
        let fs = create_backend();

        let result = fs.write(std::path::Path::new("/nonexistent/file.txt"), b"data");
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn write_empty_file() {
        let fs = create_backend();

        fs.write(std::path::Path::new("/empty.txt"), b"").unwrap();

        assert!(fs.exists(std::path::Path::new("/empty.txt")).unwrap());
        assert_eq!(fs.read(std::path::Path::new("/empty.txt")).unwrap(), b"");
    }

    #[test]
    fn write_binary_data() {
        let fs = create_backend();
        let binary: Vec<u8> = (0..=255).collect();

        fs.write(std::path::Path::new("/binary.bin"), &binary).unwrap();

        assert_eq!(fs.read(std::path::Path::new("/binary.bin")).unwrap(), binary);
    }

    #[test]
    fn append_to_existing_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/log.txt"), b"line1\n").unwrap();

        fs.append(std::path::Path::new("/log.txt"), b"line2\n").unwrap();

        assert_eq!(fs.read(std::path::Path::new("/log.txt")).unwrap(), b"line1\nline2\n");
    }

    #[test]
    fn append_to_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.append(std::path::Path::new("/nonexistent.txt"), b"data");
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn remove_file_existing() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/delete-me.txt"), b"bye").unwrap();

        fs.remove_file(std::path::Path::new("/delete-me.txt")).unwrap();

        assert!(!fs.exists(std::path::Path::new("/delete-me.txt")).unwrap());
    }

    #[test]
    fn remove_file_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.remove_file(std::path::Path::new("/nonexistent.txt"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn remove_file_on_directory_returns_not_a_file() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/mydir")).unwrap();

        let result = fs.remove_file(std::path::Path::new("/mydir"));
        assert!(matches!(result, Err(FsError::NotAFile { .. })));
    }

    #[test]
    fn rename_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/old.txt"), b"content").unwrap();

        fs.rename(std::path::Path::new("/old.txt"), std::path::Path::new("/new.txt")).unwrap();

        assert!(!fs.exists(std::path::Path::new("/old.txt")).unwrap());
        assert!(fs.exists(std::path::Path::new("/new.txt")).unwrap());
        assert_eq!(fs.read(std::path::Path::new("/new.txt")).unwrap(), b"content");
    }

    #[test]
    fn rename_overwrites_destination() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/src.txt"), b"source").unwrap();
        fs.write(std::path::Path::new("/dst.txt"), b"destination").unwrap();

        fs.rename(std::path::Path::new("/src.txt"), std::path::Path::new("/dst.txt")).unwrap();

        assert!(!fs.exists(std::path::Path::new("/src.txt")).unwrap());
        assert_eq!(fs.read(std::path::Path::new("/dst.txt")).unwrap(), b"source");
    }

    #[test]
    fn rename_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.rename(std::path::Path::new("/nonexistent.txt"), std::path::Path::new("/new.txt"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn copy_file() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/original.txt"), b"data").unwrap();

        fs.copy(std::path::Path::new("/original.txt"), std::path::Path::new("/copy.txt")).unwrap();

        assert!(fs.exists(std::path::Path::new("/original.txt")).unwrap());
        assert!(fs.exists(std::path::Path::new("/copy.txt")).unwrap());
        assert_eq!(fs.read(std::path::Path::new("/copy.txt")).unwrap(), b"data");
    }

    #[test]
    fn copy_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.copy(std::path::Path::new("/nonexistent.txt"), std::path::Path::new("/copy.txt"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn truncate_shrink() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"0123456789").unwrap();

        fs.truncate(std::path::Path::new("/file.txt"), 5).unwrap();

        assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"01234");
    }

    #[test]
    fn truncate_expand() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"abc").unwrap();

        fs.truncate(std::path::Path::new("/file.txt"), 6).unwrap();

        let content = fs.read(std::path::Path::new("/file.txt")).unwrap();
        assert_eq!(content.len(), 6);
        assert_eq!(&content[..3], b"abc");
        // Expanded bytes should be zero
        assert!(content[3..].iter().all(|&b| b == 0));
    }

    #[test]
    fn truncate_to_zero() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"content").unwrap();

        fs.truncate(std::path::Path::new("/file.txt"), 0).unwrap();

        assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"");
    }

    #[test]
    fn open_write_and_close() {
        let fs = create_backend();

        {
            let mut writer = fs.open_write(std::path::Path::new("/stream.txt")).unwrap();
            std::io::Write::write_all(&mut writer, b"streamed").unwrap();
        }

        // Content should be visible after writer is dropped
        assert_eq!(fs.read(std::path::Path::new("/stream.txt")).unwrap(), b"streamed");
    }
}

// ============================================================================
// FsDir Tests
// ============================================================================

mod fs_dir {
    use super::*;

    #[test]
    fn create_dir_single() {
        let fs = create_backend();

        fs.create_dir(std::path::Path::new("/newdir")).unwrap();

        assert!(fs.exists(std::path::Path::new("/newdir")).unwrap());
        let meta = fs.metadata(std::path::Path::new("/newdir")).unwrap();
        assert_eq!(meta.file_type, FileType::Directory);
    }

    #[test]
    fn create_dir_already_exists_returns_error() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/existing")).unwrap();

        let result = fs.create_dir(std::path::Path::new("/existing"));
        assert!(matches!(result, Err(FsError::AlreadyExists { .. })));
    }

    #[test]
    fn create_dir_parent_not_exists_returns_not_found() {
        let fs = create_backend();

        let result = fs.create_dir(std::path::Path::new("/nonexistent/child"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn create_dir_all_nested() {
        let fs = create_backend();

        fs.create_dir_all(std::path::Path::new("/a/b/c/d")).unwrap();

        assert!(fs.exists(std::path::Path::new("/a")).unwrap());
        assert!(fs.exists(std::path::Path::new("/a/b")).unwrap());
        assert!(fs.exists(std::path::Path::new("/a/b/c")).unwrap());
        assert!(fs.exists(std::path::Path::new("/a/b/c/d")).unwrap());
    }

    #[test]
    fn create_dir_all_partially_exists() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/exists")).unwrap();

        fs.create_dir_all(std::path::Path::new("/exists/new/nested")).unwrap();

        assert!(fs.exists(std::path::Path::new("/exists/new/nested")).unwrap());
    }

    #[test]
    fn create_dir_all_already_exists_is_ok() {
        let fs = create_backend();
        fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();

        // Should not error
        fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
    }

    #[test]
    fn read_dir_empty() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/empty")).unwrap();

        let entries: Vec<_> = fs.read_dir(std::path::Path::new("/empty")).unwrap()
            .filter_map(|e| e.ok())
            .collect();

        assert!(entries.is_empty());
    }

    #[test]
    fn read_dir_with_files() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/parent")).unwrap();
        fs.write(std::path::Path::new("/parent/file1.txt"), b"1").unwrap();
        fs.write(std::path::Path::new("/parent/file2.txt"), b"2").unwrap();
        fs.create_dir(std::path::Path::new("/parent/subdir")).unwrap();

        let mut entries: Vec<_> = fs.read_dir(std::path::Path::new("/parent")).unwrap()
            .filter_map(|e| e.ok())
            .collect();
        entries.sort_by(|a, b| a.name.cmp(&b.name));

        assert_eq!(entries.len(), 3);
        assert_eq!(entries[0].name, "file1.txt");
        assert_eq!(entries[0].file_type, FileType::File);
        assert_eq!(entries[1].name, "file2.txt");
        assert_eq!(entries[2].name, "subdir");
        assert_eq!(entries[2].file_type, FileType::Directory);
    }

    #[test]
    fn read_dir_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.read_dir(std::path::Path::new("/nonexistent"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn read_dir_on_file_returns_not_a_directory() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

        let result = fs.read_dir(std::path::Path::new("/file.txt"));
        assert!(matches!(result, Err(FsError::NotADirectory { .. })));
    }

    #[test]
    fn remove_dir_empty() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/todelete")).unwrap();

        fs.remove_dir(std::path::Path::new("/todelete")).unwrap();

        assert!(!fs.exists(std::path::Path::new("/todelete")).unwrap());
    }

    #[test]
    fn remove_dir_not_empty_returns_error() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/notempty")).unwrap();
        fs.write(std::path::Path::new("/notempty/file.txt"), b"data").unwrap();

        let result = fs.remove_dir(std::path::Path::new("/notempty"));
        assert!(matches!(result, Err(FsError::DirectoryNotEmpty { .. })));
    }

    #[test]
    fn remove_dir_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.remove_dir(std::path::Path::new("/nonexistent"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn remove_dir_on_file_returns_not_a_directory() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

        let result = fs.remove_dir(std::path::Path::new("/file.txt"));
        assert!(matches!(result, Err(FsError::NotADirectory { .. })));
    }

    #[test]
    fn remove_dir_all_recursive() {
        let fs = create_backend();
        fs.create_dir_all(std::path::Path::new("/root/a/b")).unwrap();
        fs.write(std::path::Path::new("/root/file.txt"), b"data").unwrap();
        fs.write(std::path::Path::new("/root/a/nested.txt"), b"nested").unwrap();

        fs.remove_dir_all(std::path::Path::new("/root")).unwrap();

        assert!(!fs.exists(std::path::Path::new("/root")).unwrap());
        assert!(!fs.exists(std::path::Path::new("/root/a")).unwrap());
        assert!(!fs.exists(std::path::Path::new("/root/file.txt")).unwrap());
    }

    #[test]
    fn remove_dir_all_nonexistent_returns_not_found() {
        let fs = create_backend();

        let result = fs.remove_dir_all(std::path::Path::new("/nonexistent"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }
}

// ============================================================================
// Edge Case Tests
// ============================================================================

mod edge_cases {
    use super::*;

    #[test]
    fn root_directory_exists() {
        let fs = create_backend();

        assert!(fs.exists(std::path::Path::new("/")).unwrap());
        let meta = fs.metadata(std::path::Path::new("/")).unwrap();
        assert_eq!(meta.file_type, FileType::Directory);
    }

    #[test]
    fn read_dir_root() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

        let entries: Vec<_> = fs.read_dir(std::path::Path::new("/")).unwrap()
            .filter_map(|e| e.ok())
            .collect();

        assert!(!entries.is_empty());
    }

    #[test]
    fn cannot_remove_root() {
        let fs = create_backend();

        let result = fs.remove_dir(std::path::Path::new("/"));
        assert!(result.is_err());
    }

    #[test]
    fn cannot_remove_root_all() {
        let fs = create_backend();

        let result = fs.remove_dir_all(std::path::Path::new("/"));
        assert!(result.is_err());
    }

    #[test]
    fn file_at_root_level() {
        let fs = create_backend();

        fs.write(std::path::Path::new("/rootfile.txt"), b"at root").unwrap();

        assert!(fs.exists(std::path::Path::new("/rootfile.txt")).unwrap());
        assert_eq!(fs.read(std::path::Path::new("/rootfile.txt")).unwrap(), b"at root");
    }

    #[test]
    fn deeply_nested_path() {
        let fs = create_backend();
        let deep_path = "/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p";

        fs.create_dir_all(std::path::Path::new(deep_path)).unwrap();
        fs.write(std::path::Path::new(&format!("{}/file.txt", deep_path)), b"deep").unwrap();

        assert_eq!(
            fs.read(std::path::Path::new(&format!("{}/file.txt", deep_path))).unwrap(),
            b"deep"
        );
    }

    #[test]
    fn unicode_filename() {
        let fs = create_backend();

        fs.write(std::path::Path::new("/文件.txt"), b"chinese").unwrap();
        fs.write(std::path::Path::new("/файл.txt"), b"russian").unwrap();
        fs.write(std::path::Path::new("/αρχείο.txt"), b"greek").unwrap();

        assert_eq!(fs.read(std::path::Path::new("/文件.txt")).unwrap(), b"chinese");
        assert_eq!(fs.read(std::path::Path::new("/файл.txt")).unwrap(), b"russian");
        assert_eq!(fs.read(std::path::Path::new("/αρχείο.txt")).unwrap(), b"greek");
    }

    #[test]
    fn filename_with_spaces() {
        let fs = create_backend();

        fs.write(std::path::Path::new("/file with spaces.txt"), b"spaced").unwrap();

        assert!(fs.exists(std::path::Path::new("/file with spaces.txt")).unwrap());
        assert_eq!(fs.read(std::path::Path::new("/file with spaces.txt")).unwrap(), b"spaced");
    }

    #[test]
    fn filename_with_special_chars() {
        let fs = create_backend();

        fs.write(std::path::Path::new("/file-name_123.test.txt"), b"special").unwrap();

        assert!(fs.exists(std::path::Path::new("/file-name_123.test.txt")).unwrap());
    }

    #[test]
    fn large_file() {
        let fs = create_backend();
        let large_data: Vec<u8> = (0..1_000_000).map(|i| (i % 256) as u8).collect();

        fs.write(std::path::Path::new("/large.bin"), &large_data).unwrap();

        assert_eq!(fs.read(std::path::Path::new("/large.bin")).unwrap(), large_data);
    }

    #[test]
    fn many_files_in_directory() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/many")).unwrap();

        for i in 0..100 {
            fs.write(std::path::Path::new(&format!("/many/file_{:03}.txt", i)), format!("{}", i).as_bytes()).unwrap();
        }

        let entries: Vec<_> = fs.read_dir(std::path::Path::new("/many")).unwrap()
            .filter_map(|e| e.ok())
            .collect();

        assert_eq!(entries.len(), 100);
    }

    #[test]
    fn overwrite_larger_with_smaller() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"this is a longer content").unwrap();

        fs.write(std::path::Path::new("/file.txt"), b"short").unwrap();

        assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"short");
    }

    #[test]
    fn overwrite_smaller_with_larger() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"short").unwrap();

        fs.write(std::path::Path::new("/file.txt"), b"this is a longer content").unwrap();

        assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"this is a longer content");
    }
}

// ============================================================================
// Security Tests (Learned from Prior Art Vulnerabilities)
// ============================================================================

mod security {
    use super::*;

    // ------------------------------------------------------------------------
    // Path Traversal Tests (Apache Commons VFS CVE-inspired)
    // ------------------------------------------------------------------------

    #[test]
    fn reject_dotdot_traversal() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
        fs.write(std::path::Path::new("/secret.txt"), b"secret").unwrap();

        // Direct .. traversal must be blocked or normalized
        let result = fs.read(std::path::Path::new("/sandbox/../secret.txt"));
        // Either blocks the operation or normalizes to /secret.txt (acceptable)
        // But must NOT escape sandbox context in sandboxed backends
    }

    #[test]
    fn reject_url_encoded_dotdot() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();

        // URL-encoded path traversal: %2e = '.', %2f = '/'
        // This caused CVE in Apache Commons VFS
        let result = fs.read(std::path::Path::new("/sandbox/%2e%2e/etc/passwd"));
        assert!(result.is_err(), "URL-encoded path traversal must be rejected");

        // Double-encoded traversal
        let result = fs.read(std::path::Path::new("/sandbox/%252e%252e/etc/passwd"));
        assert!(result.is_err(), "Double URL-encoded traversal must be rejected");
    }

    #[test]
    fn reject_backslash_traversal() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();

        // Windows-style path traversal
        let result = fs.read(std::path::Path::new("/sandbox\\..\\secret.txt"));
        assert!(result.is_err(), "Backslash traversal must be rejected");
    }

    #[test]
    fn reject_mixed_slash_traversal() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();

        // Mixed forward/backward slashes
        let result = fs.read(std::path::Path::new("/sandbox/..\\..\\secret.txt"));
        assert!(result.is_err(), "Mixed slash traversal must be rejected");

        let result = fs.read(std::path::Path::new("/sandbox\\../secret.txt"));
        assert!(result.is_err(), "Mixed slash traversal must be rejected");
    }

    #[test]
    fn reject_null_byte_injection() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/safe.txt"), b"safe").unwrap();
        fs.write(std::path::Path::new("/safe.txt.bak"), b"backup").unwrap();

        // Null byte injection: /safe.txt\0.bak -> /safe.txt
        let result = fs.read(std::path::Path::new("/safe.txt\0.bak"));
        // Should either reject or not truncate at null
        if let Ok(content) = result {
            // If it succeeds, it should read the full path, not truncated
            assert_ne!(content, b"safe", "Null byte must not truncate path");
        }
    }

    // ------------------------------------------------------------------------
    // Symlink Security Tests (Afero-inspired)
    // ------------------------------------------------------------------------

    #[test]
    fn symlink_cannot_escape_sandbox() {
        // This test is for sandboxed backends (e.g., VRootFs)
        // Regular backends may allow this, which is fine
        let fs = create_backend();

        // Attempt to create symlink pointing outside virtual root
        let result = fs.symlink(std::path::Path::new("/etc/passwd"), std::path::Path::new("/escape_link"));

        // Sandboxed backends MUST reject this
        // Non-sandboxed backends may allow it
        // The key is: reading through the symlink must not expose
        // content outside the sandbox
    }

    #[test]
    fn symlink_to_absolute_path_outside() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
        fs.write(std::path::Path::new("/sandbox/safe.txt"), b"safe").unwrap();

        // Symlink pointing to absolute path outside sandbox
        // In sandboxed context, this must either:
        // 1. Reject symlink creation, or
        // 2. Resolve relative to sandbox root
        let result = fs.symlink(std::path::Path::new("/../../../etc/passwd"), std::path::Path::new("/sandbox/link"));
        // Behavior depends on backend type
    }

    #[test]
    fn relative_symlink_traversal() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
        fs.create_dir(std::path::Path::new("/sandbox/subdir")).unwrap();
        fs.write(std::path::Path::new("/secret.txt"), b"secret outside sandbox").unwrap();

        // Relative symlink that traverses up and out
        let _ = fs.symlink(std::path::Path::new("../../secret.txt"), std::path::Path::new("/sandbox/subdir/link"));

        // If symlink was created, reading through it in a sandboxed
        // context must not expose /secret.txt
    }

    // ------------------------------------------------------------------------
    // Symlink Loop Detection Tests (PyFilesystem2-inspired)
    // ------------------------------------------------------------------------

    #[test]
    fn detect_direct_symlink_loop() {
        let fs = create_backend();

        // Self-referential symlink
        let _ = fs.symlink(std::path::Path::new("/loop"), std::path::Path::new("/loop"));

        // Reading must detect the loop
        let result = fs.read(std::path::Path::new("/loop"));
        assert!(matches!(result, Err(FsError::TooManySymlinks { .. }))
            || matches!(result, Err(FsError::NotFound { .. }))
            || result.is_err(),
            "Direct symlink loop must be detected");
    }

    #[test]
    fn detect_indirect_symlink_loop() {
        let fs = create_backend();

        // Two symlinks pointing to each other: a -> b, b -> a
        let _ = fs.symlink(std::path::Path::new("/b"), std::path::Path::new("/a"));
        let _ = fs.symlink(std::path::Path::new("/a"), std::path::Path::new("/b"));

        // Reading either must detect the loop
        let result = fs.read(std::path::Path::new("/a"));
        assert!(matches!(result, Err(FsError::TooManySymlinks { .. }))
            || result.is_err(),
            "Indirect symlink loop must be detected");
    }

    #[test]
    fn detect_deep_symlink_chain() {
        let fs = create_backend();

        // Create a long chain of symlinks
        // link_0 -> link_1 -> link_2 -> ... -> link_N
        for i in 0..100 {
            let _ = fs.symlink(
                std::path::Path::new(&format!("/link_{}", i + 1)),
                std::path::Path::new(&format!("/link_{}", i))
            );
        }
        fs.write(std::path::Path::new("/link_100"), b"target").unwrap();

        // Following the chain should either succeed or fail with TooManySymlinks
        // Must NOT cause stack overflow or infinite loop
        let result = fs.read(std::path::Path::new("/link_0"));
        // Either succeeds (if backend allows deep chains) or returns error
        // Key is: it must terminate
    }

    #[test]
    fn symlink_loop_with_directories() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/dir1")).unwrap();
        fs.create_dir(std::path::Path::new("/dir2")).unwrap();

        // Create directory symlink loop
        let _ = fs.symlink(std::path::Path::new("/dir2"), std::path::Path::new("/dir1/link_to_dir2"));
        let _ = fs.symlink(std::path::Path::new("/dir1"), std::path::Path::new("/dir2/link_to_dir1"));

        // Attempting to read a file through the loop
        let result = fs.read(std::path::Path::new("/dir1/link_to_dir2/link_to_dir1/link_to_dir2/file.txt"));
        assert!(result.is_err(), "Directory symlink loop must be detected");
    }

    // ------------------------------------------------------------------------
    // Resource Exhaustion Tests
    // ------------------------------------------------------------------------

    #[test]
    fn reject_excessive_symlink_depth() {
        let fs = create_backend();

        // FUSE typically limits to 40 symlink follows
        // We should have a reasonable limit (e.g., 40-256)
        const MAX_EXPECTED_DEPTH: u32 = 256;

        // Create chain that exceeds expected limit
        for i in 0..MAX_EXPECTED_DEPTH + 10 {
            let _ = fs.symlink(
                std::path::Path::new(&format!("/excessive_{}", i + 1)),
                std::path::Path::new(&format!("/excessive_{}", i))
            );
        }

        // Create actual target
        fs.write(std::path::Path::new(&format!("/excessive_{}", MAX_EXPECTED_DEPTH + 10)), b"data").unwrap();

        // Should reject or limit, not follow indefinitely
        let result = fs.read(std::path::Path::new("/excessive_0"));
        // Either succeeds (backend allows this depth) or errors
        // Key: must not hang or OOM
    }

    // ------------------------------------------------------------------------
    // Path Normalization Tests (FileStorage Integration)
    // ------------------------------------------------------------------------
    //
    // NOTE: Path normalization (`.`, `..`, `//`) is handled by FileStorage,
    // NOT by backends. Backends receive already-resolved, clean paths.
    // These tests verify FileStorage + backend work together correctly.
    //
    // See testing-guide.md for the full FileStorage path normalization suite.
    // Backend conformance tests should only use clean paths like "/parent/file.txt".

    #[test]
    fn path_normalization_removes_dots() {
        // Test through FileStorage, not raw backend
        let fs = anyfs::FileStorage::new(create_backend());
        fs.create_dir("/parent").unwrap();
        fs.write("/parent/file.txt", b"content").unwrap();

        // FileStorage normalizes paths before passing to backend
        assert_eq!(fs.read("/parent/./file.txt").unwrap(), b"content");
        assert_eq!(fs.read("/parent/subdir/../file.txt").unwrap(), b"content");
    }

    #[test]
    fn path_normalization_removes_double_slashes() {
        // Test through FileStorage, not raw backend
        let fs = anyfs::FileStorage::new(create_backend());
        fs.write("/file.txt", b"content").unwrap();

        // FileStorage normalizes double slashes
        assert_eq!(fs.read("//file.txt").unwrap(), b"content");
        assert_eq!(fs.read("/parent//file.txt").is_err(), true); // Parent doesn't exist
    }

    #[test]
    fn trailing_slash_handling() {
        // Test through FileStorage, not raw backend
        let fs = anyfs::FileStorage::new(create_backend());
        fs.create_dir("/mydir").unwrap();
        fs.write("/mydir/file.txt", b"content").unwrap();

        // Directory with trailing slash - FileStorage normalizes
        assert!(fs.exists("/mydir/").unwrap());

        // File with trailing slash - implementation-defined behavior
        // FileStorage may normalize or reject
    }

    // ------------------------------------------------------------------------
    // Windows-Specific Security Tests (from soft-canonicalize/strict-path)
    // ------------------------------------------------------------------------

    #[test]
    #[cfg(windows)]
    fn reject_ntfs_alternate_data_streams() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/file.txt"), b"main content").unwrap();

        // NTFS ADS: file.txt:hidden_stream
        // Attacker may try to hide data or escape paths via ADS
        let result = fs.read(std::path::Path::new("/file.txt:hidden"));
        assert!(result.is_err(), "NTFS ADS must be rejected");

        let result = fs.read(std::path::Path::new("/file.txt:$DATA"));
        assert!(result.is_err(), "NTFS ADS with $DATA must be rejected");

        let result = fs.read(std::path::Path::new("/file.txt::$DATA"));
        assert!(result.is_err(), "NTFS default stream syntax must be rejected");

        // ADS in directory path (traversal attempt)
        let result = fs.read(std::path::Path::new("/dir:ads/../secret.txt"));
        assert!(result.is_err(), "ADS in directory path must be rejected");
    }

    #[test]
    #[cfg(windows)]
    fn reject_windows_8_3_short_names() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/Program Files")).unwrap();
        fs.write(std::path::Path::new("/Program Files/secret.txt"), b"secret").unwrap();

        // 8.3 short names can be used to obfuscate paths
        // PROGRA~1 is the typical short name for "Program Files"
        // Virtual filesystems should either:
        // 1. Not support 8.3 names at all (reject)
        // 2. Resolve them consistently to the same canonical path

        // Test that we don't accidentally create different files
        let result1 = fs.exists(std::path::Path::new("/Program Files/secret.txt"));
        let result2 = fs.exists(std::path::Path::new("/PROGRA~1/secret.txt"));

        // Either both exist (resolved) or short name doesn't exist (rejected)
        // Key: they must NOT be different files
        if result1.unwrap_or(false) && result2.unwrap_or(false) {
            // If both exist, they must have same content
            let content1 = fs.read(std::path::Path::new("/Program Files/secret.txt")).unwrap();
            let content2 = fs.read(std::path::Path::new("/PROGRA~1/secret.txt")).unwrap();
            assert_eq!(content1, content2, "8.3 names must resolve to same file");
        }
    }

    #[test]
    #[cfg(windows)]
    fn reject_windows_unc_traversal() {
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();

        // Extended-length path prefix traversal
        let result = fs.read(std::path::Path::new("\\\\?\\C:\\..\\..\\etc\\passwd"));
        assert!(result.is_err(), "UNC extended path traversal must be rejected");

        // Device namespace
        let result = fs.read(std::path::Path::new("\\\\.\\C:\\secret.txt"));
        assert!(result.is_err(), "Device namespace paths must be rejected");

        // UNC server path
        let result = fs.read(std::path::Path::new("\\\\server\\share\\..\\..\\secret.txt"));
        assert!(result.is_err(), "UNC server paths must be rejected");
    }

    #[test]
    #[cfg(windows)]
    fn reject_windows_reserved_names() {
        let fs = create_backend();

        // Windows reserved device names (CON, PRN, AUX, NUL, COM1-9, LPT1-9)
        // These can cause hangs or unexpected behavior
        let reserved_names = ["CON", "PRN", "AUX", "NUL", "COM1", "LPT1"];

        for name in reserved_names {
            let result = fs.write(std::path::Path::new(&format!("/{}", name)), b"data");
            // Should either reject or handle safely (not hang)

            let result = fs.write(std::path::Path::new(&format!("/{}.txt", name)), b"data");
            // CON.txt is also problematic on Windows
        }
    }

    #[test]
    #[cfg(windows)]
    fn reject_windows_junction_escape() {
        // Junction points are Windows' equivalent of directory symlinks
        // They can be used for sandbox escape similar to symlinks
        let fs = create_backend();
        fs.create_dir(std::path::Path::new("/sandbox")).unwrap();

        // If backend supports junctions, they must be contained like symlinks
        // The test setup would require actual junction creation capability
        // This documents the requirement even if not all backends support it
    }

    // ------------------------------------------------------------------------
    // Linux-Specific Security Tests (from soft-canonicalize/strict-path)
    // ------------------------------------------------------------------------

    #[test]
    #[cfg(target_os = "linux")]
    fn reject_proc_magic_symlinks() {
        // /proc/PID/root and similar "magic" symlinks can escape namespaces
        // Virtual filesystems wrapping real FS must not follow these
        let fs = create_backend();

        // These paths are only relevant for backends that wrap real filesystem
        // In-memory backends naturally don't have this issue

        // /proc/self/root points to the filesystem root, even in containers
        // Following it would escape chroot/container boundaries
        let result = fs.read(std::path::Path::new("/proc/self/root/etc/passwd"));
        // Either NotFound (good - path doesn't exist in VFS)
        // or handled safely (doesn't escape actual container)
    }

    #[test]
    #[cfg(target_os = "linux")]
    fn reject_dev_fd_symlinks() {
        let fs = create_backend();

        // /dev/fd/N symlinks to open file descriptors
        // Could be used to access files outside sandbox
        let result = fs.read(std::path::Path::new("/dev/fd/0"));
        // Should fail or be isolated from real /dev/fd
    }

    // ------------------------------------------------------------------------
    // Unicode Security Tests (from strict-path)
    // ------------------------------------------------------------------------

    #[test]
    fn unicode_normalization_consistency() {
        let fs = create_backend();

        // NFC vs NFD normalization: é can be:
        // - U+00E9 (precomposed, NFC)
        // - U+0065 U+0301 (decomposed, NFD: e + combining acute)
        let nfc = "/caf\u{00E9}.txt";  // precomposed
        let nfd = "/cafe\u{0301}.txt"; // decomposed

        fs.write(nfc, b"coffee").unwrap();

        // If backend normalizes, both should access same file
        // If backend doesn't normalize, second should not exist
        // Key: must NOT create two different files that look identical
        let result_nfc = fs.exists(nfc);
        let result_nfd = fs.exists(nfd);

        // Document the backend's behavior
        // Either both true (normalized) or only NFC true (strict)
    }

    #[test]
    fn reject_unicode_direction_override() {
        let fs = create_backend();

        // Right-to-Left Override (U+202E) can make paths appear different
        // "secret\u{202E}txt.exe" displays as "secretexe.txt" in some contexts
        let malicious_path = "/secret\u{202E}txt.exe";

        let result = fs.write(malicious_path, b"data");
        // Should either reject or sanitize bidirectional control characters
    }

    #[test]
    fn reject_unicode_homoglyphs() {
        let fs = create_backend();

        // Cyrillic 'а' (U+0430) looks like Latin 'a' (U+0061)
        let latin_path = "/data/file.txt";
        let cyrillic_path = "/d\u{0430}ta/file.txt"; // Cyrillic 'а'

        fs.create_dir(std::path::Path::new("/data")).unwrap();
        fs.write(latin_path, b"real content").unwrap();

        // These must NOT silently access the same file
        // Either cyrillic path is NotFound, or it's a different file
        let result = fs.read(cyrillic_path);
        if let Ok(content) = result {
            // If cyrillic path exists, it must be a distinct file
            // (not accidentally matching the latin path)
        }
    }

    #[test]
    fn reject_null_in_unicode() {
        let fs = create_backend();

        // Null can be encoded in various ways
        // UTF-8 null is just 0x00, but check overlong encodings aren't decoded
        let path_with_null = "/file\u{0000}name.txt";

        let result = fs.write(path_with_null, b"data");
        assert!(result.is_err(), "Embedded null must be rejected");
    }

    // ------------------------------------------------------------------------
    // TOCTOU Race Condition Tests (from soft-canonicalize/strict-path)
    // ------------------------------------------------------------------------

    #[test]
    fn toctou_check_then_use() {
        let fs = Arc::new(create_backend());
        fs.create_dir(std::path::Path::new("/uploads")).unwrap();

        // Simulate TOCTOU: check if path is safe, then use it
        // An attacker might change the filesystem between check and use

        let fs_checker = fs.clone();
        let fs_writer = fs.clone();

        // This test documents the requirement for atomic operations
        // or proper locking in security-critical paths

        // Thread 1: Check then write
        let checker = thread::spawn(move || {
            for i in 0..100 {
                let path = format!("/uploads/file_{}.txt", i);
                // Check
                if !fs_checker.exists(std::path::Path::new(&path)).unwrap_or(true) {
                    // Use (potential race window here)
                    let _ = fs_checker.write(std::path::Path::new(&path), b"data");
                }
            }
        });

        // Thread 2: Rapid file creation/deletion
        let writer = thread::spawn(move || {
            for i in 0..100 {
                let path = format!("/uploads/file_{}.txt", i);
                let _ = fs_writer.write(std::path::Path::new(&path), b"attacker");
                let _ = fs_writer.remove_file(std::path::Path::new(&path));
            }
        });

        checker.join().unwrap();
        writer.join().unwrap();

        // Test passes if no panic/crash occurs
        // Real protection requires atomic create-if-not-exists operations
    }

    #[test]
    fn symlink_toctou_during_resolution() {
        let fs = Arc::new(create_backend());
        fs.create_dir(std::path::Path::new("/safe")).unwrap();
        fs.write(std::path::Path::new("/safe/target.txt"), b"safe content").unwrap();
        fs.write(std::path::Path::new("/unsafe.txt"), b"unsafe content").unwrap();

        // Attacker rapidly changes symlink target during path resolution
        let fs_attacker = fs.clone();
        let fs_reader = fs.clone();

        let attacker = thread::spawn(move || {
            for _ in 0..100 {
                // Create symlink to safe target
                let _ = fs_attacker.remove_file(std::path::Path::new("/safe/link.txt"));
                let _ = fs_attacker.symlink(std::path::Path::new("/safe/target.txt"), std::path::Path::new("/safe/link.txt"));

                // Quickly change to unsafe target
                let _ = fs_attacker.remove_file(std::path::Path::new("/safe/link.txt"));
                let _ = fs_attacker.symlink(std::path::Path::new("/unsafe.txt"), std::path::Path::new("/safe/link.txt"));
            }
        });

        let reader = thread::spawn(move || {
            for _ in 0..100 {
                // Try to read through symlink
                // Must not accidentally read /unsafe.txt if sandboxed
                let _ = fs_reader.read(std::path::Path::new("/safe/link.txt"));
            }
        });

        attacker.join().unwrap();
        reader.join().unwrap();

        // For sandboxed backends: must never return content from /unsafe.txt
        // This test verifies the implementation doesn't have TOCTOU in symlink resolution
    }
}

// ============================================================================
// Thread Safety Tests
// ============================================================================

mod thread_safety {
    use super::*;

    #[test]
    fn concurrent_reads() {
        let fs = Arc::new(create_backend());
        fs.write(std::path::Path::new("/shared.txt"), b"shared content").unwrap();

        let handles: Vec<_> = (0..10)
            .map(|_| {
                let fs = fs.clone();
                thread::spawn(move || {
                    for _ in 0..100 {
                        let content = fs.read(std::path::Path::new("/shared.txt")).unwrap();
                        assert_eq!(content, b"shared content");
                    }
                })
            })
            .collect();

        for handle in handles {
            handle.join().unwrap();
        }
    }

    #[test]
    fn concurrent_writes_different_files() {
        let fs = Arc::new(create_backend());

        let handles: Vec<_> = (0..10)
            .map(|i| {
                let fs = fs.clone();
                thread::spawn(move || {
                    let path = format!("/file_{}.txt", i);
                    for j in 0..100 {
                        fs.write(std::path::Path::new(&path), format!("{}:{}", i, j).as_bytes()).unwrap();
                    }
                })
            })
            .collect();

        for handle in handles {
            handle.join().unwrap();
        }

        // Verify all files exist
        for i in 0..10 {
            assert!(fs.exists(std::path::Path::new(&format!("/file_{}.txt", i))).unwrap());
        }
    }

    #[test]
    fn concurrent_create_dir_all_same_path() {
        let fs = Arc::new(create_backend());

        let handles: Vec<_> = (0..10)
            .map(|_| {
                let fs = fs.clone();
                thread::spawn(move || {
                    // All threads try to create the same path
                    let _ = fs.create_dir_all(std::path::Path::new("/a/b/c/d"));
                })
            })
            .collect();

        for handle in handles {
            handle.join().unwrap();
        }

        // Path should exist regardless of race
        assert!(fs.exists(std::path::Path::new("/a/b/c/d")).unwrap());
    }

    #[test]
    fn read_during_write() {
        let fs = Arc::new(create_backend());
        fs.write(std::path::Path::new("/changing.txt"), b"initial").unwrap();

        let fs_writer = fs.clone();
        let writer = thread::spawn(move || {
            for i in 0..100 {
                fs_writer.write(std::path::Path::new("/changing.txt"), format!("version {}", i).as_bytes()).unwrap();
            }
        });

        let fs_reader = fs.clone();
        let reader = thread::spawn(move || {
            for _ in 0..100 {
                // Should not panic or return garbage
                let result = fs_reader.read(std::path::Path::new("/changing.txt"));
                assert!(result.is_ok());
            }
        });

        writer.join().unwrap();
        reader.join().unwrap();
    }

    #[test]
    fn metadata_consistency() {
        let fs = Arc::new(create_backend());
        fs.write(std::path::Path::new("/meta.txt"), b"content").unwrap();

        let handles: Vec<_> = (0..10)
            .map(|_| {
                let fs = fs.clone();
                thread::spawn(move || {
                    for _ in 0..100 {
                        let meta = fs.metadata(std::path::Path::new("/meta.txt")).unwrap();
                        // Size should be consistent
                        assert!(meta.size > 0);
                    }
                })
            })
            .collect();

        for handle in handles {
            handle.join().unwrap();
        }
    }
}

// ============================================================================
// No Panic Tests (Edge Cases That Must Not Crash)
// ============================================================================

mod no_panic {
    use super::*;

    #[test]
    fn empty_path_does_not_panic() {
        let fs = create_backend();

        // These should return errors, not panic
        let _ = fs.read(std::path::Path::new(""));
        let _ = fs.write(std::path::Path::new(""), b"data");
        let _ = fs.metadata(std::path::Path::new(""));
        let _ = fs.exists(std::path::Path::new(""));
        let _ = fs.read_dir(std::path::Path::new(""));
    }

    #[test]
    fn path_with_null_does_not_panic() {
        let fs = create_backend();

        // Paths with null bytes should error or be handled gracefully
        let _ = fs.read(std::path::Path::new("/file\0name.txt"));
        let _ = fs.write(std::path::Path::new("/file\0name.txt"), b"data");
    }

    #[test]
    fn very_long_path_does_not_panic() {
        let fs = create_backend();
        let long_name = "a".repeat(10000);
        let long_path = format!("/{}", long_name);

        // Should error gracefully, not panic
        let _ = fs.write(std::path::Path::new(&long_path), b"data");
        let _ = fs.read(std::path::Path::new(&long_path));
    }

    #[test]
    fn very_long_filename_does_not_panic() {
        let fs = create_backend();
        let long_name = format!("/{}.txt", "x".repeat(1000));

        let _ = fs.write(std::path::Path::new(&long_name), b"data");
    }

    #[test]
    fn read_after_remove_does_not_panic() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/temp.txt"), b"data").unwrap();
        fs.remove_file(std::path::Path::new("/temp.txt")).unwrap();

        // Should return NotFound, not panic
        let result = fs.read(std::path::Path::new("/temp.txt"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }

    #[test]
    fn double_remove_does_not_panic() {
        let fs = create_backend();
        fs.write(std::path::Path::new("/temp.txt"), b"data").unwrap();
        fs.remove_file(std::path::Path::new("/temp.txt")).unwrap();

        // Second remove should error, not panic
        let result = fs.remove_file(std::path::Path::new("/temp.txt"));
        assert!(matches!(result, Err(FsError::NotFound { .. })));
    }
}
}

Extended Test Suite (FsFull Traits)

For backends implementing FsFull:

#![allow(unused)]
fn main() {
mod fs_full {
    use super::*;
    use anyfs_backend::{FsLink, FsPermissions, FsSync, FsStats, Permissions};

    // Only run these if the backend implements FsFull traits
    fn create_full_backend() -> impl Fs + FsLink + FsPermissions + FsSync + FsStats {
        todo!("Return your FsFull backend")
    }

    // ========================================================================
    // FsLink Tests
    // ========================================================================

    mod fs_link {
        use super::*;

        #[test]
        fn create_symlink() {
            let fs = create_full_backend();
            fs.write(std::path::Path::new("/target.txt"), b"target content").unwrap();

            fs.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();

            assert!(fs.exists(std::path::Path::new("/link.txt")).unwrap());
            let meta = fs.symlink_metadata(std::path::Path::new("/link.txt")).unwrap();
            assert_eq!(meta.file_type, FileType::Symlink);
        }

        #[test]
        fn read_symlink() {
            let fs = create_full_backend();
            fs.write(std::path::Path::new("/target.txt"), b"content").unwrap();
            fs.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();

            let target = fs.read_link(std::path::Path::new("/link.txt")).unwrap();
            assert_eq!(target.to_string_lossy(), "/target.txt");
        }

        #[test]
        fn hard_link() {
            let fs = create_full_backend();
            fs.write(std::path::Path::new("/original.txt"), b"shared content").unwrap();

            fs.hard_link(std::path::Path::new("/original.txt"), std::path::Path::new("/hardlink.txt")).unwrap();

            // Both paths should have the same content
            assert_eq!(fs.read(std::path::Path::new("/original.txt")).unwrap(), b"shared content");
            assert_eq!(fs.read(std::path::Path::new("/hardlink.txt")).unwrap(), b"shared content");

            // Modifying one should affect the other
            fs.write(std::path::Path::new("/hardlink.txt"), b"modified").unwrap();
            assert_eq!(fs.read(std::path::Path::new("/original.txt")).unwrap(), b"modified");
        }

        #[test]
        fn symlink_metadata_vs_metadata() {
            let fs = create_full_backend();
            fs.write(std::path::Path::new("/target.txt"), b"content").unwrap();
            fs.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();

            // symlink_metadata returns the symlink's metadata
            let sym_meta = fs.symlink_metadata(std::path::Path::new("/link.txt")).unwrap();
            assert_eq!(sym_meta.file_type, FileType::Symlink);

            // metadata (if it follows symlinks) returns target's metadata
            // Note: behavior depends on implementation
        }
    }

    // ========================================================================
    // FsPermissions Tests
    // ========================================================================

    mod fs_permissions {
        use super::*;

        #[test]
        fn set_permissions() {
            let fs = create_full_backend();
            fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

            fs.set_permissions(std::path::Path::new("/file.txt"), Permissions::from_mode(0o755)).unwrap();

            let meta = fs.metadata(std::path::Path::new("/file.txt")).unwrap();
            assert_eq!(meta.permissions, Some(0o755));
        }

        #[test]
        fn set_permissions_nonexistent_returns_not_found() {
            let fs = create_full_backend();

            let result = fs.set_permissions(std::path::Path::new("/nonexistent"), Permissions::from_mode(0o644));
            assert!(matches!(result, Err(FsError::NotFound { .. })));
        }
    }

    // ========================================================================
    // FsSync Tests
    // ========================================================================

    mod fs_sync {
        use super::*;

        #[test]
        fn sync_does_not_error() {
            let fs = create_full_backend();
            fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

            // sync() should complete without error
            fs.sync().unwrap();
        }

        #[test]
        fn fsync_specific_file() {
            let fs = create_full_backend();
            fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

            fs.fsync(std::path::Path::new("/file.txt")).unwrap();
        }

        #[test]
        fn fsync_nonexistent_returns_not_found() {
            let fs = create_full_backend();

            let result = fs.fsync(std::path::Path::new("/nonexistent.txt"));
            assert!(matches!(result, Err(FsError::NotFound { .. })));
        }
    }

    // ========================================================================
    // FsStats Tests
    // ========================================================================

    mod fs_stats {
        use super::*;

        #[test]
        fn statfs_returns_valid_stats() {
            let fs = create_full_backend();

            let stats = fs.statfs().unwrap();

            // Basic sanity checks
            assert!(stats.block_size > 0);
            // available should not exceed total (if total is reported)
            if stats.total_bytes > 0 {
                assert!(stats.available_bytes <= stats.total_bytes);
            }
        }
    }
}
}

FUSE Test Suite (FsFuse Traits)

For backends implementing FsFuse:

#![allow(unused)]
fn main() {
mod fs_fuse {
    use super::*;
    use anyfs_backend::FsInode;

    fn create_fuse_backend() -> impl Fs + FsInode {
        todo!("Return your FsFuse backend")
    }

    #[test]
    fn path_to_inode_consistency() {
        let fs = create_fuse_backend();
        fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

        let inode1 = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();
        let inode2 = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();

        // Same path should always return same inode
        assert_eq!(inode1, inode2);
    }

    #[test]
    fn inode_to_path_roundtrip() {
        let fs = create_fuse_backend();
        fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();

        let inode = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();
        let path = fs.inode_to_path(inode).unwrap();

        assert_eq!(path.to_string_lossy(), "/file.txt");
    }

    #[test]
    fn lookup_child() {
        let fs = create_fuse_backend();
        fs.create_dir(std::path::Path::new("/parent")).unwrap();
        fs.write(std::path::Path::new("/parent/child.txt"), b"data").unwrap();

        let parent_inode = fs.path_to_inode(std::path::Path::new("/parent")).unwrap();
        let child_inode = fs.lookup(parent_inode, std::ffi::OsStr::new("child.txt")).unwrap();

        let expected_inode = fs.path_to_inode(std::path::Path::new("/parent/child.txt")).unwrap();
        assert_eq!(child_inode, expected_inode);
    }

    #[test]
    fn metadata_by_inode() {
        let fs = create_fuse_backend();
        fs.write(std::path::Path::new("/file.txt"), b"content").unwrap();

        let inode = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();
        let meta = fs.metadata_by_inode(inode).unwrap();

        assert_eq!(meta.file_type, FileType::File);
        assert_eq!(meta.size, 7);
    }

    #[test]
    fn root_inode_is_one() {
        let fs = create_fuse_backend();

        let root_inode = fs.path_to_inode(std::path::Path::new("/")).unwrap();

        // By FUSE convention, root inode is 1
        assert_eq!(root_inode, 1);
    }

    #[test]
    fn different_files_different_inodes() {
        let fs = create_fuse_backend();
        fs.write(std::path::Path::new("/file1.txt"), b"data1").unwrap();
        fs.write(std::path::Path::new("/file2.txt"), b"data2").unwrap();

        let inode1 = fs.path_to_inode(std::path::Path::new("/file1.txt")).unwrap();
        let inode2 = fs.path_to_inode(std::path::Path::new("/file2.txt")).unwrap();

        assert_ne!(inode1, inode2);
    }

    #[test]
    fn hard_links_same_inode() {
        let fs = create_fuse_backend();
        fs.write(std::path::Path::new("/original.txt"), b"data").unwrap();
        fs.hard_link(std::path::Path::new("/original.txt"), std::path::Path::new("/link.txt")).unwrap();

        let inode1 = fs.path_to_inode(std::path::Path::new("/original.txt")).unwrap();
        let inode2 = fs.path_to_inode(std::path::Path::new("/link.txt")).unwrap();

        // Hard links must share the same inode
        assert_eq!(inode1, inode2);
    }
}
}

Middleware Test Suite

For middleware implementers, verify the middleware doesn’t break the underlying backend:

#![allow(unused)]
fn main() {
mod middleware_tests {
    use super::*;
    use anyfs::MemoryBackend;

    /// Your middleware wrapping a known-good backend.
    fn create_middleware() -> MyMiddleware<MemoryBackend> {
        MyMiddleware::new(MemoryBackend::new())
    }

    // Run all standard Fs tests through the middleware
    // This ensures the middleware doesn't break basic functionality

    #[test]
    fn passthrough_read_write() {
        let fs = create_middleware();

        fs.write(std::path::Path::new("/test.txt"), b"data").unwrap();
        assert_eq!(fs.read(std::path::Path::new("/test.txt")).unwrap(), b"data");
    }

    #[test]
    fn passthrough_directories() {
        let fs = create_middleware();

        fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
        assert!(fs.exists(std::path::Path::new("/a/b/c")).unwrap());
    }

    // Add middleware-specific tests here
    // e.g., for a Quota middleware:

    #[test]
    fn quota_blocks_oversized_write() {
        let fs = QuotaMiddleware::new(MemoryBackend::new())
            .with_max_file_size(100);

        let result = fs.write(std::path::Path::new("/big.txt"), &vec![0u8; 200]);
        assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
    }

    #[test]
    fn quota_allows_within_limit() {
        let fs = QuotaMiddleware::new(MemoryBackend::new())
            .with_max_file_size(100);

        fs.write(std::path::Path::new("/small.txt"), &vec![0u8; 50]).unwrap();
        assert!(fs.exists(std::path::Path::new("/small.txt")).unwrap());
    }
}
}

Running the Tests

Basic Usage

# Run all conformance tests
cargo test --test conformance

# Run specific test module
cargo test --test conformance fs_read

# Run with output
cargo test --test conformance -- --nocapture

# Run thread safety tests with more threads
RUST_TEST_THREADS=1 cargo test --test conformance thread_safety

CI Integration

# .github/workflows/test.yml
name: Conformance Tests

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@stable
      - name: Run conformance tests
        run: cargo test --test conformance
      - name: Run thread safety tests
        run: cargo test --test conformance thread_safety -- --test-threads=1

Test Checklist

Before releasing your backend or middleware:

Core Tests (Required)

  • All fs_read tests pass
  • All fs_write tests pass
  • All fs_dir tests pass
  • All edge_cases tests pass
  • All security tests pass
  • All thread_safety tests pass
  • All no_panic tests pass

Extended Tests (If Implementing FsFull)

  • All fs_link tests pass
  • All fs_permissions tests pass
  • All fs_sync tests pass
  • All fs_stats tests pass

FUSE Tests (If Implementing FsFuse)

  • All fs_fuse tests pass
  • Root inode is 1
  • Hard links share inodes

Middleware Tests

  • Basic passthrough works
  • Middleware-specific behavior tested
  • Error cases handled correctly

Summary

This conformance test suite provides:

  1. Complete coverage of all Fs trait operations
  2. Edge case testing for robustness
  3. Security tests learned from vulnerabilities in prior art (Apache Commons VFS, Afero, PyFilesystem2)
  4. Thread safety verification for concurrent access
  5. No-panic guarantees for invalid inputs
  6. Extended tests for FsFull and FsFuse traits
  7. Middleware testing patterns

Security Tests Cover:

  • Path traversal attacks: URL-encoded %2e%2e, backslash traversal, null byte injection
  • Symlink escape: Preventing sandbox escape via symlinks
  • Symlink loops: Direct loops, indirect loops, deep chains
  • Resource exhaustion: Limits on symlink depth
  • Path canonicalization: Dot removal, double slash normalization
  • Windows-specific (from soft-canonicalize/strict-path):
    • NTFS Alternate Data Streams
    • Windows 8.3 short names
    • UNC path traversal
    • Reserved device names
    • Junction point escapes
  • Linux-specific: Magic symlinks (/proc/PID/root), /dev/fd escapes
  • Unicode: NFC/NFD normalization, RTL override, homoglyphs
  • TOCTOU: Race conditions in check-then-use and symlink resolution

Copy the relevant test modules, implement create_backend(), and run the tests. If they all pass, your backend/middleware is AnyFS-compatible.

Middleware Implementation Guide

This document provides implementation sketches for all AnyFS middleware, verifying that each is implementable within our framework.

Verdict: All 9 middleware are implementable. Some have interesting challenges documented below.


Implementation Pattern

All middleware follow the same pattern:

#![allow(unused)]
fn main() {
pub struct MiddlewareName<B> {
    inner: B,
    state: MiddlewareState,  // Interior mutability if needed
}

impl<B: Fs> FsRead for MiddlewareName<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // 1. Pre-check (validate, log, check limits)
        // 2. Delegate to inner.read(path)
        // 3. Post-process (update state, transform result)
    }
}

// Implement FsWrite, FsDir similarly...
// Blanket impl for Fs is automatic
}

1. ReadOnly

Complexity: Trivial State: None Dependencies: None

Implementation

#![allow(unused)]
fn main() {
pub struct ReadOnly<B> {
    inner: B,
}

impl<B> ReadOnly<B> {
    pub fn new(inner: B) -> Self {
        Self { inner }
    }
}

impl<B: FsRead> FsRead for ReadOnly<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        self.inner.read(path)  // Pass through
    }

    fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
        self.inner.read_to_string(path)  // Pass through
    }

    fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
        self.inner.read_range(path, offset, len)  // Pass through
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        self.inner.exists(path)  // Pass through
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        self.inner.metadata(path)  // Pass through
    }

    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
        self.inner.open_read(path)  // Pass through
    }
}

impl<B: FsWrite> FsWrite for ReadOnly<B> {
    fn write(&self, path: &Path, _data: &[u8]) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "write" })
    }

    fn append(&self, path: &Path, _data: &[u8]) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "append" })
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "remove_file" })
    }

    fn rename(&self, from: &Path, _to: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: from.to_path_buf(), operation: "rename" })
    }

    fn copy(&self, from: &Path, _to: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: from.to_path_buf(), operation: "copy" })
    }

    fn truncate(&self, path: &Path, _size: u64) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "truncate" })
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "open_write" })
    }
}

impl<B: FsDir> FsDir for ReadOnly<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        self.inner.read_dir(path)  // Pass through (reading)
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "create_dir" })
    }

    fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "create_dir_all" })
    }

    fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "remove_dir" })
    }

    fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
        Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "remove_dir_all" })
    }
}
}

Verdict: ✅ Trivially Implementable

No challenges. Pure delegation for reads, error return for writes.


2. Restrictions

Complexity: Simple State: Configuration flags only Dependencies: None

Note: Symlink/hard-link capability is determined by trait bounds (B: FsLink), not middleware. Restrictions only controls permission-related operations.

Implementation

#![allow(unused)]
fn main() {
pub struct Restrictions<B> {
    inner: B,
    deny_permissions: bool,
}

pub struct RestrictionsBuilder {
    deny_permissions: bool,
}

impl RestrictionsBuilder {
    pub fn deny_permissions(mut self) -> Self {
        self.deny_permissions = true;
        self
    }

    pub fn build<B>(self, inner: B) -> Restrictions<B> {
        Restrictions {
            inner,
            deny_permissions: self.deny_permissions,
        }
    }
}

// FsRead, FsDir, FsLink: pure delegation (Restrictions doesn't block these)

impl<B: FsLink> FsLink for Restrictions<B> {
    fn symlink(&self, target: &Path, link: &Path) -> Result<(), FsError> {
        self.inner.symlink(target, link)  // Pure delegation
    }

    fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError> {
        self.inner.hard_link(original, link)  // Pure delegation
    }

    fn read_link(&self, path: &Path) -> Result<PathBuf, FsError> {
        self.inner.read_link(path)
    }

    fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        self.inner.symlink_metadata(path)
    }
}

impl<B: FsPermissions> FsPermissions for Restrictions<B> {
    fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError> {
        if self.deny_permissions {
            return Err(FsError::FeatureNotEnabled {
                path: path.to_path_buf(),
                feature: "permissions",
                operation: "set_permissions",
            });
        }
        self.inner.set_permissions(path, perm)
    }
}
}

Verdict: ✅ Trivially Implementable

Simple flag check on set_permissions(). Link operations delegate to inner backend.


3. Tracing

Complexity: Simple State: Configuration only Dependencies: tracing crate

Implementation

#![allow(unused)]
fn main() {
use tracing::{instrument, info, debug, Level};

pub struct Tracing<B> {
    inner: B,
    target: &'static str,
    level: Level,
}

impl<B: FsRead> FsRead for Tracing<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();
        let span = tracing::span!(Level::DEBUG, "fs::read", ?path);
        let _guard = span.enter();

        let result = self.inner.read(path);

        match &result {
            Ok(data) => debug!(bytes = data.len(), "read succeeded"),
            Err(e) => debug!(?e, "read failed"),
        }

        result
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        let path = path.as_ref();
        let span = tracing::span!(Level::DEBUG, "fs::exists", ?path);
        let _guard = span.enter();

        let result = self.inner.exists(path);
        debug!(?result, "exists check");
        result
    }

    // ... similar for all other methods
}

// FsWrite and FsDir follow the same pattern
}

Verdict: ✅ Trivially Implementable

Pure instrumentation wrapper. No state mutation, no complex logic.


4. RateLimit

Complexity: Moderate State: Counter + timestamp (requires interior mutability) Dependencies: None (uses std::time) Algorithm: Fixed-window counter (simpler than token bucket, sufficient for most use cases)

Implementation

#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicU64, AtomicU32, Ordering};
use std::time::{Duration, Instant};
use std::sync::RwLock;

pub struct RateLimit<B> {
    inner: B,
    max_ops: u32,
    window: Duration,
    state: RwLock<RateLimitState>,
}

struct RateLimitState {
    window_start: Instant,
    count: u32,
}

impl<B> RateLimit<B> {
    fn check_rate_limit(&self, path: &Path) -> Result<(), FsError> {
        let mut state = self.state.write().unwrap();

        let now = Instant::now();
        if now.duration_since(state.window_start) >= self.window {
            // Window expired, reset
            state.window_start = now;
            state.count = 1;
            return Ok(());
        }

        if state.count >= self.max_ops {
            return Err(FsError::RateLimitExceeded {
                path: path.to_path_buf(),
                limit: self.max_ops,
                window_secs: self.window.as_secs(),
            });
        }

        state.count += 1;
        Ok(())
    }
}

impl<B: FsRead> FsRead for RateLimit<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        self.check_rate_limit(path)?;
        self.inner.read(path)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        self.check_rate_limit(path)?;
        self.inner.exists(path)
    }

    // ... all methods call check_rate_limit(path) first
}
}

Considerations

  • Fixed window vs sliding window: Fixed window is simpler and sufficient for most use cases.
  • Thread safety: Uses RwLock for state. Could optimize with atomics for lock-free path.
  • What counts as an operation? Each method call counts as 1 operation.

Verdict: ✅ Implementable

Straightforward with interior mutability.


5. DryRun

Complexity: Moderate State: Operation log Dependencies: None

Implementation

#![allow(unused)]
fn main() {
use std::sync::RwLock;

pub struct DryRun<B> {
    inner: B,
    operations: RwLock<Vec<String>>,
}

impl<B> DryRun<B> {
    pub fn operations(&self) -> Vec<String> {
        self.operations.read().unwrap().clone()
    }

    pub fn clear(&self) {
        self.operations.write().unwrap().clear();
    }

    fn log(&self, op: String) {
        self.operations.write().unwrap().push(op);
    }
}

impl<B: FsRead> FsRead for DryRun<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        // Reads execute normally - we need real state to test against
        self.inner.read(path)
    }

    // All read operations pass through unchanged
}

impl<B: FsWrite> FsWrite for DryRun<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();
        self.log(format!("write {} ({} bytes)", path.display(), data.len()));
        Ok(())  // Don't actually write
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        self.log(format!("remove_file {}", path.display()));
        Ok(())  // Don't actually remove
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
        let path = path.as_ref();
        self.log(format!("open_write {}", path.display()));
        // Return a sink that discards all writes
        Ok(Box::new(std::io::sink()))
    }

    // ... similar for all write operations
}

impl<B: FsDir> FsDir for DryRun<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        self.inner.read_dir(path)  // Pass through
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        self.log(format!("create_dir {}", path.display()));
        Ok(())
    }

    // ... similar for all directory mutations
}
}

Semantics Clarification

DryRun is NOT an isolation layer. It’s for answering “what would this code do?”

  • Reads see the real backend state (unchanged from before DryRun was applied)
  • Writes are logged but not executed
  • After a dry write, reads won’t see the change (because it wasn’t written)

This is intentional. For isolation, use MemoryBackend::clone() for snapshots.

Verdict: ✅ Implementable

The semantics are clear once documented. Uses std::io::sink() for discarding streamed writes.


6. PathFilter

Complexity: Moderate State: Compiled glob patterns Dependencies: globset crate

Implementation

#![allow(unused)]
fn main() {
use globset::{Glob, GlobSet, GlobSetBuilder};

pub struct PathFilter<B> {
    inner: B,
    rules: Vec<PathRule>,
    compiled: GlobSet,  // For efficient matching
}

enum PathRule {
    Allow(String),
    Deny(String),
}

impl<B> PathFilter<B> {
    fn check_access(&self, path: &Path) -> Result<(), FsError> {
        let path_str = path.to_string_lossy();

        for rule in &self.rules {
            match rule {
                PathRule::Allow(pattern) => {
                    if glob_matches(pattern, &path_str) {
                        return Ok(());
                    }
                }
                PathRule::Deny(pattern) => {
                    if glob_matches(pattern, &path_str) {
                        return Err(FsError::AccessDenied {
                            path: path.to_path_buf(),
                            reason: format!("path matches deny pattern: {}", pattern),
                        });
                    }
                }
            }
        }

        // Default: deny if no rules matched
        Err(FsError::AccessDenied {
            path: path.to_path_buf(),
            reason: "no matching allow rule".to_string(),
        })
    }
}

impl<B: FsRead> FsRead for PathFilter<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();
        self.check_access(path)?;
        self.inner.read(path)
    }

    // ... all methods check access first
}

impl<B: FsDir> FsDir for PathFilter<B> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        let path = path.as_ref();
        self.check_access(path)?;

        let inner_iter = self.inner.read_dir(path)?;

        // Filter the iterator to exclude denied entries
        Ok(ReadDirIter::new(FilteredDirIter {
            inner: inner_iter,
            rules: self.rules.clone(),  // Copy rules for filtering
        }))
    }
}

// Custom iterator that filters denied entries
struct FilteredDirIter {
    inner: ReadDirIter,
    rules: Vec<PathRule>,
}

impl Iterator for FilteredDirIter {
    type Item = Result<DirEntry, FsError>;

    fn next(&mut self) -> Option<Self::Item> {
        loop {
            match self.inner.next()? {
                Ok(entry) => {
                    if self.is_allowed(&entry.path) {
                        return Some(Ok(entry));
                    }
                    // Skip denied entries (don't reveal their existence)
                }
                Err(e) => return Some(Err(e)),
            }
        }
    }
}
}

Considerations

  • Rule evaluation order: First match wins, consistent with firewall rules.
  • Default policy: Deny if no rules match (secure by default).
  • Directory listing: Filters out denied entries so their existence isn’t revealed.
  • Parent directory access: If you allow /workspace/**, accessing /workspace itself needs to be allowed.

Implementation Detail: ReadDirIter Filtering

Our ReadDirIter type needs to support wrapping. Options:

#![allow(unused)]
fn main() {
// Option 1: ReadDirIter is a trait object
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send>);

// Option 2: ReadDirIter has a filter method
impl ReadDirIter {
    pub fn filter<F>(self, predicate: F) -> ReadDirIter
    where
        F: Fn(&DirEntry) -> bool + Send + 'static
    { ... }
}
}

Recommendation: Option 1 (trait object) is more flexible and aligns with open_read/open_write returning Box<dyn ...>.

Verdict: ✅ Implementable

Requires ReadDirIter to be a trait object wrapper (already the case) so we can filter entries.


7. Cache

Complexity: Moderate State: LRU cache with entries Dependencies: lru crate (or custom implementation)

Implementation

#![allow(unused)]
fn main() {
use lru::LruCache;
use std::sync::RwLock;
use std::time::{Duration, Instant};

pub struct Cache<B> {
    inner: B,
    cache: RwLock<LruCache<PathBuf, CacheEntry>>,
    max_entry_size: usize,
    ttl: Duration,
}

struct CacheEntry {
    data: Vec<u8>,
    metadata: Metadata,
    inserted_at: Instant,
}

impl<B: FsRead> FsRead for Cache<B> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();

        // Check cache
        {
            let cache = self.cache.read().unwrap();
            if let Some(entry) = cache.peek(path) {
                if entry.inserted_at.elapsed() < self.ttl {
                    return Ok(entry.data.clone());
                }
            }
        }

        // Cache miss - fetch from backend
        let data = self.inner.read(path)?;

        // Store in cache if not too large
        if data.len() <= self.max_entry_size {
            let metadata = self.inner.metadata(path)?;
            let mut cache = self.cache.write().unwrap();
            cache.put(path.to_path_buf(), CacheEntry {
                data: data.clone(),
                metadata,
                inserted_at: Instant::now(),
            });
        }

        Ok(data)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        let path = path.as_ref();

        // Check cache for metadata
        {
            let cache = self.cache.read().unwrap();
            if let Some(entry) = cache.peek(path) {
                if entry.inserted_at.elapsed() < self.ttl {
                    return Ok(entry.metadata.clone());
                }
            }
        }

        // Fetch from backend
        self.inner.metadata(path)
    }

    fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
        // DO NOT CACHE - streams are for large files
        self.inner.open_read(path)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        // Could cache this too, or derive from metadata cache
        let path = path.as_ref();
        {
            let cache = self.cache.read().unwrap();
            if let Some(entry) = cache.peek(path) {
                if entry.inserted_at.elapsed() < self.ttl {
                    return Ok(true);  // If in cache, it exists
                }
            }
        }
        self.inner.exists(path)
    }
}

impl<B: FsWrite> FsWrite for Cache<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();
        let result = self.inner.write(path, data)?;

        // Invalidate cache entry
        let mut cache = self.cache.write().unwrap();
        cache.pop(path);

        Ok(result)
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();
        let result = self.inner.remove_file(path)?;

        // Invalidate cache entry
        let mut cache = self.cache.write().unwrap();
        cache.pop(path);

        Ok(result)
    }

    // ... all mutations invalidate cache
}
}

What Gets Cached

MethodCached?Reason
read()YesSmall files benefit from caching
read_to_string()YesSame as read
read_range()MaybeCould cache full file, serve ranges from cache
metadata()YesFrequently accessed
exists()DerivedCan derive from metadata cache
open_read()NoStreams are for large files that shouldn’t be cached
read_dir()MaybeDirectory listings change frequently

Verdict: ✅ Implementable

Standard LRU cache pattern. Key decision: don’t cache open_read() streams.


8. Quota

Complexity: High State: Usage counters (requires accurate tracking) Dependencies: None

The Challenge

Quota must track:

  • Total bytes used
  • Total file count
  • Total directory count
  • Per-directory entry count (optional)
  • Maximum path depth (optional)

The tricky part: streaming writes via open_write(). We must track bytes as they’re written, not just when the operation completes.

Implementation

#![allow(unused)]
fn main() {
use std::sync::{Arc, RwLock};
use std::io::Write;

pub struct Quota<B> {
    inner: B,
    config: QuotaConfig,
    usage: Arc<RwLock<QuotaUsage>>,
}

struct QuotaConfig {
    max_total_size: Option<u64>,
    max_file_size: Option<u64>,
    max_node_count: Option<u64>,
    max_dir_entries: Option<u64>,  // Max entries per directory
    max_path_depth: Option<usize>,
}

/// Current usage statistics.
#[derive(Debug, Clone, Default)]
pub struct Usage {
    pub total_size: u64,
    pub file_count: u64,
    pub dir_count: u64,
}

/// Configured limits.
#[derive(Debug, Clone)]
pub struct Limits {
    pub max_total_size: Option<u64>,
    pub max_file_size: Option<u64>,
    pub max_node_count: Option<u64>,
    pub max_dir_entries: Option<u64>,
    pub max_path_depth: Option<usize>,
}

/// Remaining capacity.
#[derive(Debug, Clone)]
pub struct Remaining {
    pub bytes: Option<u64>,
    pub nodes: Option<u64>,
    pub can_write: bool,
}

struct QuotaUsage {
    total_size: u64,
    file_count: u64,
    dir_count: u64,
}

impl Default for QuotaUsage {
    fn default() -> Self {
        Self { total_size: 0, file_count: 0, dir_count: 0 }
    }
}

impl<B> Quota<B> {
    /// Get current usage statistics.
    pub fn usage(&self) -> Usage {
        let u = self.usage.read().unwrap();
        Usage {
            total_size: u.total_size,
            file_count: u.file_count,
            dir_count: u.dir_count,
        }
    }

    /// Get configured limits.
    pub fn limits(&self) -> Limits {
        Limits {
            max_total_size: self.config.max_total_size,
            max_file_size: self.config.max_file_size,
            max_node_count: self.config.max_node_count,
            max_dir_entries: self.config.max_dir_entries,
            max_path_depth: self.config.max_path_depth,
        }
    }

    /// Get remaining capacity.
    pub fn remaining(&self) -> Remaining {
        let u = self.usage.read().unwrap();
        let bytes = self.config.max_total_size.map(|max| max.saturating_sub(u.total_size));
        let nodes = self.config.max_node_count.map(|max| max.saturating_sub(u.file_count + u.dir_count));
        Remaining {
            bytes,
            nodes,
            can_write: bytes.map(|b| b > 0).unwrap_or(true),
        }
    }
}

impl<B: Fs> Quota<B> {
    /// Create Quota middleware with explicit config.
    /// Prefer `QuotaLayer::builder()` for the Layer pattern.
    pub fn with_config(inner: B, config: QuotaConfig) -> Result<Self, FsError> {
        // IMPORTANT: Scan backend to initialize usage counters
        let usage = Self::scan_usage(&inner)?;

        Ok(Self {
            inner,
            config,
            usage: Arc::new(RwLock::new(usage)),
        })
    }

    fn scan_usage(backend: &B) -> Result<QuotaUsage, FsError> {
        let mut usage = QuotaUsage::default();
        Self::scan_dir(backend, Path::new("/"), &mut usage)?;
        Ok(usage)
    }

    fn scan_dir(backend: &B, path: &Path, usage: &mut QuotaUsage) -> Result<(), FsError> {
        for entry in backend.read_dir(path)? {
            let entry = entry?;
            let meta = backend.metadata(&entry.path)?;

            if meta.is_file() {
                usage.file_count += 1;
                usage.total_size += meta.size;
            } else if meta.is_dir() {
                usage.dir_count += 1;
                Self::scan_dir(backend, &entry.path, usage)?;
            }
        }
        Ok(())
    }

    fn check_size_limit(&self, path: &Path, additional_bytes: u64) -> Result<(), FsError> {
        let usage = self.usage.read().unwrap();

        if let Some(max) = self.config.max_total_size {
            if usage.total_size + additional_bytes > max {
                return Err(FsError::QuotaExceeded {
                    path: path.to_path_buf(),
                    limit: max,
                    requested: additional_bytes,
                    usage: usage.total_size,
                });
            }
        }

        Ok(())
    }

    fn check_node_limit(&self, path: &Path) -> Result<(), FsError> {
        if let Some(max) = self.config.max_node_count {
            let usage = self.usage.read().unwrap();
            if usage.file_count + usage.dir_count >= max {
                return Err(FsError::QuotaExceeded {
                    path: path.to_path_buf(),
                    limit: max,
                    requested: 1,
                    usage: usage.file_count + usage.dir_count,
                });
            }
        }
        Ok(())
    }

    fn check_dir_entries(&self, parent: &Path) -> Result<(), FsError>
    where B: FsDir {
        if let Some(max) = self.config.max_dir_entries {
            // Count entries in parent directory
            let count = self.inner.read_dir(parent)?
                .filter(|e| e.is_ok())
                .count() as u64;
            if count >= max {
                return Err(FsError::QuotaExceeded {
                    path: parent.to_path_buf(),
                    limit: max,
                    requested: 1,
                    usage: count,
                });
            }
        }
        Ok(())
    }

    fn check_path_depth(&self, path: &Path) -> Result<(), FsError> {
        if let Some(max) = self.config.max_path_depth {
            let depth = path.components().count();
            if depth > max {
                return Err(FsError::QuotaExceeded {
                    path: path.to_path_buf(),
                    limit: max as u64,
                    requested: depth as u64,
                    usage: depth as u64,
                });
            }
        }
        Ok(())
    }
}

impl<B: FsWrite + FsRead + FsDir> FsWrite for Quota<B> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();
        let new_size = data.len() as u64;

        // Check path depth limit
        self.check_path_depth(path)?;

        // Check per-file limit
        if let Some(max) = self.config.max_file_size {
            if new_size > max {
                return Err(FsError::FileSizeExceeded {
                    path: path.to_path_buf(),
                    size: new_size,
                    limit: max,
                });
            }
        }

        // Get old size (if file exists)
        let old_size = self.inner.metadata(path)
            .map(|m| m.size)
            .unwrap_or(0);

        // If creating a new file, check node count and dir entries
        let is_new_file = old_size == 0;
        if is_new_file {
            self.check_node_limit(path)?;
            if let Some(parent) = path.parent() {
                self.check_dir_entries(parent)?;
            }
        }

        let size_delta = new_size as i64 - old_size as i64;

        if size_delta > 0 {
            self.check_size_limit(path, size_delta as u64)?;
        }

        // Perform write
        self.inner.write(path, data)?;

        // Update usage
        let mut usage = self.usage.write().unwrap();
        usage.total_size = (usage.total_size as i64 + size_delta) as u64;
        if is_new_file {
            usage.file_count += 1;
        }

        Ok(())
    }

    fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
        let path = path.as_ref().to_path_buf();

        // Get the underlying writer
        let inner_writer = self.inner.open_write(&path)?;

        // Wrap in a counting writer
        Ok(Box::new(QuotaWriter {
            inner: inner_writer,
            path,
            bytes_written: 0,
            usage: Arc::clone(&self.usage),
            max_file_size: self.config.max_file_size,
            max_total_size: self.config.max_total_size,
        }))
    }
}

/// Wrapper that counts bytes and enforces quota on streaming writes
struct QuotaWriter {
    inner: Box<dyn Write + Send>,
    path: PathBuf,
    bytes_written: u64,
    usage: Arc<RwLock<QuotaUsage>>,
    max_file_size: Option<u64>,
    max_total_size: Option<u64>,
}

impl Write for QuotaWriter {
    fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
        let additional = buf.len() as u64;

        // Check per-file limit
        if let Some(max) = self.max_file_size {
            if self.bytes_written + additional > max {
                return Err(std::io::Error::new(
                    std::io::ErrorKind::Other,
                    "file size limit exceeded"
                ));
            }
        }

        // Check total size limit
        if let Some(max) = self.max_total_size {
            let usage = self.usage.read().unwrap();
            if usage.total_size + additional > max {
                return Err(std::io::Error::new(
                    std::io::ErrorKind::Other,
                    "quota exceeded"
                ));
            }
        }

        // Write to inner
        let written = self.inner.write(buf)?;

        // Update counters
        self.bytes_written += written as u64;
        let mut usage = self.usage.write().unwrap();
        usage.total_size += written as u64;

        Ok(written)
    }

    fn flush(&mut self) -> std::io::Result<()> {
        self.inner.flush()
    }
}

impl Drop for QuotaWriter {
    fn drop(&mut self) {
        // If we need to track "committed" vs "in-progress" writes,
        // this is where we'd finalize the accounting
    }
}

impl<B: FsDir + FsRead> FsDir for Quota<B> {
    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        // Check path depth
        self.check_path_depth(path)?;

        // Check node count
        self.check_node_limit(path)?;

        // Check parent directory entries
        if let Some(parent) = path.parent() {
            self.check_dir_entries(parent)?;
        }

        // Create directory
        self.inner.create_dir(path)?;

        // Update usage
        let mut usage = self.usage.write().unwrap();
        usage.dir_count += 1;

        Ok(())
    }

    // create_dir_all, remove_dir, etc. delegate similarly
    // ...
}
}

Challenges and Solutions

ChallengeSolution
Initial usage unknownScan backend on construction
Streaming writesQuotaWriter wrapper counts bytes
Concurrent writesRwLock on usage counters
File replacementCalculate delta (new_size - old_size)
New file detectionCheck exists() before write
Accurate accountingUpdate counters after successful operations
Node count limitCheck before creating files/directories
Dir entries limitCount parent entries before creating child
Path depth limitCount path components on create

Edge Cases

  1. Partial write failure: If inner.write() fails, don’t update counters.
  2. Streaming write failure: QuotaWriter updates optimistically; on error, may need rollback.
  3. Rename: Doesn’t change total size.
  4. Copy: Adds destination size.
  5. Append: Adds appended bytes only.

Verdict: ✅ Implementable

The most complex middleware, but well-understood patterns. The QuotaWriter wrapper is the key insight.


9. Overlay<B1, B2>

Complexity: High State: Two backends + whiteout tracking Dependencies: None

Overlay Semantics (Docker-style)

  • Lower layer (base): Read-only source
  • Upper layer: Writable overlay
  • Whiteouts: Files named .wh.<filename> mark deletions
  • Opaque directories: .wh..wh..opq hides entire lower directory

Implementation

#![allow(unused)]
fn main() {
pub struct Overlay<Lower, Upper> {
    lower: Lower,
    upper: Upper,
}

impl<Lower, Upper> Overlay<Lower, Upper> {
    const WHITEOUT_PREFIX: &'static str = ".wh.";
    const OPAQUE_MARKER: &'static str = ".wh..wh..opq";

    fn whiteout_path(path: &Path) -> PathBuf {
        let parent = path.parent().unwrap_or(Path::new("/"));
        let name = path.file_name().unwrap_or_default();
        parent.join(format!("{}{}", Self::WHITEOUT_PREFIX, name.to_string_lossy()))
    }

    fn is_whiteout(name: &str) -> bool {
        name.starts_with(Self::WHITEOUT_PREFIX)
    }

    fn original_name(whiteout_name: &str) -> &str {
        &whiteout_name[Self::WHITEOUT_PREFIX.len()..]
    }
}

impl<Lower: FsRead, Upper: FsRead> FsRead for Overlay<Lower, Upper> {
    fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
        let path = path.as_ref();

        // Check if whiteout exists in upper
        let whiteout = Self::whiteout_path(path);
        if self.upper.exists(&whiteout).unwrap_or(false) {
            return Err(FsError::NotFound { path: path.to_path_buf() });
        }

        // Try upper first
        match self.upper.read(path) {
            Ok(data) => return Ok(data),
            Err(FsError::NotFound { .. }) => {}
            Err(e) => return Err(e),
        }

        // Fall back to lower
        self.lower.read(path)
    }

    fn exists(&self, path: &Path) -> Result<bool, FsError> {
        let path = path.as_ref();

        // Check whiteout first
        let whiteout = Self::whiteout_path(path);
        if self.upper.exists(&whiteout).unwrap_or(false) {
            return Ok(false);  // Whited out = doesn't exist
        }

        // Check upper, then lower
        if self.upper.exists(path).unwrap_or(false) {
            return Ok(true);
        }

        self.lower.exists(path)
    }

    fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
        let path = path.as_ref();

        // Check whiteout
        let whiteout = Self::whiteout_path(path);
        if self.upper.exists(&whiteout).unwrap_or(false) {
            return Err(FsError::NotFound { path: path.to_path_buf() });
        }

        // Upper first, then lower
        match self.upper.metadata(path) {
            Ok(meta) => return Ok(meta),
            Err(FsError::NotFound { .. }) => {}
            Err(e) => return Err(e),
        }

        self.lower.metadata(path)
    }
}

impl<Lower: FsRead, Upper: Fs> FsWrite for Overlay<Lower, Upper> {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
        let path = path.as_ref();

        // Remove whiteout if it exists
        let whiteout = Self::whiteout_path(path);
        let _ = self.upper.remove_file(&whiteout);  // Ignore if doesn't exist

        // Write to upper
        self.upper.write(path, data)
    }

    fn remove_file(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();

        // Try to remove from upper
        let _ = self.upper.remove_file(path);

        // If file exists in lower, create whiteout
        if self.lower.exists(path).unwrap_or(false) {
            let whiteout = Self::whiteout_path(path);
            self.upper.write(&whiteout, b"")?;  // Create whiteout marker
        }

        Ok(())
    }

    fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
        let from = from.as_ref();
        let to = to.as_ref();

        // Copy-on-write: read from overlay, write to upper, whiteout original
        let data = self.read(from)?;
        self.write(to, &data)?;
        self.remove_file(from)?;

        Ok(())
    }
}

impl<Lower: FsRead + FsDir, Upper: Fs> FsDir for Overlay<Lower, Upper> {
    fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
        let path = path.as_ref();

        // Check for opaque marker
        let opaque_marker = path.join(Self::OPAQUE_MARKER);
        let is_opaque = self.upper.exists(&opaque_marker).unwrap_or(false);

        // Get entries from upper
        let mut entries: HashMap<String, DirEntry> = HashMap::new();
        let mut whiteouts: HashSet<String> = HashSet::new();

        if let Ok(upper_iter) = self.upper.read_dir(path) {
            for entry in upper_iter {
                let entry = entry?;
                let name = entry.name.clone();

                if Self::is_whiteout(&name) {
                    whiteouts.insert(Self::original_name(&name).to_string());
                } else if name != Self::OPAQUE_MARKER {
                    entries.insert(name, entry);
                }
            }
        }

        // Merge lower entries (unless opaque)
        if !is_opaque {
            if let Ok(lower_iter) = self.lower.read_dir(path) {
                for entry in lower_iter {
                    let entry = entry?;
                    let name = entry.name.clone();

                    // Skip if already in upper or whited out
                    if !entries.contains_key(&name) && !whiteouts.contains(&name) {
                        entries.insert(name, entry);
                    }
                }
            }
        }

        // Convert to iterator
        let entries_vec: Vec<_> = entries.into_values().map(Ok).collect();
        Ok(ReadDirIter::from_vec(entries_vec))
    }

    fn create_dir(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();

        // Remove whiteout if exists
        let whiteout = Self::whiteout_path(path);
        let _ = self.upper.remove_file(&whiteout);

        self.upper.create_dir(path)
    }

    fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
        let path = path.as_ref();

        // Try to remove from upper
        let _ = self.upper.remove_dir(path);

        // If exists in lower, create whiteout
        if self.lower.exists(path).unwrap_or(false) {
            let whiteout = Self::whiteout_path(path);
            self.upper.write(&whiteout, b"")?;
        }

        Ok(())
    }
}
}

Key Concepts

ConceptDescription
Whiteout.wh.<name> file in upper marks deletion of <name> from lower
Opaque.wh..wh..opq file in a directory hides all lower entries
Copy-on-writeFirst write copies from lower to upper, then modifies
Mergeread_dir() combines both layers, respecting whiteouts

Challenges

  1. Whiteout storage: Whiteouts are regular files - backend doesn’t need special support.
  2. Directory listing merge: Must be memory-buffered to remove duplicates and whiteouts.
  3. Rename: Implemented as copy + delete (standard CoW pattern).
  4. Symlinks in lower: Need to handle carefully - symlink targets might point to lower layer.

ReadDirIter Consideration

For Overlay, we need to buffer the merged directory listing. This means ReadDirIter must support construction from a Vec:

#![allow(unused)]
fn main() {
impl ReadDirIter {
    pub fn from_vec(entries: Vec<Result<DirEntry, FsError>>) -> Self {
        Self(Box::new(entries.into_iter()))
    }
}
}

Verdict: ✅ Implementable

The most complex middleware, but uses well-established patterns from OverlayFS. Key insight: whiteouts are just marker files, no special backend support needed.


Summary

MiddlewareComplexityKey Implementation Insight
ReadOnlyTrivialBlock all writes
RestrictionsSimpleFlag checks
TracingSimpleWrap operations in spans
RateLimitModerateAtomic counter + time window
DryRunModerateLog writes, return Ok without executing
PathFilterModerateGlob matching + filtered ReadDirIter
CacheModerateLRU cache, invalidate on writes
QuotaHighUsage counters + QuotaWriter wrapper
OverlayHighWhiteout markers + merged directory listing

Required Framework Features

These middleware implementations assume:

  1. ReadDirIter is a trait object wrapper - allows filtering and composition
  2. All methods use &self - interior mutability for state
  3. FsError has all necessary variants - ReadOnly, RateLimitExceeded, QuotaExceeded, AccessDenied, FeatureNotEnabled

All of these are already part of our design. All middleware are implementable.


Appendix: Layer Trait Implementation

Each middleware provides a corresponding Layer type for composition:

#![allow(unused)]
fn main() {
// Example for Quota
pub struct QuotaLayer {
    config: QuotaConfig,
}

impl QuotaLayer {
    pub fn builder() -> QuotaLayerBuilder<Unconfigured> {
        QuotaLayerBuilder::new()
    }
}

impl<B: Fs> Layer<B> for QuotaLayer {
    type Backend = Quota<B>;

    fn layer(self, backend: B) -> Self::Backend {
        Quota::with_config(backend, self.config).expect(\"quota initialization failed\")\n    }\n}

// Usage:
let fs = MemoryBackend::new()
    .layer(QuotaLayer::builder()
        .max_total_size(100_000_000)
        .build());
}

Lessons from Similar Projects

Analysis of issues from vfs and agentfs to inform AnyFS design.

This chapter documents problems encountered by similar projects and how AnyFS addresses them. These lessons are incorporated into our Implementation Plan and Backend Guide.


Summary

PriorityIssueAnyFS Response
1Panics instead of errorsNo-panic policy, always return Result
2Thread safety problemsConcurrent stress tests required
3Inconsistent path handlingNormalize in one place, test edge cases
4Poor error ergonomicsFsError with context fields
5Missing documentationPerformance & thread safety docs required
6Platform issuesCross-platform CI pipeline

1. Thread Safety Issues

What Happened

ProjectIssueProblem
vfs#72RwLock panic in production
vfs#47create_dir_all races with itself

Root cause: Insufficient synchronization in concurrent access patterns.

AnyFS Response

  • Test concurrent operations explicitly - stress test with multiple threads
  • Document thread safety guarantees per backend
  • Fs: Send bound is intentional
  • MemoryBackend uses Arc<RwLock<...>> for interior mutability

Required tests:

#![allow(unused)]
fn main() {
#[test]
fn test_concurrent_create_dir_all() {
    let backend = Arc::new(RwLock::new(create_backend()));
    let handles: Vec<_> = (0..10).map(|_| {
        let backend = backend.clone();
        std::thread::spawn(move || {
            let mut backend = backend.write().unwrap();
            let _ = backend.create_dir_all(std::path::Path::new("/a/b/c/d"));
        })
    }).collect();
    for handle in handles {
        handle.join().unwrap();
    }
}
}

2. Panics Instead of Errors

What Happened

ProjectIssueProblem
vfs#8AltrootFS panics when file doesn’t exist
vfs#23Unhandled edge cases cause panics
vfs#68MemoryFS panics in WebAssembly

Root cause: Using .unwrap() or .expect() on fallible operations.

AnyFS Response

No-panic policy: Never use .unwrap() or .expect() in library code.

#![allow(unused)]
fn main() {
// BAD - will panic
let entry = self.entries.get(&path).unwrap();

// GOOD - returns error
let entry = self.entries.get(&path)
    .ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
}

Edge cases that must return errors (not panic):

  • File doesn’t exist
  • Directory doesn’t exist
  • Path is empty string
  • Invalid UTF-8 in path
  • Parent directory missing
  • Type mismatch (file vs directory)
  • Concurrent access conflicts

3. Path Handling Inconsistencies

What Happened

ProjectIssueProblem
vfs#24Inconsistent path definition across backends
vfs#42Path join doesn’t behave Unix-like
vfs#22Non-UTF-8 path support questions

Root cause: Each backend implemented path handling differently.

AnyFS Response

  • Normalize paths in ONE place (FileStorage resolver for virtual backends; SelfResolving backends delegate to the OS)
  • Consistent semantics: always absolute, always / separator
  • Use &Path in core traits for object safety; provide impl AsRef<Path> at the ergonomic layer (FileStorage/FsExt)

Required conformance tests:

InputExpected Output
/foo/../bar/bar
/foo/./bar/foo/bar
//double//slash/double/slash
//
`` (empty)Error
/foo/bar//foo/bar

4. Static Lifetime Requirements

What Happened

ProjectIssueProblem
vfs#66Why does filesystem require 'static?

Root cause: Design decision that confused users and limited flexibility.

AnyFS Response

  • Avoid 'static bounds unless necessary
  • Our design: Fs: Send (not 'static)
  • Document why bounds exist when needed

What Happened

ProjectIssueProblem
vfs#81Symlink support missing entirely

Root cause: Symlinks are complex and were deferred indefinitely.

AnyFS Response

  • Symlinks supported via FsLink trait - backends that implement FsLink support symlinks
  • Compile-time capability - no FsLink impl = no symlinks (won’t compile)
  • Bound resolution depth (default: 40 hops)
  • strict-path prevents symlink escapes in VRootFsBackend

6. Error Type Ergonomics

What Happened

ProjectIssueProblem
vfs#33Error type hard to match programmatically

Root cause: Error enum wasn’t designed for pattern matching.

AnyFS Response

FsError includes context and is easy to match:

#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum FsError {
    #[error("not found: {path}")]
    NotFound { path: PathBuf },

    #[error("{operation}: already exists: {path}")]
    AlreadyExists { path: PathBuf, operation: &'static str },

    #[error("quota exceeded: limit {limit}, requested {requested}, usage {usage}")]
    QuotaExceeded { limit: u64, requested: u64, usage: u64 },

    #[error("feature not enabled: {feature} ({operation})")]
    FeatureNotEnabled { feature: &'static str, operation: &'static str },

    #[error("permission denied: {path} ({operation})")]
    PermissionDenied { path: PathBuf, operation: &'static str },

    // ...
}
}

7. Seek + Write Operations

What Happened

ProjectIssueProblem
vfs#35Missing file positioning features

Root cause: Initial API was too simple.

AnyFS Response

  • Streaming I/O: open_read/open_write return Box<dyn Read/Write + Send>
  • Seek support varies by backend - document which support it
  • Consider future: open_read_seek variant or capability query

8. Read-Only Filesystem Request

What Happened

ProjectIssueProblem
vfs#58Request for immutable filesystem

Root cause: No built-in way to enforce read-only access.

AnyFS Response

Already solved: ReadOnly<B> middleware blocks all writes.

#![allow(unused)]
fn main() {
use anyfs::{ReadOnly, FileStorage};
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

let readonly_fs = FileStorage::new(
    ReadOnly::new(SqliteBackend::open("archive.db")?)
);
// All write operations return FsError::ReadOnly
}

This validates our middleware approach.


9. Performance Issues

What Happened

ProjectIssueProblem
agentfs#130File deletion is slow
agentfs#135Benchmark hangs

Root cause: SQLite operations not optimized, FUSE overhead.

AnyFS Response

  • Batch operations where possible in SqliteBackend
  • Use transactions for multi-file operations
  • Document performance characteristics per backend
  • Keep mounting optional - core AnyFS stays a library; mount concerns are behind feature flags (fuse, winfsp)

Documentation requirement:

#![allow(unused)]
fn main() {
/// # Performance Characteristics
///
/// | Operation | Complexity | Notes |
/// |-----------|------------|-------|
/// | `read` | O(1) | Single DB query |
/// | `write` | O(n) | n = data size |
/// | `remove_dir_all` | O(n) | n = descendants |
pub struct SqliteBackend { ... }
}

10. Signal Handling / Shutdown

What Happened

ProjectIssueProblem
agentfs#129Doesn’t shutdown on SIGTERM

Root cause: FUSE mount cleanup issues.

AnyFS Response

  • Core stays a library - daemon/mount shutdown concerns are behind feature flags
  • Ensure Drop implementations clean up properly
  • SqliteBackend flushes on drop
#![allow(unused)]
fn main() {
impl Drop for SqliteBackend {
    fn drop(&mut self) {
        if let Err(e) = self.sync() {
            eprintln!("Warning: failed to sync on drop: {}", e);
        }
    }
}
}

11. Platform Compatibility

What Happened

ProjectIssueProblem
agentfs#132FUSE-T support (macOS)
agentfs#138virtio-fs support

Root cause: Platform-specific FUSE variants.

AnyFS Response

  • We isolate this - core traits stay pure; FUSE lives behind feature flags (fuse, winfsp) in the anyfs crate
  • Cross-platform by design - Memory and SQLite work everywhere
  • VRootFsBackend uses strict-path which handles Windows/Unix

CI requirement:

strategy:
  matrix:
    os: [ubuntu-latest, windows-latest, macos-latest]

12. Multiple Sessions / Concurrent Access

What Happened

ProjectIssueProblem
agentfs#126Can’t have multiple sessions on same filesystem

Root cause: Locking/concurrency design.

AnyFS Response

  • SqliteBackend uses WAL mode for concurrent readers
  • Document concurrency model per backend
  • MemoryBackend uses Arc<RwLock<...>> for sharing

Issues We Already Avoid

Our design decisions already prevent these problems:

Problem in OthersAnyFS Solution
No middleware patternTower-style composable middleware
No quota enforcementQuota<B> middleware
No read-only modeReadOnly<B> middleware
Symlink complexityFsLink trait (compile-time)
Path escape via symlinksstrict-path canonicalization
FUSE complexityIsolated behind feature flags
SQLite-onlyMultiple backends
Monolithic featuresComposable middleware

References

Open Questions & Future Considerations

Status: Resolved (future considerations tracked) Last Updated: 2025-12-28


This document captures previously open questions and design considerations. Unless explicitly marked as future, the items below are resolved.

Note: Final decisions live in the Architecture Decision Records.


Status: Resolved

Decision

  • Symlink support is a backend capability (via FsLink).
  • FileStorage resolves paths via pluggable PathResolver for non-SelfResolving backends.
  • The default IterativeResolver follows symlinks when FsLink is available. Custom resolvers can implement different behaviors.
  • SelfResolving backends delegate to the OS. strict-path prevents escapes.

Implications

  • If you need symlink-free semantics, use a backend that does not implement FsLink, or if using a backend that does implement FsLink, ensure no preexisting symlinks exist in the data (the FsLink trait provides creation capability, but you can choose not to call those methods).
  • Restrictions middleware controls permission-related operations only, not symlink creation (which is a trait-level capability).

Why

  • Virtual backends have no host filesystem to escape to; symlink resolution stays inside the virtual structure.
  • OS-backed backends cannot reliably disable symlink following without TOCTOU risks or platform-specific hacks.

Virtual vs Real Backends: Path Resolution

Status: Resolved (see also ADR-033 for PathResolver)

Question: Should path resolution logic be different for virtual backends (memory, SQLite) vs filesystem-based backends (StdFsBackend, VRootFsBackend)?

Resolution: FileStorage handles path resolution via pluggable PathResolver trait for non-SelfResolving backends. SelfResolving backends delegate to the OS, so FileStorage does not pre-resolve paths for them.

Backend TypePath ResolutionSymlink Handling
MemoryBackendPathResolver (default: IterativeResolver)Resolver follows symlinks
SqliteBackendPathResolver (default: IterativeResolver)Resolver follows symlinks
VRootFsBackendOS (implements SelfResolving)OS follows symlinks (strict-path prevents escapes)

Key design decisions:

  1. Backends that wrap a real filesystem implement the SelfResolving marker trait to tell FileStorage to skip resolution:
#![allow(unused)]
fn main() {
impl SelfResolving for VRootFsBackend {}
}
  1. Path resolution is pluggable via PathResolver trait (ADR-033). Built-in resolvers include:
    • IterativeResolver - default symlink-aware resolution (when backend implements FsLink)
    • NoOpResolver - for SelfResolving backends
    • CachingResolver - LRU cache wrapper around another resolver

Compression and Encryption

Question: Does the current design allow backends to compress/decompress or encrypt/decrypt files transparently?

Answer: Yes. The backend receives the data and stores it however it wants. A backend could:

  • Compress data before writing to SQLite
  • Encrypt blobs with a user-provided key
  • Use a remote object store with encryption at rest

This is an implementation detail of the backend, not visible to the FileStorage API.


Hooks and Callbacks

Question: Should AnyFS support hooks or callbacks for file operations (e.g., audit logging, validation)?

Considerations:

  • AgentFS (see comparison below) provides audit logging as a core feature
  • Hooks add complexity but enable powerful use cases
  • Could be implemented as a middleware pattern around FileStorage

Resolution: Implemented via Tracing middleware. Users can also wrap FileStorage or backends for custom hooks.


AgentFS Comparison

Note: There are two projects named “AgentFS”:

ProjectDescription
tursodatabase/agentfsFull AI agent runtime (Turso/libSQL)
cryptopatrick/agentfsRelated to AgentDB abstraction layer

This section focuses on Turso’s AgentFS, which has a published spec.

What AgentFS Provides

AgentFS is an agent runtime, not just a filesystem. It provides three integrated subsystems:

  1. Virtual Filesystem - POSIX-like, inode-based, chunked storage in SQLite
  2. Key-Value Store - Agent state and context storage
  3. Tool Call Audit Trail - Records all tool invocations for debugging/compliance

AnyFS vs AgentFS: Different Abstractions

ConcernAnyFSAgentFS
ScopeFilesystem abstractionAgent runtime
FilesystemFullFull
Key-Value storeNot our domainIncluded
Tool auditingTracing middlewareBuilt-in
BackendsMemory, SQLite, VRootFs, customSQLite only (spec)
MiddlewareComposable layersMonolithic

Relationship Options

AnyFS could be used BY AgentFS:

  • AgentFS could implement its filesystem portion using Fs trait
  • Our middleware (Quota, PathFilter, etc.) would work with their system

AgentFS-compatible backend for AnyFS:

  • Someone could implement Fs using AgentFS’s SQLite schema
  • Would enable interop with AgentFS tooling

What we should NOT do:

  • Add KV store to Fs (different abstraction, scope creep)
  • Add tool call auditing to core trait (that’s what Tracing middleware is for)

When to Use Which

Use CaseRecommendation
Need just filesystem operationsAnyFS
Need composable middleware (quota, sandboxing)AnyFS
Need full agent runtime (FS + KV + auditing)AgentFS
Need multiple backend types (memory, real FS)AnyFS
Need AgentFS-compatible SQLite formatAgentFS or custom AnyFS backend

Takeaway

AnyFS and AgentFS solve different problems at different layers:

  • AnyFS = filesystem abstraction with composable middleware
  • AgentFS = complete agent runtime with integrated storage

They can complement each other rather than compete.


VFS Crate Comparison

The vfs crate provides virtual filesystem abstractions with:

  • PhysicalFS: Host filesystem access
  • MemoryFS: In-memory storage
  • AltrootFS: Rooted filesystem (similar to our VRootFsBackend)
  • OverlayFS: Layered filesystem
  • EmbeddedFS: Compile resources into binary

Similarities with AnyFS:

  • Trait-based abstraction over storage
  • Memory and physical filesystem backends

Differences:

  • VFS doesn’t have SQLite backend
  • VFS doesn’t have policy/quota layer
  • AnyFS focuses on isolation and limits

Why not use VFS? VFS is a good library, but AnyFS’s design goals differ:

  1. We want SQLite as a first-class backend
  2. We need quota/limit enforcement
  3. We want feature whitelisting (least privilege)

FUSE Mount Support

Status: Designed - Part of anyfs crate (feature flags: fuse, winfsp)

What is FUSE? FUSE (Filesystem in Userspace) allows implementing filesystems in userspace rather than kernel code. It enables:

  • Mounting any backend as a real filesystem
  • Using standard Unix tools (ls, cat, etc.) on AnyFS containers
  • Integration with existing workflows

Resolution: Part of anyfs crate with feature flags:

  • Linux: FUSE (native) - fuse feature
  • macOS: macFUSE - fuse feature
  • Windows: WinFsp - winfsp feature

See Cross-Platform Mounting for full details.


Type-System Protection for Cross-Container Operations

Status: Resolved - User-defined wrapper types

Question: Should we use the type system to prevent accidentally mixing data between containers?

Resolution: Users who need type-safe domain separation can create wrapper types:

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;  // Ecosystem crate

// User-defined wrapper types provide compile-time safety
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);

let sandbox = SandboxFs(FileStorage::new(MemoryBackend::new()));
let userdata = UserDataFs(FileStorage::new(SqliteBackend::open("data.db")?));

fn process_sandbox(fs: &SandboxFs) { /* only accepts SandboxFs */ }

process_sandbox(&sandbox);   // OK
process_sandbox(&userdata);  // Compile error - different type!
}

This approach avoids generic parameter complexity while still enabling compile-time safety when needed. See FileStorage for details.


Naming Considerations

Based on review feedback, the following naming concerns were raised:

Current NameConcernAlternatives Considered
anyfs-traits“traits” is vagueanyfs-backend (adopted)
anyfs-containerCould imply DockerMerged into anyfs (adopted)
anyfsSounds like Hebrew “ani efes” (I am zero)anyfs retained for simplicity

Decision: Renamed anyfs-traits to anyfs-backend. Merged anyfs-container into anyfs.


POSIX Behavior

Question: How POSIX-compatible should AnyFS be?

Answer: AnyFS is not a POSIX emulator. We use std::fs-like naming and semantics for familiarity, but we don’t aim for full POSIX compliance. Specific differences:

  • Symlink-aware path resolution (FileStorage walks the virtual structure using metadata() and read_link())
  • No file descriptors or open file handles in the basic API
  • Simplified permissions model
  • No device files, FIFOs, or sockets

Async Support

Question: Should Fs traits be async?

Decision: Sync-first, async-ready (see ADR-010).

Rationale:

  • Built-in backends are naturally synchronous (std::fs, memory)
  • Ecosystem backends are also sync (e.g., rusqlite is sync)
  • No runtime dependency (tokio/async-std) required
  • Rust 1.75+ has native async traits, so adding later is low-cost

Async-ready design:

  • Traits require Send - compatible with async executors
  • Return types are Result<T, FsError> - works with async
  • No hidden blocking state
  • Methods are stateless per-call

Future path: When needed (e.g., S3/network backends), add parallel AsyncFs trait:

  • Separate trait, not replacing Fs
  • Blanket impl possible via spawn_blocking
  • No breaking changes to existing sync API

Summary

TopicDecision
Symlink securityBackend-defined (FsLink); VRootFsBackend uses strict-path for containment
Path resolutionFileStorage (symlink-aware); VRootFs = OS via SelfResolving
Compression/encryptionBackend responsibility
Hooks/callbacksTracing middleware
FUSE mountPart of anyfs crate (fuse, winfsp feature flags)
Type-system protectionUser-defined wrapper types (e.g., struct SandboxFs(FileStorage<B>))
POSIX compatibilityNot a goal
truncateAdded to FsWrite
sync / fsyncAdded to FsSync
Async supportSync-first, async-ready (ADR-010)
Layer traitTower-style composition (ADR-011)
LoggingTracing with tracing ecosystem (ADR-012)
Extension methodsFsExt (ADR-013)
Zero-copy bytesOptional bytes feature (ADR-014)
Error contextContextual FsError (ADR-015)
BackendStack builderFluent API via .layer()
Path-based access controlPathFilter middleware (ADR-016)
Read-only modeReadOnly middleware (ADR-017)
Rate limitingRateLimit middleware (ADR-018)
Dry-run testingDryRun middleware (ADR-019)
Read cachingCache middleware (ADR-020)
Union filesystemOverlay middleware (ADR-021)

Design Review: Rust Community Alignment

This document critically reviews AnyFS design decisions against Rust community expectations and best practices. The goal is to identify potential friction points before implementation.


Summary

CategoryIssues FoundStatus
Critical (must fix)2✅ Fixed
Should fix4🟡 In progress
Document clearly3🟢 Ongoing
Non-issues5✅ Verified

✅ Critical Issues (Fixed)

1. FsError Missing #[non_exhaustive] — FIXED

Problem: Our FsError enum doesn’t have #[non_exhaustive]. This is a semver hazard.

Status: ✅ Fixed in design-overview.md. FsError now has #[non_exhaustive], thiserror::Error derive, and From<std::io::Error> impl.

#![allow(unused)]
fn main() {
// Current (problematic)
pub enum FsError {
    NotFound { path: PathBuf },
    AlreadyExists { path: PathBuf, operation: &'static str },
    // ...
}

// If we add a variant in 1.1:
pub enum FsError {
    NotFound { path: PathBuf },
    AlreadyExists { path: PathBuf, operation: &'static str },
    TooManySymlinks { path: PathBuf },  // NEW - breaks exhaustive matches!
}
}

Impact: Users with exhaustive matches will get compile errors on minor version bumps.

Fix:

#![allow(unused)]
fn main() {
#[non_exhaustive]
#[derive(Debug, thiserror::Error)]
pub enum FsError {
    #[error("not found: {path}")]
    NotFound { path: PathBuf },
    // ...
}
}

Also needed:

  • impl std::error::Error for FsError
  • impl From<std::io::Error> for FsError
  • Consider #[non_exhaustive] on struct variants too

2. Documentation Shows &mut self Despite ADR-023 — FIXED

Problem: Several code examples still show &mut self or &mut impl Fs.

Status: ✅ Fixed. All examples in design-overview.md and files-container.md now use &self.

#![allow(unused)]
fn main() {
// In design-overview.md line 346:
fn with_symlinks(fs: &mut (impl Fs + FsLink)) {  // WRONG
    fs.write("/target.txt", b"content")?;
    fs.symlink("/target.txt", "/link.txt")?;
}

// Should be:
fn with_symlinks(fs: &(impl Fs + FsLink)) {  // Correct
    fs.write("/target.txt", b"content")?;
    fs.symlink("/target.txt", "/link.txt")?;
}
}

Impact: Contradicts ADR-023 (interior mutability). Confuses implementers.

Fix: Audit all examples and ensure &self everywhere.


🟡 Should Fix

3. Sync-Only Design May Limit Adoption

Problem: No async support. Many modern Rust projects are async-first.

Current stance (ADR-024): Sync now, async later via parallel traits.

Community expectation: Projects like tokio, async-std are dominant. Users may expect:

#![allow(unused)]
fn main() {
async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
}

Mitigation:

  1. Document clearly: “Sync-first by design, async planned”
  2. Ensure Send + Sync bounds enable spawn_blocking wrapper
  3. Consider shipping anyfs-async adapter crate early

Recommendation: Acceptable. Async support is a future consideration.


4. Interior Mutability May Surprise Users

Problem: &self for write operations is unusual in Rust.

#![allow(unused)]
fn main() {
// Our design
pub trait FsWrite: Send + Sync {
    fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}

// What users might expect (std::io::Write pattern)
pub trait FsWrite {
    fn write(&mut self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
}

Why we chose this (ADR-023):

  • Filesystems are shared resources
  • Enables concurrent access
  • Matches how std::fs::write() works (takes path, not mutable handle)

Potential friction:

  • Users may try to use &mut self patterns
  • May conflict with borrowck mental models

Mitigation:

  1. Document prominently with rationale
  2. Show examples of concurrent usage
  3. Explain: “Like std::fs, not like std::io::Write”

Recommendation: Keep the design, but add prominent documentation.


5. Layer Trait Doesn’t Match Tower Exactly

Problem: Tower’s Layer trait has a different signature:

#![allow(unused)]
fn main() {
// Tower's Layer
pub trait Layer<S> {
    type Service;
    fn layer(&self, inner: S) -> Self::Service;
}

// Our Layer (appears to be)
pub trait Layer<B: Fs> {
    type Backend: Fs;
    fn layer(self, backend: B) -> Self::Backend;
}
}

Differences:

  1. Tower uses &self, we use self (consumes the layer)
  2. Tower calls it Service, we call it Backend
  3. Tower doesn’t require bounds on S

Impact: Users familiar with Tower may be confused.

Options:

  1. Match Tower exactly - maximum familiarity
  2. Keep our design - self consumption is arguably cleaner for our use case
  3. Document differences - explain why we diverge

Recommendation: Document the differences. Our self consumption prevents accidental reuse of configured layers, which is appropriate for our use case.


6. No #[must_use] on Results

Problem: Functions returning Result should have #[must_use] to catch ignored errors.

#![allow(unused)]
fn main() {
// Current
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;

// Better
#[must_use]
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}

Impact: Users might accidentally ignore errors.

Fix: Add #[must_use] to all Result-returning methods, or use #[must_use] on the trait itself.


🟢 Document Clearly

7. Path Semantics Are Virtual, Not OS

Consideration: Our paths are virtual filesystem paths, not OS paths.

#![allow(unused)]
fn main() {
// On Windows, this works the same as on Unix:
fs.write("/documents/file.txt", data)?;  // Forward slashes always
}

Potential confusion:

  • Windows users might expect backslashes
  • Path normalization rules may differ from OS

Mitigation: Document:

  • “Paths are virtual, always use forward slashes”
  • “Path resolution is platform-independent”
  • Show examples on Windows

8. Fs as Marker Trait Pattern

Pattern:

#![allow(unused)]
fn main() {
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}
}

This is valid Rust but may surprise some users. They might expect:

#![allow(unused)]
fn main() {
pub trait Fs {
    fn read(...);
    fn write(...);
    // etc
}
}

Why we do it:

  • Granular traits for partial implementations
  • Middleware only needs to implement what it wraps

Mitigation: Document the pattern clearly with examples.


9. Builder Pattern Requires Configuration

Pattern:

#![allow(unused)]
fn main() {
// This won't compile - no build() on unconfigured builder
let quota = QuotaLayer::builder().build();  // Error!

// Must configure at least one limit
let quota = QuotaLayer::builder()
    .max_total_size(1_000_000)
    .build();  // OK
}

This is intentional (ADR-022) but may surprise users expecting defaults.

Mitigation: Clear error messages and documentation.


✅ Non-Issues (We’re Doing It Right)

10. Object-Safe Path Parameters

✅ Core traits use &Path; ergonomics come from FileStorage/FsExt.

11. Send + Sync Requirements

✅ Standard for thread-safe abstractions. Enables use across async boundaries.

12. Feature-Gated Backends

✅ Standard Cargo pattern. Reduces compile time for unused backends.

13. Strategic Boxing (ADR-025)

✅ Matches Tower/Axum approach. Well-documented rationale.

14. Generic Middleware Composition

✅ Zero-cost abstractions. Idiomatic Rust.


Action Items

Before MVP

PriorityIssueAction
🔴 CriticalFsError non_exhaustiveAdd #[non_exhaustive] and thiserror derive
🔴 Critical&mut in examplesAudit all examples for &self consistency
🟡 Should#[must_use]Add to all Result-returning methods
🟢 DocumentInterior mutabilityAdd prominent section explaining why
🟢 DocumentPath semanticsAdd section on virtual paths

Should Fix

PriorityIssueAction
🟡 ShouldAsync supportShip anyfs-async or document workaround
🟡 ShouldLayer trait docsDocument differences from Tower
🟢 DocumentMarker trait patternExplain Fs = FsRead + FsWrite + FsDir

Comparison to Axum’s Success Factors

FactorAxumAnyFSAssessment
Tower integrationNativeInspired by🟡 Different but similar
Async supportYesNo (planned)🟡 Gap, but documented
Error handlingthiserrorPlanned🔴 Must add
DocumentationExcellentIn progress🟡 Continue
ExamplesComprehensiveIn progress🟡 Continue
Ecosystem fittokio nativestd::fs native✅ Different target

Conclusion

Overall assessment: The design is sound and follows Rust best practices. The main gaps are:

  1. Critical: #[non_exhaustive] on FsError (semver hazard)
  2. Critical: Inconsistent &mut in examples (contradicts ADR-023)
  3. Important: No async yet (but documented path forward)
  4. Minor: Documentation gaps (being addressed)

With these fixes, the design should be well-received by the Rust community.

License

This project is dual-licensed to allow for open collaboration on the design while ensuring the resulting code examples can be freely used in software implementations.


Documentation (Text and Media)

The text, diagrams, and other media in this design manual are licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

You are free to:

  • Share — copy and redistribute the material in any medium or format
  • Adapt — remix, transform, and build upon the material for any purpose, even commercially

Under the following terms:

  • Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made
  • ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original

Full CC BY-SA 4.0 Legal Text


Code Samples

Any code snippets, examples, or software implementation details contained within this manual are dual-licensed under your choice of:

This is the same licensing model used by the Rust ecosystem.


Summary

Content TypeLicense
Documentation textCC BY-SA 4.0
Diagrams and mediaCC BY-SA 4.0
Code snippetsMIT OR Apache-2.0
Example implementationsMIT OR Apache-2.0

Attribution

When attributing this work, please use:

AnyFS Design Manual by David Krasnitsky, licensed under CC BY-SA 4.0 (documentation) and MIT/Apache-2.0 (code samples). https://github.com/DK26/anyfs-design-manual