Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Backends Guide

This guide explains each backend available for AnyFS—both built-in (in anyfs) and ecosystem crates (anyfs-sqlite, anyfs-indexed)—how they work internally, when to use them, and the trade-offs involved.


Quick Reference: Which Backend Should You Use?

TL;DR — Pick the first match from top to bottom:

Your SituationBest ChoiceWhy
Writing testsMemoryBackendFast, isolated, no cleanup
Running in WASM/browserMemoryBackendSimplest option
Need encrypted single-file storageanyfs-sqlite: SqliteBackendAES-256 via encryption feature (ecosystem crate)
Need portable single-file databaseanyfs-sqlite: SqliteBackendCross-platform, ACID (ecosystem crate)
Large files (>100MB) with path isolationanyfs-indexed: IndexedBackendVirtual paths + native disk I/O (ecosystem crate)
Containing untrusted code to a directoryVRootFsBackendPrevents path traversal attacks
Working with real files in trusted environmentStdFsBackendDirect OS operations
Need layered filesystem (container-like)Overlay (middleware)Base + writable upper layer

⚠️ Security Warning: StdFsBackend provides NO isolation. Never use with untrusted input.

Ecosystem Crates: Complex backends like SqliteBackend and IndexedBackend live in separate crates (anyfs-sqlite, anyfs-indexed) because they require internal runtime complexity (connection pooling, sharding, chunking).


Backend Categories

AnyFS backends fall into two fundamental categories based on who resolves paths:

CategoryPath ResolutionSymlink HandlingIsolation
Type 1: Virtual FilesystemPathResolver (pluggable)Simulated by AnyFSComplete
Type 2: Real FilesystemOperating SystemDelegated to OSPartial/None

Type 1: Virtual Filesystem Backends

These backends store filesystem data in an abstract format (memory, database, etc.). AnyFS handles path resolution via pluggable PathResolver (see ADR-033), including:

  • Path traversal (.., .)
  • Symlink following (simulated)
  • Hard link tracking (simulated)
  • Path normalization

Key benefit: Complete isolation from the host OS. Identical behavior across all platforms.

Type 2: Real Filesystem Backends

These backends delegate operations to the actual operating system. The OS handles path resolution, which means:

  • Native symlink behavior
  • Native permission enforcement
  • Platform-specific edge cases
  • Potential security considerations (path escapes)

Key benefit: Native performance and compatibility with existing files.


Type 1: Virtual Filesystem Backends

MemoryBackend

An in-memory filesystem. All data lives in RAM and is lost when the process exits.

#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};

let fs = FileStorage::new(MemoryBackend::new());
fs.write("/data.txt", b"Hello, World!")?;
}

How It Works

  • Files and directories stored in a tree structure (HashMap or similar)
  • Symlinks stored as data pointing to target paths
  • Hard links share the same underlying data node
  • All operations are memory-only (no disk I/O)
  • Supports snapshots via Clone and persistence via save_to()/load_from()

Performance

OperationSpeedNotes
Read/WriteVery FastNo I/O, pure memory operations
Path ResolutionVery FastIn-memory tree traversal
Large Files⚠️ Memory-boundLimited by available RAM

Advantages

  • Fastest backend - no disk I/O overhead
  • Deterministic - perfect for testing
  • Portable - works on all platforms including WASM
  • Snapshots - Clone creates instant backups
  • No cleanup - no temp files to delete

Disadvantages

  • Volatile - data lost on process exit (unless serialized)
  • Memory-limited - large filesystems consume RAM
  • No persistence - must explicitly save/load state

When to Use

Use CaseRecommendation
Unit testsIdeal - fast, isolated, deterministic
Integration testsIdeal - no filesystem pollution
Temporary workspacesGood - fast scratch space
Build cachesGood - if fits in memory
WASM/BrowserIdeal - simplest option
Large file storageAvoid - use anyfs-sqlite or disk
Persistent dataAvoid - unless you handle serialization

✅ USE MemoryBackend when:

  • Writing unit tests (fast, isolated, deterministic)
  • Writing integration tests (no filesystem pollution)
  • Building temporary workspaces or scratch space
  • Caching data that fits in memory
  • Running in WASM/browser environments (simplest option)
  • Need instant snapshots via Clone

❌ DON’T USE MemoryBackend when:

  • Storing files larger than available RAM
  • Data must survive process restart (use anyfs-sqlite)
  • Working with existing files on disk (use VRootFsBackend)

SqliteBackend (Ecosystem Crate)

Crate: anyfs-sqlite

Complex backends live in separate crates. See AGENTS.md “Crate Ecosystem” section.

Stores the entire filesystem in a single SQLite database file.

#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;
use anyfs::FileStorage;

let fs = FileStorage::new(SqliteBackend::open("myfs.db")?);
fs.write("/documents/report.txt", b"Annual Report")?;
}

How It Works

  • Single .db file contains all files, directories, and metadata
  • Schema: nodes table (path, type, content, permissions, timestamps)
  • Symlinks stored as rows with target path in content
  • Hard links share the same inode (row ID)
  • Uses WAL mode for concurrent read access
  • Connection pooling: multiple readers, single writer with batching
  • Write batching: groups operations into transactions for efficiency
  • Transactions ensure atomic operations

Key insight: “Writes are expensive.” SqliteBackend batches writes internally because one transaction per batch is far more efficient than one transaction per operation.

Performance

OperationSpeedNotes
Read/Write🐢 SlowerSQLite query overhead
Path Resolution🐢 SlowerDatabase lookups per component
TransactionsAtomicACID guarantees
Large Files🟡 VariesSee note below

Large file behavior: SQLite streams BLOB content incrementally via sqlite3_blob_read/write, so files don’t need to fit entirely in RAM. However, very large BLOBs (>100MB) can cause higher memory pressure during I/O operations due to SQLite’s internal buffering and page management. For frequent large file operations, consider IndexedBackend which uses native file I/O.

Performance note: SQLite performance varies significantly based on hardware, configuration, and workload. With proper tuning (WAL mode, connection pooling, write batching), a single SQLite database on modern hardware can achieve high throughput. See sqlite-operations.md for tuning guidance.

Advantages

  • Single-file portability - entire filesystem in one .db file
  • ACID transactions - atomic operations, crash recovery
  • Cross-platform - works on all platforms including WASM
  • Complete isolation - no interaction with host filesystem
  • Queryable - can inspect with SQLite tools
  • Optional encryption - AES-256 via SQLCipher with encryption feature

Disadvantages

  • Slower than memory - database overhead on every operation
  • Single-writer - SQLite’s write lock limits concurrency
  • Large file overhead - very large BLOBs (>100MB) have higher memory pressure due to SQLite buffering

When to Use

Use CaseRecommendation
Portable storageIdeal - single file, works everywhere
Embedded databasesIdeal - self-contained
Sandboxed environmentsGood - complete isolation
Encrypted storageGood - use open_encrypted() with feature
Archive/backupGood - atomic, portable
Large media filesWorks - higher memory pressure during I/O
High-throughput I/O⚠️ Tradeoff - database overhead vs MemoryBackend
External tool accessAvoid - files not on real filesystem

✅ USE SqliteBackend when:

  • Need portable, single-file storage (easy to copy, backup, share)
  • Building embedded/self-contained applications
  • Complete isolation from host filesystem is required
  • Want encryption (use open_encrypted() with encryption feature)
  • Need ACID transactions and crash recovery
  • Cross-platform consistency is critical

❌ DON’T USE SqliteBackend when:

  • Files must be accessible to external tools (use VRootFsBackend)
  • Minimizing memory pressure for very large files is critical (use anyfs-indexed: IndexedBackend)

💡 SqliteBackend vs IndexedBackend: Both provide complete path isolation. Choose SqliteBackend for single-file portability and portable storage. Choose IndexedBackend (anyfs-indexed) for very large files (>100MB) that need native disk streaming performance.


IndexedBackend (Ecosystem Crate)

Crate: anyfs-indexed

Complex backends live in separate crates. See AGENTS.md “Crate Ecosystem” section.

A hybrid backend: virtual paths with disk-based content storage. Paths, directories, symlinks, and metadata are stored in an index database. File content is stored on the real filesystem as opaque blobs.

Key insight: Same isolation model as SqliteBackend, but file content stored externally for native I/O performance with large files.

#![allow(unused)]
fn main() {
use anyfs_indexed::IndexedBackend;
use anyfs::FileStorage;

// Files stored in ./storage/, index in ./storage/index.db
let fs = FileStorage::new(IndexedBackend::open("./storage")?);
fs.write("/documents/report.pdf", &pdf_bytes)?;
// Actually stored as: ./storage/a1b2c3d4-5678-...-1704067200.bin
}

How It Works

Virtual Path                    Real Storage
─────────────────────────────────────────────────────
/documents/report.pdf    →    ./storage/blobs/a1b2c3d4-...-1704067200.bin
/images/photo.jpg        →    ./storage/blobs/b2c3d4e5-...-1704067201.bin
/config.json             →    ./storage/blobs/c3d4e5f6-...-1704067202.bin

index.db contains:
┌─────────────────────────┬──────────────────────────────┬──────────┐
│ virtual_path            │ blob_name                    │ metadata │
├─────────────────────────┼──────────────────────────────┼──────────┤
│ /documents/report.pdf   │ a1b2c3d4-...-1704067200.bin  │ {...}    │
│ /images/photo.jpg       │ b2c3d4e5-...-1704067201.bin  │ {...}    │
└─────────────────────────┴──────────────────────────────┴──────────┘
  • Virtual filesystem, real content: Directory structure, paths, symlinks, and metadata are virtual (stored in index.db). Only raw file content lives on disk as opaque blobs.
  • Files stored with UUID + timestamp names (flat, meaningless filenames)
  • index.db SQLite database maps virtual paths to blob names
  • Symlinks and hard links are simulated in the index (not OS symlinks)
  • Path resolution handled by AnyFS framework (Type 1 backend)
  • File content streamed directly from disk (native I/O performance)

Performance

OperationSpeedNotes
Read/Write🟢 FastNative disk I/O for content
Path Resolution🟡 ModerateIndex lookup + disk access
Large FilesExcellentStreamed directly from disk
Metadata Ops🟢 FastIndex-only, no disk I/O
Index Optimization

The SQLite index benefits from the same performance tuning as SqliteBackend:

SettingDefaultPurpose
journal_modeWALConcurrent reads during metadata updates
synchronousFULLIndex integrity on power loss (safe default)
cache_size16MBSmaller cache for metadata-only index
busy_timeout5000msGracefully handle lock contention
auto_vacuumINCREMENTALReclaim space from deleted entries

Connection pooling: 4-8 reader connections for concurrent index queries; single writer for metadata updates. Blob I/O bypasses SQLite entirely, so the bottleneck is typically blob disk throughput, not index performance.

See anyfs-indexed#9 for detailed performance guidance.

Advantages

  • Native file I/O - content stored as raw files, fast streaming
  • Large file support - uses OS file I/O, avoids SQLite BLOB buffering overhead
  • Complete path isolation - virtual paths, same as SqliteBackend
  • Inspectable - can see blob files on disk (though with opaque names)
  • Cross-platform - works identically on all platforms

Disadvantages

  • Index dependency - losing index.db = losing virtual structure (blobs become orphaned)
  • Two-component backup - must copy directory + index.db together
  • Content exposure - blob files are readable on disk (paths are hidden, content is not)
  • Not single-file portable - unlike SqliteBackend

When to Use

Use CaseRecommendation
Large file storageIdeal - native I/O performance
Media librariesIdeal - stream large videos/images
Document managementGood - virtual paths + fast I/O
Sandboxed + large filesIdeal - virtual paths, real I/O
Single-file portabilityAvoid - use anyfs-sqlite: SqliteBackend
Content confidentiality⚠️ Wrap - use Encryption middleware for protection
WASM/BrowserAvoid - requires real filesystem

✅ USE IndexedBackend when:

  • Storing large files (videos, images, documents >100MB)
  • Need native I/O performance for streaming content
  • Building media libraries or document management systems
  • Want virtual path isolation but with real disk performance
  • Files are large but path structure should be sandboxed

❌ DON’T USE IndexedBackend when:

  • Need single-file portability (use anyfs-sqlite: SqliteBackend)
  • Content must be hidden from host filesystem (use anyfs-sqlite: SqliteBackend with encryption)
  • Need WASM/browser support (use MemoryBackend)

🔒 Encryption Tip: If you need large file performance but content confidentiality matters, you can implement an Encryption<B> middleware wrapper to encrypt blob contents at rest. This is a user-defined middleware pattern (not built-in) - see the middleware implementation guide for how to create custom middleware. Alternatively, use SqliteBackend with SQLCipher encryption for simpler encrypted storage.


Type 2: Real Filesystem Backends

StdFsBackend

Direct delegation to std::fs. Every call maps 1:1 to the standard library.

#![allow(unused)]
fn main() {
use anyfs::{StdFsBackend, FileStorage, NoOpResolver};

// SelfResolving backends require explicit NoOpResolver
let fs = FileStorage::with_resolver(StdFsBackend::new(), NoOpResolver);
fs.write("/tmp/data.txt", b"Hello")?; // Actually writes to /tmp/data.txt
}

How It Works

  • Every method directly calls the equivalent std::fs function
  • Paths passed through unchanged
  • OS handles all resolution, symlinks, permissions
  • Implements SelfResolving marker (use NoOpResolver to skip virtual resolution)

Performance

OperationSpeedNotes
Read/Write🟢 NormalNative OS speed
Path ResolutionFastOS kernel handles it
SymlinksNativeOS behavior

Advantages

  • Zero overhead - direct OS calls
  • Full compatibility - works with all existing files
  • Native features - OS permissions, ACLs, xattrs
  • Middleware-ready - add Quota, Tracing, etc. to real filesystem

Disadvantages

  • No isolation - full filesystem access
  • No containment - paths can escape anywhere
  • Platform differences - Windows vs Unix behavior
  • Security risk - must trust path inputs

When to Use

Use CaseRecommendation
Adding middleware to real FSIdeal - wrap with Quota, Tracing
Trusted environmentsGood - when isolation not needed
Migration pathGood - gradually add AnyFS features
Full host FS featuresGood - ACLs, xattrs, etc.
Untrusted inputNever - use VRootFsBackend
SandboxingNever - no containment whatsoever
Multi-tenant systemsAvoid - use virtual backends

✅ USE StdFsBackend when:

  • Adding middleware (Quota, Tracing, etc.) to real filesystem operations
  • Operating in a fully trusted environment with controlled inputs
  • Migrating existing code to AnyFS incrementally
  • Need full access to host filesystem features (ACLs, xattrs)
  • Building tools that work with user’s actual files

❌ DON’T USE StdFsBackend when:

  • Handling untrusted path inputs (use VRootFsBackend)
  • Any form of sandboxing is required (no containment!)
  • Building multi-tenant systems (use virtual backends)
  • Security isolation matters at all

⚠️ Security Warning: StdFsBackend provides ZERO isolation. Paths like ../../etc/passwd will work. Only use with fully trusted, controlled inputs.


VRootFsBackend

Sets a directory as a virtual root. All operations are contained within it.

Feature: vrootfs

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, FileStorage, NoOpResolver};

// /home/user/sandbox becomes the virtual "/"
// SelfResolving backends require explicit NoOpResolver
let fs = FileStorage::with_resolver(
    VRootFsBackend::new("/home/user/sandbox")?,
    NoOpResolver
);

fs.write("/data.txt", b"Hello")?; 
// Actually writes to: /home/user/sandbox/data.txt

fs.read("/../../../etc/passwd")?;
// Resolves to: /home/user/sandbox/etc/passwd (clamped!)
}

How It Works

  • Configured with a real directory as the “virtual root”
  • All paths are validated and clamped to stay within root
  • Uses strict-path crate for escape prevention
  • Symlinks are followed but targets validated
  • Implements SelfResolving marker (OS handles resolution after validation)
Virtual Path          Validation              Real Path
───────────────────────────────────────────────────────────────
/data.txt        →   validate & join    →   /home/user/sandbox/data.txt
/../../../etc    →   clamp to root      →   /home/user/sandbox/etc
/link → /tmp     →   validate target    →   ERROR or clamped

Performance

OperationSpeedNotes
Read/Write🟡 ModerateValidation overhead
Path Resolution🐢 SlowerExtra I/O for symlink checks
Symlink Following🐢 SlowerMust validate each hop

Advantages

  • Path containment - cannot escape virtual root
  • Real file access - native OS performance for content
  • Symlink safety - targets validated against root
  • Drop-in sandboxing - wrap existing directories

Disadvantages

  • Performance overhead - validation on every operation
  • Extra I/O - symlink following requires lstat calls
  • Platform quirks - symlink behavior varies (especially Windows)
  • Theoretical edge cases - TOCTOU races exist but are difficult to exploit

When to Use

Use CaseRecommendation
User uploads directoryIdeal - contain user content
Plugin sandboxingGood - limit plugin file access
Chroot-like isolationGood - without actual chroot
AI agent workspacesGood - bound agent to directory
Real FS + path containmentIdeal - native I/O with boundaries
Maximum security⚠️ Careful - theoretical TOCTOU exists
Cross-platform symlinks⚠️ Careful - Windows behavior differs
Complete host isolationAvoid - use SqliteBackend instead

✅ USE VRootFsBackend when:

  • Containing user-uploaded content to a specific directory
  • Sandboxing plugins, extensions, or untrusted code
  • Need chroot-like isolation without actual chroot privileges
  • Building AI agent workspaces with filesystem boundaries
  • Want real filesystem performance with path containment

❌ DON’T USE VRootFsBackend when:

  • Maximum security required (theoretical TOCTOU edge cases exist - use MemoryBackend)
  • Need highest I/O performance (validation adds overhead)
  • Cross-platform symlink consistency is critical (Windows differs)
  • Want complete isolation from host (use SqliteBackend)

🔒 Encryption Tip: For sensitive data in sandboxed directories (user uploads, plugin workspaces, AI agent data), consider implementing an Encryption<B> middleware wrapper. This is a user-defined middleware pattern - you would create a custom Layer that encrypts data before delegating to the inner backend. See the middleware implementation guide for the pattern.


Composition Middleware

Overlay<Base, Upper>

Union filesystem middleware combining a read-only base with a writable upper layer.

Note: Overlay is middleware (in anyfs/middleware/overlay.rs), not a standalone backend. It composes two backends into a layered view.

#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, MemoryBackend, Overlay, FileStorage};

// Base: read-only template
let base = VRootFsBackend::new("/var/templates")?;

// Upper: writable scratch layer  
let upper = MemoryBackend::new();

let fs = FileStorage::new(Overlay::new(base, upper));

// Read: checks upper first, falls back to base
let data = fs.read("/config.txt")?;

// Write: always goes to upper
fs.write("/config.txt", b"modified")?;

// Delete: creates "whiteout" in upper, shadows base
fs.remove_file("/unwanted.txt")?;
}

How It Works

┌─────────────────────────────────────────────────┐
│                  Overlay<B, U>                  │
├─────────────────────────────────────────────────┤
│  Read:   upper.exists(path)?                    │
│            → upper.read(path)                   │
│            : base.read(path)                    │
│                                                 │
│  Write:  upper.write(path, data)                │
│          (base unchanged)                       │
│                                                 │
│  Delete: upper.mark_whiteout(path)              │
│          (shadows base, doesn't delete it)      │
│                                                 │
│  List:   merge(base.read_dir(), upper.read_dir())│
│          - exclude whiteouts                    │
└─────────────────────────────────────────────────┘

         ┌──────────────┐
         │    Upper     │  ← Writes go here
         │ (MemoryFs)   │  ← Modifications stored here
         │              │  ← Whiteouts (deletions) here
         └──────┬───────┘
                │ if not found
                ▼
         ┌──────────────┐
         │     Base     │  ← Read-only layer
         │ (SqliteFs)   │  ← Original/template data
         │              │  ← Never modified
         └──────────────┘
  • Reads: Check upper layer first, fall back to base
  • Writes: Always go to upper layer (base is read-only)
  • Deletes: Create “whiteout” marker in upper (shadows base file)
  • Directory listing: Merge both layers, exclude whiteouts

Performance

OperationSpeedNotes
Read (upper hit)FastSingle layer lookup
Read (base fallback)🟡 ModerateTwo-layer lookup
WriteDepends on upperUpper layer speed
Directory listing🐢 SlowerMust merge both layers

Advantages

  • Copy-on-write semantics - modifications don’t affect base
  • Instant rollback - discard upper layer to reset
  • Space efficient - only changes stored in upper
  • Template pattern - share base across multiple instances
  • Testing isolation - test against real data without modifying it

Disadvantages

  • Complexity - whiteout handling, merge logic
  • Directory listing overhead - must combine and filter
  • Two backends to manage - lifecycle of both layers
  • Not true CoW - doesn’t deduplicate at block level

When to Use

Use CaseRecommendation
Container imagesIdeal - base image + writable layer
Template filesystemsIdeal - shared base, per-user upper
Testing with real dataIdeal - modify without consequences
Rollback capabilityGood - discard upper to reset
Git-like branchingGood - branch = new upper layer
Simple use casesOverkill - use single backend
Block-level CoWAvoid - Overlay is file-level
Dir listing perfAvoid - merge overhead on listings

✅ USE Overlay when:

  • Building container-like systems (base image + writable layer)
  • Sharing a template filesystem across multiple instances
  • Testing against production data without modifying it
  • Need instant rollback capability (discard upper layer)
  • Implementing git-like branching at filesystem level

❌ DON’T USE Overlay when:

  • Simple, single-purpose filesystem (unnecessary complexity)
  • Need block-level copy-on-write (Overlay is file-level)
  • Directory listing performance is critical (merge overhead)
  • Don’t need layered semantics (use single backend)

Backend Selection Guide

Quick Decision Tree

Do you need persistence?
├─ No → MemoryBackend
└─ Yes
   ├─ Single portable file? → SqliteBackend
   ├─ Large files + path isolation? → IndexedBackend
   └─ Access existing files on disk?
      ├─ Need containment? → VRootFsBackend  
      └─ Trusted environment? → StdFsBackend

Comparison Matrix

BackendSpeedIsolationPersistenceLarge FilesWASM
MemoryBackend⚡ Very Fast✅ Complete❌ None⚠️ RAM-limited
SqliteBackend🐢 Slower✅ Complete✅ Single file✅ Supported
IndexedBackend🟢 Fast✅ Complete✅ Directory✅ Native I/O
StdFsBackend🟢 Normal❌ None✅ Native✅ Native
VRootFsBackend🟡 Moderate✅ Strong✅ Native✅ Native
Overlay†VariesVariesVariesVariesVaries

†Overlay is middleware that composes two backends; characteristics depend on the backends used.

By Use Case

Use CaseRecommended
Unit testingMemoryBackend
Integration testingMemoryBackend or SqliteBackend
Portable application dataSqliteBackend
Encrypted storageSqliteBackend (with encryption feature)
Large file + isolationIndexedBackend
Media librariesIndexedBackend
Plugin/agent sandboxingVRootFsBackend
Adding middleware to real FSStdFsBackend
Container-like isolationOverlay<SqliteBackend, MemoryBackend>
Template with modificationsOverlay<Base, Upper>
WASM/BrowserMemoryBackend or SqliteBackend

Platform Compatibility

BackendWindowsLinuxmacOSWASM
MemoryBackend
SqliteBackend✅*
IndexedBackend
StdFsBackend
VRootFsBackend✅**
Overlay†Varies

* Requires wasm32-compatible SQLite build
** Windows symlinks require elevated privileges or Developer Mode
†Overlay is middleware; platform support depends on the backends composed


Common Mistakes to Avoid

❌ Mistake✅ Instead
Using StdFsBackend with user-provided pathsUse VRootFsBackend - it prevents ../../etc/passwd attacks
Using MemoryBackend for data that must survive restartUse SqliteBackend for persistence, or call save_to() to serialize
Expecting identical symlink behavior across platforms with VRootFsBackendUse MemoryBackend or SqliteBackend for consistent cross-platform symlinks
Using Overlay when a simple backend would sufficeKeep it simple - use Overlay only when you need true layered semantics