AnyFS Ecosystem
An open standard for pluggable virtual filesystem backends in Rust.
Overview
AnyFS is an open standard for virtual filesystem backends using a Tower-style middleware pattern for composable functionality.
You get:
- A familiar
std::fs-aligned API - Composable middleware (limits, logging, security)
- Choice of storage: memory, SQLite, host filesystem, or custom
- A developer-first goal: make storage composition easy, safe, and enjoyable
Architecture
┌─────────────────────────────────────────┐
│ FileStorage<B> │ ← Ergonomic std::fs-aligned API
├─────────────────────────────────────────┤
│ Middleware (composable): │
│ Quota<B> │ ← Quotas
│ Restrictions<B> │ ← Security
│ Tracing<B> │ ← Audit
├─────────────────────────────────────────┤
│ Fs │ ← Storage
│ (Memory, SQLite, VRootFs, custom...) │
└─────────────────────────────────────────┘
Each layer has one job. Compose only what you need.
Two-Crate Structure
| Crate | Purpose |
|---|---|
anyfs-backend | Minimal contract: Fs trait + types |
anyfs | Backends + middleware + mounting + ergonomic FileStorage<B> |
Note: Mounting (FsFuse + MountHandle) is part of the anyfs crate behind feature flags (fuse, winfsp), not a separate crate.
Quick Example
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, FileStorage};
// Layer-based composition
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(RestrictionsLayer::builder()
.deny_permissions()
.build());
let fs = FileStorage::new(backend);
fs.create_dir_all("/data")?;
fs.write("/data/file.txt", b"hello")?;
}
How to Use This Manual
| Section | Audience | Purpose |
|---|---|---|
| Overview | Stakeholders | One-page understanding |
| Getting Started | Developers | Practical examples |
| Design & Architecture | Contributors | Detailed design |
| Traits & APIs | Backend authors | Contract and types |
| Implementation | Implementers | Plan + backend guide |
Future Considerations
These are optional extensions that fit the design but are out of scope for initial release:
- URL-based backend registry and bulk helpers (
FsExt/utilities) - Async adapter for remote backends
- Companion shell for interactive exploration
- Copy-on-write overlay and archive backends (zip/tar)
See Design Overview for the full list and rationale.
Status
| Component | Status |
|---|---|
| Design | Complete |
| Implementation | Not started (mounting roadmap defined) |
Authoritative Documents
AGENTS.md(for AI assistants)src/architecture/design-overview.mdsrc/architecture/adrs.md
AnyFS - Executive Summary
One-page overview for stakeholders and decision-makers.
What is it?
AnyFS is an open standard for pluggable virtual filesystem backends in Rust, using a Tower-style middleware pattern for composable functionality.
You get:
- A familiar
std::fs-aligned API - Composable middleware for limits, logging, and security
- Choice of storage: memory, SQLite, host filesystem, or custom
Architecture
┌─────────────────────────────────────────┐
│ FileStorage<B> │ ← Ergonomics (std::fs API)
├─────────────────────────────────────────┤
│ Middleware (composable): │
│ Quota<B> │ ← Quotas
│ Restrictions<B> │ ← Security
│ Tracing<B> │ ← Audit
├─────────────────────────────────────────┤
│ Fs │ ← Storage
└─────────────────────────────────────────┘
Why does it matter?
| Problem | How AnyFS helps |
|---|---|
| Multi-tenant isolation | Separate backend instances per tenant |
| Portability | SQLite backend: tenant data = single .db file |
| Security | Restrictions blocks risky operations when composed |
| Resource control | Quota enforces quotas |
| Audit compliance | Tracing records all operations |
| Custom storage | Implement Fs for any medium |
Key design points
-
Two-crate structure
anyfs-backend: trait contractanyfs: backends + middleware + ergonomic wrapper
-
Middleware pattern (like Axum/Tower)
- Each middleware has one job
- Compose only what you need
- Complete separation of concerns
-
std::fs alignment
- Familiar method names
- Core traits use
&Path;FileStorage/FsExtacceptimpl AsRef<Path>for ergonomics
-
Developer experience first
- Make storage composition easy, safe, and enjoyable to use
Quick example
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, FileStorage};
// Layer-based composition
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(RestrictionsLayer::builder()
.deny_permissions()
.build());
let fs = FileStorage::new(backend);
fs.create_dir_all("/documents")?;
fs.write("/documents/hello.txt", b"Hello!")?;
}
Status
| Phase | Status |
|---|---|
| Design | Complete |
| Implementation | Not started (mounting roadmap defined) |
For details, see Design Overview.
AnyFS - Project Structure
Status: Target layout (design spec) Last updated: 2025-12-24
This manual describes the intended code repository layout; this repository contains documentation only.
Repository Layout
anyfs-backend/ # Crate 1: traits + types (minimal dependencies)
Cargo.toml
src/
lib.rs
traits/
fs_read.rs # FsRead trait
fs_write.rs # FsWrite trait
fs_dir.rs # FsDir trait
fs_link.rs # FsLink trait
fs_permissions.rs # FsPermissions trait
fs_sync.rs # FsSync trait
fs_stats.rs # FsStats trait
fs_path.rs # FsPath trait (canonicalization, blanket impl)
fs_inode.rs # FsInode trait
fs_handles.rs # FsHandles trait
fs_lock.rs # FsLock trait
fs_xattr.rs # FsXattr trait
layer.rs # Layer trait (Tower-style)
ext.rs # FsExt (extension methods)
markers.rs # SelfResolving marker trait
path_resolver.rs # PathResolver trait (pluggable resolution)
types.rs # Metadata, DirEntry, Permissions, StatFs
error.rs # FsError
anyfs/ # Crate 2: framework glue (simple backends + middleware + ergonomics)
Cargo.toml
src/
lib.rs
backends/
memory.rs # MemoryBackend (in-memory, simple)
stdfs.rs # StdFsBackend (thin std::fs wrapper)
vrootfs.rs # VRootFsBackend (std::fs + path containment)
middleware/
quota.rs # Quota<B>
restrictions.rs # Restrictions<B>
path_filter.rs # PathFilter<B>
read_only.rs # ReadOnly<B>
rate_limit.rs # RateLimit<B>
tracing.rs # Tracing<B>
dry_run.rs # DryRun<B>
cache.rs # Cache<B>
overlay.rs # Overlay<B1, B2>
resolvers/
iterative.rs # IterativeResolver (default)
noop.rs # NoOpResolver (for SelfResolving backends)
caching.rs # CachingResolver (LRU cache wrapper)
container.rs # FileStorage<B>
stack.rs # BackendStack builder
Ecosystem Crates (Separate Repositories)
Complex backends with internal runtime requirements:
anyfs-sqlite/ # SQLite backend (connection pooling, WAL, sharding)
anyfs-indexed/ # Hybrid backend (SQLite index + disk blobs)
anyfs-s3/ # Third-party: S3 backend
anyfs-redis/ # Third-party: Redis backend
Testing Crate
anyfs-test/ # Conformance test suite for backend implementers
src/
lib.rs
conformance/ # Test generators for each trait level
fs_tests.rs # Fs-level tests
fs_full_tests.rs # FsFull-level tests
fs_fuse_tests.rs # FsFuse-level tests
prelude.rs # Re-exports for test files
Dependency Model
anyfs-backend (trait + types)
^
|-- anyfs (backends + middleware + ergonomics)
| ^-- vrootfs feature uses strict-path
Key points:
- Custom backends depend only on
anyfs-backend anyfsprovides built-in backends, middleware, mounting (behind feature flags), and the ergonomicFileStorage<B>wrapper
Middleware Pattern
FileStorage<B>
wraps -> Tracing<B>
wraps -> Restrictions<B>
wraps -> Quota<B>
wraps -> MemoryBackend (or any Fs)
Each layer implements Fs, enabling composition.
Cargo Features
Backends (anyfs crate)
memory— In-memory storage (default)stdfs— Directstd::fsdelegation (no containment)vrootfs— Host filesystem backend with path containment (usesstrict-path)
Ecosystem Backends (Separate Crates)
Complex backends live in their own crates:
anyfs-sqlite— SQLite-backed persistent storage (pooling, WAL, sharding, encryption)anyfs-indexed— SQLite index + disk blobs for large file performance
Middleware (MVP Scope)
Core middleware is always available (no feature flags needed):
- Quota, PathFilter, Restrictions, ReadOnly, RateLimit, Cache, DryRun, Overlay
Optional middleware with external dependencies:
tracing— Detailed audit logging (requirestracingcrate)
Mounting (Platform Features)
fuse— Mount as filesystem on Linux/macOS (requiresfusercrate)winfsp— Mount as filesystem on Windows (requireswinfspcrate)
Use default-features = false to cherry-pick exactly what you need.
Middleware (Future Scope)
metrics— Prometheus integration (requiresprometheuscrate)
Where To Start
- Application usage: Getting Started Guide
- Trait details: Layered Traits
- Middleware: Design Overview
- Decisions: Architecture Decision Records
AnyFS — Getting Started Guide
A practical introduction with examples
Installation
Add to your Cargo.toml:
[dependencies]
anyfs = "0.1"
For additional backends:
[dependencies]
anyfs = { version = "0.1", features = ["stdfs", "vrootfs"] }
# SQLite storage (ecosystem crate)
anyfs-sqlite = "0.1"
anyfs-sqlite = { version = "0.1", features = ["encryption"] } # With SQLCipher
# Hybrid backend: SQLite metadata + disk blobs (ecosystem crate)
anyfs-indexed = "0.1"
Available anyfs features:
memory— In-memory storage (default)stdfs— Directstd::fsdelegation (no containment)vrootfs— Host filesystem backend with path containmentbytes— Zero-copyBytessupport (addsread_bytes()method)tracing— Detailed audit logging (requirestracingcrate)fuse— Mount as filesystem on Linux/macOSwinfsp— Mount as filesystem on Windows
Core middleware (Quota, PathFilter, Restrictions, ReadOnly, RateLimit, Cache, DryRun, Overlay) is always available.
Quick Start
Examples below use FileStorage, so you can pass paths as &str. If you call core trait methods directly, use &Path.
Hello World
use anyfs::{MemoryBackend, FileStorage};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/hello.txt", b"Hello, AnyFS!")?;
let content = fs.read("/hello.txt")?;
println!("{}", String::from_utf8_lossy(&content));
Ok(())
}
With Quotas
use anyfs::{MemoryBackend, QuotaLayer, FileStorage};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let backend = QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024) // 100 MB
.max_file_size(10 * 1024 * 1024) // 10 MB per file
.build()
.layer(MemoryBackend::new());
let fs = FileStorage::new(backend);
fs.create_dir_all("/documents")?;
fs.write("/documents/notes.txt", b"Meeting notes")?;
Ok(())
}
With Restrictions
use anyfs::{MemoryBackend, RestrictionsLayer, FileStorage};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Block permission changes for untrusted code
let backend = RestrictionsLayer::builder()
.deny_permissions() // Block set_permissions() calls
.build()
.layer(MemoryBackend::new());
let fs = FileStorage::new(backend);
// All other operations work normally
fs.write("/file.txt", b"content")?;
Ok(())
}
Full Stack (Layer-based)
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, TracingLayer, FileStorage};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.max_file_size(10 * 1024 * 1024)
.build())
.layer(RestrictionsLayer::builder()
.deny_permissions()
.build())
.layer(TracingLayer::new());
let fs = FileStorage::new(backend);
fs.create_dir_all("/data")?;
fs.write("/data/file.txt", b"hello")?;
Ok(())
}
Common Operations
Creating Directories
#![allow(unused)]
fn main() {
fs.create_dir("/documents")?; // Single level
fs.create_dir_all("/documents/2024/q1")?; // Recursive
}
Reading and Writing Files
#![allow(unused)]
fn main() {
fs.write("/data.txt", b"line 1\n")?; // Create or overwrite
fs.append("/data.txt", b"line 2\n")?; // Append
let content = fs.read("/data.txt")?; // Read all
let partial = fs.read_range("/data.txt", 0, 6)?; // Read range
let text = fs.read_to_string("/data.txt")?; // Read as String
}
Listing Directories
#![allow(unused)]
fn main() {
for entry in fs.read_dir("/documents")? {
println!("{}: {:?}", entry.name, entry.file_type);
}
}
Checking Existence and Metadata
#![allow(unused)]
fn main() {
if fs.exists("/file.txt")? {
let meta = fs.metadata("/file.txt")?;
println!("Size: {} bytes", meta.size);
}
}
Copying and Moving
#![allow(unused)]
fn main() {
fs.copy("/original.txt", "/copy.txt")?;
fs.rename("/original.txt", "/renamed.txt")?;
}
Deleting
#![allow(unused)]
fn main() {
fs.remove_file("/old-file.txt")?;
fs.remove_dir("/empty-folder")?;
fs.remove_dir_all("/old-folder")?;
}
Middleware
Quota — Resource Limits
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Quota};
let backend = QuotaLayer::builder()
.max_total_size(500 * 1024 * 1024) // 500 MB total
.max_file_size(50 * 1024 * 1024) // 50 MB per file
.max_node_count(100_000) // 100K files/dirs
.max_dir_entries(5_000) // 5K per directory
.max_path_depth(32) // Max nesting
.build()
.layer(MemoryBackend::new());
// Check usage
let usage = backend.usage();
println!("Using {} bytes", usage.total_size);
// Check remaining
let remaining = backend.remaining();
if !remaining.can_write {
println!("Storage full!");
}
}
Restrictions — Block Permission Changes
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, RestrictionsLayer};
// Restrictions controls permission-related operations.
// Symlink/hard-link capability is determined by trait bounds (FsLink).
let backend = RestrictionsLayer::builder()
.deny_permissions() // Block set_permissions() calls
.build()
.layer(MemoryBackend::new());
// Blocked operations return FsError::FeatureNotEnabled
}
Tracing — Instrumentation
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, TracingLayer};
// TracingLayer uses the global tracing subscriber by default
let backend = MemoryBackend::new().layer(TracingLayer::new());
// Or configure with custom settings
let backend = MemoryBackend::new()
.layer(TracingLayer::new()
.with_target("myapp::fs")
.with_level(tracing::Level::DEBUG));
}
Error Handling
#![allow(unused)]
fn main() {
use anyfs_backend::FsError;
match fs.write("/file.txt", &large_data) {
Ok(()) => println!("Written"),
Err(FsError::NotFound { path, .. }) => println!("Not found: {}", path.display()),
Err(FsError::AlreadyExists { path, .. }) => println!("Exists: {}", path.display()),
Err(FsError::QuotaExceeded { .. }) => println!("Quota exceeded"),
Err(FsError::FeatureNotEnabled { feature }) => println!("Feature disabled: {}", feature),
Err(e) => println!("Error: {}", e),
}
}
Testing
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
#[test]
fn test_write_and_read() {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/test.txt", b"test data").unwrap();
let content = fs.read("/test.txt").unwrap();
assert_eq!(content, b"test data");
}
}
With limits:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Quota, FileStorage};
#[test]
fn test_quota_exceeded() {
let backend = QuotaLayer::builder()
.max_total_size(1024) // 1 KB
.build()
.layer(MemoryBackend::new());
let fs = FileStorage::new(backend);
let big_data = vec![0u8; 2048]; // 2 KB
let result = fs.write("/big.bin", &big_data);
assert!(result.is_err());
}
}
Best Practices
1. Use Appropriate Backend
| Use Case | Backend | Crate |
|---|---|---|
| Testing | MemoryBackend | anyfs |
| Production (portable) | SqliteBackend | anyfs-sqlite |
| Host filesystem (with containment) | VRootFsBackend | anyfs |
| Host filesystem (direct access) | StdFsBackend | anyfs |
2. Compose Middleware for Your Needs
#![allow(unused)]
fn main() {
// Minimal: just storage
let fs = FileStorage::new(MemoryBackend::new());
// With limits (layer-based)
let fs = FileStorage::new(
MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
);
// Sandboxed (layer-based)
let temp_dir = tempfile::tempdir()?;
let fs = FileStorage::new(
VRootFsBackend::new(temp_dir.path())?
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(RestrictionsLayer::builder()
.deny_permissions()
.build())
);
}
3. Handle Errors Gracefully
Always check for quota exceeded, feature not enabled, and other errors.
Advanced Use Cases
These use cases require the fuse or winfsp feature flags.
Database-Backed Drive with Live Monitoring
Mount a database-backed filesystem and query it directly for real-time analytics:
┌─────────────────────────────────────────────────────────────┐
│ Database (SQLite, PostgreSQL, etc.) │
├─────────────────────────────────────────────────────────────┤
│ │ │
│ MountHandle │ Stats Dashboard │
│ (write + read) │ (direct DB queries) │
│ │ │ │ │
│ ▼ │ ▼ │
│ /mnt/workspace │ SELECT SUM(size) FROM nodes │
│ $ cp file.txt ./ │ SELECT COUNT(*) FROM nodes │
│ $ mkdir projects/ │ SELECT * FROM audit_log │
│ │ │ │
│ │ ▼ │
│ │ ┌──────────────┐ │
│ │ │ Live Graphs │ │
│ │ │ - Disk usage │ │
│ │ │ - File count │ │
│ │ │ - Recent ops │ │
│ │ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
SQLite Example (using ecosystem crate - API sketch, planned):
#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, TracingLayer, MountHandle};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Mount the drive
let backend = SqliteBackend::open("tenant.db")?
.layer(TracingLayer::new()) // Logs operations to tracing subscriber
.layer(QuotaLayer::builder()
.max_total_size(1_000_000_000)
.build());
let mount = MountHandle::mount(backend, "/mnt/workspace")?;
}
#![allow(unused)]
fn main() {
// Meanwhile, in a monitoring dashboard...
// Note: This queries SqliteBackend's internal schema (nodes, audit_log tables).
// See anyfs-sqlite documentation for schema details.
use rusqlite::{Connection, OpenFlags};
let conn = Connection::open_with_flags(
"tenant.db",
OpenFlags::SQLITE_OPEN_READ_ONLY, // Safe concurrent reads
)?;
loop {
let (file_count, total_bytes): (i64, i64) = conn.query_row(
"SELECT COUNT(*), COALESCE(SUM(size), 0) FROM nodes WHERE type = 'file'",
[],
|row| Ok((row.get(0)?, row.get(1)?)),
)?;
let recent_ops: Vec<String> = conn
.prepare("SELECT operation, path, timestamp FROM audit_log ORDER BY timestamp DESC LIMIT 10")?
.query_map([], |row| Ok(format!("{}: {}", row.get::<_, String>(0)?, row.get::<_, String>(1)?)))?
.collect::<Result<_, _>>()?;
render_dashboard(file_count, total_bytes, &recent_ops);
std::thread::sleep(Duration::from_secs(1));
}
}
Works with any database backend:
| Backend | Direct Query Method |
|---|---|
SqliteBackend | rusqlite with SQLITE_OPEN_READ_ONLY |
| Custom (user-implemented) | Direct database driver connection |
Third-party crates can implement additional database backends (PostgreSQL, MySQL, etc.) following the same pattern.
What you can visualize:
- Real-time storage usage (gauges, bar charts)
- File count over time (line graphs)
- Operations log (live feed)
- Most accessed files (heatmaps)
- Directory tree maps (size visualization)
- Per-tenant usage (multi-tenant dashboards)
This pattern is powerful because the database is the source of truth — you get filesystem semantics via FUSE and SQL analytics via direct queries, from the same data.
RAM Drive
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, MountHandle};
// 4GB RAM drive
let mount = MountHandle::mount(
MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(4 * 1024 * 1024 * 1024)
.build()),
"/mnt/ramdisk"
)?;
// Use for fast compilation, temp files, etc.
// $ TMPDIR=/mnt/ramdisk cargo build
}
Sandboxed AI Agent Workspace
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RestrictionsLayer, TracingLayer, MountHandle};
let mount = MountHandle::mount(
MemoryBackend::new()
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/..*") // No hidden files
.deny("**/.*") // No dotfiles
.build())
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.max_file_size(10 * 1024 * 1024)
.build())
.layer(TracingLayer::new()), // Full audit trail
"/mnt/agent"
)?;
// Agent uses standard filesystem APIs
// All operations are sandboxed, quota-limited, and logged
}
Next Steps
AnyFS — API Quick Reference
Condensed reference for developers
Installation
[dependencies]
anyfs = "0.1"
With optional features and ecosystem crates:
anyfs = { version = "0.1", features = ["vrootfs", "bytes"] }
anyfs-sqlite = "0.1" # SQLite backend (ecosystem crate)
Creating a Backend Stack
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, TracingLayer, FileStorage};
// Simple
let fs = FileStorage::new(MemoryBackend::new());
// With limits
let fs = FileStorage::new(
MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.max_file_size(10 * 1024 * 1024)
.build())
);
// Full stack (fluent composition)
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(RestrictionsLayer::builder()
.deny_permissions()
.build())
.layer(TracingLayer::new());
let fs = FileStorage::new(backend);
// BackendStack builder (fluent API)
use anyfs::BackendStack;
let fs = BackendStack::new(MemoryBackend::new())
.limited(|l| l.max_total_size(100 * 1024 * 1024))
.restricted(|g| g.deny_permissions())
.traced()
.into_container();
}
BackendStack Methods
BackendStack provides a fluent API for building middleware stacks:
#![allow(unused)]
fn main() {
use anyfs::BackendStack;
BackendStack::new(backend) // Start with any backend
.limited(|l| l // -> Quota<B>
.max_total_size(bytes)
.max_file_size(bytes)
.max_node_count(count))
.restricted(|g| g // -> Restrictions<B>
.deny_permissions())
.traced() // -> Tracing<B>
.cached(|c| c // -> Cache<B>
.max_size(bytes)
.ttl(duration))
.read_only() // -> ReadOnly<B>
.into_container() // -> FileStorage<...>
}
| Method | Creates | Description |
|---|---|---|
.limited() | Quota<B> | Add quota limits |
.restricted() | Restrictions<B> | Add operation restrictions |
.traced() | Tracing<B> | Add tracing instrumentation |
.cached() | Cache<B> | Add LRU caching |
.read_only() | ReadOnly<B> | Make backend read-only |
.into_container() | FileStorage<B> | Finish and wrap in FileStorage |
Quota Methods
#![allow(unused)]
fn main() {
// Builder pattern (required - at least one limit must be set)
QuotaLayer::builder()
.max_total_size(bytes) // Total storage limit
.max_file_size(bytes) // Per-file limit
.max_node_count(count) // Max files/dirs
.max_dir_entries(count) // Max entries per dir
.max_path_depth(depth) // Max nesting
.build()
.layer(backend)
// Query
backend.usage() // -> Usage { total_size, file_count, ... }
backend.limits() // -> Limits { max_total_size, ... }
backend.remaining() // -> Remaining { bytes, can_write, ... }
}
Restrictions Methods
#![allow(unused)]
fn main() {
// Builder pattern
// Restrictions only controls permission-related operations.
// Symlink/hard-link capability is via trait bounds (FsLink), not middleware.
RestrictionsLayer::builder()
.deny_permissions() // Block set_permissions() calls
.build()
.layer(backend)
}
TracingLayer Methods
#![allow(unused)]
fn main() {
// TracingLayer configuration (applied via .layer())
TracingLayer::new()
.with_target("anyfs") // tracing target
.with_level(tracing::Level::DEBUG)
// Usage
let backend = inner.layer(TracingLayer::new().with_target("anyfs"));
}
PathFilter Methods
#![allow(unused)]
fn main() {
// Builder pattern (required - at least one rule must be set)
PathFilterLayer::builder()
.allow("/workspace/**") // Allow glob pattern
.deny("**/.env") // Deny glob pattern
.deny("**/secrets/**")
.build()
.layer(backend)
// Rules evaluated in order; first match wins
// No match = denied (deny by default)
}
ReadOnly Methods
#![allow(unused)]
fn main() {
ReadOnly::new(backend)
// All read operations pass through
// All write operations return FsError::ReadOnly
}
RateLimit Methods
#![allow(unused)]
fn main() {
// Builder pattern (required - must set ops and window)
RateLimitLayer::builder()
.max_ops(1000) // Operation limit
.per_second() // Window: 1 second
// or
.per_minute() // Window: 60 seconds
// or
.per(Duration::from_millis(500)) // Custom window
.build()
.layer(backend)
}
DryRun Methods
#![allow(unused)]
fn main() {
let dry_run = DryRun::new(backend);
let fs = FileStorage::new(dry_run);
// Read operations execute normally
// Write operations are logged but not executed
fs.write("/file.txt", b"data")?; // Logged, returns Ok
// Inspect logged operations (returns Vec<String>)
let ops: Vec<String> = dry_run.operations();
// e.g., ["write /file.txt (4 bytes)", "remove_file /old.txt"]
dry_run.clear(); // Clear the log
}
Cache Methods
#![allow(unused)]
fn main() {
// Builder pattern (required - at least max_entries must be set)
CacheLayer::builder()
.max_entries(1000) // LRU cache size
.max_entry_size(1024 * 1024) // 1MB max per entry
.ttl(Duration::from_secs(300)) // Optional: entry lifetime (default: no expiry)
.build()
.layer(backend)
}
IndexLayer Methods (Future)
#![allow(unused)]
fn main() {
// Builder pattern (required - set index path)
IndexLayer::builder()
.index_file("index.db") // Sidecar index file (SQLite default)
.consistency(IndexConsistency::Strict)
.track_reads(false) // Optional
.build()
.layer(backend)
}
Overlay Methods
#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, MemoryBackend, Overlay};
let base = VRootFsBackend::new("/var/templates")?; // Read-only base
let upper = MemoryBackend::new(); // Writable upper
let overlay = Overlay::new(base, upper);
// Read: check upper first, fall back to base
// Write: always to upper layer
// Delete: whiteout marker in upper
}
FsExt Methods
Extension methods available on all backends:
#![allow(unused)]
fn main() {
use anyfs_backend::FsExt;
// JSON support (requires `serde` feature on anyfs-backend)
let config: Config = fs.read_json("/config.json")?;
fs.write_json("/config.json", &config)?;
// Type checks
if fs.is_file("/path")? { ... }
if fs.is_dir("/path")? { ... }
}
Note: JSON methods require
anyfs-backend = { version = "0.1", features = ["serde"] }
File Operations
Examples below assume FileStorage (std::fs-style paths). If you call core trait methods directly, pass &Path.
#![allow(unused)]
fn main() {
// Check existence
fs.exists("/path")? // -> bool
// Metadata
let meta = fs.metadata("/path")?;
meta.inode // unique identifier
meta.nlink // hard link count
meta.file_type // File | Directory | Symlink
meta.size // file size in bytes
meta.permissions // Permissions (default if unsupported)
meta.created // SystemTime (UNIX_EPOCH if unsupported)
meta.modified // SystemTime (UNIX_EPOCH if unsupported)
meta.accessed // SystemTime (UNIX_EPOCH if unsupported)
// Read
let bytes = fs.read("/path")?; // -> Vec<u8>
let text = fs.read_to_string("/path")?; // -> String
let chunk = fs.read_range("/path", 0, 1024)?;
// List directory
for entry in fs.read_dir("/path")? {
let entry = entry?;
entry.name // String (file/dir name only)
entry.path // PathBuf (full path)
entry.file_type // File | Directory | Symlink
entry.size // u64 (0 for directories)
entry.inode // u64 (0 if unsupported)
}
// Write
fs.write("/path", b"content")?; // Create or overwrite
fs.append("/path", b"more")?; // Append
// Directories
fs.create_dir("/path")?;
fs.create_dir_all("/path")?;
// Delete
fs.remove_file("/path")?;
fs.remove_dir("/path")?; // Empty only
fs.remove_dir_all("/path")?; // Recursive
// Move/Copy
fs.rename("/from", "/to")?;
fs.copy("/from", "/to")?;
// Links
fs.symlink("/target", "/link")?;
fs.hard_link("/original", "/link")?;
fs.read_link("/link")?; // -> PathBuf
fs.symlink_metadata("/link")?; // Metadata of link itself, not target
// Symlink capability is determined by FsLink trait bounds, not middleware.
// Permissions (requires FsPermissions)
fs.set_permissions("/path", perms)?;
// File size
fs.truncate("/path", 1024)?; // Resize to 1024 bytes
// Durability
fs.sync()?; // Flush all writes
fs.fsync("/path")?; // Flush writes for one file
}
Path Canonicalization
FileStorage provides path canonicalization that works on the virtual filesystem.
Note: Canonicalization requires FsLink because symlink resolution needs read_link() and symlink_metadata(). Backends that only implement Fs (without FsLink) cannot use these methods.
#![allow(unused)]
fn main() {
// Strict canonicalization - path must exist
let canonical = fs.canonicalize("/some/../path/./file.txt")?;
// Returns fully resolved absolute path, follows symlinks
// Soft canonicalization - handles non-existent paths
let resolved = fs.soft_canonicalize("/existing/dir/new_file.txt")?;
// Resolves existing components, appends non-existent remainder lexically
// Anchored canonicalization - sandboxed resolution
let safe = fs.anchored_canonicalize("/workspace/../etc/passwd", "/workspace")?;
// Clamps result within anchor directory (returns error if escape attempted)
}
Standalone utility (no backend needed):
#![allow(unused)]
fn main() {
use anyfs::normalize;
// Lexical path cleanup only
let clean = normalize("//foo///bar//"); // -> "/foo/bar"
// Does NOT resolve . or .. (those require filesystem context)
// Does NOT follow symlinks
}
Comparison:
| Function | Path Must Exist? | Follows Symlinks? | Resolves ..? |
|---|---|---|---|
canonicalize | Yes (all components) | Yes | Yes (symlink-aware) |
soft_canonicalize | No (appends non-existent) | Yes (for existing) | Yes (symlink-aware) |
anchored_canonicalize | No | Yes (for existing) | Yes (clamped to anchor) |
normalize | N/A (lexical only) | No | No |
Inode Operations (FsInode trait)
Backends implementing FsInode track inodes internally for hardlink support and FUSE mounting:
#![allow(unused)]
fn main() {
use anyfs::FileStorage;
// Convert between paths and inodes
let fs = FileStorage::new(backend);
let inode: u64 = fs.path_to_inode("/some/path")?;
let path: PathBuf = fs.inode_to_path(inode)?;
// Lookup child by name within a directory (FUSE-style)
let root_inode = fs.path_to_inode("/")?;
let child_inode = fs.lookup(root_inode, "filename.txt")?;
// Get metadata by inode (avoids path resolution)
let meta = fs.metadata_by_inode(inode)?;
// Hardlinks share the same inode
fs.hard_link("/original", "/link")?;
let ino1 = fs.path_to_inode("/original")?;
let ino2 = fs.path_to_inode("/link")?;
assert_eq!(ino1, ino2); // Same inode!
}
Inode sources by backend:
| Backend | Inode Source |
|---|---|
MemoryBackend | Internal node IDs (incrementing counter) |
anyfs-sqlite: SqliteBackend | SQLite row IDs (INTEGER PRIMARY KEY) |
VRootFsBackend | OS inode numbers (metadata.inode) |
Error Handling
#![allow(unused)]
fn main() {
use anyfs_backend::FsError;
match result {
// Path errors
Err(FsError::NotFound { path }) => {
// e.g., path="/file.txt"
}
Err(FsError::AlreadyExists { path, operation }) => ...
Err(FsError::NotADirectory { path }) => ...
Err(FsError::NotAFile { path }) => ...
Err(FsError::DirectoryNotEmpty { path }) => ...
Err(FsError::SymlinkLoop { path }) => ... // Circular symlink detected
// Quota middleware errors\n Err(FsError::QuotaExceeded { path, limit, requested, usage }) => ...\n Err(FsError::FileSizeExceeded { path, size, limit }) => ...\n\n // Restrictions middleware errors\n Err(FsError::FeatureNotEnabled { path, feature, operation }) => ...\n\n // PathFilter middleware errors\n Err(FsError::AccessDenied { path, reason }) => ...\n\n // ReadOnly middleware errors\n Err(FsError::ReadOnly { path, operation }) => ...\n\n // RateLimit middleware errors\n Err(FsError::RateLimitExceeded { path, limit, window_secs }) => ...
// FsExt errors
Err(FsError::Serialization(msg)) => ...
Err(FsError::Deserialization(msg)) => ...
// Optional feature not supported
Err(FsError::NotSupported { operation }) => ...
Err(e) => ...
}
}
Built-in Backends (anyfs crate)
| Type | Description |
|---|---|
MemoryBackend | In-memory storage (default) |
StdFsBackend | Direct std::fs (no containment) |
VRootFsBackend | Host filesystem (with path containment) |
Ecosystem Backends (Separate Crates)
| Crate | Type | Description |
|---|---|---|
anyfs-sqlite | SqliteBackend | Persistent single-file database (optional encryption via SQLCipher) |
anyfs-indexed | IndexedBackend | Virtual paths + disk blobs (large files) |
SqliteBackend Encryption (Ecosystem Crate)
Crate:
anyfs-sqlitewithencryptionfeature
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;
// Standard (no encryption)
let backend = SqliteBackend::open("data.db")?;
// With encryption (requires `encryption` feature)
let backend = SqliteBackend::open_encrypted("encrypted.db", "password")?;
// With raw 256-bit key
let backend = SqliteBackend::open_with_key("encrypted.db", &key)?;
// Change password on open encrypted database
backend.change_password("new-password")?;
}
Without the correct password, the .db file appears as random bytes.
Feature:
anyfs-sqlite = { version = "0.1", features = ["encryption"] }
Middleware
| Type | Purpose |
|---|---|
Quota<B> | Quota enforcement |
Restrictions<B> | Least privilege |
PathFilter<B> | Path-based access control |
ReadOnly<B> | Prevent write operations |
RateLimit<B> | Operation throttling |
Tracing<B> | Instrumentation (tracing ecosystem) |
DryRun<B> | Log without executing |
Cache<B> | LRU read caching |
Overlay<B1,B2> | Union filesystem |
Layers
| Layer | Creates |
|---|---|
QuotaLayer | Quota<B> |
RestrictionsLayer | Restrictions<B> |
PathFilterLayer | PathFilter<B> |
ReadOnlyLayer | ReadOnly<B> |
RateLimitLayer | RateLimit<B> |
TracingLayer | Tracing<B> |
DryRunLayer | DryRun<B> |
CacheLayer | Cache<B> |
Note:
Overlay<B1, B2>is constructed directly viaOverlay::new(base, upper)rather than using a Layer, because it takes two backends.
Type Reference
From anyfs-backend
Core Traits (Layer 1):
| Trait | Description |
|---|---|
FsRead | Read operations: read, read_to_string, read_range, exists, metadata, open_read |
FsWrite | Write operations: write, append, remove_file, rename, copy, truncate, open_write |
FsDir | Directory operations: read_dir, create_dir*, remove_dir* |
Extended Traits (Layer 2):
| Trait | Description |
|---|---|
FsLink | Link operations: symlink, hard_link, read_link |
FsPermissions | Permission operations: set_permissions |
FsSync | Sync operations: sync, fsync |
FsStats | Stats operations: statfs |
Inode Trait (Layer 3):
| Trait | Description |
|---|---|
FsInode | Inode operations: path_to_inode, inode_to_path, lookup, metadata_by_inode |
POSIX Traits (Layer 4):
| Trait | Description |
|---|---|
FsHandles | Handle operations: open, read_at, write_at, close |
FsLock | Lock operations: lock, try_lock, unlock |
FsXattr | Extended attribute operations: get_xattr, set_xattr, list_xattr |
Convenience Supertraits:
| Trait | Combines |
|---|---|
Fs | FsRead + FsWrite + FsDir (90% of use cases) |
FsFull | Fs + FsLink + FsPermissions + FsSync + FsStats |
FsFuse | FsFull + FsInode (FUSE-mountable) |
FsPosix | FsFuse + FsHandles + FsLock + FsXattr (full POSIX) |
Other Types:
| Type | Description |
|---|---|
Layer | Middleware composition trait |
FsExt | Extension methods trait (JSON, type checks) |
FsError | Error type (with context) |
ROOT_INODE | Constant: root directory inode (= 1) |
FileType | File, Directory, Symlink |
Metadata | File/dir metadata (inode, nlink, size, times, permissions) |
DirEntry | Directory entry (name, inode, file_type) |
Permissions | File permissions (mode: u32) |
StatFs | Filesystem stats (bytes, inodes, block_size) |
From anyfs
| Type | Description |
|---|---|
MemoryBackend | In-memory backend |
StdFsBackend | Direct std::fs backend (no containment) |
VRootFsBackend | Host FS backend (with containment) |
Quota<B> | Quota middleware |
Restrictions<B> | Feature gate middleware |
PathFilter<B> | Path access control middleware |
ReadOnly<B> | Read-only middleware |
RateLimit<B> | Rate limiting middleware |
Tracing<B> | Tracing middleware |
DryRun<B> | Dry-run middleware |
Cache<B> | Caching middleware |
Overlay<B1,B2> | Union filesystem middleware |
QuotaLayer | Layer for Quota |
RestrictionsLayer | Layer for Restrictions |
PathFilterLayer | Layer for PathFilter |
ReadOnlyLayer | Layer for ReadOnly |
RateLimitLayer | Layer for RateLimit |
TracingLayer | Layer for Tracing |
DryRunLayer | Layer for DryRun |
CacheLayer | Layer for Cache |
OverlayLayer | Layer for Overlay |
From Ecosystem Crates
| Crate | Type | Description |
|---|---|---|
anyfs-sqlite | SqliteBackend | SQLite backend (optional encryption with feature) |
anyfs-indexed | IndexedBackend | Virtual paths + disk blobs |
Ergonomic Wrappers (in anyfs):
| Type | Description |
|---|---|
FileStorage<B> | Thin ergonomic wrapper (generic backend, boxed resolver) |
BackendStack | Fluent builder for middleware stacks |
.boxed() | Opt-in type erasure for FileStorage |
IterativeResolver | Default path resolver (symlink-aware for backends with FsLink) |
NoOpResolver | No-op resolver for SelfResolving backends |
CachingResolver<R> | LRU cache wrapper around another resolver |
AnyFS - Design Overview
Status: Current Last updated: 2025-12-24
What This Project Is
AnyFS is an open standard for pluggable virtual filesystem backends in Rust. It uses a middleware/decorator pattern (like Axum/Tower) for composable functionality with complete separation of concerns.
Philosophy: Focused App, Smart Storage
It decouples application logic from storage policy, enabling a Data Mesh at the filesystem level.
- The App focuses on business value (“save the document”).
- The Storage Layer enforces non-functional requirements (“encrypt, audit, limit, index”).
Anyone can:
- Control how a drive acts, looks, and protects itself.
- Implement a custom backend for their specific storage needs (Cloud, DB, RAM).
- Compose middleware to add limits, logging, and security.
- Use the ergonomic
FileStorage<B>wrapper for a standardstd::fs-like API.
Architecture (Tower-style Middleware)
┌─────────────────────────────────────────┐
│ FileStorage<B> │ ← Ergonomic std::fs-aligned API
├─────────────────────────────────────────┤
│ Middleware (optional, composable): │
│ │
│ Policy: │
│ Quota<B> - Resource limits │
│ Restrictions<B> - Least privilege │
│ PathFilter<B> - Sandbox paths │
│ ReadOnly<B> - Prevent writes │
│ RateLimit<B> - Ops/sec limit │
│ │
│ Observability: │
│ Tracing<B> - Instrumentation │
│ DryRun<B> - Test mode │
│ │
│ Performance: │
│ Cache<B> - LRU caching │
│ │
│ Composition: │
│ Overlay<B1,B2> - Layered FS │
│ │
├─────────────────────────────────────────┤
│ Backend (implements Fs, FsFull, │ ← Pure storage + fs semantics
│ FsFuse, or FsPosix) │
│ (Memory, SQLite, VRootFs, custom...) │
└─────────────────────────────────────────┘
Each layer has exactly one responsibility:
| Layer | Responsibility |
|---|---|
Backend (Fs+) | Storage + filesystem semantics |
Quota<B> | Resource limits (size, count, depth) |
Restrictions<B> | Opt-in operation restrictions |
PathFilter<B> | Path-based access control |
ReadOnly<B> | Prevent all write operations |
RateLimit<B> | Limit operations per second |
Tracing<B> | Instrumentation / audit trail |
Design Principle: Predictable Defaults, Opt-in Security
The Fs traits mimic std::fs with predictable, permissive defaults.
See ADR-027 for the decision rationale.
The traits are low-level interfaces that any backend can implement - memory, SQLite, real filesystem, network storage, etc. To maintain consistent behavior across all backends:
- All operations work by default (
symlink(),hard_link(),set_permissions()) - No security restrictions at the trait level
- Behavior matches what you’d expect from a real filesystem
Why not secure-by-default at this layer?
- Predictability: A backend should behave like a filesystem. Surprising restrictions break expectations.
- Backend-agnostic: The traits don’t know if they’re wrapping a sandboxed memory store or a real filesystem. Restrictions that make sense for one may not for another.
- Composition: Security is achieved by layering middleware, not by baking it into the storage layer.
Security is the responsibility of higher-level APIs:
| Layer | Security Responsibility |
|---|---|
Backend (Fs+) | None - pure filesystem semantics |
Middleware (Restrictions, PathFilter, etc.) | Opt-in restrictions |
FileStorage or application code | Configure appropriate middleware |
Example: Secure AI Agent Sandbox
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, FileStorage};
// Create wrapper type for type-safe sandbox
struct AiSandbox(FileStorage<MemoryBackend>);
impl AiSandbox {
fn new() -> Self {
AiSandbox(FileStorage::new(
MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(50 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.build())
))
}
}
}
The backend is permissive. The application adds restrictions appropriate for its use case.
Crates
| Crate | Purpose | Contains |
|---|---|---|
anyfs-backend | Minimal contract | Layered traits (Fs, FsFull, FsFuse, FsPosix), Layer trait, types, FsExt |
anyfs | Backends + middleware + ergonomics | Built-in backends, all middleware layers, FileStorage<B>, BackendStack builder |
Dependency Graph
anyfs-backend (trait + types)
^
|-- anyfs (backends + middleware + ergonomics)
^-- vrootfs feature may use strict-path
Future Considerations
These are optional extensions to explore after the core is stable.
Keep (add-ons that fit the current design):
- URL-based backend registry (
sqlite://,mem://,stdfs://) as a helper crate, not in core APIs. - Bulk operation helpers (
read_many,write_many,copy_many,glob,walk) asFsExtor a utilities crate. - Early async adapter crate (
anyfs-async) to support remote backends without changing sync traits. - Bash-style shell (example app or
anyfs-shellcrate) that routesls/cd/cat/cp/mv/rm/mkdir/statthroughFileStorageto demonstrate middleware and backend neutrality (navigation and file management only, not full bash scripting). - Copy-on-write overlay middleware (Afero-style
CopyOnWriteFs) as a specializedOverlayvariant. - Archive backends (zip/tar) as separate crates implementing
Fs(inspired by PyFilesystem/fsspec). - Indexing middleware (
Indexing<B>+IndexLayer) with pluggable index engines (SQLite default). See Indexing Middleware.
Defer (valuable, but needs data or wider review):
- Range/block caching middleware for
read_rangeheavy workloads (fsspec-style block cache). - Runtime capability discovery (
Capabilitiesstruct) for feature detection (symlink control, case sensitivity, max path length). - Lint/analyzer to discourage direct
std::fsusage in app code (System.IO.Abstractions-style). - Retry/timeout middleware for remote backends (when network backends are real).
Drop for now (adds noise or cross-platform complexity):
- Change notification support (optional
FsWatchtrait or polling middleware).
Detailed rationale lives in src/comparisons/prior-art-analysis.md.
Language Bindings (Python, C, etc.)
The AnyFS design is FFI-friendly and can be exposed to other languages with minimal friction.
Why the design works well for FFI:
| Design Choice | FFI Benefit |
|---|---|
&self methods (ADR-023) | Interior mutability allows holding a single Arc<FileStorage<...>> across FFI |
Box<dyn Fs> type erasure | FileStorage::boxed() provides a concrete type suitable for FFI |
| Owned return types | Vec<u8>, String, bool - no lifetime issues across FFI boundary |
| Simple structs | Metadata, DirEntry, Permissions map directly to Python/C structs |
Recommended approach for Python (PyO3):
#![allow(unused)]
fn main() {
// anyfs-python/src/lib.rs
use pyo3::prelude::*;
use anyfs::{FileStorage, MemoryBackend, Fs};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
#[pyclass]
struct PyFileStorage {
// Type-erased for FFI
inner: FileStorage<Box<dyn Fs>>,
}
#[pymethods]
impl PyFileStorage {
#[staticmethod]
fn memory() -> Self {
Self { inner: FileStorage::new(MemoryBackend::new()).boxed() }
}
#[staticmethod]
fn sqlite(path: &str) -> PyResult<Self> {
let backend = SqliteBackend::open(path)
.map_err(|e| PyErr::new::<pyo3::exceptions::PyIOError, _>(e.to_string()))?;
Ok(Self { inner: FileStorage::new(backend).boxed() })
}
fn read(&self, path: &str) -> PyResult<Vec<u8>> {
self.inner.read(path)
.map_err(|e| PyErr::new::<pyo3::exceptions::PyIOError, _>(e.to_string()))
}
fn write(&self, path: &str, data: &[u8]) -> PyResult<()> {
self.inner.write(path, data)
.map_err(|e| PyErr::new::<pyo3::exceptions::PyIOError, _>(e.to_string()))
}
}
#[pymodule]
fn anyfs_python(_py: Python, m: &PyModule) -> PyResult<()> {
m.add_class::<PyFileStorage>()?;
Ok(())
}
}
Python usage:
from anyfs_python import PyFileStorage
fs = PyFileStorage.memory()
fs.write("/hello.txt", b"Hello from Python!")
data = fs.read("/hello.txt")
print(data) # b"Hello from Python!"
Key considerations for FFI:
| Concern | Solution |
|---|---|
Generics (FileStorage<B>) | Use FileStorage<Box<dyn Fs>> (boxed) for FFI layer |
Streaming (Box<dyn Read>) | Wrap in language-native class with read(n) method |
| Middleware composition | Pre-build common stacks, expose as factory functions |
| Error handling | Convert FsError to language-native exceptions |
Future crate: anyfs-python
Dynamic Middleware
The current design uses compile-time generics for zero-cost middleware composition:
#![allow(unused)]
fn main() {
// Static: type known at compile time
let fs: Tracing<Quota<MemoryBackend>> = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(100).build())
.layer(TracingLayer::new());
}
For runtime-configured middleware (e.g., based on config files), use Box<dyn Fs>:
#![allow(unused)]
fn main() {
fn build_from_config(config: &Config) -> FileStorage<Box<dyn Fs>> {
let mut backend: Box<dyn Fs> = Box::new(MemoryBackend::new());
if config.enable_quota {
let quota_config = QuotaConfig {
max_total_size: Some(config.quota_limit),
..Default::default()
};
backend = Box::new(Quota::with_config(backend, quota_config)
.expect("quota initialization failed"));
}
if config.enable_antivirus {
backend = Box::new(AntivirusMiddleware::new(backend, config.av_scanner_path));
}
if config.enable_tracing {
backend = Box::new(Tracing::new(backend));
}
FileStorage::new(backend)
}
}
Trade-off: One Box allocation per layer + vtable dispatch. For I/O-bound workloads, this overhead is negligible (<1% of operation time).
Example: Antivirus Middleware
#![allow(unused)]
fn main() {
pub struct Antivirus<B> {
inner: B,
scanner: Arc<dyn VirusScanner + Send + Sync>,
}
pub trait VirusScanner: Send + Sync {
fn scan(&self, data: &[u8]) -> Option<String>; // Returns threat name if detected
}
impl<B: FsWrite> FsWrite for Antivirus<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
if let Some(threat) = self.scanner.scan(data) {
return Err(FsError::ThreatDetected {
path: path.to_path_buf(),
reason: threat,
});
}
self.inner.write(path, data)
}
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
let inner = self.inner.open_write(path)?;
Ok(Box::new(ScanningWriter::new(inner, self.scanner.clone())))
}
}
}
Future: Plugin System
For true runtime-loaded plugins (.so/.dll), a future MiddlewarePlugin trait could enable:
#![allow(unused)]
fn main() {
pub trait MiddlewarePlugin: Send + Sync {
fn name(&self) -> &str;
fn wrap(&self, backend: Box<dyn Fs>) -> Box<dyn Fs>;
}
// Load at runtime
let plugin = libloading::Library::new("antivirus_plugin.so")?;
let create_plugin: fn() -> Box<dyn MiddlewarePlugin> = plugin.get(b"create_plugin")?;
let av_plugin = create_plugin();
let backend = av_plugin.wrap(backend);
}
When to use each approach:
| Scenario | Approach | Overhead |
|---|---|---|
| Fixed middleware stack | Generics (compile-time) | Zero-cost |
| Config-driven middleware | Box<dyn Fs> chaining | ~50ns per layer |
| Runtime-loaded plugins | MiddlewarePlugin trait | ~50ns + plugin load |
Verdict: The current design supports dynamic middleware via Box<dyn Fs>. A formal MiddlewarePlugin trait for hot-loading is a future enhancement.
Middleware with Configurable Backends
Some middleware benefit from pluggable backends for their own storage or output. The pattern is to inject a trait object or configuration at construction time.
Metrics Middleware with Prometheus Exporter:
(Requires features = ["metrics"])
#![allow(unused)]
fn main() {
use prometheus::{Counter, Histogram, Registry};
pub struct Metrics<B> {
inner: B,
reads: Counter,
writes: Counter,
read_bytes: Counter,
write_bytes: Counter,
latency: Histogram,
}
impl<B> Metrics<B> {
/// Creates a new Metrics middleware.
///
/// # Panics
/// Panics if metric registration fails (indicates duplicate metric names - programmer error).
/// This is acceptable at initialization time per the No Panic Policy, which applies to
/// runtime operations. Initialization failures are configuration errors that should fail fast.
pub fn new(inner: B, registry: &Registry) -> Self {
let reads = Counter::new("anyfs_reads_total", "Total read operations")
.expect("metric creation failed");
let writes = Counter::new("anyfs_writes_total", "Total write operations")
.expect("metric creation failed");
registry.register(Box::new(reads.clone()))
.expect("metric registration failed");
registry.register(Box::new(writes.clone()))
.expect("metric registration failed");
// ... register all metrics
Self { inner, reads, writes, /* ... */ }
}
}
impl<B: FsRead> FsRead for Metrics<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
self.reads.inc();
let start = Instant::now();
let result = self.inner.read(path);
self.latency.observe(start.elapsed().as_secs_f64());
if let Ok(ref data) = result {
self.read_bytes.inc_by(data.len() as u64);
}
result
}
}
// Expose via HTTP endpoint
async fn metrics_handler(registry: web::Data<Registry>) -> impl Responder {
let encoder = TextEncoder::new();
let metrics = registry.gather();
encoder.encode_to_string(&metrics)
.unwrap_or_else(|e| format!("# Encoding error: {}", e))
}
}
Indexing Middleware with Remote Database:
#![allow(unused)]
fn main() {
pub trait IndexBackend: Send + Sync {
fn record_write(&self, path: &Path, size: u64, hash: &str) -> Result<(), IndexError>;
fn record_delete(&self, path: &Path) -> Result<(), IndexError>;
fn query(&self, pattern: &str) -> Result<Vec<IndexEntry>, IndexError>;
}
// SQLite implementation
pub struct SqliteIndex { conn: Connection }
// PostgreSQL implementation
pub struct PostgresIndex { pool: PgPool }
// MariaDB implementation
pub struct MariaDbIndex { pool: MySqlPool }
pub struct Indexing<B, I: IndexBackend> {
inner: B,
index: I,
}
impl<B: FsWrite, I: IndexBackend> FsWrite for Indexing<B, I> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
self.inner.write(path, data)?;
let hash = sha256(data);
self.index.record_write(path, data.len() as u64, &hash)
.map_err(|e| FsError::Backend(e.to_string()))?;
Ok(())
}
}
// Usage with PostgreSQL
let index = PostgresIndex::connect("postgres://user:pass@db.example.com/files").await?;
let backend = MemoryBackend::new()
.layer(IndexLayer::builder()
.index(index)
.build());
}
Configurable Tracing with Multiple Sinks:
#![allow(unused)]
fn main() {
pub trait TraceSink: Send + Sync {
fn log_operation(&self, op: &Operation);
}
// Structured JSON logs
pub struct JsonSink { writer: Box<dyn Write + Send> }
// CEF (Common Event Format) for SIEM integration
pub struct CefSink {
host: String,
port: u16,
device_vendor: String
}
impl TraceSink for CefSink {
fn log_operation(&self, op: &Operation) {
let cef = format!(
"CEF:0|AnyFS|FileStorage|1.0|{}|{}|{}|src={} dst={}",
op.event_id, op.name, op.severity, op.source_path, op.dest_path
);
self.send_syslog(&cef);
}
}
// Remote sink (e.g., Loki, Elasticsearch)
pub struct RemoteSink { endpoint: String, client: reqwest::Client }
pub struct Tracing<B, S: TraceSink> {
inner: B,
sink: S,
}
}
Performance: Strategic Boxing (ADR-025)
AnyFS follows Tower/Axum’s approach to dynamic dispatch: zero-cost on the hot path, box at boundaries where flexibility is needed. We avoid heap allocations and dynamic dispatch unless they add flexibility without meaningful performance impact.
| Path | Operations | Cost |
|---|---|---|
| Hot path (zero-cost) | read(), write(), metadata(), exists() | Concrete types, no boxing |
| Hot path (zero-cost) | Middleware composition: Quota<Tracing<B>> | Generics, monomorphized |
| Cold path (boxed) | open_read(), open_write(), read_dir() | One Box allocation per call |
| Opt-in | FileStorage::boxed() | Explicit type erasure |
Hot-loop guidance: If you open many small files and care about micro-overhead (especially on virtual backends), prefer read()/write() or the typed streaming extension (FsReadTyped/FsWriteTyped) when the backend type is known. These are the zero-allocation fast paths.
Why box streams and iterators?
- Middleware needs to wrap them (
QuotaWritercounts bytes,PathFilterfilters entries) - Box allocation (~50ns) is <1% of actual I/O time
- Avoids type explosion:
QuotaReader<PathFilterReader<TracingReader<Cursor<...>>>>
Why NOT box bulk operations?
read()andwrite()are the most common operations- They return concrete types (
Vec<u8>,()) - Zero overhead for the typical use case
See ADR-025 and Zero-Cost Alternatives for full analysis.
Trait Architecture (in anyfs-backend)
AnyFS uses layered traits for maximum flexibility with minimal complexity.
See ADR-030 for the rationale behind the layered hierarchy.
FsPosix
│
┌──────────────┼──────────────┐
│ │ │
FsHandles FsLock FsXattr
│ │ │
└──────────────┴──────────────┘
│
FsFuse ← FsFull + FsInode
│
┌──────────────┴──────────────┐
│ │
FsFull FsInode
│
│
├──────┬───────┬───────┬──────┐
│ │ │ │ │
FsLink FsPerm FsSync FsStats │
│ │ │ │ │
└──────┴───────┴───────┴──────┘
│
Fs ← Most users only need this
│
┌───────────┼───────────┐
│ │ │
FsRead FsWrite FsDir
Simple rule: Import Fs for basic use. Add traits as needed for advanced features.
Core Traits (Layer 1)
FsRead - Read Operations
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
fn read_to_string(&self, path: &Path) -> Result<String, FsError>;
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError>;
fn exists(&self, path: &Path) -> Result<bool, FsError>;
fn metadata(&self, path: &Path) -> Result<Metadata, FsError>;
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
}
FsWrite - Write Operations
#![allow(unused)]
fn main() {
pub trait FsWrite: Send + Sync {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
fn remove_file(&self, path: &Path) -> Result<(), FsError>;
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError>;
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError>;
fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError>;
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
}
}
Note: All methods use
&self(interior mutability). Backends manage their own synchronization. See ADR-023.
FsDir - Directory Operations
#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
fn create_dir(&self, path: &Path) -> Result<(), FsError>;
fn create_dir_all(&self, path: &Path) -> Result<(), FsError>;
fn remove_dir(&self, path: &Path) -> Result<(), FsError>;
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError>;
}
}
Extended Traits (Layer 2 - Optional)
#![allow(unused)]
fn main() {
pub trait FsLink: Send + Sync {
fn symlink(&self, original: &Path, link: &Path) -> Result<(), FsError>;
fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError>;
fn read_link(&self, path: &Path) -> Result<PathBuf, FsError>;
fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError>;
}
pub trait FsPermissions: Send + Sync {
fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError>;
}
pub trait FsSync: Send + Sync {
fn sync(&self) -> Result<(), FsError>;
fn fsync(&self, path: &Path) -> Result<(), FsError>;
}
pub trait FsStats: Send + Sync {
fn statfs(&self) -> Result<StatFs, FsError>;
}
}
Inode Traits (Layer 3 - For FUSE)
#![allow(unused)]
fn main() {
pub trait FsInode: Send + Sync {
fn path_to_inode(&self, path: &Path) -> Result<u64, FsError>;
fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError>;
fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError>;
fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError>;
}
}
POSIX Traits (Layer 4 - Full POSIX)
#![allow(unused)]
fn main() {
pub trait FsHandles: Send + Sync {
fn open(&self, path: &Path, flags: OpenFlags) -> Result<Handle, FsError>;
fn read_at(&self, handle: Handle, buf: &mut [u8], offset: u64) -> Result<usize, FsError>;
fn write_at(&self, handle: Handle, data: &[u8], offset: u64) -> Result<usize, FsError>;
fn close(&self, handle: Handle) -> Result<(), FsError>;
}
pub trait FsLock: Send + Sync {
fn lock(&self, handle: Handle, lock: LockType) -> Result<(), FsError>;
fn try_lock(&self, handle: Handle, lock: LockType) -> Result<bool, FsError>;
fn unlock(&self, handle: Handle) -> Result<(), FsError>;
}
pub trait FsXattr: Send + Sync {
fn get_xattr(&self, path: &Path, name: &str) -> Result<Vec<u8>, FsError>;
fn set_xattr(&self, path: &Path, name: &str, value: &[u8]) -> Result<(), FsError>;
fn remove_xattr(&self, path: &Path, name: &str) -> Result<(), FsError>;
fn list_xattr(&self, path: &Path) -> Result<Vec<String>, FsError>;
}
}
Convenience Supertraits (Simple API)
#![allow(unused)]
fn main() {
/// Basic filesystem - covers 90% of use cases
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}
/// Full filesystem with all std::fs features
pub trait FsFull: Fs + FsLink + FsPermissions + FsSync + FsStats {}
impl<T: Fs + FsLink + FsPermissions + FsSync + FsStats> FsFull for T {}
/// FUSE-mountable filesystem
pub trait FsFuse: FsFull + FsInode {}
impl<T: FsFull + FsInode> FsFuse for T {}
/// Full POSIX filesystem
pub trait FsPosix: FsFuse + FsHandles + FsLock + FsXattr {}
impl<T: FsFuse + FsHandles + FsLock + FsXattr> FsPosix for T {}
}
Usage Examples
Application code should use FileStorage for the std::fs-style DX (string paths). Core trait examples are shown separately for implementers and generic code.
Most Users: FileStorage
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, MemoryBackend};
fn process_files() -> Result<(), Box<dyn std::error::Error>> {
let fs = FileStorage::new(MemoryBackend::new());
let data = fs.read("/input.txt")?;
fs.write("/output.txt", &processed(data))?;
Ok(())
}
}
Generic Code over Core Traits
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, Fs, FsError};
fn process_files<B: Fs>(fs: &FileStorage<B>) -> Result<(), FsError> {
let data = fs.read("/input.txt")?;
fs.write("/output.txt", &processed(data))?;
Ok(())
}
}
Need Links? Add the Trait
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, Fs, FsLink, FsError};
fn with_symlinks<B: Fs + FsLink>(fs: &FileStorage<B>) -> Result<(), FsError> {
fs.write("/target.txt", b"content")?;
fs.symlink("/target.txt", "/link.txt")?;
Ok(())
}
}
FUSE Mount
Mounting is part of anyfs crate with fuse and winfsp feature flags; see src/guides/mounting.md.
#![allow(unused)]
fn main() {
use anyfs::{FsFuse, MountHandle, MountError};
fn mount_filesystem(fs: impl FsFuse) -> Result<(), MountError> {
MountHandle::mount(fs, "/mnt/myfs")?;
Ok(())
}
}
Full POSIX Application
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, FsPosix, FsError, OpenFlags, LockType, Handle};
fn database_app<B: FsPosix>(fs: &FileStorage<B>, data: &[u8], offset: u64) -> Result<(), FsError> {
let handle: Handle = fs.open("/data.db", OpenFlags::READ_WRITE)?;
fs.lock(handle, LockType::Exclusive)?;
fs.write_at(handle, data, offset)?;
fs.unlock(handle)?;
fs.close(handle)?;
Ok(())
}
}
Core Types (in anyfs-backend)
Constants
#![allow(unused)]
fn main() {
/// Root directory inode. FUSE convention.
pub const ROOT_INODE: u64 = 1;
}
Metadata
#![allow(unused)]
fn main() {
/// File or directory metadata.
#[derive(Debug, Clone)]
pub struct Metadata {
/// Type: File, Directory, or Symlink.
pub file_type: FileType,
/// Size in bytes (0 for directories).
pub size: u64,
/// Permission mode bits. Default to 0o755/0o644 if unsupported.
pub permissions: Permissions,
/// Creation time (UNIX_EPOCH if unsupported).
pub created: SystemTime,
/// Last modification time.
pub modified: SystemTime,
/// Last access time.
pub accessed: SystemTime,
/// Inode number (0 if unsupported).
pub inode: u64,
/// Number of hard links (1 if unsupported).
pub nlink: u64,
}
impl Metadata {
/// Check if this is a file.
pub fn is_file(&self) -> bool { self.file_type == FileType::File }
/// Check if this is a directory.
pub fn is_dir(&self) -> bool { self.file_type == FileType::Directory }
/// Check if this is a symlink.
pub fn is_symlink(&self) -> bool { self.file_type == FileType::Symlink }
}
}
FileType
#![allow(unused)]
fn main() {
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum FileType {
File,
Directory,
Symlink,
}
}
DirEntry
#![allow(unused)]
fn main() {
/// Entry in a directory listing.
#[derive(Debug, Clone)]
pub struct DirEntry {
/// File or directory name (not full path).
pub name: String,
/// Full path to the entry.
pub path: PathBuf,
/// Type: File, Directory, or Symlink.
pub file_type: FileType,
/// Size in bytes (0 for directories, can be lazy).
pub size: u64,
/// Inode number (0 if unsupported).
pub inode: u64,
}
}
Permissions
#![allow(unused)]
fn main() {
/// Unix-style permission bits.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub struct Permissions(u32);
impl Permissions {
/// Create permissions from a mode (e.g., 0o755).
pub fn from_mode(mode: u32) -> Self { Permissions(mode) }
/// Get the mode bits.
pub fn mode(&self) -> u32 { self.0 }
/// Read-only permissions (0o444).
pub fn readonly() -> Self { Permissions(0o444) }
/// Default file permissions (0o644).
pub fn default_file() -> Self { Permissions(0o644) }
/// Default directory permissions (0o755).
pub fn default_dir() -> Self { Permissions(0o755) }
}
}
StatFs
#![allow(unused)]
fn main() {
/// Filesystem statistics.
#[derive(Debug, Clone)]
pub struct StatFs {
/// Total size in bytes (0 = unlimited).
pub total_bytes: u64,
/// Used bytes.
pub used_bytes: u64,
/// Available bytes.
pub available_bytes: u64,
/// Total number of inodes (0 = unlimited).
pub total_inodes: u64,
/// Used inodes.
pub used_inodes: u64,
/// Available inodes.
pub available_inodes: u64,
/// Filesystem block size.
pub block_size: u64,
/// Maximum filename length.
pub max_name_len: u64,
}
}
Middleware (in anyfs)
Each middleware implements the same traits as its inner backend. This enables composition while preserving capabilities.
Quota
Enforces quota limits. Tracks usage and rejects operations that would exceed limits.
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer};
let backend = QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024) // 100 MB
.max_file_size(10 * 1024 * 1024) // 10 MB per file
.max_node_count(10_000) // 10K files/dirs
.max_dir_entries(1_000) // 1K entries per dir
.max_path_depth(64)
.build()
.layer(MemoryBackend::new());
// Check usage
let usage = backend.usage();
let remaining = backend.remaining();
}
Restrictions
Blocks permission-related operations when needed.
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Restrictions};
// Symlink/hard-link capability is determined by trait bounds (FsLink).
// Restrictions only controls permission changes.
let backend = RestrictionsLayer::builder()
.deny_permissions() // Block set_permissions() calls
.build()
.layer(MemoryBackend::new());
}
When blocked, operations return FsError::FeatureNotEnabled.
Tracing
Integrates with the tracing ecosystem for structured logging and instrumentation.
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, TracingLayer};
let backend = MemoryBackend::new()
.layer(TracingLayer::new()
.with_target("anyfs")
.with_level(tracing::Level::DEBUG));
// Users configure tracing subscribers as they prefer
tracing_subscriber::fmt::init();
}
Why tracing instead of custom logging?
- Works with existing tracing infrastructure
- Structured logging with spans
- Compatible with OpenTelemetry, Jaeger, etc.
- Users choose their subscriber (console, file, distributed tracing)
PathFilter
Restricts access to specific paths. Essential for sandboxing.
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, PathFilterLayer};
let backend = PathFilterLayer::builder()
.allow("/workspace/**") // Allow all under /workspace
.allow("/tmp/**") // Allow temp files
.deny("/workspace/.env") // But deny .env files
.deny("**/.git/**") // Deny all .git directories
.build()
.layer(MemoryBackend::new());
}
When a path is denied, operations return FsError::AccessDenied.
ReadOnly
Prevents all write operations. Useful for publishing immutable data.
#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, ReadOnly, FileStorage};
// Wrap any backend to make it read-only
let backend = ReadOnly::new(VRootFsBackend::new("/var/published")?);
let fs = FileStorage::new(backend);
fs.read("/doc.txt")?; // OK
fs.write("/doc.txt", b"x"); // Error: FsError::ReadOnly
}
RateLimit
Limits operations per second. Prevents runaway agents.
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, RateLimitLayer};
let backend = RateLimitLayer::builder()
.max_ops(100) // 100 ops per window
.per_second() // 1 second window
.build()
.layer(MemoryBackend::new());
// When rate exceeded: FsError::RateLimitExceeded
}
DryRun
Logs operations without executing writes. Great for testing and debugging.
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, DryRun, FileStorage};
let backend = DryRun::new(MemoryBackend::new());
let fs = FileStorage::new(backend);
fs.write("/test.txt", b"hello")?; // Logged but not written
let _ = fs.read("/test.txt"); // Error: file doesn't exist
// To inspect recorded operations, keep the DryRun handle before wrapping it.
}
Cache
LRU cache for read operations. Essential for slow backends (S3, network).
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, CacheLayer, FileStorage};
let backend = MemoryBackend::new()
.layer(CacheLayer::builder()
.max_entries(10_000) // Max 10K entries in cache
.max_entry_size(10 * 1024 * 1024) // 10 MB max per entry
.build());
let fs = FileStorage::new(backend);
// First read: hits backend, caches result
let data = fs.read("/file.txt")?;
// Second read: served from cache (fast!)
let data = fs.read("/file.txt")?;
}
Overlay<Base, Upper>
Union filesystem with a read-only base and writable upper layer. Like Docker.
#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, MemoryBackend, Overlay};
// Base: read-only template
let base = VRootFsBackend::new("/var/templates")?;
// Upper: writable layer for changes
let upper = MemoryBackend::new();
let backend = Overlay::new(base, upper);
// Reads check upper first, then base
// Writes always go to upper
// Deletes in upper "shadow" base files
}
Use cases:
- Container images (base image + writable layer)
- Template filesystems with per-user modifications
- Testing with rollback capability
FileStorage (in anyfs)
FileStorage<B> is an ergonomic wrapper with a single generic parameter:
B- Backend type (the only generic)- Resolver is boxed internally (cold path, per ADR-025)
Axum-style design: Simple by default, type erasure opt-in via .boxed().
Basic Usage
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
// Type is inferred - no need to write it out
let fs = FileStorage::new(MemoryBackend::new());
fs.create_dir_all("/documents")?;
fs.write("/documents/hello.txt", b"Hello!")?;
let content = fs.read("/documents/hello.txt")?;
}
Type-Safe Wrappers (User-Defined)
If you need compile-time safety to prevent mixing filesystems, create wrapper types:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Define wrapper types for your domains
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);
// Type-safe function signatures prevent mixing
fn process_sandbox(fs: &SandboxFs) {
// Can only accept SandboxFs
}
fn save_user_file(fs: &UserDataFs, name: &str, data: &[u8]) {
// Can only accept UserDataFs
}
// Compile-time safety:
let sandbox = SandboxFs(FileStorage::new(MemoryBackend::new()));
process_sandbox(&sandbox); // OK
// process_sandbox(&userdata); // Compile error! Wrong type
}
Type Aliases for Clean Code
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Define your standard secure stack
type SecureBackend = Tracing<Restrictions<Quota<SqliteBackend>>>;
// Type aliases for common combinations
type SandboxFs = FileStorage<MemoryBackend>;
type UserDataFs = FileStorage<SecureBackend>;
// Clean function signatures
fn run_agent(fs: &SandboxFs) { ... }
}
FileStorage Implementation
#![allow(unused)]
fn main() {
use anyfs_backend::PathResolver;
/// Ergonomic wrapper with single generic.
pub struct FileStorage<B> {
backend: B,
resolver: Box<dyn PathResolver>, // Boxed: cold path
}
impl<B: Fs> FileStorage<B> {
/// Create with default resolver (IterativeResolver).
pub fn new(backend: B) -> Self { ... }
/// Create with custom path resolver.
pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self { ... }
/// Type-erase the backend (opt-in boxing).
pub fn boxed(self) -> FileStorage<Box<dyn Fs>> { ... }
}
}
Type Erasure (Opt-in)
When you need uniform types (e.g., collections), use .boxed():
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Type-erased for uniform storage
let filesystems: Vec<FileStorage<Box<dyn Fs>>> = vec![
FileStorage::new(MemoryBackend::new()).boxed(),
FileStorage::new(SqliteBackend::open("a.db")?).boxed(),
];
}
Layer Trait (in anyfs-backend)
The Layer trait (inspired by Tower) standardizes middleware composition:
#![allow(unused)]
fn main() {
/// A layer that wraps a backend to add functionality.
pub trait Layer<B: Fs> {
type Backend: Fs;
fn layer(self, backend: B) -> Self::Backend;
}
/// Extension trait enabling fluent `.layer()` method on any Fs.
/// This is how `backend.layer(QuotaLayer::builder()...build())` works.
pub trait LayerExt: Fs + Sized {
fn layer<L: Layer<Self>>(self, layer: L) -> L::Backend {
layer.layer(self)
}
}
// Blanket impl: any Fs gets .layer() for free
impl<B: Fs> LayerExt for B {}
}
Each middleware provides a corresponding Layer implementation:
#![allow(unused)]
fn main() {
// QuotaLayer wraps QuotaConfig (not a separate QuotaLimits type)
pub struct QuotaLayer {
config: QuotaConfig,
}
impl<B: Fs> Layer<B> for QuotaLayer {
type Backend = Quota<B>;
fn layer(self, backend: B) -> Self::Backend {
Quota::with_config(backend, self.config)
.expect("quota initialization failed")
}
}
}
Note: Middleware that implements additional traits (like FsInode) can use more specific bounds to preserve capabilities through the layer.
Composing Middleware
Middleware composes by wrapping. Order matters - innermost applies first.
Fluent Composition
Use the .layer() extension method for Axum-style composition:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, TracingLayer};
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(RestrictionsLayer::builder()
.deny_permissions() // Block set_permissions()
.build())
.layer(TracingLayer::new());
}
BackendStack Builder
For complex stacks, use BackendStack for a fluent API:
#![allow(unused)]
fn main() {
use anyfs::BackendStack;
let fs = BackendStack::new(MemoryBackend::new())
.limited(|l| l
.max_total_size(100 * 1024 * 1024)
.max_file_size(10 * 1024 * 1024))
.restricted(|g| g
.deny_permissions()) // Block set_permissions() calls
.traced()
.into_container();
}
Built-in Backends (anyfs crate)
| Backend | Description |
|---|---|
MemoryBackend | In-memory storage, implements Clone for snapshots |
StdFsBackend | Direct std::fs delegation (no containment) |
VRootFsBackend | Host filesystem with path containment (via strict-path) |
Ecosystem Backends (Separate Crates)
Complex backends with internal runtime requirements live in their own crates:
| Crate | Backend | Description |
|---|---|---|
anyfs-sqlite | SqliteBackend | Single-file database with pooling, WAL, sharding; optional encryption |
anyfs-indexed | IndexedBackend | Virtual paths + disk blobs (large file support) |
Why separate crates? Complex backends need internal runtimes (connection pools, sharding, chunking). Keeps anyfs lightweight and focused on framework glue.
Path Handling
Core traits take &Path so they are object-safe (dyn Fs works). The ergonomic layer (FileStorage and FsExt) accepts impl AsRef<Path>:
#![allow(unused)]
fn main() {
// These work via FileStorage/FsExt
fs.write("/file.txt", data)?;
fs.write(String::from("/file.txt"), data)?;
fs.write(PathBuf::from("/file.txt"), data)?;
}
Path Resolution
Path resolution (walking directory structure, following symlinks) operates on the Fs abstraction, not reimplemented per-backend.
See ADR-029 for the path-resolution decision.
Why Abstract Path Resolution?
We simulate inodes - that’s the whole point of virtualizing a filesystem. Path resolution must work on that abstraction:
/foo/../barcannot be resolved lexically -foomight be a symlink to/other/place, making..resolve to/other- Resolution requires following the actual directory structure (inodes)
- The
Fstraits have the needed methods:metadata(),read_link(),read_dir()
Path Resolution via PathResolver Trait
FileStorage delegates path resolution to a pluggable PathResolver (see ADR-033). The default IterativeResolver walks paths component by component:
#![allow(unused)]
fn main() {
/// Default resolver algorithm (simplified):
/// - Walk path component by component
/// - Use backend.metadata() to check node types
/// - If backend implements FsLink, use read_link() to follow symlinks
/// - Detect circular symlinks (max depth: 40)
/// - Return fully resolved canonical path
pub struct IterativeResolver {
max_symlink_depth: usize, // Default: 40
}
}
Resolution behavior depends on the resolver used. The default IterativeResolver follows symlinks when the backend implements FsLink. For backends without FsLink, it traverses directories but treats symlinks as regular files. Users can provide custom resolvers for case-insensitive matching, caching, or other behaviors.
Note: Built-in virtual backends (MemoryBackend) and ecosystem backends (SqliteBackend) implement FsLink, so symlink-aware resolution works out of the box.
When Resolution Is Needed
| Backend | Needs Our Resolution? | Why |
|---|---|---|
MemoryBackend | Yes | Storage (HashMap) has no FS semantics |
SqliteBackend | Yes | Storage (SQL tables) has no FS semantics |
VRootFsBackend | No | OS handles resolution; strict-path prevents escapes |
Opt-out Mechanism
Virtual backends need resolution by default. Real filesystem backends opt out via a marker trait:
#![allow(unused)]
fn main() {
/// Marker trait for backends that handle their own path resolution.
/// VRootFsBackend implements this because the OS handles resolution.
pub trait SelfResolving {}
impl SelfResolving for VRootFsBackend {}
}
Important:
FileStoragedoes NOT auto-detectSelfResolving. You must explicitly useNoOpResolver:#![allow(unused)] fn main() { // For SelfResolving backends, use NoOpResolver explicitly let fs = FileStorage::with_resolver(VRootFsBackend::new("/data")?, NoOpResolver); }
The default IterativeResolver follows symlinks when FsLink is available. Custom resolvers can implement different behaviors (e.g., no symlink following, caching, case-insensitivity).
#![allow(unused)]
fn main() {
impl<B: Fs> FileStorage<B> {
pub fn new(backend: B) -> Self { /* uses IterativeResolver */ }
pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self { /* custom resolver */ }
}
}
Path Canonicalization Utilities
FileStorage provides path canonicalization methods modeled after the soft-canonicalize crate, adapted to work on the virtual filesystem abstraction.
Why We Need Our Own Canonicalization
std::fs::canonicalize operates on the real filesystem. For virtual backends (MemoryBackend, SqliteBackend), there is no real filesystem - we need canonicalization that queries the virtual structure via metadata() and read_link().
Core Methods
#![allow(unused)]
fn main() {
impl<B: Fs> FileStorage<B> {
/// Strict canonicalization - entire path must exist.
///
/// Delegates to the PathResolver to resolve symlinks and normalize the path.
/// Returns error if any component doesn't exist.
pub fn canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
self.resolver.canonicalize(path.as_ref(), &self.backend as &dyn Fs)
}
/// Soft canonicalization - resolves existing components,
/// appends non-existent remainder lexically.
///
/// Delegates to the PathResolver.
pub fn soft_canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
self.resolver.soft_canonicalize(path.as_ref(), &self.backend as &dyn Fs)
}
/// Anchored soft canonicalization - like soft_canonicalize but
/// clamps result within a boundary directory.
///
/// Useful for sandboxing: ensures the resolved path never escapes
/// the anchor directory, even via symlinks or `..` traversal.
pub fn anchored_canonicalize(
&self,
path: impl AsRef<Path>,
anchor: impl AsRef<Path>
) -> Result<PathBuf, FsError>;
}
/// Standalone lexical normalization (no backend needed).
///
/// Pure string manipulation:
/// - Collapses `//` to `/`
/// - Removes trailing slashes
/// - Does NOT resolve `.` or `..` (those require filesystem context)
/// - Does NOT follow symlinks
pub fn normalize(path: impl AsRef<Path>) -> PathBuf;
}
Algorithm: Component-by-Component Resolution
The canonicalization algorithm walks the path one component at a time:
Input: /a/b/c/d/e
1. Start at root (/)
2. Check /a exists?
- Yes, and it's a symlink → follow to target
- Yes, and it's a directory → continue
3. Check /a/b exists?
- Yes → continue
4. Check /a/b/c exists?
- No → stop resolution, append "c/d/e" lexically
5. Result: /resolved/path/to/b/c/d/e
Key behaviors:
- Symlink following: Existing symlinks are resolved to their targets
- Non-existent handling: When a component doesn’t exist, the remainder is appended as-is
- Cycle detection: Bounded depth tracking prevents infinite loops from circular symlinks
- Root boundary: Never ascends past the filesystem root
Comparison with std::fs
| Function | std::fs | FileStorage |
|---|---|---|
canonicalize | Requires all components exist | Same - returns error if path doesn’t exist |
| N/A | N/A | soft_canonicalize - handles non-existent paths |
| N/A | N/A | anchored_canonicalize - sandboxed resolution |
Security Considerations
For virtual backends: Canonicalization happens entirely within the virtual structure. There is no host filesystem to escape to.
For VRootFsBackend: Delegates to OS canonicalization + strict-path containment. The anchored_canonicalize provides additional safety by clamping paths within a boundary.
Platform Notes (VRootFsBackend only)
When delegating to OS canonicalization:
- Windows: Returns extended-length UNC paths (
\\?\C:\path) by default - Linux/macOS: Standard canonical paths
Windows UNC Path Simplification
The dunce crate provides simplified() - a lexical function that converts UNC paths to regular paths without filesystem access:
#![allow(unused)]
fn main() {
use dunce::simplified;
// \\?\C:\Users\foo\bar.txt → C:\Users\foo\bar.txt
let path = simplified(r"\\?\C:\Users\foo\bar.txt");
}
Why this matters for soft_canonicalize:
soft_canonicalizeworks with non-existent paths- We can’t use
dunce::canonicalize(requires path to exist) dunce::simplifiedis pure string manipulation - works on any path
When UNC can be simplified:
- Path is on a local drive (C:, D:, etc.)
- Path doesn’t exceed MAX_PATH (260 chars)
- No reserved names (CON, PRN, etc.)
When UNC must be kept:
- Network paths (
\\?\UNC\server\share) - Paths exceeding MAX_PATH
- Paths with reserved device names
Virtual backends have no platform differences - paths are just strings.
Filesystem Semantics: Linux-like by Default
Design principle: Simple, secure defaults. Don’t close doors for alternative semantics.
See ADR-028 for the decision rationale.
Default Behavior (Virtual Backends)
Virtual backends (MemoryBackend, SqliteBackend) use Linux-like semantics:
| Aspect | Behavior | Rationale |
|---|---|---|
| Case sensitivity | Case-sensitive | Simpler, more secure, Unix standard |
| Path separator | / internally | Cross-platform consistency |
| Reserved names | None | No artificial restrictions |
| Max path length | No limit | Virtual, no OS constraints |
ADS (:stream) | Not supported | Security risk, complexity |
Real filesystem backends (StdFsBackend, VRootFsBackend) follow OS semantics—case-insensitive on Windows/macOS, case-sensitive on Linux.
Trait is Agnostic
The Fs trait doesn’t enforce filesystem semantics - backends decide their behavior:
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, MemoryBackend};
use std::path::Path;
// Virtual backends: Linux-like (case-sensitive)
let linux_fs = FileStorage::new(MemoryBackend::new());
assert!(linux_fs.exists("/Foo.txt")? != linux_fs.exists("/foo.txt")?);
// For case-insensitive behavior, implement a custom PathResolver:
// (Not built-in because real-world demand is minimal - VRootFsBackend on
// Windows/macOS already gets case-insensitivity from the OS)
struct CaseFoldingResolver;
impl PathResolver for CaseFoldingResolver {
fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
// Normalize path components to lowercase during lookup
todo!()
}
fn soft_canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
// Same but allows non-existent final component
todo!()
}
}
let ntfs_like = FileStorage::with_resolver(
MemoryBackend::new(),
CaseFoldingResolver // User-implemented
);
}
FUSE Mount: Report What You Support
When mounting, the FUSE layer reports backend capabilities to the OS:
#![allow(unused)]
fn main() {
impl FuseOps for AnyFsFuse<B> {
fn get_volume_params(&self) -> VolumeParams {
VolumeParams {
case_sensitive: self.backend.is_case_sensitive(),
supports_hard_links: /* check if B: FsLink */,
supports_symlinks: /* check if B: FsLink */,
// ...
}
}
}
}
Windows respects these flags - a case-sensitive mounted filesystem works correctly (modern Windows/WSL handle this).
Illustrative: Custom Middleware for Windows Compatibility
For users who need Windows-safe paths in virtual backends, here are example middleware patterns (not built-in - implement as needed):
#![allow(unused)]
fn main() {
/// Example: Middleware that validates paths are Windows-compatible.
/// Rejects: CON, PRN, NUL, COM1-9, LPT1-9, trailing dots/spaces, ADS.
pub struct NtfsValidation<B> { /* user-implemented */ }
/// Example: Middleware that makes a backend case-insensitive.
/// Stores canonical (lowercase) keys, preserves original case in metadata.
pub struct CaseInsensitive<B> { /* user-implemented */ }
}
Not built-in - these are illustrative patterns for users who need NTFS-like behavior.
Security Model
Security is achieved through composition:
| Concern | Solution |
|---|---|
| Path containment | PathFilter + VRootFsBackend |
| Resource exhaustion | Quota enforces quotas |
| Rate limiting | RateLimit prevents abuse |
| Feature restriction | Restrictions disables dangerous features |
| Read-only access | ReadOnly prevents writes |
| Audit trail | Tracing instruments operations |
| Tenant isolation | Separate backend instances |
| Testing | DryRun logs without executing |
Defense in depth: Compose multiple middleware layers for comprehensive security.
AI Agent Sandbox Example
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, Quota, PathFilter, RateLimit, Tracing};
// Build a secure sandbox for an AI agent
let sandbox = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(50 * 1024 * 1024) // 50 MB
.max_file_size(5 * 1024 * 1024) // 5 MB per file
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.deny("**/secrets/**")
.build())
.layer(RateLimitLayer::builder()
.max_ops(1000)
.per_second()
.build())
.layer(TracingLayer::new());
}
Extension Traits (in anyfs-backend)
The FsExt trait provides convenience methods for any Fs backend:
#![allow(unused)]
fn main() {
/// Extension methods for Fs (auto-implemented for all backends).
pub trait FsExt: Fs {
/// Check if path is a file.
fn is_file(&self, path: impl AsRef<Path>) -> Result<bool, FsError> {
self.metadata(path.as_ref()).map(|m| m.file_type == FileType::File)
}
/// Check if path is a directory.
fn is_dir(&self, path: impl AsRef<Path>) -> Result<bool, FsError> {
self.metadata(path.as_ref()).map(|m| m.file_type == FileType::Directory)
}
// JSON methods require `serde` feature (see below)
#[cfg(feature = "serde")]
fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError>;
#[cfg(feature = "serde")]
fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError>;
}
// Blanket implementation for all Fs backends
impl<B: Fs> FsExt for B {}
}
JSON Methods (feature: serde)
The read_json and write_json methods require the serde feature:
anyfs-backend = { version = "0.1", features = ["serde"] }
#![allow(unused)]
fn main() {
use serde::{Serialize, de::DeserializeOwned};
#[cfg(feature = "serde")]
impl<B: Fs> FsExt for B {
fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError> {
let bytes = self.read(path.as_ref())?;
serde_json::from_slice(&bytes).map_err(|e| FsError::Deserialization(e.to_string()))
}
fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError> {
let bytes = serde_json::to_vec(value).map_err(|e| FsError::Serialization(e.to_string()))?;
self.write(path.as_ref(), &bytes)
}
}
}
Users can define their own extension traits for domain-specific operations.
Optional Features
Bytes Support (feature: bytes)
For zero-copy efficiency, enable the bytes feature to get Bytes-returning convenience methods on FileStorage:
anyfs = { version = "0.1", features = ["bytes"] }
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, MemoryBackend};
use bytes::Bytes;
let fs = FileStorage::new(MemoryBackend::new());
// With bytes feature, FileStorage provides read_bytes() convenience method
let data: Bytes = fs.read_bytes("/large-file.bin")?;
let slice = data.slice(1000..2000); // Zero-copy!
// Core trait still uses Vec<u8> for object safety
// read_bytes() wraps the Vec<u8> in Bytes::from()
}
Note: Core traits (FsRead, etc.) always use Vec<u8> for object safety (dyn Fs). The bytes feature adds convenience methods to FileStorage that wrap results in Bytes.
When to use:
- Large file handling with frequent slicing
- Network-backed storage
- Streaming scenarios
Default: Vec<u8> (no extra dependency)
Error Types
FsError includes context for better debugging. It implements std::error::Error via thiserror and uses #[non_exhaustive] for forward compatibility.
#![allow(unused)]
fn main() {
/// Filesystem error with context.
///
/// All variants include enough information for meaningful error messages.
/// Use `#[non_exhaustive]` to allow adding variants in minor versions.
#[non_exhaustive]
#[derive(Debug, thiserror::Error)]
pub enum FsError {
// ========================================================================
// Path/File Errors
// ========================================================================
/// Path not found.
#[error("not found: {path}")]
NotFound {
path: PathBuf,
},
/// Circular symlink detected during path resolution.
#[error("symlink loop detected: {path}")]
SymlinkLoop {
path: PathBuf,
},
/// Security threat detected (e.g., virus).
/// Note: This variant supports the Antivirus middleware example.
/// Custom middleware can use this or define domain-specific error types.
#[error("threat detected: {reason} in {path}")]
ThreatDetected {
path: PathBuf,
reason: String,
},
/// Path already exists.
#[error("{operation}: already exists: {path}")]
AlreadyExists {
path: PathBuf,
operation: &'static str,
},
/// Expected a file, found directory.
NotAFile { path: PathBuf },
/// Expected a directory, found file.
NotADirectory { path: PathBuf },
/// Directory not empty (for remove_dir).
DirectoryNotEmpty { path: PathBuf },
// ========================================================================
// Permission/Access Errors
// ========================================================================
/// Permission denied (general filesystem permission error).
PermissionDenied {
path: PathBuf,
operation: &'static str,
},
/// Access denied (from PathFilter or RBAC).
AccessDenied {
path: PathBuf,
reason: String, // Dynamic reason string
},
/// Read-only filesystem (from ReadOnly middleware).
ReadOnly {
path: PathBuf,
operation: &'static str,
},
/// Feature not enabled (from Restrictions middleware).
/// Note: Symlink/hard-link capability is determined by trait bounds (FsLink),
/// not middleware. Restrictions only controls "permissions".
FeatureNotEnabled {
path: PathBuf,
feature: &'static str, // "permissions"
operation: &'static str,
},
// ========================================================================
// Resource Limit Errors
// ========================================================================
/// Quota exceeded (total storage).
QuotaExceeded {
path: PathBuf,
limit: u64,
requested: u64,
usage: u64,
},
/// File size limit exceeded.
FileSizeExceeded {
path: PathBuf,
size: u64,
limit: u64,
},
/// Rate limit exceeded (from RateLimit middleware).
RateLimitExceeded {
path: PathBuf,
limit: u32,
window_secs: u64,
},
// ========================================================================
// Data Errors
// ========================================================================
/// Invalid data (e.g., not valid UTF-8 when string expected).
InvalidData {
path: PathBuf,
details: String,
},
/// Corrupted data (e.g., failed checksum, parse error).
CorruptedData {
path: PathBuf,
details: String,
},
/// Data integrity verification failed (AEAD tag mismatch, HMAC failure).
IntegrityError {
path: PathBuf,
},
/// Serialization error (from FsExt JSON methods).
Serialization(String),
/// Deserialization error (from FsExt JSON methods).
Deserialization(String),
// ========================================================================
// Backend/Operation Errors
// ========================================================================
/// Operation not supported by this backend.
NotSupported {
operation: &'static str,
},
/// Invalid password or encryption key (from SqliteBackend with encryption).
InvalidPassword,
/// Conflict during sync (from offline mode).
Conflict {
path: PathBuf,
},
/// Backend-specific error (catch-all for custom backends).
Backend {
message: String,
},
/// I/O error wrapper.
Io {
operation: &'static str,
path: PathBuf,
source: std::io::Error,
},
}
// Required implementations
impl From<std::io::Error> for FsError {
fn from(err: std::io::Error) -> Self {
FsError::Io {
operation: "io",
path: PathBuf::new(),
source: err,
}
}
}
}
Implementation notes:
- All variants have
#[error("...")]attributes (shown for first two, omitted for brevity) #[non_exhaustive]allows adding variants in minor versions without breaking changesFrom<std::io::Error>enables?operator with std::io functions- Consider
#[must_use]on functions returningResult<_, FsError>
Cross-Platform Compatibility
AnyFS is designed for cross-platform use. Virtual backends work everywhere; real filesystem backends have platform considerations.
Backend Compatibility
| Backend | Windows | Linux | macOS | WASM |
|---|---|---|---|---|
MemoryBackend | ✅ | ✅ | ✅ | ✅ |
SqliteBackend | ✅ | ✅ | ✅ | ✅* |
IndexedBackend | ✅ | ✅ | ✅ | ❌ |
StdFsBackend | ✅ | ✅ | ✅ | ❌ |
VRootFsBackend | ✅ | ✅ | ✅ | ❌ |
*SQLiteBackend on WASM requires wasm32 build of rusqlite with bundled SQLite. Encryption feature not available on WASM.
Feature Compatibility
| Feature | Virtual Backends | VRootFsBackend |
|---|---|---|
Basic I/O (Fs) | ✅ All platforms | ✅ All platforms |
| Symlinks | ✅ All platforms | Platform-dependent (see below) |
| Hard links | ✅ All platforms | Platform-dependent |
| Permissions | ✅ Stored as metadata | Platform-dependent |
| Extended attributes | ✅ Stored as metadata | Platform-dependent |
| FUSE mounting | N/A | Platform-dependent |
Platform-Specific Notes
Virtual Backends (MemoryBackend, SqliteBackend)
Fully cross-platform. All features work identically everywhere because:
- Paths are just strings/keys - no OS path resolution
- Symlinks are stored data, not OS constructs
- Permissions are metadata, not enforced by OS
- No filesystem syscalls involved
#![allow(unused)]
fn main() {
// This works identically on Windows, Linux, macOS, and WASM
let fs = FileStorage::new(MemoryBackend::new());
fs.symlink("/target", "/link")?; // Just stores the link
fs.set_permissions("/file", Permissions::from_mode(0o755))?; // Just stores metadata
}
VRootFsBackend (Real Filesystem)
Wraps the host filesystem. Platform differences apply:
| Feature | Linux | macOS | Windows |
|---|---|---|---|
| Symlinks | ✅ | ✅ | ⚠️ Requires privileges* |
| Hard links | ✅ | ✅ | ✅ (NTFS only) |
| Permissions (mode bits) | ✅ | ✅ | ⚠️ Limited mapping |
| Extended attributes | ✅ xattr | ✅ xattr | ⚠️ ADS (different API) |
| Case sensitivity | ✅ | ⚠️ Default insensitive | ⚠️ Insensitive |
*Windows requires SeCreateSymbolicLinkPrivilege or Developer Mode for symlinks.
FUSE Mounting
| Platform | Support | Library |
|---|---|---|
| Linux | ✅ Native | libfuse |
| macOS | ⚠️ Third-party | macFUSE |
| Windows | ⚠️ Third-party | WinFsp or Dokan |
| WASM | ❌ | N/A |
Path Handling
Virtual backends use / as separator internally, regardless of platform:
#![allow(unused)]
fn main() {
// Always use forward slashes with virtual backends
fs.write("/project/src/main.rs", code)?; // Works everywhere
}
VRootFsBackend translates to native paths internally:
- Linux/macOS:
/stays/ - Windows:
/project/file.txt→C:\root\project\file.txt
Recommendations
| Use Case | Recommended Backend | Why |
|---|---|---|
| Cross-platform app | MemoryBackend or SqliteBackend | No platform differences |
| Portable storage | SqliteBackend | Single file, works everywhere |
| WASM/browser | MemoryBackend or SqliteBackend | No filesystem access needed |
| Host filesystem access | VRootFsBackend | With awareness of platform limits |
| Testing | MemoryBackend | Fast, no cleanup, deterministic |
Feature Detection
Check platform capabilities at runtime if needed:
#![allow(unused)]
fn main() {
/// Check if symlinks are supported on the current platform.
pub fn symlinks_available() -> bool {
#[cfg(unix)]
return true;
#[cfg(windows)]
{
// Check for Developer Mode or symlink privilege
// ...
}
}
}
On platforms without symlink support, use a backend that doesn’t implement FsLink, or check symlinks_available() before calling symlink operations.
Layered Design: Backends + Middleware + Ergonomics
AnyFS uses a layered architecture that separates concerns:
- Backends: Pure storage + filesystem semantics
- Middleware: Composable policy layers
- FileStorage: Ergonomic wrapper
Architecture
┌─────────────────────────────────────────┐
│ FileStorage │ ← Ergonomics only
├─────────────────────────────────────────┤
│ Middleware Stack (composable): │ ← Policy enforcement
│ Tracing → PathFilter → Restrictions │
│ → Quota → Backend │
├─────────────────────────────────────────┤
│ Fs │ ← Pure storage
│ (Memory, SQLite, VRootFs, custom) │
└─────────────────────────────────────────┘
Layer Responsibilities
| Layer | Responsibility | Path Handling |
|---|---|---|
FileStorage | Ergonomic API + path resolution | Accepts impl AsRef<Path>; resolves paths via pluggable PathResolver |
| Middleware | Policy enforcement | &Path (object-safe core traits) |
| Backend | Storage + FS semantics | &Path (object-safe core traits) |
Core traits use &Path for object safety; FileStorage/FsExt provide impl AsRef<Path> ergonomics. Path resolution is pluggable via PathResolver trait (see ADR-033). Backends that wrap a real filesystem implement SelfResolving so FileStorage can skip resolution.
Policy via Middleware
Old design (rejected): FileStorage contained quota/feature logic.
Current design: Policy is handled by composable middleware:
#![allow(unused)]
fn main() {
// Middleware enforces policy
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.build())
.layer(TracingLayer::new());
// FileStorage is ergonomics + path resolution (no policy)
let fs = FileStorage::new(backend);
}
Path Containment
For VRootFsBackend (real filesystem), path containment uses strict-path::VirtualRoot internally:
#![allow(unused)]
fn main() {
// VRootFsBackend implements FsRead, FsWrite, FsDir (and thus Fs via blanket impl)
impl FsRead for VRootFsBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// VirtualRoot ensures paths can't escape
let safe_path = self.root.join(path)?;
std::fs::read(safe_path).map_err(Into::into)
}
}
}
For virtual backends (Memory, SQLite), paths are just keys - no OS path traversal possible. FileStorage performs symlink-aware resolution for these backends so normalization is consistent across virtual implementations.
For sandboxing across all backends, use PathFilter middleware:
#![allow(unused)]
fn main() {
PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.build()
.layer(backend)
}
Why This Matters
- Separation of concerns: Backends focus on storage, middleware handles policy
- Composability: Add/remove policies without touching storage code
- Flexibility: Same middleware works with any backend
- Simplicity: Each layer has one job
AnyFS - Architecture Decision Records
This file captures the decisions for the current AnyFS design.
Decision Map
Primary docs are where each decision is explained in narrative form. ADRs remain the source of truth for the decision itself.
| ADR | Primary doc |
|---|---|
| ADR-001 | Design Overview |
| ADR-002 | Project Structure |
| ADR-003 | Layered Traits |
| ADR-004 | Design Overview |
| ADR-005 | API Quick Reference |
| ADR-006 | Middleware Implementation |
| ADR-007 | Middleware Implementation |
| ADR-008 | FileStorage |
| ADR-009 | Project Structure |
| ADR-010 | Implementation Plan |
| ADR-011 | Design Overview |
| ADR-012 | Middleware Implementation |
| ADR-013 | Layered Traits |
| ADR-014 | Design Overview |
| ADR-015 | Design Overview |
| ADR-016 | Security Considerations |
| ADR-017 | Middleware Implementation |
| ADR-018 | Middleware Implementation |
| ADR-019 | Middleware Implementation |
| ADR-020 | Middleware Implementation |
| ADR-021 | Middleware Implementation |
| ADR-022 | API Quick Reference |
| ADR-023 | Layered Traits |
| ADR-024 | Implementation Plan |
| ADR-025 | Zero-Cost Alternatives |
| ADR-026 | Implementation Plan |
| ADR-027 | Design Overview |
| ADR-028 | Design Overview |
| ADR-029 | Two-Layer Path Handling |
| ADR-030 | Layered Traits |
| ADR-031 | Indexing Middleware |
ADR Index
| ADR | Title | Status |
|---|---|---|
| ADR-001 | Path-based Fs trait | Accepted |
| ADR-002 | Two-crate structure | Accepted |
| ADR-003 | Object-safe path parameters | Accepted |
| ADR-004 | Tower-style middleware pattern | Accepted |
| ADR-005 | std::fs-aligned method names | Accepted |
| ADR-006 | Quota for quota enforcement | Accepted |
| ADR-007 | Restrictions for least-privilege | Accepted |
| ADR-008 | FileStorage as thin ergonomic wrapper | Accepted |
| ADR-009 | Built-in backends are feature-gated | Accepted |
| ADR-010 | Sync-first, async-ready design | Accepted |
| ADR-011 | Layer trait for standardized composition | Accepted |
| ADR-012 | Tracing for instrumentation | Accepted |
| ADR-013 | FsExt for extension methods | Accepted |
| ADR-014 | Optional Bytes support | Accepted |
| ADR-015 | Contextual FsError | Accepted |
| ADR-016 | PathFilter for path-based access control | Accepted |
| ADR-017 | ReadOnly for preventing writes | Accepted |
| ADR-018 | RateLimit for operation throttling | Accepted |
| ADR-019 | DryRun for testing and debugging | Accepted |
| ADR-020 | Cache for read performance | Accepted |
| ADR-021 | Overlay for union filesystem | Accepted |
| ADR-022 | Builder pattern for configurable middleware | Accepted |
| ADR-023 | Interior mutability for all trait methods | Accepted |
| ADR-024 | Async Strategy | Accepted |
| ADR-025 | Strategic Boxing (Tower-style) | Accepted |
| ADR-026 | Companion shell (anyfs-shell) | Accepted (Future) |
| ADR-027 | Permissive core; security via middleware | Accepted |
| ADR-028 | Linux-like semantics for virtual backends | Accepted |
| ADR-029 | Path resolution in FileStorage | Accepted |
| ADR-030 | Layered trait hierarchy | Accepted |
| ADR-031 | Indexing as middleware | Accepted (Future) |
| ADR-032 | Path Canonicalization via FsPath Trait | Accepted |
| ADR-033 | PathResolver Trait for Pluggable Resolution | Accepted |
| ADR-034 | LLM-Oriented Architecture (LOA) | Accepted |
ADR-001: Path-based Fs trait
Decision: Backends implement a path-based trait aligned with std::fs method naming.
Why: Filesystem operations are naturally path-oriented; a single, familiar trait surface is easier to implement and adopt than graph-store or inode models.
ADR-002: Two-crate structure
Decision:
| Crate | Purpose |
|---|---|
anyfs-backend | Minimal contract: Fs trait, Layer trait, FsExt, types |
anyfs | Backends + middleware + ergonomics (FileStorage<B>, BackendStack) |
Why:
- Backend authors only need
anyfs-backend(no heavy dependencies). - Middleware is composable and lives with backends in
anyfs. FileStorageprovides ergonomics plus centralized path resolution for virtual backends - no policy logic - included inanyfsfor convenience.
ADR-003: Object-safe path parameters
Decision: Core Fs traits take &Path so they remain object-safe (dyn Fs works). For ergonomics, FileStorage and FsExt accept impl AsRef<Path> and forward to the core traits.
Why:
- Object safety enables opt-in type erasure (
FileStorage::boxed()). - Keeps hot-path calls zero-cost; dynamic dispatch is explicit and optional.
- Ergonomics preserved via
FileStorage/FsExt(&str,String,PathBuf).
ADR-004: Tower-style middleware pattern
Decision: Use composable middleware (decorator pattern) for cross-cutting concerns like limits, logging, and feature gates. Each middleware implements Fs by wrapping another Fs.
Why:
- Complete separation of concerns - each layer has one job.
- Composable - use only what you need.
- Familiar pattern (Axum/Tower use the same approach).
- No code duplication - middleware written once, works with any backend.
- Testable - each layer can be tested in isolation.
Example:
#![allow(unused)]
fn main() {
let backend = SqliteBackend::open("data.db")?
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.build())
.layer(TracingLayer::new());
}
ADR-005: std::fs-aligned method names
Decision: Prefer read_dir, create_dir_all, remove_file, etc.
Why: Familiarity and reduced cognitive overhead.
ADR-006: Quota for quota enforcement
Decision: Quota/limit enforcement is handled by Quota<B> middleware, not by backends or FileStorage.
Configuration:
with_max_total_size(bytes)- total storage limitwith_max_file_size(bytes)- per-file limitwith_max_node_count(count)- max files/directorieswith_max_dir_entries(count)- max entries per directorywith_max_path_depth(depth)- max directory nesting
Why:
- Limits are policy, not storage semantics.
- Written once, works with any backend.
- Optional - users who don’t need limits skip this middleware.
Implementation notes:
- On construction, scan existing backend to initialize usage counters.
- Wrap
open_writestreams withCountingWriterto track streamed bytes. - Check limits before operations, update usage after successful operations.
ADR-007: Capability via Trait Bounds
Decision: Symlink and hard-link capability is determined by whether the backend implements FsLink. The default PathResolver (IterativeResolver) follows symlinks when FsLink is available.
The Rule
Backend implements FsLink? | Symlinks work? |
|---|---|
Yes (B: FsLink) | Yes |
| No | No (won’t compile) |
Examples
#![allow(unused)]
fn main() {
// MemoryBackend implements FsLink
let fs = FileStorage::new(MemoryBackend::new());
fs.symlink("/target", "/link")?; // ✅ Works
// Custom backend that doesn't implement FsLink
let fs = FileStorage::new(MySimpleBackend::new());
fs.symlink("/target", "/link")?; // ❌ Won't compile - no FsLink impl
}
Why Not Runtime Blocking?
A hypothetical deny_symlinks() middleware would create type/behavior mismatch:
- Type says “I implement FsLink”
- Runtime says “but symlink() errors”
This is confusing and defeats the purpose of Rust’s type system. Instead, symlink capability is determined at compile time by trait bounds.
Restrictions Middleware
Restrictions<B> is limited to operations where runtime policy makes sense:
#![allow(unused)]
fn main() {
let backend = RestrictionsLayer::builder()
.deny_permissions() // Prevent metadata changes
.build()
.layer(backend);
}
Symlink following: Controlled by the PathResolver. The default IterativeResolver follows symlinks when FsLink is available. Custom resolvers can implement different behaviors. OS-backed backends delegate to the OS (strict-path prevents escapes).
ADR-008: FileStorage as thin ergonomic wrapper
Decision: FileStorage<B> is a thin wrapper that provides std::fs-aligned ergonomics and path resolution for virtual backends. It contains NO policy logic.
Context: Earlier designs used FileStorage<B, R, M> with three type parameters:
B- Backend typeR- PathResolver type (default:IterativeResolver)M- Marker type for compile-time container differentiation
This was over-engineered. We simplified to a single generic.
Why only one generic parameter?
| Removed | Rationale |
|---|---|
R (Resolver) | Path resolution is a cold path (once per operation, I/O dominates). Boxing is acceptable per ADR-025. Runtime swapping via with_resolver() is sufficient. |
M (Marker) | Speculative feature with unclear demand. Prior art (vfs, cap-std, tempfile) don’t have marker parameters. Users who need type safety can create wrapper newtypes: struct SandboxFs(FileStorage<MemoryBackend>). |
What it does:
- Provides familiar method names
- Accepts
impl AsRef<Path>for convenience and forwards to the core&Pathtraits - Delegates path resolution to a boxed
PathResolver(cold path, boxing OK per ADR-025) - Delegates all operations to the wrapped backend
What it does NOT do:
- Quota enforcement (use Quota)
- Feature gating (use Restrictions)
- Instrumentation (use Tracing)
- Marker types (users create wrapper newtypes if needed)
- Any other policy
Why this design:
- Single responsibility - ergonomics + path resolution (no policy).
- One generic parameter keeps the API simple for 90% of users.
- Resolver is boxed because path resolution is a cold path.
- Users who need type-safe markers can create their own wrapper types.
- Policy is composable via middleware, not hardcoded.
User-defined type safety pattern:
#![allow(unused)]
fn main() {
// Instead of FileStorage<_, _, Sandbox>, users create:
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);
fn process_sandbox(fs: &SandboxFs) { /* only accepts SandboxFs */ }
}
ADR-009: Simple backends in anyfs, complex backends as ecosystem crates
Decision: Simple backends (MemoryBackend, StdFsBackend, VRootFsBackend) are built into anyfs with feature flags. Complex backends (SqliteBackend, IndexedBackend) live in separate ecosystem crates.
Built-in backends (anyfs features):
memory(default)stdfs(optional)vrootfs(optional)
Ecosystem crates:
anyfs-sqlite—SqliteBackendwith optionalencryptionfeatureanyfs-indexed—IndexedBackend(SQLite metadata + disk blobs)
Why:
- Simple backends have minimal dependencies (just
std) - Complex backends need internal runtimes (connection pools, sharding, chunking)
- Follows Tower/Axum pattern: framework is minimal, complex implementations in their own crates
- Reduces compile time and binary size for users who don’t need complex backends
ADR-010: Sync-first, async-ready design
Decision: Fs traits are synchronous. The API is designed to allow adding AsyncFs later without breaking changes.
Rationale:
- Built-in backends are naturally synchronous:
MemoryBackend- in-memory, instantStdFsBackend/VRootFsBackend- std::fs is sync
- Ecosystem backends are also sync (e.g.,
SqliteBackenduses rusqlite which is sync) - Sync is simpler - no runtime dependency (tokio/async-std)
- Users can wrap sync backends in
spawn_blockingif needed
Async-ready design principles:
- Traits require
Send- compatible with async executors - Return types are
Result<T, FsError>- works with async - No internal blocking assumptions
- Methods are stateless per-call - no hidden blocking state
Future async path (Option 2): When async is needed (e.g., network-backed storage), add a parallel trait:
#![allow(unused)]
fn main() {
// In anyfs-backend
pub trait AsyncFs: Send + Sync {
async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
async fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
// ... mirrors Fs with async
// Streaming uses AsyncRead/AsyncWrite
async fn open_read(&self, path: &Path)
-> Result<Box<dyn AsyncRead + Send + Unpin>, FsError>;
}
}
Migration notes:
AsyncFswould be a separate trait, not replacingFs- Blanket impl possible:
impl<T: Fs> AsyncFs for Tusingspawn_blocking - Middleware would need async variants:
AsyncQuota<B>, etc. - No breaking changes to existing sync API
Why not async now:
- Complexity without benefit - all current backends are sync
- Rust 1.75 makes async traits easy, so adding later is low-cost
- Better to wait for real async backend requirements
ADR-011: Layer trait for standardized composition
Decision: Provide a Layer trait (inspired by Tower) that standardizes middleware composition.
#![allow(unused)]
fn main() {
pub trait Layer<B: Fs> {
type Backend: Fs;
fn layer(self, backend: B) -> Self::Backend;
}
}
Note: For async compatibility, the trait is unbounded:
Layer<B>withoutB: Fs. This allows the same layer types to implement both sync (impl<B: Fs> Layer<B>) and async (impl<B: AsyncFs> Layer<B>). See ADR-027.
Why:
- Standardized composition pattern familiar to Tower/Axum users.
- IDE autocomplete for available layers.
- Enables
BackendStackfluent builder in anyfs. - Each middleware provides a corresponding
*Layertype.
Example:
#![allow(unused)]
fn main() {
// SqliteBackend from anyfs-sqlite crate
let backend = SqliteBackend::open("data.db")?
.layer(QuotaLayer::builder()
.max_total_size(100_000)
.build())
.layer(TracingLayer::new());
}
ADR-012: Tracing for instrumentation
Decision: Use Tracing<B> integrated with the tracing ecosystem instead of a custom logging solution.
Why:
- Works with existing tracing infrastructure (tracing-subscriber, OpenTelemetry, Jaeger).
- Structured logging with spans for each operation.
- Users choose their subscriber - no logging framework lock-in.
- Consistent with modern Rust ecosystem practices.
Configuration:
#![allow(unused)]
fn main() {
backend.layer(TracingLayer::new()
.with_target("anyfs")
.with_level(tracing::Level::DEBUG))
}
ADR-013: FsExt for extension methods
Decision: Provide FsExt trait with convenience methods, auto-implemented for all backends.
#![allow(unused)]
fn main() {
pub trait FsExt: Fs {
fn is_file(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;
fn is_dir(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;
// JSON methods require `serde` feature
#[cfg(feature = "serde")]
fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError>;
#[cfg(feature = "serde")]
fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError>;
}
impl<B: Fs> FsExt for B {}
}
Feature gating:
is_file()andis_dir()are always available.read_json()andwrite_json()requireanyfs-backend = { features = ["serde"] }.
Why:
- Adds convenience without bloating
Fstrait. - Blanket impl means all backends get these methods for free.
- Users can define their own extension traits for domain-specific operations.
- Follows Rust convention (e.g.,
IteratorExt,StreamExt). - Serde is optional - users who don’t need JSON avoid the dependency.
ADR-014: Optional Bytes support
Decision: Support the bytes crate via an optional feature for zero-copy efficiency.
anyfs = { version = "0.1", features = ["bytes"] }
Why:
Bytesprovides O(1) slicing via reference counting.- Beneficial for large file handling, network backends, streaming.
- Optional - users who don’t need it avoid the dependency.
- Core traits remain
Vec<u8>for simplicity andSend + Synccompliance.
Implementation: The bytes feature adds a convenience method to FileStorage, not a core trait change:
#![allow(unused)]
fn main() {
// In anyfs/src/container.rs (behind `bytes` feature)
impl<B: Fs> FileStorage<B> {
#[cfg(feature = "bytes")]
pub fn read_bytes(&self, path: impl AsRef<Path>) -> Result<bytes::Bytes, FsError> {
Ok(bytes::Bytes::from(self.read(path)?))
}
}
}
Core traits unchanged: FsRead::read() returns Vec<u8>. The bytes feature only adds ergonomic wrappers.
ADR-015: Contextual FsError
Decision: FsError variants include context for better debugging.
#![allow(unused)]
fn main() {
FsError::NotFound {
path: PathBuf,
}
FsError::QuotaExceeded {
limit: u64,
requested: u64,
usage: u64,
}
}
Why:
- Error messages include enough context to understand what failed.
- No need for separate error context crate (like anyhow) for basic usage.
- Path is sufficient for NotFound - the call site knows the operation.
- Quota errors include all relevant numbers for debugging.
ADR-016: PathFilter for path-based access control
Decision: Provide PathFilter<B> middleware for glob-based path access control.
Configuration:
#![allow(unused)]
fn main() {
PathFilterLayer::builder()
.allow("/workspace/**") // Allow workspace access
.deny("**/.env") // Deny .env files anywhere
.deny("**/secrets/**") // Deny secrets directories
.build()
.layer(backend)
}
Semantics:
- Deny rules are evaluated first and take precedence over allow rules.
- If path matches any deny rule, access is denied.
- If path matches an allow rule (and no deny), access is granted.
- If no rules match, access is denied (deny by default).
- Uses glob patterns (e.g.,
**for recursive,*for single segment). - Returns
FsError::AccessDeniedfor denied paths.
Why:
- Essential for AI agent sandboxing - restrict to specific directories.
- Prevents access to sensitive files (.env, secrets, credentials).
- Separate from backend - works with any backend.
- Inspired by AgentFS and similar AI sandbox patterns.
Implementation notes:
- Use
globsetcrate for efficient glob pattern matching. read_dirfilters out denied entries from results (don’t expose existence of denied files).- Check path at operation start, then delegate to inner backend.
ADR-017: ReadOnly for preventing writes
Decision: Provide ReadOnly<B> middleware that blocks all write operations.
Usage:
#![allow(unused)]
fn main() {
let readonly_fs = ReadOnly::new(backend);
}
Semantics:
- All read operations pass through to inner backend.
- All write operations return
FsError::ReadOnly. - Simple, no configuration needed.
Why:
- Safe browsing of container contents without modification risk.
- Useful for debugging, inspection, auditing.
- Simpler than configuring Restrictions for read-only use case.
ADR-018: RateLimit for operation throttling
Decision: Provide RateLimit<B> middleware to limit operations per time window.
Configuration:
#![allow(unused)]
fn main() {
RateLimitLayer::builder()
.max_ops(1000)
.per_second()
.build()
.layer(backend)
}
Semantics:
- Tracks operation count in fixed time window (simpler than sliding window, sufficient for most use cases).
- Returns
FsError::RateLimitExceededwhen limit exceeded. - Counter resets when window expires.
Why:
- Protects against runaway processes consuming resources.
- Essential for multi-tenant environments.
- Prevents denial-of-service from misbehaving code.
Implementation notes:
- Use
std::time::Instantfor timing. - Store window start time and counter; reset when window expires.
- Count operation calls (including
open_read/open_write), not bytes transferred. - Return error immediately when limit exceeded (no blocking/waiting).
ADR-019: DryRun for testing and debugging
Decision: Provide DryRun<B> middleware that logs write operations without executing them.
Usage:
#![allow(unused)]
fn main() {
let dry_run = DryRun::new(backend);
let fs = FileStorage::new(dry_run);
fs.write("/test.txt", b"hello")?; // Logged but not written
// To inspect recorded operations, keep the DryRun handle before wrapping it.
}
Semantics:
- Read operations execute normally against inner backend.
- Write operations are logged but return
Ok(())without executing. - Operations log can be inspected for verification.
Why:
- Test code paths without side effects.
- Debug complex operation sequences.
- Audit what would happen before committing.
Implementation notes:
- Read operations delegate to inner backend (test against real state).
- Write operations log and return
Ok(())without executing. open_writereturnsstd::io::sink()- writes are discarded.- Useful for: “What would this code do?” not “Run this in isolation.”
ADR-020: Cache for read performance
Decision: Provide Cache<B> middleware with LRU caching for read operations.
Configuration:
#![allow(unused)]
fn main() {
CacheLayer::builder()
.max_entries(1000)
.max_entry_size(1024 * 1024) // 1MB max per entry
.build()
.layer(backend)
}
Semantics:
- Read operations check cache first, populate on miss.
- Write operations invalidate relevant cache entries.
- LRU eviction when max entries exceeded.
Why:
- Improves performance for repeated reads.
- Reduces load on underlying backend (especially for SQLite/network).
- Configurable to balance memory vs performance.
Implementation notes:
- Cache bulk reads only:
read(),read_to_string(),read_range(),metadata(),exists(). - Do NOT cache
open_read()- streams are for large files that shouldn’t be cached. - Invalidate cache entry on any write to that path.
- Use
lrucrate or similar for LRU eviction.
ADR-021: Overlay for union filesystem
Decision: Provide Overlay<B1, B2> middleware for copy-on-write layered filesystems.
Usage:
#![allow(unused)]
fn main() {
// SqliteBackend from anyfs-sqlite crate
let base = SqliteBackend::open("base.db")?; // Read-only base
let upper = MemoryBackend::new(); // Writable upper layer
let overlay = Overlay::new(base, upper);
}
Semantics:
- Read: check upper layer first, fall back to base if not found.
- Write: always to upper layer (copy-on-write).
- Delete: create whiteout marker in upper layer (file appears deleted but base unchanged).
- Directory listing: merge results from both layers.
Why:
- Docker-like layered filesystem for containers.
- Base image with per-instance modifications.
- Testing with isolated changes over shared baseline.
- Inspired by OverlayFS and VFS crate patterns.
Implementation notes:
- Whiteout convention:
.wh.<filename>marks deleted files from base layer. read_dirmust merge results from both layers, excluding whiteouts and whited-out files.existschecks upper first, then base (respecting whiteouts).- All writes go to upper layer; base is never modified.
- Consider
opaquedirectories (.wh..wh..opq) to hide entire base directories.
ADR-022: Builder pattern for configurable middleware
Decision: Middleware that requires configuration MUST use a builder pattern that prevents construction without meaningful values. ::new() constructors are NOT allowed for middleware where a default configuration is nonsensical.
Problem: A constructor like QuotaLayer::new() raises the question: “What quota?” An unlimited quota is pointless - you wouldn’t use QuotaLayer at all. Similarly, RestrictionsLayer::new() with no restrictions, PathFilterLayer::new() with no rules, and RateLimitLayer::new() with no rate limit are all nonsensical.
Solution: Use builders that enforce at least one meaningful configuration:
#![allow(unused)]
fn main() {
// QuotaLayer - requires at least one limit
let quota = QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build();
// Can also set multiple limits
let quota = QuotaLayer::builder()
.max_total_size(1_000_000)
.max_file_size(100_000)
.max_node_count(1000)
.build();
// RestrictionsLayer - requires at least one restriction
let restrictions = RestrictionsLayer::builder()
.deny_permissions()
.build();
// PathFilterLayer - requires at least one rule
let filter = PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.build();
// RateLimitLayer - requires rate limit parameters
let rate_limit = RateLimitLayer::builder()
.max_ops(1000)
.per_second()
.build();
// CacheLayer - requires cache configuration
let cache = CacheLayer::builder()
.max_entries(1000)
.build();
}
Middleware that MAY keep ::new():
| Middleware | Rationale |
|---|---|
TracingLayer | Default (global tracing subscriber) is meaningful |
ReadOnlyLayer | No configuration needed |
DryRunLayer | No configuration needed |
OverlayLayer | Takes two backends as required params: Overlay::new(lower, upper) |
Implementation:
#![allow(unused)]
fn main() {
// Builder with typestate pattern for compile-time enforcement
pub struct QuotaLayerBuilder<State = Unconfigured> {
max_total_size: Option<u64>,
max_file_size: Option<u64>,
max_node_count: Option<u64>,
_state: PhantomData<State>,
}
pub struct Unconfigured;
pub struct Configured;
impl QuotaLayerBuilder<Unconfigured> {
pub fn max_total_size(mut self, bytes: u64) -> QuotaLayerBuilder<Configured> {
self.max_total_size = Some(bytes);
QuotaLayerBuilder {
max_total_size: self.max_total_size,
max_file_size: self.max_file_size,
max_node_count: self.max_node_count,
_state: PhantomData,
}
}
pub fn max_file_size(mut self, bytes: u64) -> QuotaLayerBuilder<Configured> {
// Similar transition to Configured state
}
pub fn max_node_count(mut self, count: u64) -> QuotaLayerBuilder<Configured> {
// Similar transition to Configured state
}
// Note: NO build() method on Unconfigured state!
}
impl QuotaLayerBuilder<Configured> {
// Additional configuration methods stay in Configured state
pub fn max_total_size(mut self, bytes: u64) -> Self {
self.max_total_size = Some(bytes);
self
}
// Only Configured state has build()
pub fn build(self) -> QuotaLayer {
QuotaLayer { /* ... */ }
}
}
impl QuotaLayer {
pub fn builder() -> QuotaLayerBuilder<Unconfigured> {
QuotaLayerBuilder {
max_total_size: None,
max_file_size: None,
max_node_count: None,
_state: PhantomData,
}
}
}
}
Why:
- Compile-time safety: Invalid configurations don’t compile.
- Self-documenting API: Users must explicitly choose configuration.
- No meaningless defaults: Eliminates “what does this default to?” confusion.
- IDE guidance: Autocomplete shows required methods before
build(). - Familiar pattern: Rust builders are idiomatic and widely understood.
Error prevention:
#![allow(unused)]
fn main() {
// This won't compile - no build() on Unconfigured
let quota = QuotaLayer::builder().build(); // ❌ Error!
// This compiles - at least one limit set
let quota = QuotaLayer::builder()
.max_total_size(1_000_000)
.build(); // ✅ OK
}
ADR-023: Interior mutability for all trait methods
Decision: All Fs trait methods use &self, not &mut self. Backends manage their own synchronization internally (interior mutability).
Previous design:
#![allow(unused)]
fn main() {
pub trait FsRead: Send {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
}
pub trait FsWrite: Send {
fn write(&mut self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
}
New design:
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
}
pub trait FsWrite: Send + Sync {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
}
Why:
-
Filesystems are conceptually always mutable. A filesystem doesn’t become “borrowed” when you write to it - the underlying storage manages concurrency itself.
-
Enables concurrent access patterns. With
&mut self, you cannot have concurrent readers and writers even when the backend supports it (e.g., SQLite with WAL mode, real filesystems). -
Matches real-world filesystem semantics.
std::fs::write()takes a path, not a mutable reference to some filesystem object. Files are shared resources. -
Simplifies middleware implementation. Middleware no longer needs to worry about propagating mutability - all operations use
&self. -
Common pattern in Rust. Many I/O abstractions use interior mutability:
std::io::WriteforFile(via OS handles),tokio::fs, database connection pools, etc.
Implementation:
Backends use appropriate synchronization primitives:
#![allow(unused)]
fn main() {
pub struct MemoryBackend {
// Interior mutability via Mutex/RwLock
data: RwLock<HashMap<PathBuf, Vec<u8>>>,
}
impl FsWrite for MemoryBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let mut guard = self.data.write().unwrap();
guard.insert(path.as_ref().to_path_buf(), data.to_vec());
Ok(())
}
}
pub struct SqliteBackend {
// SQLite handles its own locking
conn: Connection, // rusqlite::Connection is internally synchronized
}
}
Trade-offs:
| Aspect | &mut self | &self (interior mutability) |
|---|---|---|
| Compile-time safety | Single writer enforced | Runtime synchronization |
| Concurrent access | Not possible | Backend decides |
| API simplicity | Simple | Slightly more complex backends |
| Real-world match | Poor | Good |
Backend implementer responsibility:
Backends MUST use interior mutability (RwLock, Mutex, etc.) to ensure thread-safe concurrent access. This guarantees:
- Memory safety (no data corruption)
- Atomic operations (a single
write()won’t produce partial results)
This does NOT guarantee:
- Order of concurrent writes to the same path (last write wins - standard FS behavior)
Conclusion: The benefits of matching filesystem semantics and enabling concurrent access outweigh the loss of compile-time single-writer enforcement. Backends are responsible for their own thread safety via interior mutability.
ADR-024: Async Strategy
Status: Accepted
Context: Async/await is prevalent in Rust networking and I/O. While AnyFS is primarily sync-focused (matching std::fs), we may need async support in the future for:
- Network-backed storage (S3, WebDAV, etc.)
- High-concurrency scenarios
- Integration with async runtimes (tokio, async-std)
Decision: Plan for a parallel async trait hierarchy that mirrors the sync traits.
Strategy:
Sync Traits Async Traits
----------- ------------
FsRead → AsyncFsRead
FsWrite → AsyncFsWrite
FsDir → AsyncFsDir
Fs → AsyncFs
FsFull → AsyncFsFull
FsFuse → AsyncFsFuse
FsPosix → AsyncFsPosix
Design principles:
-
Separate crate: Async traits live in
anyfs-asyncto avoid pulling async dependencies into the core. -
Method parity: Each async trait method corresponds 1:1 with its sync counterpart:
#![allow(unused)] fn main() { // Sync (anyfs-backend) pub trait FsRead: Send + Sync { fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>; } // Async (anyfs-async) #[async_trait] pub trait AsyncFsRead: Send + Sync { async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>; } } -
Layer trait compatibility: The
Layertrait works for both sync and async:#![allow(unused)] fn main() { pub trait Layer<B> { type Backend; fn layer(self, backend: B) -> Self::Backend; } // Middleware can implement for both: impl<B: Fs> Layer<B> for QuotaLayer { type Backend = Quota<B>; fn layer(self, backend: B) -> Self::Backend { ... } } impl<B: AsyncFs> Layer<B> for QuotaLayer { type Backend = AsyncQuota<B>; fn layer(self, backend: B) -> Self::Backend { ... } } } -
Sync-to-async bridge: Provide adapters for using sync backends in async contexts:
#![allow(unused)] fn main() { // Wraps sync backend for use in async code (uses spawn_blocking) pub struct SyncToAsync<B>(B); impl<B: Fs> AsyncFsRead for SyncToAsync<B> { async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> { let path = path.as_ref().to_path_buf(); let backend = self.0.clone(); // requires Clone tokio::task::spawn_blocking(move || backend.read(&path)).await? } } } -
No async-to-sync bridge: We intentionally don’t provide async-to-sync adapters (would require blocking on async runtime, which is problematic).
Implementation phases:
| Phase | Scope | Dependency |
|---|---|---|
| 1 | Sync traits stable | Now |
| 2 | Design async traits | When needed |
| 3 | anyfs-async crate | When needed |
| 4 | Async middleware | When needed |
Why parallel traits (not feature flags):
- No conditional compilation complexity - sync and async are separate, clean codebases
- No trait object issues - async traits have different object safety requirements
- Clear dependency boundaries - sync code doesn’t pull in tokio/async-std
- Ecosystem alignment - mirrors how
std::iovstokio::iowork
Trade-offs:
| Approach | Pros | Cons |
|---|---|---|
| Parallel traits | Clean separation, no async deps in core | Code duplication in middleware |
| Feature flags | Single codebase | Complex conditional compilation |
| Async-only | Modern, no duplication | Forces async runtime on sync users |
| Sync-only | Simple | Can’t support network backends efficiently |
Conclusion: Parallel async traits provide the best balance of simplicity now (sync-only core) with a clear migration path for async support later. The Layer trait design already accommodates this pattern.
ADR-025: Strategic Boxing (Tower-style)
Status: Accepted
Context: Dynamic dispatch (Box<dyn Trait>) adds heap allocation and vtable indirection. We need to decide where boxing is acceptable vs. where zero-cost abstractions are required.
Decision: Follow Tower/Axum’s battle-tested strategy: zero-cost on the hot path, box at boundaries where flexibility is needed and I/O cost dominates.
Principle: Avoid heap allocations and dynamic dispatch unless they buy real flexibility with negligible performance impact. Box only at cold boundaries (streams/iterators), and make type erasure explicit and opt-in.
DX stance: Application code uses FileStorage/FsExt (std::fs-style paths). Core traits stay object-safe for dyn Fs. For hot loops on known concrete backends, we provide a typed streaming extension as the first-class zero-alloc fast path.
Boxing Strategy
HOT PATH (many calls per operation - must be zero-cost):
┌─────────────────────────────────────────────────────┐
│ read(), write(), metadata(), exists() │ ← Returns concrete types
│ Read::read() / Write::write() on streams │ ← Vtable dispatch only
│ Iterator::next() on ReadDirIter │ ← Vtable dispatch only
│ Middleware composition │ ← Generics, monomorphized
└─────────────────────────────────────────────────────┘
COLD PATH (once per operation - boxing acceptable):
┌─────────────────────────────────────────────────────┐
│ open_read(), open_write() │ ← Box<dyn Read/Write>
│ read_dir() │ ← ReadDirIter (boxed inner)
└─────────────────────────────────────────────────────┘
SETUP (once at startup - zero-cost):
┌─────────────────────────────────────────────────────┐
│ Middleware stacking: Quota<Tracing<B>> │ ← Generics, no boxing
│ FileStorage::new(backend) │ ← Zero-cost wrapper
└─────────────────────────────────────────────────────┘
OPT-IN TYPE ERASURE (when explicitly needed):
┌─────────────────────────────────────────────────────────────┐
│ FileStorage::boxed() -> FileStorage<Box<dyn Fs>> │ ← Like Tower's BoxService
│ (Resolver already boxed internally - this boxes backend) │
└─────────────────────────────────────────────────────────────┘
What Gets Boxed and Why
| API | Boxed? | Rationale |
|---|---|---|
read() → Vec<u8> | No | Hot path, most common operation |
write(data) → () | No | Hot path, most common operation |
metadata() → Metadata | No | Hot path, frequently called |
exists() → bool | No | Hot path, frequently called |
open_read() → Box<dyn Read> | Yes | Cold path (once per file), enables middleware wrappers |
open_write() → Box<dyn Write> | Yes | Cold path (once per file), enables QuotaWriter |
read_dir() → ReadDirIter | Yes (inner) | Enables filtering in PathFilter, merging in Overlay |
| Middleware stack | No | Generics compose at compile time |
FileStorage::boxed() | Opt-in | Explicit type erasure when needed |
Why This Works
1. Bulk operations are the common case:
Most code uses read() and write(), not streaming. These are zero-cost.
2. Streaming is for large files:
open_read() / open_write() are for files too large to load into memory. For large files, I/O time (1-100ms) dwarfs box allocation (~50ns).
3. Box once, vtable many:
After open_read() allocates once, subsequent Read::read() calls are just vtable dispatch - no further allocations.
4. Middleware needs flexibility:
Quotawraps streams withQuotaWriterto count bytesPathFilterfiltersReadDirIterto hide denied entriesOverlaymerges directory listings from two backends Boxing enables this without type explosion.
Comparison to Tower/Axum
| AnyFS | Tower/Axum | Purpose |
|---|---|---|
Quota<Tracing<B>> | Timeout<RateLimit<S>> | Zero-cost middleware composition |
Box<dyn Read> | Pin<Box<dyn Future>> | Flexibility at boundaries |
ReadDirIter | BoxedIntoRoute | Type erasure for storage |
FileStorage::boxed() | BoxService / BoxCloneService | Opt-in type erasure |
Tower’s Timeout middleware uses Pin<Box<dyn Future>> in practice. Axum’s Router uses BoxedIntoRoute to store handlers. We follow the same pattern.
Cost Analysis
| Operation | Box Allocation | Actual I/O | Box % of Total |
|---|---|---|---|
| Open + read 4KB file | ~50ns | ~10,000ns | 0.5% |
| Open + read 1MB file | ~50ns | ~1,000,000ns | 0.005% |
| List directory (10 entries) | ~50ns | ~5,000ns | 1% |
The boxing cost is negligible relative to actual I/O.
Alternatives Considered
1. Associated types everywhere:
#![allow(unused)]
fn main() {
pub trait FsRead {
type Reader: Read + Send;
fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}
Rejected: Causes type explosion. QuotaReader<PathFilterReader<TracingReader<Cursor<Vec<u8>>>>> is unwieldy and every middleware needs a custom wrapper type.
2. RPITIT (Rust 1.75+):
#![allow(unused)]
fn main() {
fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}
Rejected as default: Loses object safety. Can’t use dyn Fs for runtime backend selection.
3. Always box everything:
Rejected: Unnecessary overhead on hot path operations like read().
Future Considerations
If profiling shows stream boxing is a bottleneck (unlikely), we can add:
#![allow(unused)]
fn main() {
/// Extension trait for zero-cost streaming when backend type is known
pub trait FsReadTyped: FsRead {
type Reader: Read + Send;
fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}
This follows Tower’s pattern of providing both Service (with associated types) and BoxService (with type erasure).
Conclusion
Our boxing strategy mirrors Tower/Axum’s production-proven approach:
- Zero-cost where it matters (hot path bulk operations, middleware composition)
- Box where flexibility is needed (streaming I/O, iterator filtering)
- Opt-in type erasure (explicit
boxed()method)
The performance cost is negligible (<1% of I/O time), while the ergonomic and flexibility benefits are substantial.
ADR-026: Companion shell (anyfs-shell)
Status: Accepted (Future)
Context: Users want a low-friction way to explore how different backends and middleware behave without writing a full application.
Decision: Provide a separate companion crate (e.g., anyfs-shell) that exposes a bash-style navigation and file management interface built on FileStorage.
Scope:
- Commands:
ls,cd,cat,cp,mv,rm,mkdir,stat. - Navigation and file management only; no full bash scripting, pipes, or job control.
- All operations route through
FileStorageto exercise middleware and backend composition.
Why:
- Demonstrates backend neutrality and middleware effects in a tangible way.
- Useful for docs, demos, and quick validation.
- Keeps the core crates free of CLI/UI dependencies.
ADR-027: Permissive core; security via middleware
Status: Accepted
Context: We need predictable filesystem semantics across backends. Some use cases require strict sandboxing, while others expect full filesystem behavior. Baking security restrictions into core traits would make behavior surprising and backend-dependent.
Decision: Core traits are permissive: all operations supported by a backend are allowed by default. Security controls (limits, access control, read-only, rate limiting, audit) are applied via middleware such as Restrictions, PathFilter, ReadOnly, Quota, RateLimit, and Tracing.
Why:
- Predictability: core behavior matches
std::fsexpectations. - Backend-agnostic: virtual and host backends share the same contract.
- Separation of concerns: policy lives in middleware, not storage semantics.
- Explicit security posture: applications opt in to the protections they need.
ADR-028: Linux-like semantics for virtual backends
Status: Accepted
Context: Cross-platform filesystems differ in case sensitivity, separators, reserved names, and path length limits. Virtual backends need a consistent model that does not inherit OS quirks.
Decision: Virtual backends use Linux-like semantics by default:
- Case-sensitive paths
/as the internal separator- No reserved names
- No max path length
- No ADS (
:stream) support
Why:
- Cross-platform consistency for the same data.
- Fewer surprises and reduced security footguns.
- Simplifies backend implementation and testing.
- Custom semantics remain possible via middleware or custom backends.
ADR-029: Path resolution in FileStorage
Status: Accepted
Context: Path normalization (//, ., ..) and symlink resolution must be consistent across backends. Implementing this logic in every backend is error-prone and leads to divergent behavior.
Decision: FileStorage performs canonicalization and normalization for virtual backends. Backends receive resolved paths. Real filesystem backends (e.g., VRootFsBackend) delegate to OS resolution plus strict-path containment. FileStorage exposes canonicalize, soft_canonicalize, and anchored_canonicalize for explicit use.
Why:
- Consistent semantics across all backends.
- Centralizes security-critical path handling.
- Simplifies backend implementations.
- Makes conformance testing straightforward.
ADR-030: Layered trait hierarchy
Status: Accepted
Context: Not all backends can or should implement full POSIX behavior. Forcing a single large trait would make simple backends harder to implement and would obscure capabilities.
Decision: Split the API into layered traits:
- Core:
FsRead,FsWrite,FsDir(combined asFs) - Extensions:
FsLink,FsPermissions,FsSync,FsStats - FUSE:
FsInode - POSIX:
FsHandles,FsLock,FsXattr - Convenience supertraits:
Fs,FsFull,FsFuse,FsPosix
Why:
- Implement the lowest level you need.
- Clear capability boundaries and trait bounds.
- Avoids forcing unsupported features on backends.
- Enables middleware to target specific capabilities.
ADR-031: Indexing as middleware
Status: Accepted (Future)
Context: We want a durable, queryable index of file activity and metadata (for audit trails, drive management tools, and statistics). This indexing should be optional, configurable, and work across all backends.
Decision: Indexing is implemented as middleware (Indexing<B> with IndexLayer), not as a specialized backend. The middleware writes to a sidecar index (SQLite by default) and can evolve to support alternate index engines.
Naming: Use IndexLayer (builder) and Indexing<B> (middleware), consistent with existing layer naming.
Why:
- Separation of concerns: Indexing is policy/analytics, not storage semantics.
- Backend-agnostic: Works with Memory, SQLite, VRootFs, and custom backends.
- Composability: Users opt in and configure it like other middleware (Quota, Tracing).
- Flexibility: Allows future index engines without changing core traits.
- DX consistency: Keeps std::fs-style usage via
FileStoragewith no API changes.
Trade-offs:
- External OS changes: Not captured unless a future watcher/scan helper is added.
- Index failures: Choose between strict mode (fail the op) and best-effort mode.
Implementation sketch:
IndexLayer::builder().index_file("index.db").consistency(IndexConsistency::Strict)...- Wraps
open_write()with a counting writer to record final size on close. - Updates a
nodestable and logsopsentries per operation.
ADR-032: Path Canonicalization via FsPath Trait
Status: Accepted
Context: Path canonicalization (resolving .., ., and symlinks) is needed for consistent path handling. The naive approach of baking this into FileStorage has issues:
- It’s not testable in isolation
- It can’t be optimized per-backend
- N+1 queries for paths like
/a/b/c/d/e(each component = separate call)
Decision: Introduce an FsPath trait with canonicalize() and soft_canonicalize() methods that have default implementations but allow backend-specific optimizations.
The Pattern:
#![allow(unused)]
fn main() {
pub trait FsPath: FsRead + FsLink {
/// Resolve all symlinks and normalize path components.
/// Returns error if final path doesn't exist.
fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
default_canonicalize(self, path)
}
/// Like canonicalize, but allows non-existent final component.
fn soft_canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
default_soft_canonicalize(self, path)
}
}
// Auto-implement for all FsLink implementors
impl<T: FsRead + FsLink> FsPath for T {}
}
Default Implementation:
#![allow(unused)]
fn main() {
fn default_canonicalize<F: FsRead + FsLink>(fs: &F, path: &Path) -> Result<PathBuf, FsError> {
let mut resolved = PathBuf::from("/");
for component in path.components() {
match component {
Component::RootDir => resolved = PathBuf::from("/"),
Component::ParentDir => { resolved.pop(); },
Component::CurDir => {},
Component::Normal(name) => {
resolved.push(name);
if let Ok(meta) = fs.symlink_metadata(&resolved) {
if meta.file_type.is_symlink() {
let target = fs.read_link(&resolved)?;
resolved.pop();
resolved = resolve_relative(&resolved, &target);
}
}
},
_ => {},
}
}
// Verify final path exists
if !fs.exists(&resolved)? {
return Err(FsError::NotFound { path: resolved, operation: "canonicalize" });
}
Ok(resolved)
}
}
Backend Optimization Examples:
| Backend | Optimization |
|---|---|
SqliteBackend | Single recursive CTE query resolves entire path |
VRootFsBackend | Delegates to std::fs::canonicalize() + containment check |
MemoryBackend | Uses default (in-memory is fast anyway) |
SQLite Optimized Implementation:
#![allow(unused)]
fn main() {
impl FsPath for SqliteBackend {
fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
// Single query with recursive CTE
self.conn.query_row(
r#"
WITH RECURSIVE path_resolve(segment, remaining, resolved, depth) AS (
-- Initial: split path into segments
SELECT ..., 0
UNION ALL
-- Recursive: resolve each segment, following symlinks
SELECT ...
FROM path_resolve
JOIN nodes ON ...
WHERE depth < 40 -- Loop protection
)
SELECT resolved FROM path_resolve
WHERE remaining = ''
ORDER BY depth DESC LIMIT 1
"#,
params![path.to_string_lossy()],
|row| Ok(PathBuf::from(row.get::<_, String>(0)?))
).map_err(|e| FsError::NotFound { path: path.into(), operation: "canonicalize" })
}
}
}
Why This Design:
| Benefit | Explanation |
|---|---|
| Portable default | Works with any Fs backend out of the box |
| Optimizable | Backends can override for O(1) queries vs O(n) |
| Testable | Canonicalization logic is separate, can be unit tested |
| Composable | Middleware can wrap/intercept canonicalization |
FileStorage Integration:
Note: ADR-033 introduces
PathResolveras the primary resolution strategy.FsPathremains as an optional backend optimization hook. When a backend implements bothFsPathand the default traits, the backend can choose to delegate to its resolver or provide fully custom logic (e.g., SQLite CTE queries).
FileStorage uses a boxed PathResolver internally for resolution (see ADR-033):
#![allow(unused)]
fn main() {
impl<B: Fs> FileStorage<B> {
pub fn canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
self.resolver.canonicalize(path.as_ref(), &self.backend as &dyn Fs)
}
pub fn soft_canonicalize(&self, path: impl AsRef<Path>) -> Result<PathBuf, FsError> {
self.resolver.soft_canonicalize(path.as_ref(), &self.backend as &dyn Fs)
}
}
}
Backends implementing FsPath can provide optimized implementations that the resolver MAY use:
#![allow(unused)]
fn main() {
impl FsPath for SqliteBackend {
fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
// Optimized: single CTE query instead of iterative resolution
self.conn.query_row(/* ... */)
}
}
}
Trade-offs:
| Approach | Queries | Complexity | Best For |
|---|---|---|---|
| Default impl | O(n) per component | Simple | Memory, small files |
| SQLite CTE | O(1) single query | Moderate | Large trees, many symlinks |
| OS delegation | O(1) syscall | Simple | Real filesystem |
Conclusion: The FsPath trait provides a clean abstraction that works everywhere but can be optimized where it matters. This follows Rust’s “zero-cost abstractions” philosophy: you don’t pay for what you don’t use, and you can optimize hot paths when needed.
ADR-033: PathResolver Trait for Pluggable Resolution
Status: Accepted
Context: Path resolution (normalizing .., ., and following symlinks) is currently handled in two places:
FsPathtrait methods (canonicalize,soft_canonicalize) with backend-specific optimizationsFileStorageperforms pre-resolution for non-SelfResolvingbackendsSelfResolvingmarker trait opts out of FileStorage resolution
This works, but the resolution algorithm is not a first-class, testable unit. The logic is spread across components, making it harder to:
- Test path resolution in isolation
- Benchmark/profile resolution performance
- Provide third-party custom resolvers
- Explore alternative resolution strategies (case-insensitive, caching, etc.)
Decision: Introduce a PathResolver trait that encapsulates the path resolution algorithm as a standalone, pluggable component.
The Pattern:
#![allow(unused)]
fn main() {
// In anyfs-backend (trait definition)
/// Strategy trait for path resolution algorithms.
///
/// Encapsulates path normalization, `..`/`.` resolution, and optionally symlink following.
///
/// **Symlink handling:** The base trait works with `&dyn Fs` (no symlink awareness).
/// For symlink-aware resolution, use `PathResolverWithLinks` which accepts `&dyn FsLink`.
/// `IterativeResolver` implements both traits - call the appropriate method based on
/// what your backend supports.
///
/// **Implementation note:** Only `canonicalize` is required. `soft_canonicalize` has a
/// default implementation that canonicalizes the parent and appends the final component.
pub trait PathResolver: Send + Sync {
/// Resolve path to canonical form (no symlink following).
/// Normalizes `.` and `..` components only.
fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError>;
/// Like canonicalize, but allows non-existent final component.
/// Default: canonicalize parent, append final component.
fn soft_canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
match path.parent() {
Some(parent) if !parent.as_os_str().is_empty() => {
let canonical_parent = self.canonicalize(parent, fs)?;
match path.file_name() {
Some(name) => Ok(canonical_parent.join(name)),
None => Ok(canonical_parent),
}
}
_ => self.canonicalize(path, fs), // Root or single component
}
}
}
/// Extension trait for symlink-aware resolution.
/// Backends implementing FsLink can use this for full resolution.
pub trait PathResolverWithLinks: PathResolver {
/// Resolve path following symlinks (requires FsLink backend).
fn canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError>;
/// Like canonicalize_following_links, but allows non-existent final component.
/// Default: canonicalize parent following links, append final component.
fn soft_canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError> {
match path.parent() {
Some(parent) if !parent.as_os_str().is_empty() => {
let canonical_parent = self.canonicalize_following_links(parent, fs)?;
match path.file_name() {
Some(name) => Ok(canonical_parent.join(name)),
None => Ok(canonical_parent),
}
}
_ => self.canonicalize_following_links(path, fs),
}
}
}
}
Built-in Implementations (in anyfs crate):
#![allow(unused)]
fn main() {
/// Default iterative resolver - walks path component by component.
/// Implements both PathResolver and PathResolverWithLinks.
pub struct IterativeResolver {
max_symlink_depth: usize, // Default: 40
}
impl PathResolver for IterativeResolver {
fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
// Normalize `.` and `..` only - no symlink following
self.normalize_path(path, fs)
}
// ...
}
impl PathResolverWithLinks for IterativeResolver {
fn canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError> {
// Full resolution with symlink following
self.resolve_with_symlinks(path, fs, self.max_symlink_depth)
}
// ...
}
/// No-op resolver for SelfResolving backends (OS handles resolution).
pub struct NoOpResolver;
/// LRU cache wrapper around another resolver.
pub struct CachingResolver<R: PathResolver> {
inner: R,
cache: Cache<PathBuf, PathBuf>, // LRU cache, bounded size
}
// Case-folding resolver is NOT built-in. Users can implement one via PathResolver
// trait if needed, but real-world demand is minimal since VRootFsBackend on
// Windows/macOS already gets case-insensitivity from the OS.
}
Integration with FileStorage:
#![allow(unused)]
fn main() {
pub struct FileStorage<B> {
backend: B,
resolver: Box<dyn PathResolver>, // Boxed: resolution is cold path
}
impl<B: Fs> FileStorage<B> {
pub fn new(backend: B) -> Self {
Self { backend, resolver: Box::new(IterativeResolver::default()) }
}
pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self {
Self { backend, resolver: Box::new(resolver) }
}
}
// Usage
let fs = FileStorage::new(backend); // Uses IterativeResolver
let fs = FileStorage::with_resolver(backend, CachingResolver::new(IterativeResolver::default()));
}
Relationship with FsPath Trait:
| Component | Responsibility |
|---|---|
PathResolver | Algorithm for resolution (first-class, testable, swappable) |
FsPath | Backend-level optimization hook (can delegate to resolver or override entirely) |
SelfResolving | Remains as marker OR becomes NoOpResolver assignment |
FsPath can delegate to the resolver:
#![allow(unused)]
fn main() {
pub trait FsPath: FsRead + FsLink {
fn resolver(&self) -> &dyn PathResolver {
static DEFAULT: IterativeResolver = IterativeResolver::new();
&DEFAULT
}
fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
self.resolver().canonicalize(path, self)
}
}
}
Or backends can override entirely for optimized implementations (e.g., SQLite CTE).
Why This Design:
| Benefit | Explanation |
|---|---|
| Testable in isolation | Unit test resolvers without full backend setup |
| Benchmarkable | Profile resolution algorithms independently |
| Third-party extensible | Custom resolvers without touching Fs traits |
| Maintainable | Path resolution is one focused, isolated component |
| New capabilities | Case-insensitive, caching, Windows-style resolvers become easy |
| Backwards compatible | Existing FsPath overrides still work; resolver is additive |
Crate Placement:
| Component | Crate | Rationale |
|---|---|---|
PathResolver trait | anyfs-backend | Core contract, minimal deps |
IterativeResolver | anyfs | Default impl, needs Fs methods |
NoOpResolver | anyfs | For SelfResolving backends |
CachingResolver | anyfs | Optional, needs cache impl |
| FileStorage integration | anyfs | Uses resolvers for path handling |
Note: Case-folding resolvers are NOT built-in. The
PathResolvertrait allows users to implement custom resolvers if needed, but we don’t ship speculative features.
Example Use Cases:
#![allow(unused)]
fn main() {
// Default: case-sensitive, symlink-aware (IterativeResolver is ZST, zero-cost)
let fs = FileStorage::new(MemoryBackend::new());
// Caching for read-heavy workloads
let fs = FileStorage::with_resolver(
backend,
CachingResolver::new(IterativeResolver::default())
);
// Custom resolver (user-implemented)
let fs = FileStorage::with_resolver(backend, MyCustomResolver::new());
// Testing: verify resolution behavior in isolation
#[test]
fn test_symlink_loop_detection() {
let resolver = IterativeResolver::new();
let mock_fs = MockFs::with_symlink_loop();
let result = resolver.canonicalize(Path::new("/loop"), &mock_fs);
assert!(matches!(result, Err(FsError::InvalidData { .. })));
}
}
Conclusion: The PathResolver trait provides clean separation of concerns, making path resolution testable, benchmarkable, and extensible. It complements FsPath (backend optimization hook) and can replace or work alongside SelfResolving (via NoOpResolver).
ADR-034: LLM-Oriented Architecture (LOA)
Status: Accepted
Context: AnyFS is being developed with significant LLM assistance (GitHub Copilot, Claude, etc.). Traditional software architecture prioritizes maintainability, testability, and extensibility for human developers. However, when LLMs are part of the development workflow, additional constraints become essential:
- LLMs work best with limited context windows - they can’t “understand” an entire codebase
- LLMs excel at pattern matching - consistent structure enables better assistance
- LLMs need clear contracts - well-documented interfaces reduce hallucination
- LLMs benefit from isolated components - fixing one thing shouldn’t require understanding everything
These same properties also benefit:
- Open source contributors (quick onboarding)
- Code review (focused changes)
- Parallel development (independent components)
- AI-generated tests and documentation
Decision: Structure AnyFS using LLM-Oriented Architecture (LOA) - a methodology where every component is independently understandable, testable, and fixable with only local context.
The Five Pillars:
| Pillar | Description | Implementation |
|---|---|---|
| Single Responsibility | One file = one concept | quota.rs, iterative.rs, etc. |
| Contract-First | Traits define the spec | Documented trait invariants |
| Isolated Testing | Tests use mocks only | No real backends in unit tests |
| Rich Errors | Errors explain the fix | Context in every FsError variant |
| Boundary Docs | Examples at every API | Usage example in every doc comment |
File Structure Convention:
#![allow(unused)]
fn main() {
//! # Component Name
//!
//! ## Responsibility
//! - Single bullet point
//!
//! ## Dependencies
//! - Traits/types only
//!
//! ## Usage
//! ```rust
//! // Minimal example
//! ```
// ============================================================================
// Types
// ============================================================================
// ============================================================================
// Trait Implementations
// ============================================================================
// ============================================================================
// Public API
// ============================================================================
// ============================================================================
// Private Helpers
// ============================================================================
// ============================================================================
// Tests
// ============================================================================
}
Component Isolation Checklist:
- Single file per component
- Implements a trait with documented invariants
- Dependencies are traits/types, not implementations
- Tests use mocks, not real backends
- Error messages explain what went wrong and how to fix
- Doc comment shows standalone usage example
- No global state
-
Send + Syncwhere required
LLM Prompting Patterns:
The architecture enables these clean prompts:
# Implement (user-provided resolver example)
Implement a case-folding resolver in your project.
Contract: Implement `PathResolver` trait from anyfs-backend.
Test: "/Foo/BAR" → "/foo/bar"
# Fix
Bug: Quota<B> doesn't account for existing file size.
File: src/middleware/quota.rs
Error: QuotaExceeded writing 50 bytes to 30-byte file with 100-byte limit.
# Review
Does this change maintain the PathResolver contract?
Are edge cases handled?
Are error messages informative?
Deliverables:
- AGENTS.md - Instructions for LLMs contributing to the codebase
- LLM Development Methodology Guide - Full methodology documentation
- llm-context.md - Context7-style API guide for LLMs using the library
Why This Design:
| Benefit | For LLMs | For Humans |
|---|---|---|
| Isolated components | Fits in context window | Easy to understand |
| Clear contracts | Reduces hallucination | Self-documenting |
| Consistent structure | Pattern matching works | Predictable codebase |
| Rich errors | Can suggest fixes | Quick debugging |
| Focused tests | Can verify changes | Fast CI |
Trade-offs:
| Approach | Pros | Cons |
|---|---|---|
| Deep abstraction | Maximum isolation | More files, more indirection |
| Monolithic design | Fewer files | LLMs can’t reason about it |
| LOA (chosen) | LLM-friendly + maintainable | Requires discipline |
Relationship to Other ADRs:
- ADR-030 (Layered traits): LOA extends this with per-file isolation
- ADR-033 (PathResolver): Example of LOA - resolver is isolated, testable, replaceable
- ADR-025 (Strategic Boxing): LOA prefers simplicity over micro-optimization
Conclusion: LLM-Oriented Architecture is not just about AI. It’s about creating a codebase where any component can be understood, tested, fixed, or replaced with only local context. This benefits LLMs, open source contributors, code reviewers, and future maintainers equally. As AI-assisted development becomes standard, LOA positions AnyFS as a reference implementation for sustainable human-AI collaboration.
See Also: LLM Development Methodology Guide
IndexedBackend Pattern
SQLite Metadata + Content-Addressed Blob Storage
This document describes the IndexedBackend architecture pattern: separating filesystem metadata (stored in SQLite) from file content (stored as blobs). This enables efficient queries, large file support, and flexible storage backends.
Ecosystem Implementation: The
anyfs-indexedcrate providesIndexedBackendas a production-ready implementation using local disk blobs. See the Backends Guide for usage. This document covers the underlying design pattern for those building custom implementations (e.g., with S3, cloud storage, or custom blob stores).
Overview
The IndexedBackend pattern separates:
- Metadata (directory structure, inodes, permissions) → SQLite
- Content (file bytes) → Content-Addressed Storage (CAS)
┌─────────────────────────────────────────────────────────┐
│ IndexedBackend (pattern) │
│ ┌─────────────────────┐ ┌────────────────────────┐ │
│ │ SQLite Metadata │ │ Blob Store (CAS) │ │
│ │ │ │ │ │
│ │ - inodes │ │ - content-addressed │ │
│ │ - dir_entries │ │ - deduplicated │ │
│ │ - blob references │ │ - S3, local, etc. │ │
│ │ - audit log │ │ │ │
│ └─────────────────────┘ └────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
Custom backends can use S3, cloud storage, or other blob stores.
IndexedBackend implements a simpler variant with UUID-named local blobs
(optimized for streaming; see note below on storage models).
Why this pattern?
- SQLite is great for metadata queries (directory listings, stats, audit)
- Blob stores scale better for large file content
- Content-addressing enables deduplication
- Separating concerns enables independent scaling
Storage Model Variants
| Model | Blob Naming | Dedup | Best For |
|---|---|---|---|
| Content-Addressed | SHA-256 of content | ✅ Yes | Cloud/S3, archival, multi-tenant |
| UUID+Timestamp | {uuid}-{timestamp}.bin | ❌ No | Streaming large files, simplicity |
IndexedBackend uses UUID+Timestamp naming because:
- Large files can be streamed without buffering the entire file to compute a hash
- Write latency is consistent (no hash computation)
- Simpler garbage collection (delete blob when reference removed)
Custom implementations may prefer content-addressed storage when:
- Deduplication is valuable (many users uploading same files)
- Using cloud blob stores with native CAS support (S3, GCS)
- Building archival systems where write latency is acceptable
Framework Validation
Do Current Traits Support This?
Yes. The Fs traits define operations, not storage implementation.
| Trait Method | Hybrid Implementation |
|---|---|
read(path) | SQLite lookup → blob fetch |
write(path, data) | Blob upload → SQLite update |
metadata(path) | SQLite query only |
read_dir(path) | SQLite query only |
remove_file(path) | SQLite update (refcount–) |
rename(from, to) | SQLite update only |
copy(from, to) | SQLite update (refcount++) |
The traits don’t care where bytes come from - that’s the backend’s business.
Thread Safety
Current design requires &self methods with interior mutability. For hybrid:
#![allow(unused)]
fn main() {
pub struct CustomIndexedBackend {
// SQLite needs single-writer (see "Write Queue" below)
metadata: Arc<Mutex<Connection>>,
// Blob store is typically already thread-safe
blobs: Arc<dyn BlobStore>,
// Write queue for serializing SQLite writes
write_tx: mpsc::Sender<WriteCmd>,
}
}
This aligns with ADR-023 (interior mutability).
Data Model
SQLite Schema
-- Inode table (one row per file/directory/symlink)
CREATE TABLE nodes (
inode INTEGER PRIMARY KEY,
parent INTEGER NOT NULL,
name TEXT NOT NULL,
node_type TEXT NOT NULL, -- 'file', 'dir', 'symlink'
size INTEGER NOT NULL DEFAULT 0,
mode INTEGER NOT NULL DEFAULT 420, -- 0o644
nlink INTEGER NOT NULL DEFAULT 1,
blob_id TEXT, -- NULL for directories
symlink_target TEXT, -- NULL unless symlink
created_at INTEGER NOT NULL,
modified_at INTEGER NOT NULL,
accessed_at INTEGER NOT NULL,
UNIQUE(parent, name)
);
-- Root directory (inode 1)
INSERT INTO nodes (inode, parent, name, node_type, size, mode, created_at, modified_at, accessed_at)
VALUES (1, 1, '', 'dir', 0, 493, strftime('%s', 'now'), strftime('%s', 'now'), strftime('%s', 'now'));
-- Blob reference tracking (for dedup + GC)
CREATE TABLE blobs (
blob_id TEXT PRIMARY KEY, -- sha256 hex
size INTEGER NOT NULL,
refcount INTEGER NOT NULL DEFAULT 0,
created_at INTEGER NOT NULL
);
-- Audit log (optional but recommended)
CREATE TABLE audit (
seq INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp INTEGER NOT NULL,
operation TEXT NOT NULL,
path TEXT,
actor TEXT,
details TEXT -- JSON
);
-- Indexes
CREATE INDEX idx_nodes_parent ON nodes(parent);
CREATE INDEX idx_nodes_blob ON nodes(blob_id) WHERE blob_id IS NOT NULL;
CREATE INDEX idx_blobs_refcount ON blobs(refcount) WHERE refcount = 0;
Blob Store Interface
#![allow(unused)]
fn main() {
/// Content-addressed blob storage.
pub trait BlobStore: Send + Sync {
/// Store bytes, returns content hash (blob_id).
fn put(&self, data: &[u8]) -> Result<String, BlobError>;
/// Retrieve bytes by content hash.
fn get(&self, blob_id: &str) -> Result<Vec<u8>, BlobError>;
/// Check if blob exists.
fn exists(&self, blob_id: &str) -> Result<bool, BlobError>;
/// Delete blob (only call after refcount reaches 0).
fn delete(&self, blob_id: &str) -> Result<(), BlobError>;
/// Streaming read for large files.
fn open_read(&self, blob_id: &str) -> Result<Box<dyn Read + Send>, BlobError>;
/// Streaming write, returns blob_id on completion.
fn open_write(&self) -> Result<Box<dyn BlobWriter>, BlobError>;
}
pub trait BlobWriter: Write + Send {
/// Finalize the blob and return its content hash.
fn finalize(self: Box<Self>) -> Result<String, BlobError>;
}
}
Implementations could be:
LocalCasBackend- local directory with content-addressed filesS3BlobStore- S3-compatible object storageMemoryBlobStore- in-memory for testing
Implementation Sketch
Core Structure
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter, DirEntry, FileType};
use rusqlite::Connection;
use std::sync::{Arc, Mutex};
use std::path::{Path, PathBuf};
use tokio::sync::mpsc;
pub struct CustomIndexedBackend {
/// SQLite connection (metadata)
db: Arc<Mutex<Connection>>,
/// Content-addressed blob storage
blobs: Arc<dyn BlobStore>,
/// Write command queue (single-writer pattern)
write_tx: mpsc::UnboundedSender<WriteCmd>,
/// Background writer handle
_writer_handle: Arc<WriterHandle>,
}
enum WriteCmd {
Write {
path: PathBuf,
blob_id: String,
size: u64,
reply: oneshot::Sender<Result<(), FsError>>,
},
Remove {
path: PathBuf,
reply: oneshot::Sender<Result<(), FsError>>,
},
CreateDir {
path: PathBuf,
reply: oneshot::Sender<Result<(), FsError>>,
},
// ... other write operations
}
}
Read Operations (Direct)
Read operations can query SQLite and blob store directly (no queue needed):
#![allow(unused)]
fn main() {
impl FsRead for CustomIndexedBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
// 1. Query SQLite for blob_id
let db = self.db.lock().map_err(|_| FsError::Backend("lock poisoned".into()))?;
let (blob_id, node_type): (Option<String>, String) = db.query_row(
"SELECT blob_id, node_type FROM nodes WHERE inode = (
SELECT inode FROM nodes WHERE parent = ? AND name = ?
)",
// ... path resolution params
|row| Ok((row.get(0)?, row.get(1)?)),
).map_err(|_| FsError::NotFound { path: path.to_path_buf() })?;
if node_type != "file" {
return Err(FsError::NotAFile { path: path.to_path_buf() });
}
let blob_id = blob_id.ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
drop(db); // Release lock before blob fetch
// 2. Fetch from blob store
self.blobs.get(&blob_id)
.map_err(|e| FsError::Backend(e.to_string()))
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
let path = path.as_ref();
let db = self.db.lock().map_err(|_| FsError::Backend("lock poisoned".into()))?;
// Pure SQLite query
let exists: bool = db.query_row(
"SELECT EXISTS(SELECT 1 FROM nodes WHERE parent = ? AND name = ?)",
// ... params
|row| row.get(0),
).unwrap_or(false);
Ok(exists)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
let path = path.as_ref();
let db = self.db.lock().map_err(|_| FsError::Backend("lock poisoned".into()))?;
// Pure SQLite query - no blob store needed
db.query_row(
"SELECT node_type, size, mode, nlink, created_at, modified_at, accessed_at, inode
FROM nodes WHERE parent = ? AND name = ?",
// ... params
|row| {
let node_type: String = row.get(0)?;
Ok(Metadata {
file_type: match node_type.as_str() {
"file" => FileType::File,
"dir" => FileType::Directory,
"symlink" => FileType::Symlink,
_ => FileType::File,
},
size: row.get(1)?,
permissions: Some(row.get(2)?),
// ... other fields
})
},
).map_err(|_| FsError::NotFound { path: path.to_path_buf() })
}
// ... other FsRead methods
}
}
Write Operations (Two-Phase Commit)
Writes use a two-phase pattern: upload blob first, then commit SQLite:
#![allow(unused)]
fn main() {
impl FsWrite for CustomIndexedBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
// Phase 1: Upload blob (can fail independently)
let blob_id = self.blobs.put(data)
.map_err(|e| FsError::Backend(format!("blob upload failed: {}", e)))?;
// Phase 2: Commit metadata (via write queue)
let (tx, rx) = oneshot::channel();
self.write_tx.send(WriteCmd::Write {
path,
blob_id,
size: data.len() as u64,
reply: tx,
}).map_err(|_| FsError::Backend("write queue closed".into()))?;
// Wait for SQLite commit
rx.blocking_recv()
.map_err(|_| FsError::Backend("write cancelled".into()))?
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
// Queue the removal (blob cleanup happens in background via GC)
let (tx, rx) = oneshot::channel();
self.write_tx.send(WriteCmd::Remove { path, reply: tx })
.map_err(|_| FsError::Backend("write queue closed".into()))?;
rx.blocking_recv()
.map_err(|_| FsError::Backend("remove cancelled".into()))?
}
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
// Copy is just a metadata operation - increment refcount, no blob copy!
let (tx, rx) = oneshot::channel();
self.write_tx.send(WriteCmd::Copy {
from: from.as_ref().to_path_buf(),
to: to.as_ref().to_path_buf(),
reply: tx,
}).map_err(|_| FsError::Backend("write queue closed".into()))?;
rx.blocking_recv()
.map_err(|_| FsError::Backend("copy cancelled".into()))?
}
// ... other FsWrite methods
}
}
Write Queue Worker
The single-writer pattern for SQLite:
#![allow(unused)]
fn main() {
async fn write_worker(
db: Arc<Mutex<Connection>>,
blobs: Arc<dyn BlobStore>,
mut rx: mpsc::UnboundedReceiver<WriteCmd>,
) {
while let Some(cmd) = rx.recv().await {
let result = {
let mut db = db.lock().unwrap();
match cmd {
WriteCmd::Write { path, blob_id, size, reply } => {
let result = db.execute_batch(&format!(r#"
BEGIN;
-- Upsert blob record
INSERT INTO blobs (blob_id, size, refcount, created_at)
VALUES ('{blob_id}', {size}, 1, strftime('%s', 'now'))
ON CONFLICT(blob_id) DO UPDATE SET refcount = refcount + 1;
-- Update or insert node
-- (simplified - real impl needs path resolution)
-- Audit log
INSERT INTO audit (timestamp, operation, path)
VALUES (strftime('%s', 'now'), 'write', '{path}');
COMMIT;
"#));
let _ = reply.send(result.map_err(|e| FsError::Backend(e.to_string())));
}
WriteCmd::Remove { path, reply } => {
// Decrement refcount (GC cleans up when refcount = 0)
let result = db.execute_batch(&format!(r#"
BEGIN;
-- Get blob_id before delete
-- Decrement refcount
-- Remove node
-- Audit log
COMMIT;
"#));
let _ = reply.send(result.map_err(|e| FsError::Backend(e.to_string())));
}
// ... other commands
}
};
}
}
}
Deduplication
Content-addressing gives you dedup for free:
#![allow(unused)]
fn main() {
impl BlobStore for LocalCasBackend {
fn put(&self, data: &[u8]) -> Result<String, BlobError> {
// Hash the content
let hash = sha256(data);
let blob_id = hex::encode(hash);
// Check if already exists
let blob_path = self.root.join(&blob_id[0..2]).join(&blob_id);
if blob_path.exists() {
// Already have this content - dedup!
return Ok(blob_id);
}
// Store new blob
std::fs::create_dir_all(blob_path.parent().unwrap())?;
std::fs::write(&blob_path, data)?;
Ok(blob_id)
}
}
}
Dedup in action:
- User A writes
report.pdf(10 MB) → blobabc123, refcount = 1 - User B writes identical
report.pdf→ same blobabc123, refcount = 2 - Physical storage: 10 MB (not 20 MB)
Refcount Management
-- On file write (new reference to blob)
UPDATE blobs SET refcount = refcount + 1 WHERE blob_id = ?;
-- On file delete
UPDATE blobs SET refcount = refcount - 1 WHERE blob_id = ?;
-- On copy (no blob copy needed!)
UPDATE blobs SET refcount = refcount + 1 WHERE blob_id = ?;
SQLite Performance
The SQLite metadata database benefits from the same tuning as SqliteBackend:
| Setting | Default | Purpose | Tradeoff |
|---|---|---|---|
journal_mode | WAL | Concurrent reads during writes | Creates .wal/.shm files |
synchronous | FULL | Index integrity on power loss | Safe default, opt-in to NORMAL |
cache_size | 16MB | Smaller cache for metadata-only | Tune based on index size |
busy_timeout | 5000 | Gracefully handle lock contention | Prevents SQLITE_BUSY errors |
auto_vacuum | INCREMENTAL | Reclaim space from deletions | Gradual space recovery |
Why FULL synchronous: Index corruption means paths no longer resolve to blobs—blobs become orphaned and unreachable. Use FULL as the safe default; opt-in to NORMAL only with battery-backed storage or when index can be rebuilt.
SQL Indexes (critical):
CREATE INDEX idx_nodes_parent ON nodes(parent);
CREATE INDEX idx_nodes_blob ON nodes(blob_id) WHERE blob_id IS NOT NULL;
CREATE INDEX idx_blobs_refcount ON blobs(refcount) WHERE refcount = 0;
Without proper indexes, path lookups become full table scans—catastrophic for large filesystems.
Connection pooling: 4-8 reader connections for concurrent metadata queries; single writer for updates. See SQLite Operations Guide for detailed patterns.
Garbage Collection
Blobs with refcount = 0 are orphans and can be deleted:
#![allow(unused)]
fn main() {
impl CustomIndexedBackend {
/// Run garbage collection (call periodically or on-demand).
pub fn gc(&self) -> Result<GcStats, FsError> {
let db = self.db.lock().map_err(|_| FsError::Backend("lock".into()))?;
// Find orphaned blobs
let orphans: Vec<String> = db.prepare(
"SELECT blob_id FROM blobs WHERE refcount = 0"
)?.query_map([], |row| row.get(0))?
.filter_map(|r| r.ok())
.collect();
drop(db);
// Delete from blob store
let mut deleted = 0;
for blob_id in &orphans {
if self.blobs.delete(blob_id).is_ok() {
deleted += 1;
}
}
// Remove from SQLite
let db = self.db.lock().unwrap();
db.execute(
"DELETE FROM blobs WHERE refcount = 0",
[],
)?;
Ok(GcStats { orphans_found: orphans.len(), blobs_deleted: deleted })
}
}
}
GC Safety:
- Never delete blobs referenced by snapshots
- Add
snapshot_refstable or userefcountthat includes snapshot references - Run GC in background, not during writes
Snapshots and Backup
Creating a Snapshot
#![allow(unused)]
fn main() {
impl CustomIndexedBackend {
/// Create a point-in-time snapshot.
pub fn snapshot(&self, name: &str) -> Result<SnapshotId, FsError> {
let db = self.db.lock().unwrap();
db.execute_batch(&format!(r#"
BEGIN;
-- Record snapshot
INSERT INTO snapshots (name, created_at, root_manifest)
VALUES ('{name}', strftime('%s', 'now'),
(SELECT json_group_array(blob_id) FROM blobs WHERE refcount > 0));
-- Pin all current blobs (prevent GC)
UPDATE blobs SET refcount = refcount + 1
WHERE blob_id IN (SELECT blob_id FROM nodes WHERE blob_id IS NOT NULL);
COMMIT;
"#))?;
Ok(SnapshotId(name.to_string()))
}
/// Export as single portable artifact.
pub fn export(&self, dest: impl AsRef<Path>) -> Result<(), FsError> {
// 1. SQLite backup API for metadata
let db = self.db.lock().unwrap();
let backup_db = Connection::open(dest.as_ref().join("metadata.db"))?;
db.backup(rusqlite::DatabaseName::Main, &backup_db, None)?;
// 2. Copy referenced blobs
let blob_ids: Vec<String> = db.prepare(
"SELECT DISTINCT blob_id FROM nodes WHERE blob_id IS NOT NULL"
)?.query_map([], |row| row.get(0))?
.filter_map(|r| r.ok())
.collect();
drop(db);
let blobs_dir = dest.as_ref().join("blobs");
std::fs::create_dir_all(&blobs_dir)?;
for blob_id in blob_ids {
let data = self.blobs.get(&blob_id)?;
std::fs::write(blobs_dir.join(&blob_id), data)?;
}
Ok(())
}
}
}
Middleware Integration
Middleware works unchanged - it wraps the hybrid backend like any other:
#![allow(unused)]
fn main() {
use anyfs::{FileStorage, QuotaLayer, TracingLayer, PathFilterLayer};
let backend = CustomIndexedBackend::open("drive.db", LocalCasBackend::new("./blobs"))?;
// Standard middleware stack
let backend = backend
.layer(QuotaLayer::builder()
.max_total_size(50 * 1024 * 1024 * 1024) // 50 GB
.build())
.layer(PathFilterLayer::builder()
.deny("**/.env")
.build())
.layer(TracingLayer::new());
let fs = FileStorage::new(backend);
// Use like any other filesystem
fs.write("/documents/report.pdf", &pdf_bytes)?;
}
Quota tracking note: QuotaLayer tracks logical size (what users see), not physical size (with dedup). For physical tracking, the backend could expose physical_usage() separately.
Async Considerations
The hybrid pattern benefits significantly from async (ADR-024):
| Operation | Sync Pain | Async Benefit |
|---|---|---|
| Blob upload to S3 | Blocks thread | Concurrent uploads |
| Multiple reads | Sequential | Parallel fetches |
| Write queue | blocking_recv() | Native async channel |
| GC | Blocks all ops | Background task |
When AsyncFs traits exist (ADR-024), the hybrid backend can use them naturally:
#![allow(unused)]
fn main() {
#[async_trait]
impl AsyncFsRead for CustomIndexedBackend {
async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let blob_id = self.lookup_blob_id(path).await?;
self.blobs.get_async(&blob_id).await // Non-blocking!
}
}
}
Identified Gaps
Areas where the current framework could be enhanced:
| Gap | Current State | Recommendation |
|---|---|---|
| Two-phase commit pattern | Not documented | Add to backend guide |
| Refcount/GC patterns | Not documented | Add section |
| Streaming large files | open_read/open_write exist | Document chunked patterns |
| Physical vs logical size | Quota tracks logical only | Consider PhysicalStats trait |
| Background tasks (GC) | No pattern | Document spawn pattern |
Summary
Framework validation: PASSED
The current AnyFS trait design supports hybrid backends:
- Traits define operations, not storage
- Interior mutability allows single-writer patterns
- Middleware composes unchanged
- Async strategy (ADR-024) enhances this pattern
Key patterns for hybrid backends:
- Single-writer queue for SQLite
- Two-phase commit (blob upload → SQLite commit)
- Content-addressing for dedup
- Refcounting for GC safety
- Snapshot pinning for backup safety
This validates that AnyFS is flexible enough for advanced storage architectures while maintaining its simple middleware composition model.
Zero-Cost Alternatives for I/O Operations
This document analyzes alternatives to dynamic dispatch (Box<dyn Trait>) for streaming I/O and directory iteration.
Decision: See ADR-025: Strategic Boxing for the formal decision.
TL;DR: We follow Tower/Axum’s approach - zero-cost on hot path (
read(),write()), box at cold path boundaries (open_read(),read_dir()). We avoid heap allocations and dynamic dispatch unless they buy flexibility with negligible performance impact.
Current Design (Dynamic Dispatch)
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
pub trait FsDir: Send + Sync {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}
// Where ReadDirIter is:
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send>);
}
Cost: One heap allocation per open_read(), open_write(), or read_dir() call.
Option 1: Associated Types (Classic Approach)
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
type Reader: Read + Send;
fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
pub trait FsDir: Send + Sync {
type DirIter: Iterator<Item = Result<DirEntry, FsError>> + Send;
fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError>;
}
}
Implementation
#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
type Reader = std::io::Cursor<Vec<u8>>;
fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
let data = self.read(path)?;
Ok(std::io::Cursor::new(data))
}
}
impl FsDir for MemoryBackend {
type DirIter = std::vec::IntoIter<Result<DirEntry, FsError>>;
fn read_dir(&self, path: &Path) -> Result<Self::DirIter, FsError> {
let entries = self.collect_entries(path)?;
Ok(entries.into_iter())
}
}
}
Middleware Propagation Problem
#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Quota<B> {
// Must define our own Reader type that wraps B::Reader
type Reader = QuotaReader<B::Reader>;
fn open_read(&self, path: &Path) -> Result<Self::Reader, FsError> {
let inner = self.inner.open_read(path)?;
Ok(QuotaReader::new(inner, self.usage.clone()))
}
}
// Every middleware needs a custom wrapper type
struct QuotaReader<R> {
inner: R,
usage: Arc<RwLock<QuotaUsage>>,
}
impl<R: Read> Read for QuotaReader<R> {
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
// Track bytes read if needed
self.inner.read(buf)
}
}
}
The Type Explosion
With a middleware stack like Quota<PathFilter<Tracing<MemoryBackend>>>:
#![allow(unused)]
fn main() {
type FinalReader = QuotaReader<PathFilterReader<TracingReader<Cursor<Vec<u8>>>>>;
type FinalDirIter = QuotaIter<PathFilterIter<TracingIter<IntoIter<Result<DirEntry, FsError>>>>>;
}
Verdict
| Aspect | Assessment |
|---|---|
| Heap allocations | ✅ None |
| Type complexity | ❌ Exponential growth |
| Middleware authoring | ❌ Every middleware needs wrapper types |
| User ergonomics | ⚠️ Type annotations become unwieldy |
| Compile times | ❌ Longer due to monomorphization |
Not recommended as the primary API due to complexity explosion.
Option 2: RPITIT (Rust 1.75+)
Return Position Impl Trait in Traits allows:
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}
pub trait FsDir: Send + Sync {
fn read_dir(&self, path: &Path)
-> Result<impl Iterator<Item = Result<DirEntry, FsError>> + Send, FsError>;
}
}
How It Works
The compiler infers a unique anonymous type for each implementor:
#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
let data = self.read(path)?;
Ok(std::io::Cursor::new(data)) // Returns Cursor<Vec<u8>>, but caller sees impl Read
}
}
impl FsRead for SqliteBackend {
fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
Ok(SqliteReader::new(self.conn.clone(), path)) // Different type, same interface
}
}
}
Middleware Still Works
#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for Tracing<B> {
fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError> {
let span = tracing::span!(Level::DEBUG, "open_read");
let _guard = span.enter();
self.inner.open_read(path) // Just forward - return type is inferred
}
}
}
The Catch: Object Safety
RPITIT makes traits non-object-safe. You cannot do:
#![allow(unused)]
fn main() {
// This won't compile with RPITIT
let backends: Vec<Box<dyn FsRead>> = vec![...];
}
Verdict
| Aspect | Assessment |
|---|---|
| Heap allocations | ✅ None |
| Type complexity | ✅ Hidden behind impl Trait |
| Middleware authoring | ✅ Simple forwarding |
| User ergonomics | ✅ Clean API |
| Object safety | ❌ Lost - can’t use dyn FsRead |
| Rust version | ⚠️ Requires 1.75+ |
Good for performance-critical paths but sacrifices dyn usage.
Option 3: Generic Associated Types (GATs)
For readers that borrow from the backend:
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
type Reader<'a>: Read + Send where Self: 'a;
fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError>;
}
}
Use Case: Zero-Copy Reads
#![allow(unused)]
fn main() {
impl FsRead for MemoryBackend {
type Reader<'a> = &'a [u8]; // Borrow directly from internal storage!
fn open_read(&self, path: &Path) -> Result<Self::Reader<'_>, FsError> {
let data = self.storage.read().unwrap();
let bytes = data.get(path.as_ref())
.ok_or(FsError::NotFound { path: path.as_ref().to_path_buf() })?;
Ok(bytes.as_slice())
}
}
}
Complexity
GATs are powerful but add significant complexity:
- Lifetime parameters propagate through middleware
- Not all backends can provide borrowed data (SQLite must copy)
- Makes trait definitions harder to understand
Verdict
| Aspect | Assessment |
|---|---|
| Heap allocations | ✅ Can be zero-copy |
| Type complexity | ❌ High (lifetimes everywhere) |
| Middleware authoring | ❌ Complex lifetime handling |
| Use case fit | ⚠️ Only benefits backends with owned data |
Overkill for most use cases. Consider only for specialized zero-copy scenarios.
Option 4: Hybrid Approach (Recommended)
Provide both dynamic and static APIs:
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
/// Dynamic dispatch version (simple, flexible)
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
/// Extension trait for zero-cost static dispatch
pub trait FsReadTyped: FsRead {
type Reader: Read + Send;
/// Static dispatch version (zero-cost, less flexible)
fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
// Blanket impl for convenience when types align
impl<T: FsReadTyped> FsRead for T {
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
Ok(Box::new(self.open_read_typed(path)?))
}
}
}
Usage
#![allow(unused)]
fn main() {
// Default: dynamic dispatch (works everywhere)
let reader = fs.open_read("/file.txt")?;
// Performance-critical: static dispatch
let reader: MemoryReader = fs.open_read_typed("/file.txt")?;
}
Verdict
| Aspect | Assessment |
|---|---|
| Heap allocations | ✅ Optional (use _typed to avoid) |
| Type complexity | ✅ Hidden unless you opt-in |
| Middleware authoring | ✅ Only implement base trait |
| User ergonomics | ✅ Simple default, power when needed |
| Object safety | ✅ Base trait remains object-safe |
Best of both worlds - simple default, zero-cost opt-in.
Option 5: Callback-Based Iteration
Avoid returning iterators entirely:
#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
fn for_each_entry<F>(&self, path: &Path, f: F) -> Result<(), FsError>
where
F: FnMut(DirEntry) -> ControlFlow<(), ()>;
}
}
Usage
#![allow(unused)]
fn main() {
fs.for_each_entry("/dir", |entry| {
println!("{}", entry.name);
ControlFlow::Continue(())
})?;
}
Verdict
| Aspect | Assessment |
|---|---|
| Heap allocations | ✅ None |
| Ergonomics | ❌ Callbacks are awkward |
| Early exit | ✅ Via ControlFlow::Break |
| Composability | ❌ Can’t chain iterator methods |
Not recommended as primary API. Could be added as optimization option.
Option 6: Stack-Allocated Small Buffer
For directory iteration, most directories are small:
#![allow(unused)]
fn main() {
use smallvec::SmallVec;
pub struct ReadDirIter {
// Stack-allocate up to 32 entries, heap only if larger
entries: SmallVec<[Result<DirEntry, FsError>; 32]>,
index: usize,
}
}
Verdict
| Aspect | Assessment |
|---|---|
| Heap allocations | ⚠️ Avoided for small directories |
| Memory overhead | ⚠️ Larger stack frames |
| Dependencies | ⚠️ Adds smallvec crate |
Reasonable optimization for directory iteration specifically.
Recommendation
Primary API: Keep Dynamic Dispatch
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
pub trait FsDir: Send + Sync {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
}
}
Why:
- Simplicity - One type to learn, one API
- Object safety - Can use
Box<dyn Fs>for runtime polymorphism - Middleware simplicity - No wrapper types needed
- Actual cost is low - One allocation per stream open, not per read
Optional: Static Dispatch Extension (Fast Path)
For performance-critical code, offer typed variants. This is the first-class fast path for hot loops when the backend type is known:
#![allow(unused)]
fn main() {
pub trait FsReadTyped: FsRead {
type Reader: Read + Send;
fn open_read_typed(&self, path: &Path) -> Result<Self::Reader, FsError>;
}
}
Future: RPITIT When Object Safety Not Needed
If a user doesn’t need dyn Fs, they can define their own trait:
#![allow(unused)]
fn main() {
pub trait FsReadStatic: Send + Sync {
fn open_read(&self, path: &Path) -> Result<impl Read + Send, FsError>;
}
}
Cost Analysis: Is It Actually a Problem?
Heap Allocation Cost
| Operation | Allocations | Typical Size | Cost |
|---|---|---|---|
open_read() | 1 | ~24-48 bytes (vtable + pointer) | ~20-50ns |
read() (data) | 0-1 | File size | Dominates |
read_dir() | 1 | ~24-48 bytes | ~20-50ns |
| Iteration | 0 | - | - |
The allocation is dwarfed by actual I/O time. For a 4KB file read from SQLite or disk, the Box allocation is <0.1% of total time.
When It Matters
| Scenario | Matters? |
|---|---|
| Reading large files | No - I/O dominates |
| Reading many small files | Maybe - consider batching |
| Hot loop micro-benchmarks | Yes |
| Real-world applications | Rarely |
Conclusion
Dynamic dispatch is the right default. The cost is negligible for real workloads, and the ergonomic benefits are substantial. Offer static dispatch as an opt-in escape hatch for the rare cases where it matters.
Summary Decision Matrix
| Approach | Alloc-Free | Simple | Object-Safe | Recommended |
|---|---|---|---|---|
Current (Box<dyn>) | ❌ | ✅ | ✅ | ✅ Default |
| Associated Types | ✅ | ❌ | ✅ | ❌ Too complex |
| RPITIT | ✅ | ✅ | ❌ | ⚠️ When no dyn needed |
| GATs | ✅ | ❌ | ❌ | ❌ Overkill |
| Hybrid | ✅ opt-in | ✅ | ✅ | ✅ Best of both |
| Callbacks | ✅ | ❌ | ✅ | ❌ Awkward API |
| SmallVec | ⚠️ | ✅ | ✅ | ⚠️ For ReadDirIter |
Indexing Middleware (Design Plan)
Status: Accepted (Future) — See ADR-031 Scope: Design plan only (no API break)
Summary
Provide a consistent, queryable index of file activity and metadata for real filesystems (SQLite default). The index tracks operations (create, write, rename, delete) and maintains a catalog of files for fast queries and statistics. This enables workflows like “manage a flash drive and query every change” and “mount a drive and get an implicit audit trail.”
Direction: Middleware-only. Indexing is a composable layer users opt into when they want a queryable catalog of file activity.
Goals
- Preserve std::fs-style DX via
FileStorage(no change to core traits). - Track file operations with timestamps in a durable index (SQLite default).
- Provide fast queries (by path, prefix, mtime, size, hash).
- Keep index consistent for operations executed through AnyFS.
- Keep the design open to future index engines via a small trait (SQLite default).
Non-Goals
- Full OS-level auditing outside AnyFS (requires kernel hooks).
- Mandatory hashing of all files (optional and expensive).
- Replacing
Tracing(indexing is storage + query, tracing is instrumentation).
Architecture (Middleware-Only)
Indexing Middleware (Primary)
Shape: Indexing<B> where B: Fs
Layer: IndexLayer (builder-based, like QuotaLayer, TracingLayer)
Behavior: Intercepts operations and writes entries into an index (SQLite by default). Works on all backends (Memory, SQLite, VRootFs, custom). Guarantees apply only to operations that flow through AnyFS.
Pros
- Backend-agnostic.
- Useful for virtual backends too.
- No special-case OS behavior.
Cons
- External changes on real FS are not captured (unless a watcher/scan helper is added later).
Where It Fits
| Use Case | Recommended |
|---|---|
| AnyFS app wants an audit trail | Indexing<B> middleware |
| Virtual backend needs queryable catalog | Indexing<B> middleware |
| Real FS with external edits to track | Indexing middleware + future watcher/scan helper |
| Mounted drive where all access goes through AnyFS | Indexing middleware (enough) |
Consistency Model
- Through AnyFS: Strong consistency for index updates in strict mode.
- External OS changes: Not captured by default. A future watcher/scan helper can reconcile.
Modes:
#![allow(unused)]
fn main() {
enum IndexConsistency {
Strict, // If index update fails, return error from FS op
BestEffort, // FS op succeeds even if index update fails
}
}
Index Schema (SQLite)
Minimal schema focused on query speed and durability:
CREATE TABLE IF NOT EXISTS nodes (
path TEXT PRIMARY KEY,
parent_path TEXT NOT NULL,
file_type INTEGER NOT NULL, -- 0=file, 1=dir, 2=symlink
size INTEGER NOT NULL DEFAULT 0,
inode INTEGER,
nlink INTEGER,
permissions INTEGER,
created_at INTEGER,
modified_at INTEGER,
accessed_at INTEGER,
symlink_target TEXT,
hash BLOB, -- optional
exists INTEGER NOT NULL DEFAULT 1,
last_seen_at INTEGER NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_nodes_parent ON nodes(parent_path);
CREATE INDEX IF NOT EXISTS idx_nodes_mtime ON nodes(modified_at);
CREATE INDEX IF NOT EXISTS idx_nodes_hash ON nodes(hash);
CREATE TABLE IF NOT EXISTS ops (
id INTEGER PRIMARY KEY,
ts INTEGER NOT NULL,
op TEXT NOT NULL, -- "write", "rename", "remove", ...
path TEXT,
path_to TEXT,
bytes INTEGER,
status TEXT NOT NULL, -- "ok" | "err"
error TEXT
);
CREATE TABLE IF NOT EXISTS config (
key TEXT PRIMARY KEY,
value TEXT NOT NULL
);
Path normalization: Store virtual paths (what the user sees), not host paths. For host FS, optionally store host paths in a separate table if needed.
Operation Mapping
| Operation | Index Update |
|---|---|
write, append, truncate | Upsert node, update size/mtime, log op |
create_dir, create_dir_all | Insert dir nodes, log op |
remove_file | Mark exists=0, log op |
remove_dir, remove_dir_all | Mark subtree removed (prefix query), log op |
rename | Update path + parent for subtree, log op |
copy | Insert new node from source metadata, log op |
symlink, hard_link | Insert node, set link metadata, log op |
read/read_range | Optional op log only (configurable) |
Streaming writes: Wrap open_write() with a counting writer that records final size and timestamps on close.
Configuration
#![allow(unused)]
fn main() {
pub struct IndexConfig {
pub index_file: PathBuf, // sidecar index file (SQLite default)
pub consistency: IndexConsistency,
pub track_reads: bool,
pub track_errors: bool,
pub track_metadata: bool,
pub content_hashing: ContentHashing, // None | OnWrite | OnDemand
pub initial_scan: InitialScan, // None | OnDemand | FullScan
}
}
Naming
- Middleware type:
Indexing<B> - Layer:
IndexLayer - Builder methods emphasize intent:
index_file,consistency,track_*,content_hashing,initial_scan
Example Configuration
#![allow(unused)]
fn main() {
let backend = MemoryBackend::new()
.layer(IndexLayer::builder()
.index_file("index.db")
.consistency(IndexConsistency::Strict)
.track_reads(false)
.build());
}
Index Engine Abstraction (Future)
To keep the middleware ergonomic while enabling alternate engines, define a small storage trait and keep SQLite as the default implementation:
#![allow(unused)]
fn main() {
pub trait IndexStore: Send + Sync {
fn upsert_node(&self, node: IndexNode) -> Result<(), IndexError>;
fn mark_removed(&self, path: &Path) -> Result<(), IndexError>;
fn rename_prefix(&self, from: &Path, to: &Path) -> Result<(), IndexError>;
fn record_op(&self, entry: OpEntry) -> Result<(), IndexError>;
}
}
The default implementation uses SQLite at index_file. If/when alternate engines are needed, the IndexLayer builder can accept a boxed IndexStore for advanced use without introducing an enum.
Performance Notes
- Use WAL mode for concurrency.
- Batch updates for recursive operations (rename/remove_dir_all).
- Hashing is optional and should be off by default.
- Keep op logs bounded (optional retention policy).
For detailed SQLite tuning (pragmas, connection pooling, checkpointing), see the SQLite Operations Guide.
Security and Containment
- Index file should live outside the root path by default.
- For mounted drives, use a dedicated index path per mount.
- Respect
PathFilterandRestrictionswhen operating through middleware.
Mounting Scenario
When mounted via anyfs (with fuse or winfsp feature flags), all access goes through AnyFS. The index becomes an implicit audit trail:
- Every file operation is logged.
- Queries reflect all operations routed through AnyFS.
Open Questions
- Should op logs be bounded by size/time by default?
- Do we need a query API in
anyfsor a separateanyfs-indexcrate? - Should middleware expose a read-only
IndexStorehandle for queries? - Should we add a companion watcher/scan tool for external changes on real FS?
Layered Traits (anyfs-backend)
AnyFS uses a layered trait architecture for maximum flexibility with minimal complexity.
See ADR-030 for the design rationale.
Trait Hierarchy
FsPosix
│
┌──────────────┼──────────────┐
│ │ │
FsHandles FsLock FsXattr
│ │ │
└──────────────┴──────────────┘
│
FsFuse ← FsFull + FsInode
│
┌──────────────┴──────────────┐
│ │
FsFull FsInode
│
│
├──────┬───────┬───────┬──────┐
│ │ │ │ │
FsLink FsPerm FsSync FsStats │
│ │ │ │ │
└──────┴───────┴───────┴──────┘
│
Fs ← Most users only need this
│
┌───────────┼───────────┐
│ │ │
FsRead FsWrite FsDir
Derived Traits (auto-impl)
───────────────────────────
FsPath: FsRead + FsLink
(path canonicalization)
Simple rule: Import Fs for basic use. Add traits as needed for advanced features.
Note: FsPath is a derived trait with a blanket impl. Any type implementing FsRead + FsLink automatically gets FsPath. SelfResolving is a marker trait - backends that implement it should be used with NoOpResolver in FileStorage (there is no automatic detection; use FileStorage::with_resolver(backend, NoOpResolver) explicitly). PathResolver is a strategy trait for pluggable path resolution (see ADR-033).
Layer 1: Core Traits (Required)
Thread Safety: All traits require
Send + Syncand use&selffor all methods. Backend implementers MUST use interior mutability (RwLock,Mutex, etc.) to ensure thread-safe concurrent access. See ADR-023 for rationale.Path Parameters: Core traits use
&Pathso they are object-safe (dyn Fsworks). For ergonomics,FileStorageandFsExtacceptimpl AsRef<Path>and forward to the core traits.
FsRead
#![allow(unused)]
fn main() {
pub trait FsRead: Send + Sync {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
fn read_to_string(&self, path: &Path) -> Result<String, FsError>;
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError>;
fn exists(&self, path: &Path) -> Result<bool, FsError>;
fn metadata(&self, path: &Path) -> Result<Metadata, FsError>;
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
}
FsWrite
#![allow(unused)]
fn main() {
pub trait FsWrite: Send + Sync {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
fn remove_file(&self, path: &Path) -> Result<(), FsError>;
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError>;
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError>;
fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError>;
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
}
}
FsDir
#![allow(unused)]
fn main() {
pub trait FsDir: Send + Sync {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
fn create_dir(&self, path: &Path) -> Result<(), FsError>;
fn create_dir_all(&self, path: &Path) -> Result<(), FsError>;
fn remove_dir(&self, path: &Path) -> Result<(), FsError>;
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError>;
}
/// Iterator over directory entries. Wraps a boxed iterator for flexibility.
///
/// - Outer `Result` (from `read_dir()`) = "can I open this directory?"
/// - Inner `Result` (per item) = "can I read this entry?"
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send + 'static>);
impl Iterator for ReadDirIter {
type Item = Result<DirEntry, FsError>;
fn next(&mut self) -> Option<Self::Item> { self.0.next() }
}
impl ReadDirIter {
pub fn new(iter: impl Iterator<Item = Result<DirEntry, FsError>> + Send + 'static) -> Self {
Self(Box::new(iter))
}
/// Create from a pre-collected vector (useful for middleware like Overlay).
pub fn from_vec(entries: Vec<Result<DirEntry, FsError>>) -> Self {
Self(Box::new(entries.into_iter()))
}
/// Collect all entries, short-circuiting on first error.
pub fn collect_all(self) -> Result<Vec<DirEntry>, FsError> {
self.collect()
}
}
}
Layer 2: Extended Traits (Optional)
FsLink
#![allow(unused)]
fn main() {
pub trait FsLink: Send + Sync {
fn symlink(&self, target: &Path, link: &Path) -> Result<(), FsError>;
fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError>;
fn read_link(&self, path: &Path) -> Result<PathBuf, FsError>;
fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError>;
}
}
FsPermissions
#![allow(unused)]
fn main() {
pub trait FsPermissions: Send + Sync {
fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError>;
}
}
FsSync
#![allow(unused)]
fn main() {
pub trait FsSync: Send + Sync {
fn sync(&self) -> Result<(), FsError>;
fn fsync(&self, path: &Path) -> Result<(), FsError>;
}
}
FsStats
#![allow(unused)]
fn main() {
pub trait FsStats: Send + Sync {
fn statfs(&self) -> Result<StatFs, FsError>;
}
}
FsPath (Optimizable)
Path canonicalization with a default implementation. Backends can override for optimized resolution.
#![allow(unused)]
fn main() {
pub trait FsPath: FsRead + FsLink {
/// Resolve all symlinks and normalize path (.., .).
/// Default: iterative resolution via read_link() and symlink_metadata().
fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
// ... default impl ...
}
/// Like canonicalize, but allows non-existent final component.
fn soft_canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
// ... default impl ...
}
}
impl<T: FsRead + FsLink> FsPath for T {}
}
SelfResolving (Marker)
Marker trait for backends that handle their own path resolution (e.g., VRootFsBackend, StdFsBackend). When using these backends, use NoOpResolver explicitly:
#![allow(unused)]
fn main() {
pub trait SelfResolving {}
// Usage:
let fs = FileStorage::with_resolver(VRootFsBackend::new("/data")?, NoOpResolver);
}
Layer 3: Inode Trait (For FUSE)
FsInode
Required for FUSE mounting (FUSE operates on inodes, not paths). Also enables correct hardlink reporting (same inode = same file, proper nlink count).
Note:
FsLinkdefines hardlink creation (hard_link()).FsInodeenables FUSE to track hardlinks via inode identity.
#![allow(unused)]
fn main() {
pub trait FsInode: Send + Sync {
fn path_to_inode(&self, path: &Path) -> Result<u64, FsError>;
fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError>;
fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError>;
fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError>;
}
}
Layer 4: POSIX Traits (Full POSIX)
POSIX Types
#![allow(unused)]
fn main() {
/// Opaque file handle (inode-based for efficiency)
pub struct Handle(pub u64);
/// File open flags (mirrors POSIX)
#[derive(Clone, Copy, Debug)]
pub struct OpenFlags {
pub read: bool,
pub write: bool,
pub create: bool,
pub truncate: bool,
pub append: bool,
}
impl OpenFlags {
pub const READ: Self = Self { read: true, write: false, create: false, truncate: false, append: false };
pub const WRITE: Self = Self { read: false, write: true, create: true, truncate: true, append: false };
pub const READ_WRITE: Self = Self { read: true, write: true, create: false, truncate: false, append: false };
pub const APPEND: Self = Self { read: false, write: true, create: true, truncate: false, append: true };
}
/// File lock type (mirrors POSIX flock)
#[derive(Clone, Copy, Debug)]
pub enum LockType {
Shared, // Multiple readers
Exclusive, // Single writer
}
}
FsHandles
#![allow(unused)]
fn main() {
pub trait FsHandles: Send + Sync {
fn open(&self, path: &Path, flags: OpenFlags) -> Result<Handle, FsError>;
fn read_at(&self, handle: Handle, buf: &mut [u8], offset: u64) -> Result<usize, FsError>;
fn write_at(&self, handle: Handle, data: &[u8], offset: u64) -> Result<usize, FsError>;
fn close(&self, handle: Handle) -> Result<(), FsError>;
}
}
FsLock
#![allow(unused)]
fn main() {
pub trait FsLock: Send + Sync {
fn lock(&self, handle: Handle, lock: LockType) -> Result<(), FsError>;
fn try_lock(&self, handle: Handle, lock: LockType) -> Result<bool, FsError>;
fn unlock(&self, handle: Handle) -> Result<(), FsError>;
}
}
FsXattr
#![allow(unused)]
fn main() {
pub trait FsXattr: Send + Sync {
fn get_xattr(&self, path: &Path, name: &str) -> Result<Vec<u8>, FsError>;
fn set_xattr(&self, path: &Path, name: &str, value: &[u8]) -> Result<(), FsError>;
fn remove_xattr(&self, path: &Path, name: &str) -> Result<(), FsError>;
fn list_xattr(&self, path: &Path) -> Result<Vec<String>, FsError>;
}
}
Convenience Supertraits
These are automatically implemented via blanket impls:
#![allow(unused)]
fn main() {
/// Basic filesystem - covers 90% of use cases
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}
/// Full filesystem with all std::fs features
pub trait FsFull: Fs + FsLink + FsPermissions + FsSync + FsStats {}
impl<T: Fs + FsLink + FsPermissions + FsSync + FsStats> FsFull for T {}
/// FUSE-mountable filesystem
pub trait FsFuse: FsFull + FsInode {}
impl<T: FsFull + FsInode> FsFuse for T {}
/// Full POSIX filesystem
pub trait FsPosix: FsFuse + FsHandles + FsLock + FsXattr {}
impl<T: FsFuse + FsHandles + FsLock + FsXattr> FsPosix for T {}
}
When to Use Each Level
| Level | Trait | Use When |
|---|---|---|
| 1 | Fs | Basic file operations (read, write, dirs) |
| 2 | FsFull | Need links, permissions, sync, or stats |
| 3 | FsFuse | FUSE mounting or hardlink support |
| 4 | FsPosix | Full POSIX (file handles, locks, xattr) |
Implementing Functions
Use trait bounds to specify requirements:
#![allow(unused)]
fn main() {
use anyfs::FileStorage;
// Works with any backend, keeps std::fs-style paths
fn process_files<B: Fs>(fs: &FileStorage<B>) -> Result<(), FsError> {
let data = fs.read("/input.txt")?;
fs.write("/output.txt", &data)?;
Ok(())
}
// Requires link support
fn create_backup<B: Fs + FsLink>(fs: &FileStorage<B>) -> Result<(), FsError> {
fs.hard_link("/data.txt", "/data.txt.bak")?;
Ok(())
}
// Requires FsFuse trait + fuse/winfsp feature
fn mount_filesystem(fs: impl FsFuse) -> Result<(), MountError> {
anyfs::MountHandle::mount(fs, "/mnt/myfs")?;
Ok(())
}
}
Extension Trait
FsExt provides convenience methods for any Fs backend:
#![allow(unused)]
fn main() {
pub trait FsExt: Fs {
/// Check if path is a file.
fn is_file(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;
/// Check if path is a directory.
fn is_dir(&self, path: impl AsRef<Path>) -> Result<bool, FsError>;
/// JSON methods (require optional `serde` feature in anyfs-backend)
#[cfg(feature = "serde")]
fn read_json<T: DeserializeOwned>(&self, path: impl AsRef<Path>) -> Result<T, FsError>;
#[cfg(feature = "serde")]
fn write_json<T: Serialize>(&self, path: impl AsRef<Path>, value: &T) -> Result<(), FsError>;
}
// Blanket implementation
impl<B: Fs> FsExt for B {}
}
FileStorage (anyfs)
Ergonomic wrapper for std::fs-aligned API
Overview
FileStorage<B> is a thin wrapper that provides a familiar std::fs-aligned API with:
B- Backend type (the only generic)- Built-in path resolution via boxed
PathResolver(swappable at runtime)
It is the intended application-facing API: std::fs-style paths with object-safe core traits under the hood.
It does TWO things:
- Ergonomics (std::fs-aligned API with
impl AsRef<Path>convenience) - Path resolution for virtual backends (via boxed
PathResolver- cold path, boxing is acceptable)
All policy (limits, feature gates, logging) is handled by middleware, not FileStorage.
Why Only One Generic?
Previous designs used FileStorage<B, R, M> with three type parameters. We simplified to FileStorage<B>:
| Old Param | Purpose | Why Removed |
|---|---|---|
R (Resolver) | Swappable path resolution | Boxed internally—resolution is a cold path (ADR-025) |
M (Marker) | Compile-time safety | Users can create wrapper newtypes if needed |
Result: Simpler API for 90% of users. Those who need type-safe markers wrap FileStorage themselves.
Creating a Container
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
// Simple: ergonomics + default path resolution
let fs = FileStorage::new(MemoryBackend::new());
}
With middleware (layer-based):
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, FileStorage};
let fs = FileStorage::new(
MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(RestrictionsLayer::builder()
.deny_permissions()
.build())
);
}
With custom resolver:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
use anyfs::resolvers::{CachingResolver, IterativeResolver};
// Custom resolver for read-heavy workloads
let fs = FileStorage::with_resolver(
MemoryBackend::new(),
CachingResolver::new(IterativeResolver::default())
);
}
Type-Safe Markers (User-Defined Wrappers)
If you need compile-time safety to prevent mixing filesystems, create wrapper newtypes:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Define your own wrapper types
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);
impl SandboxFs {
fn new() -> Self {
SandboxFs(FileStorage::new(MemoryBackend::new()))
}
}
// Type-safe function signatures prevent mixing
fn process_sandbox(fs: &SandboxFs) {
// Can only accept SandboxFs
}
fn save_user_file(fs: &UserDataFs, name: &str, data: &[u8]) {
// Can only accept UserDataFs
}
// Compile-time safety:
let sandbox = SandboxFs::new();
process_sandbox(&sandbox); // OK
// process_sandbox(&userdata); // Compile error! Wrong type
}
When to Use Wrapper Types
| Scenario | Use Wrapper? | Why |
|---|---|---|
| Single container | No | FileStorage<B> is sufficient |
| Multiple containers, same type | Yes | Prevent accidental mixing |
| Multi-tenant systems | Yes | Compile-time tenant isolation |
| Sandbox + user data | Yes | Never write user data to sandbox |
std::fs-aligned Methods
FileStorage mirrors std::fs naming:
| FileStorage | std::fs |
|---|---|
read() | std::fs::read |
read_to_string() | std::fs::read_to_string |
write() | std::fs::write |
read_dir() | std::fs::read_dir |
create_dir() | std::fs::create_dir |
create_dir_all() | std::fs::create_dir_all |
remove_file() | std::fs::remove_file |
remove_dir() | std::fs::remove_dir |
remove_dir_all() | std::fs::remove_dir_all |
rename() | std::fs::rename |
copy() | std::fs::copy |
metadata() | std::fs::metadata |
symlink_metadata() | std::fs::symlink_metadata |
read_link() | std::fs::read_link |
set_permissions() | std::fs::set_permissions |
When the backend implements extended traits (e.g., FsLink, FsInode, FsHandles), FileStorage forwards those methods too and keeps the same impl AsRef<Path> ergonomics for path parameters.
What FileStorage Does NOT Do
| Concern | Use Instead |
|---|---|
| Quota enforcement | Quota<B> |
| Feature gating | Restrictions<B> |
| Audit logging | Tracing<B> |
| Path containment | PathFilter middleware or VRootFsBackend containment |
FileStorage is not a policy layer. If you need policy, compose middleware.
FileStorage Implementation
#![allow(unused)]
fn main() {
use anyfs_backend::{Fs, PathResolver};
use anyfs::resolvers::IterativeResolver;
/// Ergonomic wrapper with single generic parameter.
pub struct FileStorage<B> {
backend: B,
resolver: Box<dyn PathResolver>, // Boxed: resolution is cold path
}
impl<B: Fs> FileStorage<B> {
/// Create with default resolver (IterativeResolver).
pub fn new(backend: B) -> Self {
FileStorage {
backend,
resolver: Box::new(IterativeResolver::new()),
}
}
/// Create with custom resolver.
pub fn with_resolver(backend: B, resolver: impl PathResolver + 'static) -> Self {
FileStorage {
backend,
resolver: Box::new(resolver),
}
}
/// Type-erase the backend for simpler types (opt-in boxing).
pub fn boxed(self) -> FileStorage<Box<dyn Fs>> {
FileStorage {
backend: Box::new(self.backend),
resolver: self.resolver,
}
}
}
}
Path Resolution
FileStorage handles path resolution for virtual backends via the boxed PathResolver. The default IterativeResolver provides symlink-aware canonicalization.
Backends implementing SelfResolving (like VRootFsBackend) skip resolution since the OS handles it.
Type Erasure (Opt-in)
When you need simpler types (e.g., storing in collections), use .boxed():
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage, Fs};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Type-erased for uniform storage
let filesystems: Vec<FileStorage<Box<dyn Fs>>> = vec![
FileStorage::new(MemoryBackend::new()).boxed(),
FileStorage::new(SqliteBackend::open("a.db")?).boxed(),
FileStorage::new(SqliteBackend::open("b.db")?).boxed(),
];
}
When to use .boxed():
| Situation | Use Generic | Use .boxed() |
|---|---|---|
| Local variables | Yes | No |
| Function params | Yes (impl Fs) | No |
| Return types | Yes (impl Fs) | No |
| Collections of mixed backends | No | Yes |
| Struct fields (want simple type) | Maybe | Yes |
Direct Backend Access
If you don’t need the wrapper, use backends directly:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, FileStorage};
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build());
// Use FileStorage for std::fs-style paths
let fs = FileStorage::new(backend);
fs.write("/file.txt", b"data")?;
}
FileStorage<B> is part of the anyfs crate, not a separate crate.
Backends Guide
This guide explains each backend available for AnyFS—both built-in (in anyfs) and ecosystem crates (anyfs-sqlite, anyfs-indexed)—how they work internally, when to use them, and the trade-offs involved.
Quick Reference: Which Backend Should You Use?
TL;DR — Pick the first match from top to bottom:
| Your Situation | Best Choice | Why |
|---|---|---|
| Writing tests | MemoryBackend | Fast, isolated, no cleanup |
| Running in WASM/browser | MemoryBackend | Simplest option |
| Need encrypted single-file storage | anyfs-sqlite: SqliteBackend | AES-256 via encryption feature (ecosystem crate) |
| Need portable single-file database | anyfs-sqlite: SqliteBackend | Cross-platform, ACID (ecosystem crate) |
| Large files (>100MB) with path isolation | anyfs-indexed: IndexedBackend | Virtual paths + native disk I/O (ecosystem crate) |
| Containing untrusted code to a directory | VRootFsBackend | Prevents path traversal attacks |
| Working with real files in trusted environment | StdFsBackend | Direct OS operations |
| Need layered filesystem (container-like) | Overlay (middleware) | Base + writable upper layer |
⚠️ Security Warning: StdFsBackend provides NO isolation. Never use with untrusted input.
Ecosystem Crates: Complex backends like
SqliteBackendandIndexedBackendlive in separate crates (anyfs-sqlite,anyfs-indexed) because they require internal runtime complexity (connection pooling, sharding, chunking).
Backend Categories
AnyFS backends fall into two fundamental categories based on who resolves paths:
| Category | Path Resolution | Symlink Handling | Isolation |
|---|---|---|---|
| Type 1: Virtual Filesystem | PathResolver (pluggable) | Simulated by AnyFS | Complete |
| Type 2: Real Filesystem | Operating System | Delegated to OS | Partial/None |
Type 1: Virtual Filesystem Backends
These backends store filesystem data in an abstract format (memory, database, etc.). AnyFS handles path resolution via pluggable PathResolver (see ADR-033), including:
- Path traversal (
..,.) - Symlink following (simulated)
- Hard link tracking (simulated)
- Path normalization
Key benefit: Complete isolation from the host OS. Identical behavior across all platforms.
Type 2: Real Filesystem Backends
These backends delegate operations to the actual operating system. The OS handles path resolution, which means:
- Native symlink behavior
- Native permission enforcement
- Platform-specific edge cases
- Potential security considerations (path escapes)
Key benefit: Native performance and compatibility with existing files.
Type 1: Virtual Filesystem Backends
MemoryBackend
An in-memory filesystem. All data lives in RAM and is lost when the process exits.
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, FileStorage};
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/data.txt", b"Hello, World!")?;
}
How It Works
- Files and directories stored in a tree structure (
HashMapor similar) - Symlinks stored as data pointing to target paths
- Hard links share the same underlying data node
- All operations are memory-only (no disk I/O)
- Supports snapshots via
Cloneand persistence viasave_to()/load_from()
Performance
| Operation | Speed | Notes |
|---|---|---|
| Read/Write | ⚡ Very Fast | No I/O, pure memory operations |
| Path Resolution | ⚡ Very Fast | In-memory tree traversal |
| Large Files | ⚠️ Memory-bound | Limited by available RAM |
Advantages
- Fastest backend - no disk I/O overhead
- Deterministic - perfect for testing
- Portable - works on all platforms including WASM
- Snapshots -
Clonecreates instant backups - No cleanup - no temp files to delete
Disadvantages
- Volatile - data lost on process exit (unless serialized)
- Memory-limited - large filesystems consume RAM
- No persistence - must explicitly save/load state
When to Use
| Use Case | Recommendation |
|---|---|
| Unit tests | ✅ Ideal - fast, isolated, deterministic |
| Integration tests | ✅ Ideal - no filesystem pollution |
| Temporary workspaces | ✅ Good - fast scratch space |
| Build caches | ✅ Good - if fits in memory |
| WASM/Browser | ✅ Ideal - simplest option |
| Large file storage | ❌ Avoid - use anyfs-sqlite or disk |
| Persistent data | ❌ Avoid - unless you handle serialization |
✅ USE MemoryBackend when:
- Writing unit tests (fast, isolated, deterministic)
- Writing integration tests (no filesystem pollution)
- Building temporary workspaces or scratch space
- Caching data that fits in memory
- Running in WASM/browser environments (simplest option)
- Need instant snapshots via
Clone
❌ DON’T USE MemoryBackend when:
- Storing files larger than available RAM
- Data must survive process restart (use anyfs-sqlite)
- Working with existing files on disk (use VRootFsBackend)
SqliteBackend (Ecosystem Crate)
Crate:
anyfs-sqliteComplex backends live in separate crates. See AGENTS.md “Crate Ecosystem” section.
Stores the entire filesystem in a single SQLite database file.
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend;
use anyfs::FileStorage;
let fs = FileStorage::new(SqliteBackend::open("myfs.db")?);
fs.write("/documents/report.txt", b"Annual Report")?;
}
How It Works
- Single
.dbfile contains all files, directories, and metadata - Schema:
nodestable (path, type, content, permissions, timestamps) - Symlinks stored as rows with target path in content
- Hard links share the same
inode(row ID) - Uses WAL mode for concurrent read access
- Connection pooling: multiple readers, single writer with batching
- Write batching: groups operations into transactions for efficiency
- Transactions ensure atomic operations
Key insight: “Writes are expensive.” SqliteBackend batches writes internally because one transaction per batch is far more efficient than one transaction per operation.
Performance
| Operation | Speed | Notes |
|---|---|---|
| Read/Write | 🐢 Slower | SQLite query overhead |
| Path Resolution | 🐢 Slower | Database lookups per component |
| Transactions | ✅ Atomic | ACID guarantees |
| Large Files | 🟡 Varies | See note below |
Large file behavior: SQLite streams BLOB content incrementally via sqlite3_blob_read/write, so files don’t need to fit entirely in RAM. However, very large BLOBs (>100MB) can cause higher memory pressure during I/O operations due to SQLite’s internal buffering and page management. For frequent large file operations, consider IndexedBackend which uses native file I/O.
Performance note: SQLite performance varies significantly based on hardware, configuration, and workload. With proper tuning (WAL mode, connection pooling, write batching), a single SQLite database on modern hardware can achieve high throughput. See sqlite-operations.md for tuning guidance.
Advantages
- Single-file portability - entire filesystem in one
.dbfile - ACID transactions - atomic operations, crash recovery
- Cross-platform - works on all platforms including WASM
- Complete isolation - no interaction with host filesystem
- Queryable - can inspect with SQLite tools
- Optional encryption - AES-256 via SQLCipher with
encryptionfeature
Disadvantages
- Slower than memory - database overhead on every operation
- Single-writer - SQLite’s write lock limits concurrency
- Large file overhead - very large BLOBs (>100MB) have higher memory pressure due to SQLite buffering
When to Use
| Use Case | Recommendation |
|---|---|
| Portable storage | ✅ Ideal - single file, works everywhere |
| Embedded databases | ✅ Ideal - self-contained |
| Sandboxed environments | ✅ Good - complete isolation |
| Encrypted storage | ✅ Good - use open_encrypted() with feature |
| Archive/backup | ✅ Good - atomic, portable |
| Large media files | ✅ Works - higher memory pressure during I/O |
| High-throughput I/O | ⚠️ Tradeoff - database overhead vs MemoryBackend |
| External tool access | ❌ Avoid - files not on real filesystem |
✅ USE SqliteBackend when:
- Need portable, single-file storage (easy to copy, backup, share)
- Building embedded/self-contained applications
- Complete isolation from host filesystem is required
- Want encryption (use
open_encrypted()withencryptionfeature) - Need ACID transactions and crash recovery
- Cross-platform consistency is critical
❌ DON’T USE SqliteBackend when:
- Files must be accessible to external tools (use VRootFsBackend)
- Minimizing memory pressure for very large files is critical (use anyfs-indexed: IndexedBackend)
💡 SqliteBackend vs IndexedBackend: Both provide complete path isolation. Choose SqliteBackend for single-file portability and portable storage. Choose IndexedBackend (anyfs-indexed) for very large files (>100MB) that need native disk streaming performance.
IndexedBackend (Ecosystem Crate)
Crate:
anyfs-indexedComplex backends live in separate crates. See AGENTS.md “Crate Ecosystem” section.
A hybrid backend: virtual paths with disk-based content storage. Paths, directories, symlinks, and metadata are stored in an index database. File content is stored on the real filesystem as opaque blobs.
Key insight: Same isolation model as SqliteBackend, but file content stored externally for native I/O performance with large files.
#![allow(unused)]
fn main() {
use anyfs_indexed::IndexedBackend;
use anyfs::FileStorage;
// Files stored in ./storage/, index in ./storage/index.db
let fs = FileStorage::new(IndexedBackend::open("./storage")?);
fs.write("/documents/report.pdf", &pdf_bytes)?;
// Actually stored as: ./storage/a1b2c3d4-5678-...-1704067200.bin
}
How It Works
Virtual Path Real Storage
─────────────────────────────────────────────────────
/documents/report.pdf → ./storage/blobs/a1b2c3d4-...-1704067200.bin
/images/photo.jpg → ./storage/blobs/b2c3d4e5-...-1704067201.bin
/config.json → ./storage/blobs/c3d4e5f6-...-1704067202.bin
index.db contains:
┌─────────────────────────┬──────────────────────────────┬──────────┐
│ virtual_path │ blob_name │ metadata │
├─────────────────────────┼──────────────────────────────┼──────────┤
│ /documents/report.pdf │ a1b2c3d4-...-1704067200.bin │ {...} │
│ /images/photo.jpg │ b2c3d4e5-...-1704067201.bin │ {...} │
└─────────────────────────┴──────────────────────────────┴──────────┘
- Virtual filesystem, real content: Directory structure, paths, symlinks, and metadata are virtual (stored in
index.db). Only raw file content lives on disk as opaque blobs. - Files stored with UUID + timestamp names (flat, meaningless filenames)
index.dbSQLite database maps virtual paths to blob names- Symlinks and hard links are simulated in the index (not OS symlinks)
- Path resolution handled by AnyFS framework (Type 1 backend)
- File content streamed directly from disk (native I/O performance)
Performance
| Operation | Speed | Notes |
|---|---|---|
| Read/Write | 🟢 Fast | Native disk I/O for content |
| Path Resolution | 🟡 Moderate | Index lookup + disk access |
| Large Files | ✅ Excellent | Streamed directly from disk |
| Metadata Ops | 🟢 Fast | Index-only, no disk I/O |
Index Optimization
The SQLite index benefits from the same performance tuning as SqliteBackend:
| Setting | Default | Purpose |
|---|---|---|
journal_mode | WAL | Concurrent reads during metadata updates |
synchronous | FULL | Index integrity on power loss (safe default) |
cache_size | 16MB | Smaller cache for metadata-only index |
busy_timeout | 5000ms | Gracefully handle lock contention |
auto_vacuum | INCREMENTAL | Reclaim space from deleted entries |
Connection pooling: 4-8 reader connections for concurrent index queries; single writer for metadata updates. Blob I/O bypasses SQLite entirely, so the bottleneck is typically blob disk throughput, not index performance.
See anyfs-indexed#9 for detailed performance guidance.
Advantages
- Native file I/O - content stored as raw files, fast streaming
- Large file support - uses OS file I/O, avoids SQLite BLOB buffering overhead
- Complete path isolation - virtual paths, same as SqliteBackend
- Inspectable - can see blob files on disk (though with opaque names)
- Cross-platform - works identically on all platforms
Disadvantages
- Index dependency - losing
index.db= losing virtual structure (blobs become orphaned) - Two-component backup - must copy directory + index.db together
- Content exposure - blob files are readable on disk (paths are hidden, content is not)
- Not single-file portable - unlike SqliteBackend
When to Use
| Use Case | Recommendation |
|---|---|
| Large file storage | ✅ Ideal - native I/O performance |
| Media libraries | ✅ Ideal - stream large videos/images |
| Document management | ✅ Good - virtual paths + fast I/O |
| Sandboxed + large files | ✅ Ideal - virtual paths, real I/O |
| Single-file portability | ❌ Avoid - use anyfs-sqlite: SqliteBackend |
| Content confidentiality | ⚠️ Wrap - use Encryption middleware for protection |
| WASM/Browser | ❌ Avoid - requires real filesystem |
✅ USE IndexedBackend when:
- Storing large files (videos, images, documents >100MB)
- Need native I/O performance for streaming content
- Building media libraries or document management systems
- Want virtual path isolation but with real disk performance
- Files are large but path structure should be sandboxed
❌ DON’T USE IndexedBackend when:
- Need single-file portability (use anyfs-sqlite: SqliteBackend)
- Content must be hidden from host filesystem (use anyfs-sqlite: SqliteBackend with encryption)
- Need WASM/browser support (use MemoryBackend)
🔒 Encryption Tip: If you need large file performance but content confidentiality matters, you can implement an
Encryption<B>middleware wrapper to encrypt blob contents at rest. This is a user-defined middleware pattern (not built-in) - see the middleware implementation guide for how to create custom middleware. Alternatively, useSqliteBackendwith SQLCipher encryption for simpler encrypted storage.
Type 2: Real Filesystem Backends
StdFsBackend
Direct delegation to std::fs. Every call maps 1:1 to the standard library.
#![allow(unused)]
fn main() {
use anyfs::{StdFsBackend, FileStorage, NoOpResolver};
// SelfResolving backends require explicit NoOpResolver
let fs = FileStorage::with_resolver(StdFsBackend::new(), NoOpResolver);
fs.write("/tmp/data.txt", b"Hello")?; // Actually writes to /tmp/data.txt
}
How It Works
- Every method directly calls the equivalent
std::fsfunction - Paths passed through unchanged
- OS handles all resolution, symlinks, permissions
- Implements
SelfResolvingmarker (useNoOpResolverto skip virtual resolution)
Performance
| Operation | Speed | Notes |
|---|---|---|
| Read/Write | 🟢 Normal | Native OS speed |
| Path Resolution | ⚡ Fast | OS kernel handles it |
| Symlinks | ✅ Native | OS behavior |
Advantages
- Zero overhead - direct OS calls
- Full compatibility - works with all existing files
- Native features - OS permissions, ACLs, xattrs
- Middleware-ready - add Quota, Tracing, etc. to real filesystem
Disadvantages
- No isolation - full filesystem access
- No containment - paths can escape anywhere
- Platform differences - Windows vs Unix behavior
- Security risk - must trust path inputs
When to Use
| Use Case | Recommendation |
|---|---|
| Adding middleware to real FS | ✅ Ideal - wrap with Quota, Tracing |
| Trusted environments | ✅ Good - when isolation not needed |
| Migration path | ✅ Good - gradually add AnyFS features |
| Full host FS features | ✅ Good - ACLs, xattrs, etc. |
| Untrusted input | ❌ Never - use VRootFsBackend |
| Sandboxing | ❌ Never - no containment whatsoever |
| Multi-tenant systems | ❌ Avoid - use virtual backends |
✅ USE StdFsBackend when:
- Adding middleware (Quota, Tracing, etc.) to real filesystem operations
- Operating in a fully trusted environment with controlled inputs
- Migrating existing code to AnyFS incrementally
- Need full access to host filesystem features (ACLs, xattrs)
- Building tools that work with user’s actual files
❌ DON’T USE StdFsBackend when:
- Handling untrusted path inputs (use VRootFsBackend)
- Any form of sandboxing is required (no containment!)
- Building multi-tenant systems (use virtual backends)
- Security isolation matters at all
⚠️ Security Warning: StdFsBackend provides ZERO isolation. Paths like
../../etc/passwdwill work. Only use with fully trusted, controlled inputs.
VRootFsBackend
Sets a directory as a virtual root. All operations are contained within it.
Feature:
vrootfs
#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, FileStorage, NoOpResolver};
// /home/user/sandbox becomes the virtual "/"
// SelfResolving backends require explicit NoOpResolver
let fs = FileStorage::with_resolver(
VRootFsBackend::new("/home/user/sandbox")?,
NoOpResolver
);
fs.write("/data.txt", b"Hello")?;
// Actually writes to: /home/user/sandbox/data.txt
fs.read("/../../../etc/passwd")?;
// Resolves to: /home/user/sandbox/etc/passwd (clamped!)
}
How It Works
- Configured with a real directory as the “virtual root”
- All paths are validated and clamped to stay within root
- Uses
strict-pathcrate for escape prevention - Symlinks are followed but targets validated
- Implements
SelfResolvingmarker (OS handles resolution after validation)
Virtual Path Validation Real Path
───────────────────────────────────────────────────────────────
/data.txt → validate & join → /home/user/sandbox/data.txt
/../../../etc → clamp to root → /home/user/sandbox/etc
/link → /tmp → validate target → ERROR or clamped
Performance
| Operation | Speed | Notes |
|---|---|---|
| Read/Write | 🟡 Moderate | Validation overhead |
| Path Resolution | 🐢 Slower | Extra I/O for symlink checks |
| Symlink Following | 🐢 Slower | Must validate each hop |
Advantages
- Path containment - cannot escape virtual root
- Real file access - native OS performance for content
- Symlink safety - targets validated against root
- Drop-in sandboxing - wrap existing directories
Disadvantages
- Performance overhead - validation on every operation
- Extra I/O - symlink following requires
lstatcalls - Platform quirks - symlink behavior varies (especially Windows)
- Theoretical edge cases - TOCTOU races exist but are difficult to exploit
When to Use
| Use Case | Recommendation |
|---|---|
| User uploads directory | ✅ Ideal - contain user content |
| Plugin sandboxing | ✅ Good - limit plugin file access |
| Chroot-like isolation | ✅ Good - without actual chroot |
| AI agent workspaces | ✅ Good - bound agent to directory |
| Real FS + path containment | ✅ Ideal - native I/O with boundaries |
| Maximum security | ⚠️ Careful - theoretical TOCTOU exists |
| Cross-platform symlinks | ⚠️ Careful - Windows behavior differs |
| Complete host isolation | ❌ Avoid - use SqliteBackend instead |
✅ USE VRootFsBackend when:
- Containing user-uploaded content to a specific directory
- Sandboxing plugins, extensions, or untrusted code
- Need chroot-like isolation without actual chroot privileges
- Building AI agent workspaces with filesystem boundaries
- Want real filesystem performance with path containment
❌ DON’T USE VRootFsBackend when:
- Maximum security required (theoretical TOCTOU edge cases exist - use MemoryBackend)
- Need highest I/O performance (validation adds overhead)
- Cross-platform symlink consistency is critical (Windows differs)
- Want complete isolation from host (use SqliteBackend)
🔒 Encryption Tip: For sensitive data in sandboxed directories (user uploads, plugin workspaces, AI agent data), consider implementing an
Encryption<B>middleware wrapper. This is a user-defined middleware pattern - you would create a custom Layer that encrypts data before delegating to the inner backend. See the middleware implementation guide for the pattern.
Composition Middleware
Overlay<Base, Upper>
Union filesystem middleware combining a read-only base with a writable upper layer.
Note: Overlay is middleware (in
anyfs/middleware/overlay.rs), not a standalone backend. It composes two backends into a layered view.
#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, MemoryBackend, Overlay, FileStorage};
// Base: read-only template
let base = VRootFsBackend::new("/var/templates")?;
// Upper: writable scratch layer
let upper = MemoryBackend::new();
let fs = FileStorage::new(Overlay::new(base, upper));
// Read: checks upper first, falls back to base
let data = fs.read("/config.txt")?;
// Write: always goes to upper
fs.write("/config.txt", b"modified")?;
// Delete: creates "whiteout" in upper, shadows base
fs.remove_file("/unwanted.txt")?;
}
How It Works
┌─────────────────────────────────────────────────┐
│ Overlay<B, U> │
├─────────────────────────────────────────────────┤
│ Read: upper.exists(path)? │
│ → upper.read(path) │
│ : base.read(path) │
│ │
│ Write: upper.write(path, data) │
│ (base unchanged) │
│ │
│ Delete: upper.mark_whiteout(path) │
│ (shadows base, doesn't delete it) │
│ │
│ List: merge(base.read_dir(), upper.read_dir())│
│ - exclude whiteouts │
└─────────────────────────────────────────────────┘
┌──────────────┐
│ Upper │ ← Writes go here
│ (MemoryFs) │ ← Modifications stored here
│ │ ← Whiteouts (deletions) here
└──────┬───────┘
│ if not found
▼
┌──────────────┐
│ Base │ ← Read-only layer
│ (SqliteFs) │ ← Original/template data
│ │ ← Never modified
└──────────────┘
- Reads: Check upper layer first, fall back to base
- Writes: Always go to upper layer (base is read-only)
- Deletes: Create “whiteout” marker in upper (shadows base file)
- Directory listing: Merge both layers, exclude whiteouts
Performance
| Operation | Speed | Notes |
|---|---|---|
| Read (upper hit) | ⚡ Fast | Single layer lookup |
| Read (base fallback) | 🟡 Moderate | Two-layer lookup |
| Write | Depends on upper | Upper layer speed |
| Directory listing | 🐢 Slower | Must merge both layers |
Advantages
- Copy-on-write semantics - modifications don’t affect base
- Instant rollback - discard upper layer to reset
- Space efficient - only changes stored in upper
- Template pattern - share base across multiple instances
- Testing isolation - test against real data without modifying it
Disadvantages
- Complexity - whiteout handling, merge logic
- Directory listing overhead - must combine and filter
- Two backends to manage - lifecycle of both layers
- Not true CoW - doesn’t deduplicate at block level
When to Use
| Use Case | Recommendation |
|---|---|
| Container images | ✅ Ideal - base image + writable layer |
| Template filesystems | ✅ Ideal - shared base, per-user upper |
| Testing with real data | ✅ Ideal - modify without consequences |
| Rollback capability | ✅ Good - discard upper to reset |
| Git-like branching | ✅ Good - branch = new upper layer |
| Simple use cases | ❌ Overkill - use single backend |
| Block-level CoW | ❌ Avoid - Overlay is file-level |
| Dir listing perf | ❌ Avoid - merge overhead on listings |
✅ USE Overlay when:
- Building container-like systems (base image + writable layer)
- Sharing a template filesystem across multiple instances
- Testing against production data without modifying it
- Need instant rollback capability (discard upper layer)
- Implementing git-like branching at filesystem level
❌ DON’T USE Overlay when:
- Simple, single-purpose filesystem (unnecessary complexity)
- Need block-level copy-on-write (Overlay is file-level)
- Directory listing performance is critical (merge overhead)
- Don’t need layered semantics (use single backend)
Backend Selection Guide
Quick Decision Tree
Do you need persistence?
├─ No → MemoryBackend
└─ Yes
├─ Single portable file? → SqliteBackend
├─ Large files + path isolation? → IndexedBackend
└─ Access existing files on disk?
├─ Need containment? → VRootFsBackend
└─ Trusted environment? → StdFsBackend
Comparison Matrix
| Backend | Speed | Isolation | Persistence | Large Files | WASM |
|---|---|---|---|---|---|
| MemoryBackend | ⚡ Very Fast | ✅ Complete | ❌ None | ⚠️ RAM-limited | ✅ |
| SqliteBackend | 🐢 Slower | ✅ Complete | ✅ Single file | ✅ Supported | ✅ |
| IndexedBackend | 🟢 Fast | ✅ Complete | ✅ Directory | ✅ Native I/O | ❌ |
| StdFsBackend | 🟢 Normal | ❌ None | ✅ Native | ✅ Native | ❌ |
| VRootFsBackend | 🟡 Moderate | ✅ Strong | ✅ Native | ✅ Native | ❌ |
| Overlay† | Varies | Varies | Varies | Varies | Varies |
†Overlay is middleware that composes two backends; characteristics depend on the backends used.
By Use Case
| Use Case | Recommended |
|---|---|
| Unit testing | MemoryBackend |
| Integration testing | MemoryBackend or SqliteBackend |
| Portable application data | SqliteBackend |
| Encrypted storage | SqliteBackend (with encryption feature) |
| Large file + isolation | IndexedBackend |
| Media libraries | IndexedBackend |
| Plugin/agent sandboxing | VRootFsBackend |
| Adding middleware to real FS | StdFsBackend |
| Container-like isolation | Overlay<SqliteBackend, MemoryBackend> |
| Template with modifications | Overlay<Base, Upper> |
| WASM/Browser | MemoryBackend or SqliteBackend |
Platform Compatibility
| Backend | Windows | Linux | macOS | WASM |
|---|---|---|---|---|
| MemoryBackend | ✅ | ✅ | ✅ | ✅ |
| SqliteBackend | ✅ | ✅ | ✅ | ✅* |
| IndexedBackend | ✅ | ✅ | ✅ | ❌ |
| StdFsBackend | ✅ | ✅ | ✅ | ❌ |
| VRootFsBackend | ✅** | ✅ | ✅ | ❌ |
| Overlay† | ✅ | ✅ | ✅ | Varies |
* Requires wasm32-compatible SQLite build
** Windows symlinks require elevated privileges or Developer Mode
†Overlay is middleware; platform support depends on the backends composed
Common Mistakes to Avoid
| ❌ Mistake | ✅ Instead |
|---|---|
Using StdFsBackend with user-provided paths | Use VRootFsBackend - it prevents ../../etc/passwd attacks |
Using MemoryBackend for data that must survive restart | Use SqliteBackend for persistence, or call save_to() to serialize |
Expecting identical symlink behavior across platforms with VRootFsBackend | Use MemoryBackend or SqliteBackend for consistent cross-platform symlinks |
Using Overlay when a simple backend would suffice | Keep it simple - use Overlay only when you need true layered semantics |
PathResolver: The Simple Explanation
What Problem Does It Solve?
Imagine you’re giving someone directions to a room in a building:
“Go to the office, then into the storage closet, then back out, then into the conference room.”
That’s a lot of steps! A smart person would simplify it:
“Just go to the conference room.”
PathResolver does exactly this for file paths.
The Problem: Messy Paths
When programs work with files, they often create messy paths like:
/home/user/../user/./documents/../documents/report.txt
This path says:
- Go to
/home/user - Go back up (
..) - Go to
useragain - Stay here (
.) - Go to
documents - Go back up (
..) - Go to
documentsagain - Finally,
report.txt
That’s exhausting! The simple answer is just:
/home/user/documents/report.txt
PathResolver’s job is to figure out the simple answer.
Why Can’t the Backend Just Do This?
Good question! Here’s why we separated it:
1. Different Backends, Same Logic
Think of backends like different types of filing cabinets:
- MemoryBackend = Files in your brain (RAM)
- anyfs-sqlite: SqliteBackend = Files in a database (ecosystem crate)
- VRootFsBackend = Files on your hard drive
The path simplification logic is the same for all of them:
..means “go up one level”.means “stay here”- Symlinks mean “actually go over there instead”
Why write this logic three times? Write it once, use it everywhere.
2. We Can Test It Alone
If path resolution is buried inside each backend, testing is hard:
❌ To test path resolution, you need:
- A real backend
- Real files
- Complex setup
With PathResolver separated:
✅ To test path resolution, you need:
- Just the resolver
- Simple inputs and outputs
- No files required!
3. We Can Benchmark It
“Is our path resolution fast enough?”
If it’s mixed with everything else, you can’t measure it. Separated, you can:
#![allow(unused)]
fn main() {
// Easy to benchmark!
let resolver = IterativeResolver::new();
benchmark(|| resolver.canonicalize("/a/b/../c", &mock_fs));
}
4. We Can Swap It
Different situations need different approaches:
| Resolver | Best For |
|---|---|
IterativeResolver | General use (walks path step by step) |
CachingResolver | Repeated paths (remembers answers) |
NoOpResolver | Real filesystem (OS already handles it) |
With separation, switching is one line:
#![allow(unused)]
fn main() {
use anyfs::resolvers::{CachingResolver, IterativeResolver};
// Default
let fs = FileStorage::new(backend);
// With caching (for performance)
let fs = FileStorage::with_resolver(
backend,
CachingResolver::new(IterativeResolver::default())
);
}
The Analogy: GPS Navigation
Think of PathResolver like a GPS system separate from your car:
| Component | In AnyFS | In a Car |
|---|---|---|
| Storage | Backend (MemoryBackend, SqliteBackend) | The roads themselves |
| Navigation | PathResolver | GPS device |
| Interface | FileStorage | Dashboard |
Why is GPS a separate device?
- ✅ You can upgrade the GPS without changing the car
- ✅ You can test the GPS in a simulator
- ✅ Different GPS apps can work in the same car
- ✅ The car maker doesn’t need to be a GPS expert
Same reasons we separated PathResolver!
What PathResolver Actually Does
Input: /home/user/../admin/./config.txt
↓
[PathResolver]
↓
Output: /home/admin/config.txt
Step by step:
/home/user→ go to user’s home..→ go back up to/homeadmin→ go into admin folder.→ stay here (ignore)config.txt→ the file!
Result: /home/admin/config.txt
Symlinks Make It Interesting
Symlinks are like shortcuts. If /home/admin is actually a symlink pointing to /users/administrator, the resolver follows it:
Input: /home/admin/config.txt
(but /home/admin → /users/administrator)
↓
[PathResolver]
↓
Output: /users/administrator/config.txt
The Two Main Methods
canonicalize() - Strict Mode
“Give me the real, final path. Everything must exist.”
#![allow(unused)]
fn main() {
resolver.canonicalize("/a/b/../c/file.txt", &fs)
// Returns: /a/c/file.txt (if it exists)
// Error: if any part doesn't exist
}
soft_canonicalize() - Relaxed Mode
“Resolve what you can, but the last part doesn’t need to exist yet.”
#![allow(unused)]
fn main() {
resolver.soft_canonicalize("/a/b/../c/new_file.txt", &fs)
// Returns: /a/c/new_file.txt (even if new_file.txt doesn't exist)
// Error: only if /a/c doesn't exist
}
This is useful for creating new files—you need to know WHERE to create them, but they don’t exist yet!
Summary: Why Separate?
| Benefit | Explanation |
|---|---|
| Testable | Test path logic without touching real files |
| Benchmarkable | Measure performance in isolation |
| Swappable | Different resolvers for different needs |
| Maintainable | One place to fix bugs, benefits all backends |
| Understandable | Each piece has one job |
In Code
#![allow(unused)]
fn main() {
// The trait (the "job description")
// Only canonicalize() is required - soft_canonicalize has a default implementation
pub trait PathResolver: Send + Sync {
fn canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError>;
// Default: canonicalize parent, append final component
// Handles edge cases (root path, empty parent)
fn soft_canonicalize(&self, path: &Path, fs: &dyn Fs) -> Result<PathBuf, FsError> {
match path.parent() {
Some(parent) if !parent.as_os_str().is_empty() => {
let canonical_parent = self.canonicalize(parent, fs)?;
match path.file_name() {
Some(name) => Ok(canonical_parent.join(name)),
None => Ok(canonical_parent),
}
}
_ => self.canonicalize(path, fs), // Root or single component
}
}
}
// For symlink-aware resolution (when backend implements FsLink):
pub trait PathResolverWithLinks: PathResolver {
fn canonicalize_following_links(&self, path: &Path, fs: &dyn FsLink) -> Result<PathBuf, FsError>;
// soft_canonicalize_following_links also has a default that delegates
}
// FileStorage uses it (boxed for flexibility)
pub struct FileStorage<B> {
backend: B, // Where files live
resolver: Box<dyn PathResolver>, // How to simplify paths (boxed: cold path)
}
}
That’s it! PathResolver answers one question: “What’s the real, simple path?”
The soft_canonicalize variant is just a convenience - it reuses canonicalize internally but allows creating new files.
Everything else—reading files, writing files, listing directories—is the backend’s job.
Which Crate Should I Use?
Decision Guide
| You want to… | Use |
|---|---|
| Build an application | anyfs |
| Use built-in backends (Memory, StdFs, VRootFs) | anyfs |
| Use built-in middleware (Quota, PathFilter, etc.) | anyfs |
| Use SQLite or IndexedBackend | anyfs-sqlite / anyfs-indexed |
| Implement a custom backend | anyfs-backend only |
| Implement custom middleware | anyfs-backend only |
Quick Examples
Simple usage
#![allow(unused)]
fn main() {
use anyfs::MemoryBackend;
use anyfs::FileStorage;
let fs = FileStorage::new(MemoryBackend::new());
fs.create_dir_all("/data")?;
fs.write("/data/file.txt", b"hello")?;
}
With middleware (quotas, sandboxing, tracing)
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, RestrictionsLayer, PathFilterLayer, TracingLayer};
use anyfs::FileStorage;
let stack = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.build())
.layer(TracingLayer::new());
let fs = FileStorage::new(stack);
}
Custom backend implementation
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, DirEntry};
use std::path::Path;
pub struct MyBackend;
// Implement the three core traits - Fs is auto-implemented via blanket impl
impl FsRead for MyBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
todo!()
}
// ... 5 more FsRead methods
}
impl FsWrite for MyBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
todo!()
}
// ... 6 more FsWrite methods
}
impl FsDir for MyBackend {
// ... 5 FsDir methods
}
// Total: 18 methods across FsRead + FsWrite + FsDir
}
Custom middleware implementation
#![allow(unused)]
fn main() {
use anyfs_backend::{Fs, Layer, FsError};
use std::path::Path;
pub struct MyMiddleware<B: Fs> {
inner: B,
}
impl<B: Fs> Fs for MyMiddleware<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// Intercept, transform, or delegate
self.inner.read(path)
}
// ... implement all methods
}
pub struct MyMiddlewareLayer;
impl<B: Fs> Layer<B> for MyMiddlewareLayer {
type Backend = MyMiddleware<B>;
fn layer(self, backend: B) -> Self::Backend {
MyMiddleware { inner: backend }
}
}
}
Common Mistakes
- Don’t depend on
anyfsif you’re only implementing a backend or middleware. Useanyfs-backend. - Don’t put policy in backends. Use middleware (Quota, PathFilter, etc.).
- Don’t put policy in FileStorage. It is an ergonomic wrapper with centralized path resolution, not a policy layer.
Consumer Documentation Planning
This document specifies what the Context7-style consumer documentation should contain when the AnyFS library is implemented. This is a planning/specification document, not actual API documentation.
Purpose
When AnyFS is implemented, we need a Context7-style reference document that LLMs can use to correctly consume the AnyFS API. This document specifies what that reference should contain.
Why Context7-style?
- LLMs need quick decision trees to select the right components
- Copy-paste-ready patterns reduce hallucination
- Common mistakes section prevents known pitfalls
- Trait hierarchy helps understand what to implement
Required Sections
The consumer documentation MUST include these sections:
1. Quick Decision Trees
Decision trees help LLMs quickly navigate to the right component. Include:
| Decision Tree | Purpose |
|---|---|
| Which Crate? | anyfs-backend vs anyfs |
| Which Backend? | Memory, SQLite, VRootFs, etc. |
| Which Middleware? | Quota, PathFilter, ReadOnly, etc. |
| Which Trait Level? | Fs, FsFull, FsFuse, FsPosix |
Format: ASCII tree diagrams with terminal answers.
Example structure (to be filled with actual API when implemented):
Is data persistence required?
├─ NO → MemoryBackend
└─ YES → Is encryption needed?
├─ YES → SqliteBackend with `encryption` feature
└─ NO → [continue decision tree...]
2. Common Patterns
Provide copy-paste-ready code for these scenarios:
| Pattern | Description |
|---|---|
| Simple File Operations | read, write, delete, check existence |
| Directory Operations | create, list, remove |
| Sandboxed AI Agent | Full middleware stack example |
| Persistent Database | SqliteBackend setup |
| Type-Safe Wrappers | User-defined newtypes for compile-time safety |
| Streaming Large Files | open_read/open_write usage |
Requirements for each pattern:
- Complete, runnable code blocks
- All imports included
- Proper error handling (no
.unwrap()) - Minimal code that demonstrates the concept
3. Trait Hierarchy Diagram
Visual representation of the trait hierarchy:
FsPosix ← Full POSIX (handles, locks, xattr)
↑
FsFuse ← FUSE-mountable (+ inodes)
↑
FsFull ← std::fs features (+ links, permissions, sync, stats)
↑
Fs ← Basic filesystem (90% of use cases)
↑
FsRead + FsWrite + FsDir ← Core traits
With clear guidance: “Implement the lowest level you need. Higher levels include all below.”
4. Backend Implementation Pattern
Template for implementing custom backends. The consumer docs should include:
| Level | Traits to Implement | Result |
|---|---|---|
| Minimum | FsRead + FsWrite + FsDir | Fs |
| Extended | Add FsLink, FsPermissions, FsSync, FsStats | FsFull |
| FUSE | Add FsInode | FsFuse |
| POSIX | Add FsHandles, FsLock, FsXattr | FsPosix |
Each level should have a complete template showing all required method signatures.
5. Middleware Implementation Pattern
Template showing:
- How to wrap an inner backend with a generic type parameter
- Which methods to intercept vs delegate
- The
Layertrait for.layer()syntax - Common middleware patterns table:
| Pattern | Intercept | Delegate | Example |
|---|---|---|---|
| Logging | All (before/after) | All | Tracing |
| Block writes | Write methods → error | Read methods | ReadOnly |
| Transform data | read/write | Everything else | Encryption |
| Check access | All (before) | All | PathFilter |
| Enforce limits | Write methods (check size) | Read methods | Quota |
6. Adapter Patterns
Templates for interoperability:
| Adapter Type | Description |
|---|---|
| FROM external | Wrap external crate’s filesystem as AnyFS backend |
| TO external | Wrap AnyFS backend to satisfy external crate’s trait |
7. Error Handling Reference
All FsError variants with when to use each:
| Variant | When to Return |
|---|---|
NotFound | Path doesn’t exist |
AlreadyExists | Path already exists (create conflict) |
NotAFile | Expected file, got directory |
NotADirectory | Expected directory, got file |
DirectoryNotEmpty | Can’t remove non-empty directory |
ReadOnly | Write blocked by ReadOnly middleware |
AccessDenied | Blocked by PathFilter or permissions |
QuotaExceeded | Size/count limit exceeded |
NotSupported | Backend doesn’t support this operation |
Backend | Backend-specific error |
8. Common Mistakes & Fixes
| Mistake | Fix |
|---|---|
Using unwrap() | Always use ? or handle FsError |
| Assuming paths normalized | Use canonicalize() first |
| Forgetting parent dirs | Use create_dir_all |
| Holding handles too long | Drop promptly |
| Mixing backend types | Use FileStorage::boxed() |
| Testing with real files | Use MemoryBackend |
Document Structure
When creating the actual consumer documentation, follow this structure:
# AnyFS Implementation Patterns
## Quick Decision Trees
### Which Crate Do I Need?
### Which Backend Should I Use?
### Do I Need Middleware?
### Which Trait Level?
## Common Patterns
### Simple File Operations
### Directory Operations
### Sandboxed AI Agent
### Persistent Database
### Type-Safe Wrapper Types
## Trait Hierarchy (Pick Your Level)
## Pattern 1: Implement a Backend
### Minimum: Implement Fs
### Add Links/Permissions: Implement FsFull
### Add FUSE Support: Implement FsFuse
## Pattern 2: Implement Middleware
### Template
### Common Middleware Patterns
## Pattern 3: Implement an Adapter
### Adapter FROM another crate
### Adapter TO another crate
## Error Handling Reference
## Common Mistakes & Fixes
## Quick Reference: What to Implement
Creation Guidelines
When creating the actual consumer documentation after implementation:
- Use actual tested code - Every example must compile and run
- Include all imports - LLMs need complete context
- Show error handling - Never use
.unwrap()in examples - Keep examples minimal - Shortest code that demonstrates the pattern
- Update with API changes - This doc must stay in sync with implementation
- Validate against real usage - Test each pattern before including it
Quality Checklist
Before publishing the consumer documentation:
- All code examples compile
- All code examples run without panics
- Decision trees lead to correct answers
- Error variants match actual
FsErrorenum - Trait hierarchy matches actual trait definitions
- Common mistakes reflect actual issues found in testing
Related Documents
| Document | Purpose |
|---|---|
| LLM Development Methodology | For implementers: how to structure code for LLM development |
| This document | Specification for consumer documentation |
| Backend Guide | Design for backend implementation |
| Middleware Tutorial | Design for middleware creation |
Tracking
This planning document should be replaced with actual consumer documentation when:
- AnyFS is implemented - The crates exist and compile
- API is stable - No major breaking changes expected
- Examples are tested - All patterns verified working
GitHub Issue: Create Context7-style consumer documentation
- Status: Blocked by AnyFS implementation
- Template: This planning document
LLM-Optimized Development Methodology
Purpose: This document defines the methodology for structuring AnyFS code so that each component is independently testable, reviewable, replaceable, and fixable—by both humans and LLMs—without requiring full project context.
Core Principle: Context-Independent Components
Every component in AnyFS should be understandable and modifiable with only local context. An LLM (or human contributor) should be able to:
- Understand a component by reading only its file + trait definition
- Test a component in isolation without the rest of the system
- Fix a bug by looking at only the failing component + error message
- Review changes without understanding the entire architecture
- Replace a component with an alternative implementation
This is achieved through strict separation of concerns, clear contracts (traits), and self-documenting structure.
The Five Pillars
1. Single Responsibility per File
Each file implements exactly one concept:
| File | Implements | Dependencies |
|---|---|---|
fs_read.rs | FsRead trait | FsError, Metadata |
quota.rs | Quota<B> middleware | Fs trait |
memory.rs | MemoryBackend | Fs, FsLink, etc. |
iterative.rs | IterativeResolver | PathResolver trait |
Why: An LLM can be given just the file + its dependencies. No need for “the big picture.”
2. Contract-First Design (Traits as Contracts)
Every component implements a well-defined trait. The trait IS the specification:
#![allow(unused)]
fn main() {
/// Read operations for a virtual filesystem.
///
/// # Contract
/// - All methods use `&self` (interior mutability)
/// - Thread-safe: `Send + Sync` required
/// - Errors are always `FsError`, never panic
///
/// # Implementor Checklist
/// - [ ] Handle non-existent paths with `FsError::NotFound`
/// - [ ] Handle non-UTF8 content in `read_to_string` with `FsError::InvalidData`
/// - [ ] `metadata()` follows symlinks; use `symlink_metadata()` for link info
pub trait FsRead: Send + Sync {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
// ...
}
}
LLM Instruction: “Implement FsRead for MyBackend. Follow the contract in the trait doc.”
3. Isolated Testing (No Integration Dependencies)
Each component has tests that run without external dependencies:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::*;
// Mock only what's needed
struct MockFs {
files: HashMap<PathBuf, Vec<u8>>,
}
#[test]
fn quota_rejects_oversized_write() {
let mock = MockFs::new();
let quota = mock.layer(QuotaLayer::builder()
.max_file_size(100)
.build());
let result = quota.write(Path::new("/big.txt"), &[0u8; 200]);
assert!(matches!(result, Err(FsError::FileSizeExceeded { .. })));
}
}
}
Why: LLM can run tests for just the component being fixed. No database, no filesystem, no network.
4. Error Messages as Documentation
Errors must contain enough context to fix the problem without reading other code:
#![allow(unused)]
fn main() {
// ❌ Bad: Requires context to understand
Err(FsError::NotFound { path: path.to_path_buf() })
// ✅ Good: Self-explanatory
Err(FsError::NotFound {
path: path.to_path_buf(),
operation: "read",
context: "file does not exist or is a directory".into(),
})
}
LLM Instruction: “The error says ‘quota exceeded: limit 100MB, requested 150MB, usage 80MB’. Fix the code that’s writing 150MB.”
5. Documentation at Every Boundary
Every public item has documentation explaining:
- What it does (one line)
- When to use it (use case)
- How to use it (example)
- Why it exists (rationale, if non-obvious)
#![allow(unused)]
fn main() {
/// Path resolution strategy using iterative component-by-component traversal.
///
/// # When to Use
/// - Default resolver for virtual backends (MemoryBackend, SqliteBackend)
/// - When you need standard POSIX-like symlink resolution
///
/// # Example
/// ```rust
/// let resolver = IterativeResolver::new();
/// let canonical = resolver.canonicalize(Path::new("/a/b/../c"), &fs)?;
/// ```
///
/// # Performance
/// O(n) where n = number of path components. For deep paths with many symlinks,
/// consider `CachingResolver` wrapper.
pub struct IterativeResolver { /* ... */ }
}
File Structure Convention
Every implementation file follows this structure:
#![allow(unused)]
fn main() {
//! # Component Name
//!
//! Brief description of what this component does.
//!
//! ## Responsibility
//! - Single bullet point describing THE responsibility
//!
//! ## Dependencies
//! - List of traits/types this depends on
//!
//! ## Usage
//! ```rust
//! // Minimal working example
//! ```
use crate::{...}; // Minimal imports
// ============================================================================
// Types
// ============================================================================
/// Primary type for this component.
pub struct ComponentName { /* ... */ }
// ============================================================================
// Trait Implementations
// ============================================================================
impl SomeTrait for ComponentName {
// Implementation
}
// ============================================================================
// Public API
// ============================================================================
impl ComponentName {
/// Constructor with sensible defaults.
pub fn new() -> Self { /* ... */ }
/// Builder-style configuration.
pub fn with_option(self, value: T) -> Self { /* ... */ }
}
// ============================================================================
// Private Helpers
// ============================================================================
impl ComponentName {
fn internal_helper(&self) { /* ... */ }
}
// ============================================================================
// Tests
// ============================================================================
#[cfg(test)]
mod tests {
use super::*;
// Tests that verify the contract
}
}
LLM Prompting Patterns
Pattern 1: Implement a Component
Implement `CachingResolver` in `anyfs/src/resolvers/caching.rs`.
Contract: Implement `PathResolver` trait (see anyfs-backend/src/path_resolver.rs).
Requirements:
- Wrap another resolver with LRU cache
- Cache resolved canonical paths keyed by input path
- Bounded cache size (configurable max entries)
Test: Write a test verifying cache hit returns same result as cache miss.
Pattern 2: Fix a Bug
Bug: `Quota<B>` doesn't account for existing file size when checking write limits.
File: src/middleware/quota.rs
Error: QuotaExceeded when writing 50 bytes to a 30-byte file with 100-byte limit.
Expected: Should succeed (30 + 50 = 80 < 100).
Fix the `check_write_quota` method.
Pattern 3: Add a Feature
Add `max_path_depth` limit to `Quota<B>` middleware.
File: src/middleware/quota.rs
Contract: Reject operations that would create paths deeper than the limit.
Example:
```rust
let fs = backend.layer(QuotaLayer::builder()
.max_path_depth(5)
.build());
fs.create_dir_all("/a/b/c/d/e/f")?; // Err: depth 6 > limit 5
### Pattern 4: Review a Change
Review this change to IterativeResolver:
- Does it maintain the PathResolver contract?
- Are edge cases handled (empty path, root path, circular symlinks)?
- Are error messages informative?
- Are tests sufficient?
[diff]
---
## Component Isolation Checklist
Before considering a component complete, verify:
- [ ] **Single file** - Component lives in one file (or one module with mod.rs)
- [ ] **Clear contract** - Implements a trait with documented invariants
- [ ] **Minimal dependencies** - Only depends on traits/types, not other implementations
- [ ] **Self-contained tests** - Tests use mocks, not real backends
- [ ] **Informative errors** - Error messages explain what went wrong and how to fix
- [ ] **Usage example** - Doc comment shows how to use in isolation
- [ ] **No global state** - All state is in the struct instance
- [ ] **Thread-safe** - `Send + Sync` where required
- [ ] **Documented edge cases** - What happens with empty input, None, errors?
---
## Open Source Contribution Benefits
This methodology directly enables:
| Benefit | How |
| --------------------------- | --------------------------------------------------------------- |
| **First-time contributors** | Can understand one component without reading the whole codebase |
| **Focused PRs** | Changes stay in one file, easy to review |
| **Parallel development** | Multiple contributors work on different components |
| **Quick onboarding** | Read the trait, implement the trait, done |
| **CI efficiency** | Test just the changed component |
---
## Anti-Patterns to Avoid
### ❌ Spaghetti Dependencies
```rust
// Bad: Middleware knows about specific backends
impl<B: Fs> Quota<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
if let Some(sqlite) = self.inner.downcast_ref::<SqliteBackend>() {
// Special case for SQLite
}
}
}
❌ Hidden Context Requirements
#![allow(unused)]
fn main() {
// Bad: Requires knowing about global configuration
impl FsRead for MyBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let config = CONFIG.lock().unwrap(); // Hidden global!
// ...
}
}
}
❌ Tests That Require Setup
#![allow(unused)]
fn main() {
// Bad: Requires database, filesystem, network
#[test]
fn test_vrootfs_backend() {
let db = VRootFsBackend::new("/tmp/test").unwrap(); // Creates real files!
// ...
}
}
❌ Vague Errors
#![allow(unused)]
fn main() {
// Bad: No context
Err(FsError::Backend("operation failed".into()))
}
Integration with Context7-style Documentation
When the project is complete, we will provide a consumer-facing LLM context document that:
- Explains the surface API (what to import, what to call)
- Provides decision trees (which backend? which middleware?)
- Shows complete, runnable examples
- Lists common mistakes and how to avoid them
This is separate from AGENTS.md (for contributors) and lives in Implementation Patterns.
Summary
| Principle | Implementation |
|---|---|
| Isolated | One file, one concept |
| Contracted | Traits define the spec |
| Testable | Mock-based unit tests |
| Debuggable | Rich error context |
| Documented | Examples at every boundary |
| LLM-Ready | Promptable patterns for common tasks |
By following this methodology, AnyFS becomes a codebase where any component can be understood, tested, fixed, or replaced by an LLM (or human) with only local context. This is the foundation for sustainable AI-assisted development.
Cross-Platform Virtual Drive Mounting
Mounting AnyFS backends as real filesystem mount points
Overview
AnyFS backends implementing FsFuse can be mounted as real filesystem drives that any application can access. This is part of the anyfs crate (behind feature flags: fuse for Linux/macOS, winfsp for Windows) because mounting is a core promise of AnyFS, not an optional extra.
Product Promise
Mounting is a core AnyFS promise: make filesystem composition easy, safe, and genuinely enjoyable for programmers. The mount API prioritizes:
- Easy onboarding (one handle, one builder, minimal boilerplate)
- Safe defaults (explicit read-only modes, clear errors, no hidden behavior)
- Delightful DX (predictable behavior, fast feedback, good docs)
Roadmap (MVP to Cross-Platform)
Phase 0: Design and API shape (complete)
- API spec defines
MountHandle,MountBuilder,MountOptions,MountError - Platform detection hooks (
is_available) and consistent error mapping - Examples and docs anchored in this guide Acceptance: Spec review complete; API signatures consistent across docs; error mapping defined.
Phase 1: Linux FUSE MVP (read-only, pending)
fuseradapter for lookup/getattr/readdir/read- Read-only mount option; write ops return
PermissionDeniedAcceptance: Mount/unmount works on Linux; read-only operations pass smoke tests; unmount-on-drop is reliable.
Phase 2: Linux FUSE read/write (pending)
- Full write path: create, write, rename, remove, link operations
- Capability reporting and correct metadata mapping Acceptance: Conformance tests pass for FsFuse path/inode behavior; no panics; clean shutdown.
Phase 3: macOS parity (macFUSE, pending)
- Port Linux FUSE adapter to macFUSE requirements
- Driver detection and install guidance Acceptance: Mount/unmount works on macOS with core read/write flows.
Phase 4: Windows support (WinFsp, optional Dokan, pending)
- WinFsp adapter with required mapping for Windows semantics
- Optional Dokan path as alternative provider Acceptance: Mount/unmount works on Windows; driver detection errors are clear and actionable.
Non-goals
- Kernel drivers or kernel-space code
- WASM or browser environments
- Network filesystem protocols (NFS/SMB)
Platform Technologies
| Platform | Technology | Rust Crate | User Installation |
|---|---|---|---|
| Linux | FUSE | fuser | Usually pre-installed |
| macOS | macFUSE | fuser | macFUSE |
| Windows | WinFsp | winfsp | WinFsp |
| Windows | Dokan | dokan | Dokan |
Key insight: Linux and macOS both use FUSE (via fuser crate), but Windows requires a completely different API (WinFsp or Dokan).
Architecture
Unified Mount Trait
#![allow(unused)]
fn main() {
/// Platform-agnostic mount handle.
/// Drop to unmount.
pub struct MountHandle {
inner: Box<dyn MountHandleInner>,
}
impl MountHandle {
/// Mount a backend at the specified path.
///
/// Platform requirements:
/// - Linux: FUSE (usually available)
/// - macOS: macFUSE must be installed
/// - Windows: WinFsp or Dokan must be installed
pub fn mount<B: FsFuse>(backend: B, path: impl AsRef<Path>) -> Result<Self, MountError> {
#[cfg(unix)]
return fuse_mount(backend, path);
#[cfg(windows)]
return winfsp_mount(backend, path);
#[cfg(not(any(unix, windows)))]
return Err(MountError::PlatformNotSupported);
}
/// Check if mounting is available on this platform.
pub fn is_available() -> bool {
#[cfg(target_os = "linux")]
return check_fuse_available();
#[cfg(target_os = "macos")]
return check_macfuse_available();
#[cfg(windows)]
return check_winfsp_available() || check_dokan_available();
#[cfg(not(any(unix, windows)))]
return false;
}
/// Unmount the filesystem.
pub fn unmount(self) -> Result<(), MountError> {
self.inner.unmount()
}
}
impl Drop for MountHandle {
fn drop(&mut self) {
let _ = self.inner.unmount();
}
}
}
Platform Adapters
┌─────────────────────────────────────────────────────────────┐
│ MountHandle (unified API) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ FuseAdapter │ │ FuseAdapter │ │ WinFspAdapter │ │
│ │ (Linux) │ │ (macOS) │ │ (Windows) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────────┬──────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌──────────┐ │
│ │ fuser │ │ fuser │ │ winfsp │ │
│ │ crate │ │ crate │ │ crate │ │
│ └────┬────┘ └────┬────┘ └────┬─────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌──────────┐ │
│ │ FUSE │ │ macFUSE │ │ WinFsp │ │
│ │ (kernel)│ │ (kext) │ │ (driver) │ │
│ └─────────┘ └─────────┘ └──────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Module Structure
Mounting is part of the anyfs crate:
anyfs/
src/
mount/
mod.rs # MountHandle, MountError, re-exports
error.rs # MountError definitions
handle.rs # MountHandle, MountOptions, builder
unix/
mod.rs # cfg(unix)
fuse_adapter.rs # FUSE implementation via fuser
windows/
mod.rs # cfg(windows)
winfsp_adapter.rs # WinFsp implementation
Feature Flags in anyfs Cargo.toml
[package]
name = "anyfs"
version = "0.1.0"
[dependencies]
anyfs-backend = { version = "0.1" }
[target.'cfg(unix)'.dependencies]
fuser = { version = "0.14", optional = true }
[target.'cfg(windows)'.dependencies]
winfsp = { version = "0.4", optional = true }
[features]
default = []
fuse = ["dep:fuser"] # Enable mounting on Linux/macOS
winfsp = ["dep:winfsp"] # Enable mounting on Windows
FUSE Adapter (Linux/macOS)
The FUSE adapter translates between fuser::Filesystem trait and our FsFuse trait:
#![allow(unused)]
fn main() {
use fuser::{Filesystem, Request, ReplyEntry, ReplyAttr, ReplyData, ReplyDirectory};
use anyfs_backend::{FsFuse, FsError, Metadata, FileType};
pub struct FuseAdapter<B: FsFuse> {
backend: B,
}
impl<B: FsFuse> Filesystem for FuseAdapter<B> {
fn lookup(&mut self, _req: &Request, parent: u64, name: &OsStr, reply: ReplyEntry) {
match self.backend.lookup(parent, name) {
Ok(inode) => {
match self.backend.metadata_by_inode(inode) {
Ok(meta) => reply.entry(&TTL, &to_fuse_attr(&meta), 0),
Err(e) => reply.error(to_errno(&e)),
}
}
Err(e) => reply.error(to_errno(&e)),
}
}
fn getattr(&mut self, _req: &Request, ino: u64, reply: ReplyAttr) {
match self.backend.metadata_by_inode(ino) {
Ok(meta) => reply.attr(&TTL, &to_fuse_attr(&meta)),
Err(e) => reply.error(to_errno(&e)),
}
}
fn read(&mut self, _req: &Request, ino: u64, _fh: u64, offset: i64, size: u32, _flags: i32, _lock: Option<u64>, reply: ReplyData) {
let path = match self.backend.inode_to_path(ino) {
Ok(p) => p,
Err(e) => return reply.error(to_errno(&e)),
};
match self.backend.read_range(&path, offset as u64, size as usize) {
Ok(data) => reply.data(&data),
Err(e) => reply.error(to_errno(&e)),
}
}
fn readdir(&mut self, _req: &Request, ino: u64, _fh: u64, offset: i64, mut reply: ReplyDirectory) {
let path = match self.backend.inode_to_path(ino) {
Ok(p) => p,
Err(e) => return reply.error(to_errno(&e)),
};
match self.backend.read_dir(&path) {
Ok(entries) => {
for (i, entry) in entries.iter().enumerate().skip(offset as usize) {
let file_type = match entry.file_type {
FileType::File => fuser::FileType::RegularFile,
FileType::Directory => fuser::FileType::Directory,
FileType::Symlink => fuser::FileType::Symlink,
};
if reply.add(entry.inode, (i + 1) as i64, file_type, &entry.name) {
break;
}
}
reply.ok();
}
Err(e) => reply.error(to_errno(&e)),
}
}
// ... write, create, mkdir, unlink, rmdir, rename, symlink, etc.
}
fn to_errno(e: &FsError) -> i32 {
match e {
FsError::NotFound { .. } => libc::ENOENT,
FsError::AlreadyExists { .. } => libc::EEXIST,
FsError::NotADirectory { .. } => libc::ENOTDIR,
FsError::NotAFile { .. } => libc::EISDIR,
FsError::DirectoryNotEmpty { .. } => libc::ENOTEMPTY,
FsError::AccessDenied { .. } => libc::EACCES,
FsError::ReadOnly { .. } => libc::EROFS,
FsError::QuotaExceeded { .. } => libc::ENOSPC,
_ => libc::EIO,
}
}
}
WinFsp Adapter (Windows)
WinFsp has a different API but similar concepts:
#![allow(unused)]
fn main() {
use winfsp::filesystem::{FileSystem, FileSystemContext, FileInfo, DirInfo};
use anyfs_backend::{FsFuse, FsError};
pub struct WinFspAdapter<B: FsFuse> {
backend: B,
}
impl<B: FsFuse> FileSystem for WinFspAdapter<B> {
fn get_file_info(&self, file_context: &FileContext) -> Result<FileInfo, NTSTATUS> {
let meta = self.backend.metadata(&file_context.path)
.map_err(to_ntstatus)?;
Ok(to_file_info(&meta))
}
fn read(&self, file_context: &FileContext, buffer: &mut [u8], offset: u64) -> Result<usize, NTSTATUS> {
let data = self.backend.read_range(&file_context.path, offset, buffer.len())
.map_err(to_ntstatus)?;
buffer[..data.len()].copy_from_slice(&data);
Ok(data.len())
}
fn read_directory(&self, file_context: &FileContext, marker: Option<&str>, callback: impl FnMut(DirInfo)) -> Result<(), NTSTATUS> {
let entries = self.backend.read_dir(&file_context.path)
.map_err(to_ntstatus)?;
for entry in entries {
// ReadDirIter yields Result<DirEntry, FsError>
let entry = entry.map_err(to_ntstatus)?;
callback(to_dir_info(&entry));
}
Ok(())
}
// ... write, create, delete, rename, etc.
}
fn to_ntstatus(e: FsError) -> NTSTATUS {
match e {
FsError::NotFound { .. } => STATUS_OBJECT_NAME_NOT_FOUND,
FsError::AlreadyExists { .. } => STATUS_OBJECT_NAME_COLLISION,
FsError::AccessDenied { .. } => STATUS_ACCESS_DENIED,
FsError::ReadOnly { .. } => STATUS_MEDIA_WRITE_PROTECTED,
_ => STATUS_INTERNAL_ERROR,
}
}
}
Usage
Basic Mount
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, MountHandle};
// Create backend with middleware
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build());
// Mount as drive
let mount = MountHandle::mount(backend, "/mnt/ramdisk")?;
// Now /mnt/ramdisk is a real mount point
// Any application can read/write files there
// Unmount when done (or on drop)
mount.unmount()?;
}
Windows Drive Letter
#![allow(unused)]
fn main() {
#[cfg(windows)]
let mount = MountHandle::mount(backend, "X:")?;
// Now X: is a virtual drive
}
Check Availability
#![allow(unused)]
fn main() {
if MountHandle::is_available() {
let mount = MountHandle::mount(backend, path)?;
} else {
eprintln!("Mounting not available. Install:");
#[cfg(target_os = "macos")]
eprintln!(" - macFUSE: https://osxfuse.github.io/");
#[cfg(windows)]
eprintln!(" - WinFsp: https://winfsp.dev/");
}
}
Mount Options
#![allow(unused)]
fn main() {
let mount = MountHandle::builder(backend)
.mount_point("/mnt/data")
.read_only(true) // Force read-only mount
.allow_other(true) // Allow other users (Linux/macOS)
.auto_unmount(true) // Unmount on process exit
.uid(1000) // Override UID (Linux/macOS)
.gid(1000) // Override GID (Linux/macOS)
.mount()?;
}
Error Handling
#![allow(unused)]
fn main() {
pub enum MountError {
/// Platform doesn't support mounting (e.g., WASM)
PlatformNotSupported,
/// Required driver not installed (macFUSE, WinFsp)
DriverNotInstalled {
driver: &'static str,
install_url: &'static str,
},
/// Mount point doesn't exist or isn't accessible
InvalidMountPoint { path: PathBuf },
/// Mount point already in use
MountPointBusy { path: PathBuf },
/// Permission denied (need root/admin)
PermissionDenied,
/// Backend error during mount
Backend(FsError),
/// Platform-specific error
Platform(String),
/// Missing mount point in options
MissingMountPoint,
}
}
Integration with Middleware
All middleware works transparently when mounted:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, TracingLayer, RateLimitLayer, MountHandle};
// Build secure, audited, rate-limited mount
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(1024 * 1024 * 1024) // 1 GB
.build())
.layer(PathFilterLayer::builder()
.deny("**/.git/**")
.deny("**/.env")
.build())
.layer(RateLimitLayer::builder()
.max_ops(10000)
.per_second()
.build())
.layer(TracingLayer::new());
let mount = MountHandle::mount(backend, "/mnt/secure")?;
// External apps see a normal filesystem
// But all operations are:
// - Quota-limited
// - Path-filtered
// - Rate-limited
// - Traced/audited
// Imagine: A mounted "USB drive" that reports real-time IOPS
// to a Prometheus dashboard!
}
Real-Time Observability
Because the mount point sits on top of your middleware stack, you get live visibility into OS operations:
- Metrics: See valid IOPS, throughput, and latency for your virtual drive in Grafana.
- Audit Logs: Record every file your legacy app touches.
- Virus Scanning: Scan files as the OS writes them, rejecting malware in real-time.
---
## Use Cases
### Temporary Workspace
```rust
let workspace = MemoryBackend::new();
let mount = MountHandle::mount(workspace, "/tmp/workspace")?;
// Run build tools that expect real filesystem
std::process::Command::new("cargo")
.current_dir("/tmp/workspace")
.arg("build")
.status()?;
Portable Database as Drive
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// User's files stored in SQLite
let db = SqliteBackend::open("user_files.db")?;
let mount = MountHandle::mount(db, "U:")?;
// User can browse U: in Explorer
// Files are actually in SQLite database
}
Network Storage
#![allow(unused)]
fn main() {
// Remote backend (future anyfs-s3, anyfs-sftp, etc.)
let remote = S3Backend::new("my-bucket")?;
let cached = remote.layer(CacheLayer::builder()
.max_size(100 * 1024 * 1024)
.build());
let mount = MountHandle::mount(cached, "/mnt/cloud")?;
// Local apps see /mnt/cloud as regular filesystem
// Actually reads/writes to S3 with local caching
}
Platform Requirements Summary
| Platform | Driver | Install Command / URL |
|---|---|---|
| Linux | FUSE | Usually pre-installed. If not: apt install fuse3 |
| macOS | macFUSE | https://osxfuse.github.io/ |
| Windows | WinFsp | https://winfsp.dev/ (recommended) |
| Windows | Dokan | https://dokan-dev.github.io/ (alternative) |
Limitations
- Requires external driver - Users must install macFUSE (macOS) or WinFsp (Windows)
- Root/admin may be required - Some mount operations need elevated privileges
- Not available on WASM - Browser environment has no filesystem mounting
- Performance overhead - Userspace filesystem has kernel boundary crossing overhead
- Backend must implement FsFuse - Requires
FsInodetrait for inode operations
Alternative: No Mount Needed
For many use cases, mounting isn’t necessary. AnyFS backends can be used directly:
| Need | With Mounting | Without Mounting |
|---|---|---|
| Build tools | Mount, run tools | Use tool’s VFS plugin if available |
| File browser | Mount as drive | Build custom UI with AnyFS API |
| Backup | Mount, use rsync | Use AnyFS API directly |
| Database | Mount for SQL tools | Query SQLite directly |
Rule of thumb: Only mount when you need compatibility with external applications that expect real filesystem paths.
Tutorial: Building a TXT Backend (Yes, Really)
How to turn a humble text file into a functioning virtual filesystem
The Absurd Premise
What if your entire filesystem was just… a text file you can edit in Notepad?
path,type,mode,data
/,dir,755,
/hello.txt,file,644,SGVsbG8sIFdvcmxkIQ==
/docs,dir,755,
/docs/readme.md,file,644,IyBXZWxjb21lIQoKWWVzLCB0aGlzIGlzIGluIGEgLnR4dCBmaWxl
One line per file. Comma-separated. Base64 content. Open it in Notepad, edit a file, save, done.
Sounds ridiculous? It is. But it works. And building it teaches you everything about implementing AnyFS backends.
Let’s do this.
Why This Is Actually Useful
Beyond the memes, a TXT backend demonstrates:
- Backend flexibility - AnyFS doesn’t care how you store bytes
- Trait implementation - You’ll implement
FsRead,FsWrite,FsDir - Middleware composition - We’ll add
Quotato prevent the file from exploding - Real-world patterns - The same patterns apply to serious backends
- Separation of concerns - Backends just store bytes; FileStorage handles path resolution
Plus, you can literally edit your “filesystem” in Notepad. Try doing that with ext4.
Important: Backends receive already-resolved paths from FileStorage. You don’t need to handle
.., symlinks, or normalization - that’s FileStorage’s job. Your backend just stores and retrieves bytes at the given paths.
The Format
One line per entry. Four comma-separated fields. Dead simple:
path,type,mode,data
| Field | Description | Example |
|---|---|---|
path | Absolute path | /docs/file.txt |
type | file or dir | file |
mode | Unix permissions (octal) | 644 |
data | Base64-encoded content | SGVsbG8= |
Directories have empty data field. That’s the entire format. Open in Notepad, add a line, you created a file.
Step 1: Data Structures
#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use base64::{Engine as _, engine::general_purpose::STANDARD as BASE64};
/// A single entry in our TXT filesystem
#[derive(Clone, Debug)]
struct TxtEntry {
path: PathBuf,
is_dir: bool,
mode: u32,
content: Vec<u8>,
}
impl TxtEntry {
fn new_dir(path: impl Into<PathBuf>) -> Self {
Self {
path: path.into(),
is_dir: true,
mode: 0o755,
content: Vec::new(),
}
}
fn new_file(path: impl Into<PathBuf>, content: Vec<u8>) -> Self {
Self {
path: path.into(),
is_dir: false,
mode: 0o644,
content,
}
}
/// Serialize to a line: path,type,mode,data
fn to_line(&self) -> String {
let file_type = if self.is_dir { "dir" } else { "file" };
let data_b64 = if self.content.is_empty() {
String::new()
} else {
BASE64.encode(&self.content)
};
format!("{},{},{:o},{}", self.path.display(), file_type, self.mode, data_b64)
}
/// Parse from line: path,type,mode,data
fn from_line(line: &str) -> Result<Self, TxtParseError> {
let parts: Vec<&str> = line.splitn(4, ',').collect();
if parts.len() < 3 {
return Err(TxtParseError::InvalidFormat);
}
let content = if parts.len() == 4 && !parts[3].is_empty() {
BASE64.decode(parts[3]).map_err(|_| TxtParseError::InvalidBase64)?
} else {
Vec::new()
};
Ok(Self {
path: PathBuf::from(parts[0]),
is_dir: parts[1] == "dir",
mode: u32::from_str_radix(parts[2], 8)
.map_err(|_| TxtParseError::InvalidNumber)?,
content,
})
}
}
#[derive(Debug)]
enum TxtParseError {
InvalidFormat,
InvalidBase64,
InvalidNumber,
}
}
Step 2: The Backend Structure
#![allow(unused)]
fn main() {
use std::sync::{Arc, RwLock};
use std::fs::File;
use std::io::{BufRead, BufReader, Write};
/// A filesystem backend that stores everything in a .txt file.
///
/// Yes, this is cursed. Yes, it works. Yes, you can edit it in Notepad.
pub struct TxtBackend {
/// Path to the .txt file on the host filesystem
txt_path: PathBuf,
/// In-memory cache of entries (path -> entry)
entries: Arc<RwLock<HashMap<PathBuf, TxtEntry>>>,
}
impl TxtBackend {
/// Create a new TXT backend, loading from file if it exists
pub fn open(txt_path: impl Into<PathBuf>) -> Result<Self, FsError> {
let txt_path = txt_path.into();
let mut entries = HashMap::new();
// Always ensure root directory exists
entries.insert(PathBuf::from("/"), TxtEntry::new_dir("/"));
// Load existing entries if file exists
if txt_path.exists() {
let file = File::open(&txt_path)
.map_err(|e| FsError::Io {
operation: "open txt",
path: txt_path.clone(),
source: e,
})?;
for (line_num, line) in BufReader::new(file).lines().enumerate() {
let line = line.map_err(|e| FsError::Io {
operation: "read line",
path: txt_path.clone(),
source: e,
})?;
// Skip header line
if line_num == 0 && line.starts_with("path,") {
continue;
}
// Skip empty lines
if line.trim().is_empty() {
continue;
}
let entry = TxtEntry::from_line(&line)
.map_err(|_| FsError::CorruptedData {
path: txt_path.clone(),
details: format!("line {}", line_num + 1),
})?;
entries.insert(entry.path.clone(), entry);
}
}
Ok(Self {
txt_path,
entries: Arc::new(RwLock::new(entries)),
})
}
/// Create a new in-memory backend (won't persist to disk)
pub fn in_memory() -> Self {
let mut entries = HashMap::new();
entries.insert(PathBuf::from("/"), TxtEntry::new_dir("/"));
Self {
txt_path: PathBuf::from(":memory:"),
entries: Arc::new(RwLock::new(entries)),
}
}
/// Flush all entries to the .txt file
fn flush(&self) -> Result<(), FsError> {
// Skip if in-memory mode
if self.txt_path.as_os_str() == ":memory:" {
return Ok(());
}
let entries = self.entries.read().unwrap();
let mut file = File::create(&self.txt_path)
.map_err(|e| FsError::Io {
operation: "create txt",
path: self.txt_path.clone(),
source: e,
})?;
// Write header
writeln!(file, "path,type,mode,data")
.map_err(|e| FsError::Io {
operation: "write header",
path: self.txt_path.clone(),
source: e,
})?;
// Write entries (sorted for consistency)
let mut paths: Vec<_> = entries.keys().collect();
paths.sort();
for path in paths {
let entry = &entries[path];
writeln!(file, "{}", entry.to_line())
.map_err(|e| FsError::Io {
operation: "write entry",
path: path.clone(),
source: e,
})?;
}
Ok(())
}
}
}
Step 3: Implement FsRead
Now the fun part - making it quack like a filesystem:
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsError, Metadata, FileType};
impl FsRead for TxtBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
let entries = self.entries.read().unwrap();
let entry = entries.get(&path)
.ok_or_else(|| FsError::NotFound { path: path.clone() })?;
if entry.is_dir {
return Err(FsError::NotAFile { path });
}
Ok(entry.content.clone())
}
fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
let bytes = self.read(path.as_ref())?;
String::from_utf8(bytes)
.map_err(|_| FsError::InvalidData {
path: path.as_ref().to_path_buf(),
details: "not valid UTF-8".to_string(),
})
}
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
let content = self.read(path)?;
let start = offset as usize;
if start >= content.len() {
return Ok(Vec::new());
}
let end = (start + len).min(content.len());
Ok(content[start..end].to_vec())
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
let path = path.as_ref();
let entries = self.entries.read().unwrap();
Ok(entries.contains_key(path))
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
let path = path.as_ref();
let entries = self.entries.read().unwrap();
let entry = entries.get(path)
.ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
Ok(Metadata {
file_type: if entry.is_dir { FileType::Directory } else { FileType::File },
size: entry.content.len() as u64,
permissions: Permissions::from_mode(entry.mode),
created: std::time::UNIX_EPOCH, // TxtBackend doesn't track timestamps
modified: std::time::UNIX_EPOCH,
accessed: std::time::UNIX_EPOCH,
inode: 0, // No inode support
nlink: 1, // No hardlink support
})
}
fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
let content = self.read(path)?;
Ok(Box::new(std::io::Cursor::new(content)))
}
}
}
Step 4: Implement FsWrite
Where the magic happens - writing files to a text file:
#![allow(unused)]
fn main() {
use anyfs_backend::FsWrite;
impl FsWrite for TxtBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
// Ensure parent directory exists
if let Some(parent) = path.parent() {
let parent_str = parent.to_string_lossy();
if parent_str != "/" && !parent_str.is_empty() {
let entries = self.entries.read().unwrap();
if !entries.contains_key(parent) {
drop(entries);
return Err(FsError::NotFound {
path: parent.to_path_buf()
});
}
}
}
let mut entries = self.entries.write().unwrap();
// Check if it's a directory
if let Some(existing) = entries.get(&path) {
if existing.is_dir {
return Err(FsError::NotAFile { path });
}
}
// Create or update the file
let entry = TxtEntry::new_file(path.clone(), data.to_vec());
entries.insert(path, entry);
drop(entries);
self.flush()?;
Ok(())
}
fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
let mut entries = self.entries.write().unwrap();
let entry = entries.get_mut(&path)
.ok_or_else(|| FsError::NotFound { path: path.clone() })?;
if entry.is_dir {
return Err(FsError::NotAFile { path });
}
entry.content.extend_from_slice(data);
drop(entries);
self.flush()?;
Ok(())
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
let mut entries = self.entries.write().unwrap();
let entry = entries.get(&path)
.ok_or_else(|| FsError::NotFound { path: path.clone() })?;
if entry.is_dir {
return Err(FsError::NotAFile { path });
}
entries.remove(&path);
drop(entries);
self.flush()?;
Ok(())
}
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
let from = from.as_ref().to_path_buf();
let to = to.as_ref().to_path_buf();
let mut entries = self.entries.write().unwrap();
let mut entry = entries.remove(&from)
.ok_or_else(|| FsError::NotFound { path: from.clone() })?;
entry.path = to.clone();
entries.insert(to, entry);
drop(entries);
self.flush()?;
Ok(())
}
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
let from = from.as_ref().to_path_buf();
let to = to.as_ref().to_path_buf();
let entries = self.entries.read().unwrap();
let source = entries.get(&from)
.ok_or_else(|| FsError::NotFound { path: from.clone() })?;
if source.is_dir {
return Err(FsError::NotAFile { path: from });
}
let mut new_entry = source.clone();
new_entry.path = to.clone();
drop(entries);
let mut entries = self.entries.write().unwrap();
entries.insert(to, new_entry);
drop(entries);
self.flush()?;
Ok(())
}
fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
let mut entries = self.entries.write().unwrap();
let entry = entries.get_mut(&path)
.ok_or_else(|| FsError::NotFound { path: path.clone() })?;
if entry.is_dir {
return Err(FsError::NotAFile { path });
}
entry.content.truncate(size as usize);
drop(entries);
self.flush()?;
Ok(())
}
fn open_write(&self, path: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
// For simplicity, we buffer writes and apply on drop
// A real implementation would be more sophisticated
let path = path.as_ref().to_path_buf();
// Ensure file exists (create empty if not)
if !self.exists(&path)? {
self.write(&path, b"")?;
}
Ok(Box::new(TxtFileWriter {
backend: self.entries.clone(),
txt_path: self.txt_path.clone(),
path,
buffer: Vec::new(),
}))
}
}
/// Writer that buffers content and writes to TXT on drop
struct TxtFileWriter {
backend: Arc<RwLock<HashMap<PathBuf, TxtEntry>>>,
txt_path: PathBuf,
path: PathBuf,
buffer: Vec<u8>,
}
impl std::io::Write for TxtFileWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
self.buffer.extend_from_slice(buf);
Ok(buf.len())
}
fn flush(&mut self) -> std::io::Result<()> {
Ok(())
}
}
impl Drop for TxtFileWriter {
fn drop(&mut self) {
let mut entries = self.backend.write().unwrap();
if let Some(entry) = entries.get_mut(&self.path) {
entry.content = std::mem::take(&mut self.buffer);
}
// Note: flush to disk happens on next explicit flush() call
}
}
}
Step 5: Implement FsDir
Directory operations to complete the Fs trait:
#![allow(unused)]
fn main() {
use anyfs_backend::{FsDir, DirEntry};
impl FsDir for TxtBackend {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
let path = path.as_ref().to_path_buf();
let entries = self.entries.read().unwrap();
// Verify the path is a directory
let entry = entries.get(&path)
.ok_or_else(|| FsError::NotFound { path: path.clone() })?;
if !entry.is_dir {
return Err(FsError::NotADirectory { path });
}
// Find all direct children
let mut children = Vec::new();
for (child_path, child_entry) in entries.iter() {
if let Some(parent) = child_path.parent() {
if parent == path && child_path != &path {
children.push(DirEntry {
name: child_path.file_name()
.unwrap_or_default()
.to_string_lossy()
.into_owned(),
path: child_path.clone(),
file_type: if child_entry.is_dir {
FileType::Directory
} else {
FileType::File
},
size: child_entry.size,
inode: 0, // No inode support
});
}
}
}
// Sort for consistent ordering
children.sort_by(|a, b| a.name.cmp(&b.name));
// Wrap in ReadDirIter (items are Ok since we've already validated them)
Ok(ReadDirIter::new(children.into_iter().map(Ok)))
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
// Check parent exists
if let Some(parent) = path.parent() {
let parent_str = parent.to_string_lossy();
if parent_str != "/" && !parent_str.is_empty() {
let entries = self.entries.read().unwrap();
let parent_entry = entries.get(parent)
.ok_or_else(|| FsError::NotFound {
path: parent.to_path_buf()
})?;
if !parent_entry.is_dir {
return Err(FsError::NotADirectory {
path: parent.to_path_buf()
});
}
}
}
let mut entries = self.entries.write().unwrap();
// Check if already exists
if entries.contains_key(&path) {
return Err(FsError::AlreadyExists { path, operation: "create_dir" });
}
entries.insert(path.clone(), TxtEntry::new_dir(path));
drop(entries);
self.flush()?;
Ok(())
}
fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
// Build list of directories to create
let mut to_create = Vec::new();
let mut current = path.clone();
loop {
{
let entries = self.entries.read().unwrap();
if entries.contains_key(¤t) {
break;
}
}
to_create.push(current.clone());
match current.parent() {
Some(parent) if !parent.as_os_str().is_empty() => {
current = parent.to_path_buf();
}
_ => break,
}
}
// Create directories from root to leaf
to_create.reverse();
for dir_path in to_create {
let mut entries = self.entries.write().unwrap();
if !entries.contains_key(&dir_path) {
entries.insert(dir_path.clone(), TxtEntry::new_dir(dir_path));
}
}
self.flush()?;
Ok(())
}
fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
// Can't remove root
if path.to_string_lossy() == "/" {
return Err(FsError::PermissionDenied {
path,
operation: "remove root directory"
});
}
let entries = self.entries.read().unwrap();
let entry = entries.get(&path)
.ok_or_else(|| FsError::NotFound { path: path.clone() })?;
if !entry.is_dir {
return Err(FsError::NotADirectory { path });
}
// Check if empty
let has_children = entries.keys().any(|p| {
p != &path && p.starts_with(&path)
});
if has_children {
return Err(FsError::DirectoryNotEmpty { path });
}
drop(entries);
let mut entries = self.entries.write().unwrap();
entries.remove(&path);
drop(entries);
self.flush()?;
Ok(())
}
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref().to_path_buf();
// Can't remove root
if path.to_string_lossy() == "/" {
return Err(FsError::PermissionDenied {
path,
operation: "remove root directory"
});
}
let mut entries = self.entries.write().unwrap();
// Verify it exists and is a directory
let entry = entries.get(&path)
.ok_or_else(|| FsError::NotFound { path: path.clone() })?;
if !entry.is_dir {
return Err(FsError::NotADirectory { path: path.clone() });
}
// Remove all entries under this path
let to_remove: Vec<_> = entries.keys()
.filter(|p| p.starts_with(&path))
.cloned()
.collect();
for p in to_remove {
entries.remove(&p);
}
drop(entries);
self.flush()?;
Ok(())
}
}
}
Step 6: Putting It All Together
Now you have a complete Fs implementation! Let’s use it:
use anyfs::{FileStorage, QuotaLayer, TracingLayer};
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create our glorious TXT filesystem
let backend = TxtBackend::open("my_filesystem.txt")?
// Wrap it with middleware to prevent the file from exploding
.layer(QuotaLayer::builder()
.max_total_size(10 * 1024 * 1024) // 10 MB max
.max_file_size(1 * 1024 * 1024) // 1 MB per file
.build())
// Add tracing because why not
.layer(TracingLayer::new());
// Create the filesystem wrapper
let fs = FileStorage::new(backend);
// Use it like any other filesystem!
fs.create_dir_all("/projects/secret")?;
fs.write("/projects/secret/plans.txt", b"World domination via TXT")?;
fs.write("/projects/readme.md", b"# My TXT-backed project\n\nYes, really.")?;
// Read it back
let content = fs.read_to_string("/projects/secret/plans.txt")?;
println!("Plans: {}", content);
// List directory
for entry in fs.read_dir("/projects")? {
println!(" {} ({})", entry.name,
if entry.file_type == FileType::Directory { "dir" } else { "file" });
}
// Copy a file
fs.copy("/projects/readme.md", "/projects/readme_backup.md")?;
// Delete a file
fs.remove_file("/projects/readme_backup.md")?;
println!("\nNow open my_filesystem.txt in Notepad!");
Ok(())
}
The Result
After running the code, your my_filesystem.txt looks like:
path,type,mode,data
/,dir,755,
/projects,dir,755,
/projects/secret,dir,755,
/projects/secret/plans.txt,file,644,V29ybGQgZG9taW5hdGlvbiB2aWEgVFhU
/projects/readme.md,file,644,IyBNeSBUWFQtYmFja2VkIHByb2plY3QKCllzLCByZWFsbHku
Open it in Notepad. Marvel at your filesystem. Edit a line. Save. You just modified a file.
Why This Actually Matters
This ridiculous example demonstrates the power of AnyFS’s design:
-
True backend abstraction - The
FileStorageAPI doesn’t know or care that it’s backed by a text file -
Middleware just works -
QuotaandTracingwrap your custom backend with zero extra code -
Type safety preserved - Compile-time guarantees work with any backend
-
Easy to implement - ~250 lines for a complete working backend
-
Testable - Use
TxtBackend::in_memory()for fast tests -
Human-editable - Open in Notepad, add a line, you created a file
Next Steps
If you’re feeling brave:
- Add symlink support - Implement
FsLinktrait - Make it async - Wrap with
tokio::fsfor the host CSV file - Add compression - Gzip the base64 content
- Excel integration - Add formulas that compute file sizes (why not?)
Bonus: Mount It as a Drive
With the fuse feature enabled, you can mount your text file as a real filesystem:
#![allow(unused)]
fn main() {
use anyfs::MountHandle;
let backend = TxtBackend::open("filesystem.txt")?;
let mount = MountHandle::mount(backend, "/mnt/txt")?;
// Now /mnt/txt is a real mount point backed by a .txt file
// Any application can read/write files there
// The data goes into a text file you can edit in Notepad
// This is fine
}
The Moral
AnyFS doesn’t care where bytes come from or where they go. Memory, SQLite, a text file, a REST API, carrier pigeons with USB drives - if you can implement the traits, it’s a valid backend.
The middleware layer (quotas, sandboxing, rate limiting, logging) works transparently on any backend. That’s the power of good abstractions.
Now go build something less cursed. Or don’t. I’m not your supervisor.
“I store my production data in text files” - Nobody, ever (until now)
“Can I edit my filesystem in Notepad?” - Yes. Yes you can.
Tutorial: Building Your First Middleware
From zero to intercepting filesystem operations in 15 minutes
What is Middleware?
Middleware wraps a backend and intercepts operations. That’s it.
User Request → [Your Middleware] → [Backend] → Storage
↑ ↓
└── intercept ─────┘
You can:
- Block operations (ReadOnly, PathFilter)
- Transform data (Encryption, Compression)
- Count/Log operations (Counter, Tracing)
- Enforce limits (Quota, RateLimit)
Let’s build one.
The Simplest Middleware: Operation Counter
We’ll count every operation. That’s our entire goal.
Step 1: The Struct
#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicU64, Ordering};
/// Counts every operation performed on the wrapped backend.
pub struct Counter<B> {
inner: B, // The backend we're wrapping
pub count: AtomicU64, // Our counter
}
impl<B> Counter<B> {
pub fn new(inner: B) -> Self {
Self {
inner,
count: AtomicU64::new(0),
}
}
pub fn operations(&self) -> u64 {
self.count.load(Ordering::Relaxed)
}
}
}
That’s the entire struct. We wrap something (inner) and add our state (count).
Step 2: Implement FsRead
Now we implement the same traits as the inner backend, intercepting each method:
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsError, Metadata};
use std::path::Path;
impl<B: FsRead> FsRead for Counter<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
self.count.fetch_add(1, Ordering::Relaxed); // COUNT IT
self.inner.read(path) // DELEGATE
}
fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.read_to_string(path)
}
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.read_range(path, offset, len)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.exists(path)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.metadata(path)
}
fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.open_read(path)
}
}
}
The pattern is always the same:
- Do your thing (count)
- Call
self.inner.method(args)(delegate)
Step 3: Implement FsWrite
Same pattern:
#![allow(unused)]
fn main() {
use anyfs_backend::FsWrite;
impl<B: FsWrite> FsWrite for Counter<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.write(path, data)
}
fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.append(path, data)
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.remove_file(path)
}
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.rename(from, to)
}
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.copy(from, to)
}
fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.truncate(path, size)
}
fn open_write(&self, path: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.open_write(path)
}
}
}
Step 4: Implement FsDir
#![allow(unused)]
fn main() {
use anyfs_backend::{FsDir, ReadDirIter};
impl<B: FsDir> FsDir for Counter<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.read_dir(path)
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.create_dir(path)
}
fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.create_dir_all(path)
}
fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.remove_dir(path)
}
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.remove_dir_all(path)
}
}
// Counter<B> now implements Fs when B: Fs (blanket impl)!
}
Step 5: Use It
use anyfs::{FileStorage, MemoryBackend};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let fs = FileStorage::new(Counter::new(MemoryBackend::new()));
fs.write("/hello.txt", b"Hello, World!")?;
fs.read("/hello.txt")?;
fs.read("/hello.txt")?;
fs.exists("/hello.txt")?;
println!("Total operations: {}", fs.operations()); // 4
Ok(())
}
That’s it. You built middleware.
Adding .layer() Support
Want the fluent .layer() syntax? Add a Layer struct:
#![allow(unused)]
fn main() {
use anyfs_backend::{Layer, Fs};
/// Layer for creating Counter middleware.
pub struct CounterLayer;
impl<B: Fs> Layer<B> for CounterLayer {
type Backend = Counter<B>;
fn layer(self, backend: B) -> Counter<B> {
Counter::new(backend)
}
}
}
Now you can do:
#![allow(unused)]
fn main() {
use anyfs::FileStorage;
let fs = FileStorage::new(
MemoryBackend::new()
.layer(CounterLayer)
);
fs.write("/test.txt", b"data")?;
println!("Operations: {}", fs.operations());
}
A More Useful Middleware: SecretBlocker
Let’s build something practical - block access to files matching a pattern:
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter};
use std::path::Path;
/// Blocks access to files containing "secret" in the path.
pub struct SecretBlocker<B> {
inner: B,
}
impl<B> SecretBlocker<B> {
pub fn new(inner: B) -> Self {
Self { inner }
}
/// Check if path is forbidden.
fn is_secret(&self, path: &Path) -> bool {
path.to_string_lossy().to_lowercase().contains("secret")
}
/// Return error if path is secret.
fn check(&self, path: &Path) -> Result<(), FsError> {
if self.is_secret(path) {
Err(FsError::AccessDenied {
path: path.to_path_buf(),
reason: "secret files are blocked".to_string(),
})
} else {
Ok(())
}
}
}
}
Implement the Traits
#![allow(unused)]
fn main() {
impl<B: FsRead> FsRead for SecretBlocker<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
self.check(path)?; // BLOCK if secret
self.inner.read(path) // DELEGATE otherwise
}
fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.read_to_string(path)
}
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.read_range(path, offset, len)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.exists(path)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.metadata(path)
}
fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.open_read(path)
}
}
impl<B: FsWrite> FsWrite for SecretBlocker<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.write(path, data)
}
fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.append(path, data)
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.remove_file(path)
}
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
let from = from.as_ref();
let to = to.as_ref();
self.check(from)?;
self.check(to)?; // Block both source and destination
self.inner.rename(from, to)
}
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
let from = from.as_ref();
let to = to.as_ref();
self.check(from)?;
self.check(to)?;
self.inner.copy(from, to)
}
fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.truncate(path, size)
}
fn open_write(&self, path: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.open_write(path)
}
}
impl<B: FsDir> FsDir for SecretBlocker<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.read_dir(path)
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.create_dir(path)
}
fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.create_dir_all(path)
}
fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.remove_dir(path)
}
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
self.check(path)?;
self.inner.remove_dir_all(path)
}
}
}
Use It
fn main() -> Result<(), Box<dyn std::error::Error>> {
let fs = FileStorage::new(SecretBlocker::new(MemoryBackend::new()));
// These work fine
fs.write("/public/data.txt", b"Hello!")?;
fs.read("/public/data.txt")?;
// These are blocked
assert!(fs.write("/secret/passwords.txt", b"hunter2").is_err());
assert!(fs.read("/my-secret-diary.txt").is_err());
assert!(fs.create_dir("/SECRET").is_err());
println!("Secret files successfully blocked!");
Ok(())
}
The Middleware Pattern Cheat Sheet
| What You Want | Intercept | Delegate | Return |
|---|---|---|---|
| Count operations | Before call | Always | Inner result |
| Block some paths | Before call | If allowed | Error or inner result |
| Block writes | Write methods | Read methods | Error or inner result |
| Transform data | read/write | Everything else | Modified data |
| Log operations | Before/after | Always | Inner result |
Three Types of Middleware
1. Pass-through with side effects (Counter, Logger)
#![allow(unused)]
fn main() {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
log::info!("Reading: {:?}", path.as_ref()); // Side effect
self.inner.read(path) // Always delegate
}
}
2. Conditional blocking (PathFilter, ReadOnly)
#![allow(unused)]
fn main() {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
if self.is_blocked(path.as_ref()) {
return Err(FsError::AccessDenied { ... }); // Block
}
self.inner.write(path, data) // Allow
}
}
3. Data transformation (Encryption, Compression)
#![allow(unused)]
fn main() {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let encrypted = self.inner.read(path)?; // Get data
Ok(self.decrypt(&encrypted)) // Transform
}
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let encrypted = self.encrypt(data); // Transform
self.inner.write(path, &encrypted) // Store
}
}
Example: Indexing Middleware (Future)
Use IndexLayer to keep a queryable index of file activity:
#![allow(unused)]
fn main() {
use anyfs::{IndexLayer, FileStorage, MemoryBackend};
let backend = MemoryBackend::new()
.layer(IndexLayer::builder()
.index_file("index.db")
.consistency(IndexConsistency::Strict)
.track_reads(false)
.build());
let fs = FileStorage::new(backend);
fs.write("/docs/hello.txt", b"hello")?;
}
Complete Example: ReadOnly Middleware
The classic - block all writes:
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter, Layer, Fs};
use std::path::Path;
/// Makes any backend read-only.
pub struct ReadOnly<B> {
inner: B,
}
impl<B> ReadOnly<B> {
pub fn new(inner: B) -> Self {
Self { inner }
}
}
// FsRead: delegate everything
impl<B: FsRead> FsRead for ReadOnly<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
self.inner.read(path)
}
fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
self.inner.read_to_string(path)
}
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
self.inner.read_range(path, offset, len)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
self.inner.exists(path)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
self.inner.metadata(path)
}
fn open_read(&self, path: &Path) -> Result<Box<dyn std::io::Read + Send>, FsError> {
self.inner.open_read(path)
}
}
// FsWrite: block everything
impl<B: FsWrite> FsWrite for ReadOnly<B> {
fn write(&self, _: &Path, _: &[u8]) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "write" })
}
fn append(&self, _: &Path, _: &[u8]) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "append" })
}
fn remove_file(&self, _: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "remove_file" })
}
fn rename(&self, _: &Path, _: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "rename" })
}
fn copy(&self, _: &Path, _: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "copy" })
}
fn truncate(&self, _: &Path, _: u64) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "truncate" })
}
fn open_write(&self, _: &Path) -> Result<Box<dyn std::io::Write + Send>, FsError> {
Err(FsError::ReadOnly { operation: "open_write" })
}
}
// FsDir: delegate reads, block writes
impl<B: FsDir> FsDir for ReadOnly<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
self.inner.read_dir(path) // Reading is OK
}
fn create_dir(&self, _: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "create_dir" })
}
fn create_dir_all(&self, _: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "create_dir_all" })
}
fn remove_dir(&self, _: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "remove_dir" })
}
fn remove_dir_all(&self, _: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "remove_dir_all" })
}
}
// Layer for .layer() syntax
pub struct ReadOnlyLayer;
impl<B: Fs> Layer<B> for ReadOnlyLayer {
type Backend = ReadOnly<B>;
fn layer(self, backend: B) -> Self::Backend {
ReadOnly::new(backend)
}
}
}
Usage
#![allow(unused)]
fn main() {
let fs = FileStorage::new(
MemoryBackend::new()
.layer(ReadOnlyLayer)
);
// Reads work
fs.exists("/anything")?;
// Writes fail
assert!(fs.write("/file.txt", b"data").is_err());
assert!(fs.create_dir("/new").is_err());
}
Stacking Middleware
Middleware composes naturally:
#![allow(unused)]
fn main() {
let fs = MemoryBackend::new()
.layer(SecretBlockerLayer) // Block secret files
.layer(ReadOnlyLayer) // Make read-only
.layer(CounterLayer); // Count operations
// Layers wrap from inside out. For a request:
// Counter (outermost) → ReadOnly → SecretBlocker → MemoryBackend (innermost)
// The innermost middleware (closest to backend) applies first to the actual operation.
}
Middleware Checklist
Before publishing your middleware:
- Depends only on
anyfs-backend - Implements same traits as inner backend (
FsRead,FsWrite,FsDir) - Has a
Layerimplementation for.layer()syntax - Documents which operations are intercepted vs delegated
- Handles errors properly (doesn’t panic)
- Is thread-safe (
&selfmethods, use atomics/locks for state)
Summary
Middleware is just:
- A struct wrapping
inner: B - Implementing the same traits as
B - Intercepting some methods, delegating others
The three patterns:
- Side effects: Do something, then delegate
- Blocking: Check condition, return error or delegate
- Transform: Modify data on the way in/out
That’s it. Go build something useful.
“Middleware: because sometimes you need to do something between nothing and everything.”
Remote Backend Patterns
Building networked filesystem backends and clients
This guide covers patterns for exposing AnyFS backends over a network and building clients that mount remote filesystems.
Overview
A remote filesystem has three components:
┌─────────────┐ Network ┌─────────────┐ ┌─────────────┐
│ Client │ ←───────────────→ │ Server │ ──→ │ Backend │
│ (FUSE) │ RPC/REST │ (API) │ │ (Storage) │
└─────────────┘ └─────────────┘ └─────────────┘
User's Cloud SQLite/CAS
Machine Service Hybrid/etc
AnyFS backends are local by design. To go remote, you need:
- Server: Exposes backend operations over network
- Protocol: Wire format for requests/responses
- Client: Implements
Fstraits by calling server
Protocol Design
Operations to Expose
Map Fs trait methods to RPC operations:
| Trait Method | RPC Operation | Notes |
|---|---|---|
read(path) | Read(path, range?) | Support partial reads |
write(path, data) | Write(path, data) | Chunked for large files |
exists(path) | Exists(path) | Or combine with Metadata |
metadata(path) | Metadata(path) | Return full stat |
read_dir(path) | ListDir(path, cursor?) | Paginated for large dirs |
create_dir(path) | CreateDir(path) | |
create_dir_all(path) | CreateDirAll(path) | Or client-side loop |
remove_file(path) | Remove(path) | |
remove_dir(path) | RemoveDir(path) | |
remove_dir_all(path) | RemoveDirAll(path) | Recursive |
rename(from, to) | Rename(from, to) | |
copy(from, to) | Copy(from, to) | Server-side copy |
Request/Response Format
Use a simple, efficient format. Here’s a protobuf-style schema:
// requests.proto
message Request {
string request_id = 1; // For idempotency
string auth_token = 2; // Authentication
oneof operation {
ReadRequest read = 10;
WriteRequest write = 11;
MetadataRequest metadata = 12;
ListDirRequest list_dir = 13;
CreateDirRequest create_dir = 14;
RemoveRequest remove = 15;
RenameRequest rename = 16;
CopyRequest copy = 17;
}
}
message ReadRequest {
string path = 1;
optional uint64 offset = 2;
optional uint64 length = 3;
}
message WriteRequest {
string path = 1;
bytes data = 2;
bool append = 3;
}
message MetadataRequest {
string path = 1;
}
message ListDirRequest {
string path = 1;
optional string cursor = 2; // For pagination
optional uint32 limit = 3;
}
// ... other requests
message Response {
string request_id = 1;
bool success = 2;
oneof result {
ErrorResult error = 10;
ReadResult read = 11;
WriteResult write = 12;
MetadataResult metadata = 13;
ListDirResult list_dir = 14;
// ... others return empty success
}
}
message ErrorResult {
string code = 1; // "not_found", "permission_denied", etc.
string message = 2;
string path = 3;
}
message MetadataResult {
string file_type = 1; // "file", "dir", "symlink"
uint64 size = 2;
uint32 mode = 3;
optional uint64 created_at = 4;
optional uint64 modified_at = 5;
optional uint64 accessed_at = 6;
optional uint64 inode = 7;
optional uint32 nlink = 8;
}
message ListDirResult {
repeated DirEntry entries = 1;
optional string next_cursor = 2; // Null if no more
}
message DirEntry {
string name = 1;
string path = 2; // Full path to entry
string file_type = 3;
uint64 size = 4;
optional uint64 inode = 5;
}
Protocol Choices
| Protocol | Pros | Cons | Use When |
|---|---|---|---|
| gRPC | Fast, typed, streaming | Complex setup | High performance |
| REST/JSON | Simple, debuggable | Slower, no streaming | Compatibility |
| WebSocket | Bidirectional, real-time | More complex | Live updates |
| Custom TCP | Maximum control | Build everything | Special needs |
Recommendation: Start with gRPC (tonic in Rust). Fall back to REST for web clients.
Server Implementation
Basic Server Structure
#![allow(unused)]
fn main() {
use tonic::{transport::Server, Request, Response, Status};
use anyfs_backend::Fs;
use anyfs::FileStorage;
pub struct FsServer<B: Fs> {
backend: FileStorage<B>,
}
impl<B: Fs + Send + Sync + 'static> FsServer<B> {
pub fn new(backend: B) -> Self {
Self { backend: FileStorage::new(backend) }
}
pub async fn serve(self, addr: &str) -> Result<(), Box<dyn std::error::Error>> {
let addr = addr.parse()?;
Server::builder()
.add_service(FsServiceServer::new(self))
.serve(addr)
.await?;
Ok(())
}
}
#[tonic::async_trait]
impl<B: Fs + Send + Sync + 'static> FsService for FsServer<B> {
async fn read(
&self,
request: Request<ReadRequest>,
) -> Result<Response<ReadResponse>, Status> {
let req = request.into_inner();
let data = match req.length {
Some(len) => self.backend.read_range(&req.path, req.offset.unwrap_or(0), len as usize),
None => self.backend.read(&req.path),
};
match data {
Ok(bytes) => Ok(Response::new(ReadResponse {
data: bytes,
success: true,
error: None,
})),
Err(e) => Ok(Response::new(ReadResponse {
data: vec![],
success: false,
error: Some(fs_error_to_proto(e)),
})),
}
}
async fn write(
&self,
request: Request<WriteRequest>,
) -> Result<Response<WriteResponse>, Status> {
let req = request.into_inner();
let result = if req.append {
self.backend.append(&req.path, &req.data)
} else {
self.backend.write(&req.path, &req.data)
};
match result {
Ok(()) => Ok(Response::new(WriteResponse {
success: true,
error: None,
})),
Err(e) => Ok(Response::new(WriteResponse {
success: false,
error: Some(fs_error_to_proto(e)),
})),
}
}
// ... implement other methods
}
fn fs_error_to_proto(e: FsError) -> ProtoError {
match e {
FsError::NotFound { path } => ProtoError {
code: "not_found".into(),
message: "File not found".into(),
path: path.to_string_lossy().into(),
},
FsError::AlreadyExists { path, .. } => ProtoError {
code: "already_exists".into(),
message: "File already exists".into(),
path: path.to_string_lossy().into(),
},
// ... map other errors
_ => ProtoError {
code: "internal".into(),
message: e.to_string(),
path: String::new(),
},
}
}
}
Authentication Middleware
Add authentication as a tower layer:
#![allow(unused)]
fn main() {
use tonic::service::Interceptor;
#[derive(Clone)]
pub struct AuthInterceptor {
valid_tokens: Arc<HashSet<String>>,
}
impl Interceptor for AuthInterceptor {
fn call(&mut self, mut request: Request<()>) -> Result<Request<()>, Status> {
let token = request
.metadata()
.get("authorization")
.and_then(|v| v.to_str().ok())
.map(|s| s.trim_start_matches("Bearer "));
match token {
Some(t) if self.valid_tokens.contains(t) => Ok(request),
_ => Err(Status::unauthenticated("Invalid or missing token")),
}
}
}
// Usage
Server::builder()
.add_service(FsServiceServer::with_interceptor(fs_server, auth_interceptor))
.serve(addr)
.await?;
}
Rate Limiting
Protect against abuse:
#![allow(unused)]
fn main() {
use governor::{Quota, RateLimiter};
use std::num::NonZeroU32;
pub struct RateLimitedServer<B: Fs> {
inner: FsServer<B>,
limiter: RateLimiter<String>, // Per-user rate limiter
}
impl<B: Fs> RateLimitedServer<B> {
pub fn new(backend: B, requests_per_second: u32) -> Self {
let quota = Quota::per_second(NonZeroU32::new(requests_per_second).unwrap());
Self {
inner: FsServer::new(backend),
limiter: RateLimiter::keyed(quota),
}
}
async fn check_rate_limit(&self, user_id: &str) -> Result<(), Status> {
self.limiter
.check_key(&user_id.to_string())
.map_err(|_| Status::resource_exhausted("Rate limit exceeded"))?;
Ok(())
}
}
}
Idempotency
Handle retried requests safely:
#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::time::{Duration, Instant};
pub struct IdempotencyCache {
cache: RwLock<HashMap<String, (Instant, Response)>>,
ttl: Duration,
}
impl IdempotencyCache {
pub fn new(ttl: Duration) -> Self {
Self {
cache: RwLock::new(HashMap::new()),
ttl,
}
}
/// Check if we've seen this request before.
pub fn get(&self, request_id: &str) -> Option<Response> {
let cache = self.cache.read().unwrap();
cache.get(request_id)
.filter(|(ts, _)| ts.elapsed() < self.ttl)
.map(|(_, resp)| resp.clone())
}
/// Store response for future duplicate requests.
pub fn put(&self, request_id: String, response: Response) {
let mut cache = self.cache.write().unwrap();
cache.insert(request_id, (Instant::now(), response));
}
/// Clean up expired entries (call periodically).
pub fn cleanup(&self) {
let mut cache = self.cache.write().unwrap();
cache.retain(|_, (ts, _)| ts.elapsed() < self.ttl);
}
}
}
Client Implementation
Remote Backend (Client-Side)
The client implements Fs traits by making RPC calls:
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, ReadDirIter, DirEntry};
use std::path::Path;
pub struct RemoteBackend {
client: FsServiceClient<tonic::transport::Channel>,
auth_token: String,
}
impl RemoteBackend {
pub async fn connect(addr: &str, auth_token: String) -> Result<Self, FsError> {
let client = FsServiceClient::connect(addr.to_string())
.await
.map_err(|e| FsError::Backend(format!("connect failed: {}", e)))?;
Ok(Self { client, auth_token })
}
fn request<T>(&self, req: T) -> tonic::Request<T> {
let mut request = tonic::Request::new(req);
request.metadata_mut().insert(
"authorization",
format!("Bearer {}", self.auth_token).parse().unwrap(),
);
request
}
}
impl FsRead for RemoteBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// Note: This is sync, but we're calling async code
// In practice, use tokio::runtime::Handle or async traits
let rt = tokio::runtime::Handle::current();
rt.block_on(async {
let req = self.request(ReadRequest {
path: path.as_ref().to_string_lossy().into(),
offset: None,
length: None,
});
let response = self.client.clone().read(req)
.await
.map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
.into_inner();
if response.success {
Ok(response.data)
} else {
Err(proto_error_to_fs(response.error.unwrap()))
}
})
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
// Could be a dedicated RPC or use metadata
match self.metadata(path) {
Ok(_) => Ok(true),
Err(FsError::NotFound { .. }) => Ok(false),
Err(e) => Err(e),
}
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
let rt = tokio::runtime::Handle::current();
rt.block_on(async {
let req = self.request(MetadataRequest {
path: path.as_ref().to_string_lossy().into(),
});
let response = self.client.clone().metadata(req)
.await
.map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
.into_inner();
if response.success {
Ok(proto_metadata_to_fs(response.metadata.unwrap()))
} else {
Err(proto_error_to_fs(response.error.unwrap()))
}
})
}
// ... other methods
}
impl FsWrite for RemoteBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let rt = tokio::runtime::Handle::current();
rt.block_on(async {
let req = self.request(WriteRequest {
path: path.as_ref().to_string_lossy().into(),
data: data.to_vec(),
append: false,
});
let response = self.client.clone().write(req)
.await
.map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
.into_inner();
if response.success {
Ok(())
} else {
Err(proto_error_to_fs(response.error.unwrap()))
}
})
}
// ... other methods
}
impl FsDir for RemoteBackend {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
let rt = tokio::runtime::Handle::current();
rt.block_on(async {
let mut all_entries = Vec::new();
let mut cursor = None;
// Paginate through all results
loop {
let req = self.request(ListDirRequest {
path: path.as_ref().to_string_lossy().into(),
cursor: cursor.clone(),
limit: Some(1000),
});
let response = self.client.clone().list_dir(req)
.await
.map_err(|e| FsError::Backend(format!("rpc failed: {}", e)))?
.into_inner();
if !response.success {
return Err(proto_error_to_fs(response.error.unwrap()));
}
all_entries.extend(response.entries.into_iter().map(proto_entry_to_fs));
match response.next_cursor {
Some(c) => cursor = Some(c),
None => break,
}
}
Ok(ReadDirIter::new(all_entries.into_iter().map(Ok)))
})
}
// ... other methods
}
}
Caching Layer
Network calls are slow. Add caching:
#![allow(unused)]
fn main() {
use std::collections::HashMap;
use std::sync::RwLock;
use std::time::{Duration, Instant};
/// Client-side cache for remote filesystem.
pub struct CachingBackend<B> {
inner: B,
metadata_cache: RwLock<HashMap<PathBuf, (Instant, Metadata)>>,
content_cache: RwLock<LruCache<PathBuf, Vec<u8>>>,
metadata_ttl: Duration,
max_cached_file_size: u64,
}
impl<B> CachingBackend<B> {
pub fn new(inner: B) -> Self {
Self {
inner,
metadata_cache: RwLock::new(HashMap::new()),
content_cache: RwLock::new(LruCache::new(100)), // 100 files
metadata_ttl: Duration::from_secs(5),
max_cached_file_size: 1024 * 1024, // 1 MB
}
}
/// Invalidate cache for a path (call after writes).
pub fn invalidate(&self, path: &Path) {
self.metadata_cache.write().unwrap().remove(path);
self.content_cache.write().unwrap().pop(path);
}
/// Invalidate everything under a directory.
pub fn invalidate_prefix(&self, prefix: &Path) {
let mut meta = self.metadata_cache.write().unwrap();
let mut content = self.content_cache.write().unwrap();
meta.retain(|k, _| !k.starts_with(prefix));
// LruCache doesn't have retain, so we'd need a different structure
}
}
impl<B: FsRead> FsRead for CachingBackend<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
// Check cache first
if let Some(data) = self.content_cache.read().unwrap().peek(path) {
return Ok(data.clone());
}
// Cache miss - fetch from remote
let data = self.inner.read(path)?;
// Cache if small enough
if data.len() as u64 <= self.max_cached_file_size {
self.content_cache.write().unwrap().put(path.to_path_buf(), data.clone());
}
Ok(data)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
let path = path.as_ref();
// Check cache
{
let cache = self.metadata_cache.read().unwrap();
if let Some((ts, meta)) = cache.get(path) {
if ts.elapsed() < self.metadata_ttl {
return Ok(meta.clone());
}
}
}
// Cache miss
let meta = self.inner.metadata(path)?;
// Store in cache
self.metadata_cache.write().unwrap()
.insert(path.to_path_buf(), (Instant::now(), meta.clone()));
Ok(meta)
}
// ... other methods
}
impl<B: FsWrite> FsWrite for CachingBackend<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
// Write through to remote
self.inner.write(path, data)?;
// Invalidate cache
self.invalidate(path);
Ok(())
}
// ... other methods - all invalidate cache after modifying
}
}
Cache Invalidation Strategies
| Strategy | How | When to Use |
|---|---|---|
| TTL | Expire after N seconds | Read-heavy, eventual consistency OK |
| Write-through | Invalidate on local write | Single client |
| Server push | WebSocket notifications | Real-time consistency |
| Version/ETag | Check version on read | Balance of consistency/perf |
Offline Mode
Handle network failures gracefully:
#![allow(unused)]
fn main() {
use anyfs::FileStorage;
pub struct OfflineCapableBackend<B> {
remote: FileStorage<B>,
local_cache: SqliteBackend, // Local SQLite for offline ops
mode: RwLock<ConnectionMode>,
pending_writes: RwLock<Vec<PendingWrite>>,
}
#[derive(Clone, Copy)]
enum ConnectionMode {
Online,
Offline,
Reconnecting,
}
struct PendingWrite {
path: PathBuf,
operation: WriteOperation,
timestamp: Instant,
}
enum WriteOperation {
Write(Vec<u8>),
Append(Vec<u8>),
Remove,
CreateDir,
// ...
}
impl<B: Fs> OfflineCapableBackend<B> {
fn is_online(&self) -> bool {
matches!(*self.mode.read().unwrap(), ConnectionMode::Online)
}
fn go_offline(&self) {
*self.mode.write().unwrap() = ConnectionMode::Offline;
}
fn try_reconnect(&self) -> bool {
*self.mode.write().unwrap() = ConnectionMode::Reconnecting;
// Try a simple operation
if self.remote.exists("/").is_ok() {
*self.mode.write().unwrap() = ConnectionMode::Online;
self.sync_pending_writes();
true
} else {
*self.mode.write().unwrap() = ConnectionMode::Offline;
false
}
}
fn sync_pending_writes(&self) {
let mut pending = self.pending_writes.write().unwrap();
for write in pending.drain(..) {
let result = match write.operation {
WriteOperation::Write(data) => self.remote.write(&write.path, &data),
WriteOperation::Append(data) => self.remote.append(&write.path, &data),
WriteOperation::Remove => self.remote.remove_file(&write.path),
WriteOperation::CreateDir => self.remote.create_dir(&write.path),
};
if result.is_err() {
// Put back and stop syncing
// (In practice, need conflict resolution)
break;
}
}
}
}
impl<B: FsRead> FsRead for OfflineCapableBackend<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
if self.is_online() {
match self.remote.read(path) {
Ok(data) => {
// Update local cache
let _ = self.local_cache.write(path, &data);
Ok(data)
}
Err(FsError::Backend(_)) => {
// Network error - go offline, try cache
self.go_offline();
self.local_cache.read(path)
}
Err(e) => Err(e),
}
} else {
// Offline - use cache
self.local_cache.read(path)
}
}
}
impl<B: FsWrite> FsWrite for OfflineCapableBackend<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
// Always write to local cache
self.local_cache.write(path, data)?;
if self.is_online() {
match self.remote.write(path, data) {
Ok(()) => Ok(()),
Err(FsError::Backend(_)) => {
// Network error - queue for later
self.go_offline();
self.pending_writes.write().unwrap().push(PendingWrite {
path: path.to_path_buf(),
operation: WriteOperation::Write(data.to_vec()),
timestamp: Instant::now(),
});
Ok(()) // Return success - we wrote locally
}
Err(e) => Err(e),
}
} else {
// Offline - queue for later sync
self.pending_writes.write().unwrap().push(PendingWrite {
path: path.to_path_buf(),
operation: WriteOperation::Write(data.to_vec()),
timestamp: Instant::now(),
});
Ok(())
}
}
}
}
Conflict Resolution
When syncing offline writes, conflicts can occur:
#![allow(unused)]
fn main() {
enum ConflictResolution {
/// Server version wins (discard local changes)
ServerWins,
/// Client version wins (overwrite server)
ClientWins,
/// Keep both (rename local to .conflict)
KeepBoth,
/// Ask user
Manual,
}
fn resolve_conflict(
path: &str,
local_data: &[u8],
server_data: &[u8],
strategy: ConflictResolution,
) -> Result<(), FsError> {
match strategy {
ConflictResolution::ServerWins => {
// Discard local, use server version
Ok(())
}
ConflictResolution::ClientWins => {
// Overwrite server with local
remote.write(path, local_data)
}
ConflictResolution::KeepBoth => {
// Rename local to path.conflict
let conflict_path = format!("{}.conflict", path);
remote.write(&conflict_path, local_data)?;
Ok(())
}
ConflictResolution::Manual => {
Err(FsError::Conflict { path: path.to_path_buf() })
}
}
}
}
FUSE Client
Mount the remote filesystem locally using FUSE:
#![allow(unused)]
fn main() {
use fuser::{Filesystem, MountOption, Request, ReplyData, ReplyEntry, ReplyAttr, ReplyDirectory};
pub struct RemoteFuse<B: Fs> {
backend: B,
// Inode management for FUSE
inodes: RwLock<BiMap<u64, PathBuf>>,
next_inode: AtomicU64,
}
impl<B: Fs> Filesystem for RemoteFuse<B> {
fn lookup(&mut self, _req: &Request, parent: u64, name: &OsStr, reply: ReplyEntry) {
let parent_path = self.inode_to_path(parent);
let path = parent_path.join(name);
match self.backend.metadata(&path) {
Ok(meta) => {
let inode = self.path_to_inode(&path);
let attr = metadata_to_fuse_attr(inode, &meta);
reply.entry(&Duration::from_secs(1), &attr, 0);
}
Err(_) => reply.error(libc::ENOENT),
}
}
fn read(
&mut self,
_req: &Request,
ino: u64,
_fh: u64,
offset: i64,
size: u32,
_flags: i32,
_lock_owner: Option<u64>,
reply: ReplyData,
) {
let path = self.inode_to_path(ino);
match self.backend.read_range(&path, offset as u64, size as usize) {
Ok(data) => reply.data(&data),
Err(_) => reply.error(libc::EIO),
}
}
fn write(
&mut self,
_req: &Request,
ino: u64,
_fh: u64,
offset: i64,
data: &[u8],
_write_flags: u32,
_flags: i32,
_lock_owner: Option<u64>,
reply: fuser::ReplyWrite,
) {
let path = self.inode_to_path(ino);
// For simplicity, read-modify-write
// (Real impl would use open_write with seeking)
match self.backend.read(&path) {
Ok(mut content) => {
let offset = offset as usize;
if offset > content.len() {
content.resize(offset, 0);
}
if offset + data.len() > content.len() {
content.resize(offset + data.len(), 0);
}
content[offset..offset + data.len()].copy_from_slice(data);
match self.backend.write(&path, &content) {
Ok(()) => reply.written(data.len() as u32),
Err(_) => reply.error(libc::EIO),
}
}
Err(_) => reply.error(libc::EIO),
}
}
fn readdir(
&mut self,
_req: &Request,
ino: u64,
_fh: u64,
offset: i64,
mut reply: ReplyDirectory,
) {
let path = self.inode_to_path(ino);
match self.backend.read_dir(&path) {
Ok(entries) => {
let entries: Vec<_> = entries.filter_map(|e| e.ok()).collect();
for (i, entry) in entries.iter().enumerate().skip(offset as usize) {
let child_path = path.join(&entry.name);
let child_inode = self.path_to_inode(&child_path);
let file_type = match entry.file_type {
FileType::File => fuser::FileType::RegularFile,
FileType::Directory => fuser::FileType::Directory,
FileType::Symlink => fuser::FileType::Symlink,
};
if reply.add(child_inode, (i + 1) as i64, file_type, &entry.name) {
break; // Buffer full
}
}
reply.ok();
}
Err(_) => reply.error(libc::EIO),
}
}
// ... implement other FUSE methods
}
// Mount the remote filesystem
pub fn mount_remote(backend: impl Fs, mountpoint: &Path) -> Result<(), Box<dyn Error>> {
let fuse = RemoteFuse::new(backend);
fuser::mount2(
fuse,
mountpoint,
&[
MountOption::RO, // Or RW
MountOption::FSName("anyfs-remote".to_string()),
MountOption::AutoUnmount,
],
)?;
Ok(())
}
}
Summary: Building a Cloud Filesystem
To build a complete cloud filesystem service:
Server Side
- Wrap your backend (e.g.,
IndexedBackendor custom) with middleware - Expose via gRPC/REST server
- Add authentication, rate limiting, idempotency
Client Side
- Implement
RemoteBackendthat calls server RPC - Wrap with
CachingBackendfor performance - Optionally add
OfflineCapableBackend - Mount via FUSE for native OS integration
Architecture
┌─────────────────────────────────────────────────────────────┐
│ Client Machine │
│ ┌─────────┐ ┌────────────┐ ┌───────────────────┐ │
│ │ FUSE │ → │ Caching │ → │ RemoteBackend │ │
│ │ Mount │ │ Backend │ │ (RPC Client) │ │
│ └─────────┘ └────────────┘ └─────────┬─────────┘ │
└────────────────────────────────────────────│───────────────┘
│ Network
┌────────────────────────────────────────────│───────────────┐
│ Server ▼ │
│ ┌─────────────────┐ ┌─────────────────────────────┐ │
│ │ RPC Server │ → │ Middleware Stack │ │
│ │ (Auth, Rate) │ │ Quota → Tracing → Backend │ │
│ └─────────────────┘ └─────────────┬───────────────┘ │
│ ▼ │
│ ┌─────────────────────────────┐ │
│ │ IndexedBackend │ │
│ │ SQLite Index + Disk Blobs │ │
│ └─────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
This gives you a complete cloud filesystem with:
- Native OS mounting (FUSE)
- Offline support
- Caching for performance
- Server-side quotas and logging
- Large file streaming performance
AnyFS: Comparison, Positioning & Honest Assessment
A comprehensive look at why AnyFS exists, how it compares, and where it falls short
Origin Story
AnyFS didn’t start as a filesystem abstraction. It started as a security problem.
The Path Security Problem
While exploring filesystem security, I created the strict-path crate to ensure that externally-sourced paths could never escape their boundaries. The approach: resolve a boundary path, resolve the provided path, and validate containment.
This proved far more challenging than expected. Attack vectors kept appearing:
- Symlinks pointing outside the boundary
- Windows junction points
- NTFS Alternate Data Streams (
file.txt:hidden:$DATA) - Windows 8.3 short names (
PROGRA~1) - Linux
/procmagic symlinks that escape namespaces - Unicode normalization tricks (NFC vs NFD)
- URL-encoded traversal (
%2e%2e) - TOCTOU race conditions
Eventually, strict-path addressed 19+ attack vectors, making it (apparently) comprehensive. But it came with costs:
- I/O overhead - Real filesystem resolution is expensive
- Existing paths only -
std::fs::canonicalizerequires paths to exist - Residual TOCTOU risk - A symlink created between verification and operation (extremely rare, but possible)
The SQLite Revelation
Then a new idea emerged: What if the filesystem didn’t exist on disk at all?
A SQLite-backed virtual filesystem would:
- Eliminate path security issues - Paths are just database keys, not real files
- Be fully portable - A tenant’s entire filesystem in one
.dbfile - Have no TOCTOU - Database transactions are atomic
- Work on non-existing paths - No canonicalization needed
The Abstraction Need
But then: What if I wanted to switch from SQLite to something else later?
I didn’t want to rewrite code just to explore different backends. I needed an abstraction.
The Framework Vision
Research revealed that existing VFS solutions were either:
- Too simple - Just swappable backends, no policies
- Too fixed - Specific to one use case (AI agents, archives, etc.)
- Insecure - Basic
..traversal prevention, missing 17+ attack vectors
My niche is security: isolating filesystems, limiting actions, controlling resources.
The Tower/Axum pattern for HTTP showed how to compose middleware elegantly. Why not apply the same pattern to filesystems?
Thus AnyFS: A composable middleware framework for filesystem operations.
The Landscape: What Already Exists
Rust Ecosystem
| Library | Stars | Downloads | Purpose |
|---|---|---|---|
vfs | 464 | 1,700+ deps | Swappable filesystem backends |
virtual-filesystem | ~30 | ~260/mo | Backends with basic sandboxing |
AgentFS | New | Alpha | AI agent state management |
Other Languages
| Library | Language | Strength |
|---|---|---|
| fsspec | Python | Async, caching, 20+ backends |
| PyFilesystem2 | Python | Clean URL-based API |
| Afero | Go | Composition patterns |
| Apache Commons VFS | Java | Enterprise, many backends |
| System.IO.Abstractions | .NET | Testing, mirrors System.IO |
Honest Comparison
What Others Do Well
vfs crate:
- Mature (464 stars, 1,700+ dependent projects)
- Multiple backends (Memory, Physical, Overlay, Embedded)
- Async support (though being sunset)
- Simple, focused API
virtual-filesystem:
- ZIP/TAR archive support
- Mountable filesystem
- Basic sandboxing attempt
AgentFS:
- Purpose-built for AI agents
- SQLite backend with FUSE mounting
- Key-value store included
- Audit trail built-in
- Backed by Turso (funded company)
- TypeScript/Python SDKs
fsspec (Python):
- Block-wise caching (not just whole-file)
- Async-first design
- Excellent data science integration
What Others Do Poorly
Security in existing solutions is inadequate.
I examined virtual-filesystem’s SandboxedPhysicalFS. Here’s their entire security implementation:
#![allow(unused)]
fn main() {
impl PathResolver for SandboxedPathResolver {
fn resolve_path(root: &Path, path: &str) -> Result<PathBuf> {
let root = root.canonicalize()?;
let host_path = root.join(make_relative(path)).canonicalize()?;
if !host_path.starts_with(root) {
return Err(io::Error::new(ErrorKind::PermissionDenied, "Traversal prevented"));
}
Ok(host_path)
}
}
}
That’s it. ~10 lines covering 2 out of 19+ attack vectors.
| Attack Vector | virtual-filesystem | strict-path |
|---|---|---|
Basic .. traversal | ✅ | ✅ |
| Symlink following | ✅ | ✅ |
| NTFS Alternate Data Streams | ❌ | ✅ |
| Windows 8.3 short names | ❌ | ✅ |
| Unicode normalization | ❌ | ✅ |
| TOCTOU race conditions | ❌ | ✅ |
| Non-existing paths | ❌ FAILS | ✅ |
| URL-encoded traversal | ❌ | ✅ |
| Windows UNC paths | ❌ | ✅ |
| Linux /proc magic symlinks | ❌ | ✅ |
| Null byte injection | ❌ | ✅ |
| Unicode direction override | ❌ | ✅ |
| Windows reserved names | ❌ | ✅ |
| Junction point escapes | ❌ | ✅ |
| Coverage | 2/19 | 19/19 |
The vfs crate’s AltrootFS is similarly basic - just path prefix translation.
No middleware composition exists anywhere.
None of the filesystem libraries offer Tower-style middleware. You can’t do something like:
#![allow(unused)]
fn main() {
// Hypothetical - doesn't exist in other libraries
backend
.layer(QuotaLayer)
.layer(RateLimitLayer)
.layer(TracingLayer)
}
If you want quotas in vfs, you’d have to build it INTO each backend. Then build it again for the next backend.
What Makes AnyFS Unique
1. Middleware Composition (Nobody Else Has This)
#![allow(unused)]
fn main() {
let fs = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024) // 100 MB
.build())
.layer(RateLimitLayer::builder()
.max_ops(100)
.per_second()
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("/workspace/.git/**")
.build())
.layer(TracingLayer::new());
}
Add, remove, or reorder middleware without touching backends. Write middleware once, use with any backend.
2. Type-Safe Domain Separation (User-Defined Wrappers)
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Users who need type-safe domain separation can create wrapper types
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);
let sandbox = SandboxFs(FileStorage::new(memory_backend));
let userdata = UserDataFs(FileStorage::new(sqlite_backend));
fn process_sandbox(fs: &SandboxFs) { ... }
process_sandbox(&sandbox); // OK
process_sandbox(&userdata); // COMPILE ERROR - different type
}
Compile-time prevention of mixing storage domains via user-defined wrapper types.
3. Backend-Agnostic Policies (Nobody Else Has This)
| Middleware | Function | Works on ANY backend |
|---|---|---|
Quota<B> | Size/count limits | ✅ |
RateLimit<B> | Ops per second | ✅ |
PathFilter<B> | Path-based access control | ✅ |
Restrictions<B> | Disable operations | ✅ |
Tracing<B> | Audit logging | ✅ |
ReadOnly<B> | Block all writes | ✅ |
Cache<B> | LRU caching | ✅ |
Overlay<B1,B2> | Union filesystem | ✅ |
4. Comprehensive Security Testing
The planned conformance test suite targets 50+ security tests covering:
- Path traversal (URL-encoded, backslash, mixed)
- Symlink attacks (escape, loops, TOCTOU)
- Platform-specific (NTFS ADS, 8.3 names, /proc)
- Unicode (normalization, RTL override, homoglyphs)
- Resource exhaustion
Derived from vulnerabilities in Apache Commons VFS, Afero, PyFilesystem2, and our own strict-path research.
Honest Downsides of AnyFS
1. We’re New, They’re Established
| Metric | vfs | AnyFS |
|---|---|---|
| Stars | 464 | 0 (new) |
| Dependent projects | 1,700+ | 0 (new) |
| Years maintained | 5+ | New |
| Contributors | 17 | 1 |
Reality: The vfs crate works fine for 90% of use cases. If you just need swappable backends for testing, vfs is battle-tested.
2. Complexity vs Simplicity
#![allow(unused)]
fn main() {
// vfs: Simple
let fs = MemoryFS::new();
fs.create_file("test.txt")?.write_all(b"hello")?;
// AnyFS: More setup if you use middleware
let fs = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(1024 * 1024).build()); // 1 MB
fs.write("/test.txt", b"hello")?;
}
If you don’t need middleware, AnyFS adds conceptual overhead.
3. Sync-Only (For Now)
AnyFS is sync-first. In an async-dominated ecosystem (Tokio, etc.), this may limit adoption.
fsspec (Python) and OpenDAL (Rust) are async-first. We’re not.
Mitigation: ADR-024 plans async support. Our Send + Sync bounds enable spawn_blocking wrappers today.
4. AgentFS Has Momentum for AI Agents
If you’re building AI agents specifically:
| Feature | AgentFS | AnyFS |
|---|---|---|
| SQLite backend | ✅ | ✅ |
| FUSE mounting | ✅ | Planned |
| Key-value store | ✅ | ❌ (different abstraction) |
| Tool call auditing | ✅ Built-in | Via Tracing middleware |
| TypeScript SDK | ✅ | ❌ |
| Python SDK | Coming | ❌ |
| Corporate backing | Turso | None |
AgentFS is purpose-built for AI agents with corporate resources. We’re a general-purpose framework.
5. Performance Overhead
Middleware composition has costs:
- Each layer adds a function call
- Quota tracking requires size accounting
- Rate limiting needs timestamp checks
For hot paths with millions of ops/second, this matters. For normal usage, it doesn’t.
6. Real Filesystem Security Has Limits
For VRootFsBackend (wrapping real filesystem):
- Still has I/O costs for path resolution
- Residual TOCTOU risk (extremely rare)
strict-pathcovers 19 vectors, but unknown unknowns exist
Virtual backends (Memory, SQLite) don’t have these issues - paths are just keys.
Feature Matrix
| Feature | AnyFS | vfs | virtual-fs | AgentFS | OpenDAL |
|---|---|---|---|---|---|
| Composable middleware | ✅ | ❌ | ❌ | ❌ | ✅ |
| Multiple backends | ✅ | ✅ | ✅ | ❌ | ✅ |
| SQLite backend | ✅ | ❌ | ❌ | ✅ | ❌ |
| Memory backend | ✅ | ✅ | ✅ | ❌ | ✅ |
| Quota enforcement | ✅ | ❌ | ❌ | ❌ | ❌ |
| Rate limiting | ✅ | ❌ | ❌ | ❌ | ❌ |
| Type-safe wrappers | ✅* | ❌ | ❌ | ❌ | ❌ |
| Path sandboxing | ✅ | Basic | Basic (2 vectors) | ❌ | ❌ |
| Async API | 🔜 | Partial | ❌ | ❌ | ✅ |
| std::fs-aligned API | ✅ | Custom | ✅ | ✅ | Custom |
| FUSE mounting | MVP scope | ❌ | ❌ | ✅ | ❌ |
| Conformance tests | Planned (80+) | Unknown | Unknown | Unknown | Unknown |
When to Use AnyFS
Good Fit
- Multi-tenant SaaS - Per-tenant quotas, path isolation, rate limiting
- Untrusted input sandboxing - Comprehensive path security
- Policy-heavy environments - When you need composable rules
- Backend flexibility - When you might swap storage later
- Type-safe domain separation - When mixing containers is dangerous
Not a Good Fit
- Simple testing -
vfsis simpler if you just need mock FS - AI agent runtime - AgentFS has more features for that specific use case
- Cloud storage - OpenDAL is async-first with cloud backends
- Async-first codebases - Wait for AnyFS async support
- Must mount filesystem - Use
anyfswithfuse/winfspfeature flags
Summary
AnyFS exists because:
- Existing VFS libraries have basic, inadequate security (2/19 attack vectors)
- No filesystem library offers middleware composition
- No filesystem library offers type-safe domain separation
- Policy enforcement (quotas, rate limits, path filtering) doesn’t exist elsewhere
AnyFS is honest about:
- We’re new,
vfsis established - We add complexity if you don’t need middleware
- We’re sync-only for now
- AgentFS has more resources for AI-specific use cases
AnyFS is positioned as:
“Tower for filesystems” - Composable middleware over pluggable backends, with comprehensive security testing.
Sources
- vfs crate
- virtual-filesystem crate
- AgentFS
- strict-path
- soft-canonicalize
- In-Memory Filesystems in Rust - Performance analysis
- Rust Forum: Virtual Filesystems
- Prior Art Analysis - Detailed vulnerability research
Security Considerations
Security model, threat analysis, and containment guarantees
Overview
AnyFS is designed with security as a primary concern. Security policies are enforced via composable middleware, not hardcoded in backends or the container wrapper.
Threat Model
In Scope (Mitigated by Middleware)
| Threat | Description | Middleware |
|---|---|---|
| Path traversal | Access files outside allowed paths | PathFilter |
| Symlink attacks | Use symlinks to bypass controls | Backend-dependent (see below) |
| Resource exhaustion | Fill storage or create excessive files | Quota |
| Runaway processes | Excessive operations consuming resources | RateLimit |
| Unauthorized writes | Modifications to read-only data | ReadOnly |
| Sensitive file access | Access to .env, secrets, etc. | PathFilter |
Out of Scope
| Threat | Reason |
|---|---|
| Side-channel attacks | Requires OS-level mitigations |
| Physical access | Disk encryption is application’s responsibility |
| SQLite vulnerabilities | Upstream dependency; update regularly |
| Network attacks | AnyFS is local storage, not network-facing |
Security Architecture
1. Middleware-Based Policy
Security policies are composable middleware layers:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RateLimitLayer, TracingLayer};
let secure_backend = MemoryBackend::new()
.layer(QuotaLayer::builder() // Limit resources
.max_total_size(100 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder() // Sandbox paths
.allow("/workspace/**")
.deny("**/.env")
.deny("**/secrets/**")
.build())
.layer(RateLimitLayer::builder() // Throttle operations
.max_ops(1000)
.per_second()
.build())
.layer(TracingLayer::new()); // Audit trail
}
2. Path Sandboxing (PathFilter)
PathFilter middleware restricts path access using glob patterns:
#![allow(unused)]
fn main() {
PathFilterLayer::builder()
.allow("/workspace/**") // Allow workspace access
.deny("**/.env") // Block .env files
.deny("**/secrets/**") // Block secrets directories
.deny("**/*.key") // Block key files
.build()
.layer(backend)
}
Guarantees:
- First matching rule wins
- No rule = denied (deny by default)
read_dirfilters denied entries from results
3. Symlink Capability via Trait Bounds
Symlink/hard-link capability is determined by trait bounds, not middleware:
#![allow(unused)]
fn main() {
// MemoryBackend implements FsLink → symlinks work
let fs = FileStorage::new(MemoryBackend::new());
fs.symlink("/target", "/link")?; // ✅ Works
// Custom backend without FsLink → symlinks won't compile
let fs = FileStorage::new(MySimpleBackend::new());
fs.symlink("/target", "/link")?; // ❌ Compile error
}
If you don’t want symlinks: Use a backend that doesn’t implement FsLink.
The Restrictions middleware only controls permission operations:
#![allow(unused)]
fn main() {
RestrictionsLayer::builder()
.deny_permissions() // Block set_permissions() calls
.build()
.layer(backend)
}
Use cases:
- Sandboxing untrusted code (block permission changes)
- Read-only-ish environments (block permission mutations)
4. Resource Limits (Quota)
Quota middleware enforces capacity limits:
#![allow(unused)]
fn main() {
QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024) // 100 MB total
.max_file_size(10 * 1024 * 1024) // 10 MB per file
.max_node_count(10_000) // Max files/dirs
.max_dir_entries(1_000) // Max per directory
.max_path_depth(64) // Max nesting
.build()
.layer(backend)
}
Guarantees:
- Writes rejected when limits exceeded
- Streaming writes tracked via
CountingWriter
5. Rate Limiting (RateLimit)
RateLimit middleware throttles operations:
#![allow(unused)]
fn main() {
RateLimitLayer::builder()
.max_ops(1000)
.per_second()
.build()
.layer(backend)
}
Guarantees:
- Operations rejected when limit exceeded
- Protects against runaway processes
6. Backend-Level Containment
Different backends achieve containment differently:
| Backend | Containment Mechanism |
|---|---|
MemoryBackend | Isolated in process memory |
SqliteBackend | Each container is a separate .db file |
IndexedBackend | SQLite index + isolated blob directory (UUID-named blobs) |
StdFsBackend | None - full filesystem access (do NOT use with untrusted input) |
VRootFsBackend | Uses strict-path::VirtualRoot to contain paths |
⚠️ Warning:
PathFiltermiddleware onStdFsBackenddoes NOT provide sandboxing. The OS still resolves paths (including symlinks) beforePathFiltercan check them. For path containment with real filesystems, useVRootFsBackend.
7. Why Virtual Backends Are Inherently Safe
For MemoryBackend and SqliteBackend, the underlying storage is isolated from the host filesystem. There is no OS filesystem to exploit - paths operate entirely within the virtual structure.
Path resolution is symlink-aware but contained: FileStorage resolves paths by walking the virtual directory structure (using metadata() and read_link() on the backend), not the OS filesystem:
Virtual backend symlink example:
/foo/bar where bar → /other/place
/foo/bar/.. resolves to /other (following the symlink target's parent)
This is correct filesystem semantics - but it happens entirely within
the virtual structure. There is no host filesystem to escape to.
This means:
- No host filesystem access - symlinks point to paths within the virtual structure only
- No TOCTOU via OS state - resolution uses the backend’s own data
- Controlled by PathResolver - the default
IterativeResolverfollows symlinks whenFsLinkis available; custom resolvers can implement different behaviors
For VRootFsBackend (real filesystem), strict-path::VirtualRoot provides equivalent guarantees by validating and containing all paths before they reach the OS.
8. Symlink Security: Virtual vs Real Backends
The security concern with symlinks is following them, not creating them.
Symlinks are just data. Creating /sandbox/link -> /etc/passwd is harmless. The danger is when reading /sandbox/link follows the symlink and accesses /etc/passwd.
| Backend Type | Symlink Creation | Symlink Following |
|---|---|---|
MemoryBackend | Supported (FsLink) | FileStorage resolves (non-SelfResolving) |
SqliteBackend | Supported (FsLink) | FileStorage resolves (non-SelfResolving) |
VRootFsBackend | Supported (FsLink) | OS controls - strict-path prevents escapes |
Virtual Backends (Memory, SQLite)
Virtual backends that implement FsLink follow symlinks during FileStorage resolution. Symlink capability is determined by trait bounds:
MemoryBackend: FsLink→ supports symlinksSqliteBackend: FsLink→ supports symlinks- Custom backend without
FsLink→ no symlinks (compile-time enforced)
If you need symlink-free behavior, use a backend that does not implement FsLink.
This is the actual security feature - controlling whether symlinks are even possible via trait bounds.
Real Filesystem Backend (VRootFsBackend)
VRootFsBackend calls OS functions (std::fs::read(), etc.) which follow symlinks automatically. We cannot control this - the OS does the symlink resolution, not us.
strict-path::VirtualRoot prevents escapes:
User requests: /sandbox/link
link -> ../../../etc/passwd
strict-path: canonicalize(/sandbox/link) = /etc/passwd
strict-path: /etc/passwd is NOT within /sandbox → DENIED
This is “follow and verify containment” - symlinks are followed by the OS, but escapes are blocked by strict-path.
Limitation: Symlinks within the jail are followed. We cannot disable this without implementing custom path resolution (TOCTOU risk) or platform-specific hacks.
Summary
| Concern | Virtual Backend | VRootFsBackend |
|---|---|---|
| Symlink creation | Supported (FsLink) | Supported (FsLink) |
| Symlink following | FileStorage resolves (non-SelfResolving) | OS controls (strict-path prevents escapes) |
| Jail escape via symlink | No host FS to escape | Prevented by strict-path |
Secure Usage Patterns
AI Agent Sandbox
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RateLimitLayer, TracingLayer, FileStorage};
let sandbox = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(50 * 1024 * 1024)
.max_file_size(5 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.deny("**/secrets/**")
.build())
.layer(RateLimitLayer::builder()
.max_ops(1000)
.per_second()
.build())
.layer(TracingLayer::new());
let fs = FileStorage::new(sandbox);
// Agent code can only access /workspace, limited resources, audited
// Note: MemoryBackend implements FsLink, so symlinks work if needed
}
Multi-Tenant Isolation
#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, FileStorage, Fs};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
fn create_tenant_storage(tenant_id: &str, quota_bytes: u64) -> FileStorage<impl Fs> {
let db_path = format!("tenants/{}.db", tenant_id);
let backend = QuotaLayer::builder()
.max_total_size(quota_bytes)
.build()
.layer(SqliteBackend::open(&db_path).unwrap());
FileStorage::new(backend)
}
// Complete isolation: separate database files
}
Read-Only Browsing
#![allow(unused)]
fn main() {
use anyfs::{VRootFsBackend, ReadOnly, FileStorage};
let readonly_fs = FileStorage::new(
ReadOnly::new(VRootFsBackend::new("/var/archive")?)
);
// All write operations return FsError::ReadOnly
}
Security Checklist
For Application Developers
- Use
PathFilterto sandbox untrusted code - Use
Quotato prevent resource exhaustion - Use
Restrictionswhen you need to disable risky operations - Use
RateLimitfor untrusted/shared environments - Use
Tracingfor audit trails - Use separate backends for separate tenants
- Keep dependencies updated
For Backend Implementers
- Ensure paths cannot escape intended scope
- For filesystem backends: use
strict-pathfor containment - Handle concurrent access safely
- Don’t leak internal paths in errors
For Middleware Implementers
- Handle streaming I/O appropriately (wrap or block)
- Document which operations are intercepted
- Fail closed (deny on error)
Encryption and Integrity Protection
AnyFS’s design enables encryption at multiple levels. Understanding the difference between container-level and file-level protection is crucial for choosing the right approach.
Container-Level vs File-Level Protection
| Level | What’s Protected | Integrity | Implementation |
|---|---|---|---|
| Container-level | Entire storage medium (.db file, serialized state) | Full structure protected | Encrypted backend |
| File-level | Individual file contents | File contents only | Encryption middleware |
Key insight: File-level encryption alone is NOT sufficient. If an attacker can modify the container structure (directory tree, metadata, file names), they can sabotage integrity even without decrypting file contents.
Threat Analysis
| Threat | File-Level Encryption | Container-Level Encryption |
|---|---|---|
| Read file contents | Protected | Protected |
| Modify file contents | Detected (with AEAD) | Detected |
| Delete files | NOT protected | Protected |
| Rename/move files | NOT protected | Protected |
| Corrupt directory structure | NOT protected | Protected |
| Replay old file versions | NOT protected | Protected (with versioning) |
| Metadata exposure (filenames, sizes) | NOT protected | Protected |
Recommendation: For sensitive data, prefer container-level encryption. Use file-level encryption when you need selective access (some files encrypted, others not).
Container-Level Encryption
Option 1: SQLCipher Backend
SQLCipher provides transparent AES-256 encryption for SQLite. In AnyFS, encryption is a feature of SqliteBackend (from the anyfs-sqlite ecosystem crate), not a separate type:
#![allow(unused)]
fn main() {
/// SqliteBackend with encryption enabled (requires `encryption` feature).
/// Uses SQLCipher for transparent AES-256 encryption.
use anyfs_sqlite::SqliteBackend;
// Open with password (derives key via PBKDF2)
let backend = SqliteBackend::open_encrypted("secure.db", "password")?;
// Or open with raw 256-bit key
let backend = SqliteBackend::open_with_key("secure.db", &key)?;
// Change password on open database
backend.change_password("new_password")?;
}
What’s protected:
- All file contents
- All metadata (names, sizes, timestamps, permissions)
- Directory structure
- Inode mappings
- Everything in the
.dbfile
Usage:
#![allow(unused)]
fn main() {
let backend = SqliteBackend::open_encrypted("secure.db", "correct-horse-battery-staple")?;
let fs = FileStorage::new(backend);
// If someone gets secure.db without the password, they see random bytes
}
Option 2: Encrypted Serialization (MemoryBackend)
For in-memory backends that need persistence:
#![allow(unused)]
fn main() {
impl MemoryBackend {
/// Serialize entire state to encrypted blob.
pub fn serialize_encrypted(&self, key: &[u8; 32]) -> Result<Vec<u8>, FsError> {
let plaintext = bincode::serialize(&self.state)?;
let nonce = generate_nonce();
let ciphertext = aes_gcm_encrypt(key, &nonce, &plaintext)?;
Ok([nonce.as_slice(), &ciphertext].concat())
}
/// Deserialize from encrypted blob.
pub fn deserialize_encrypted(data: &[u8], key: &[u8; 32]) -> Result<Self, FsError> {
let (nonce, ciphertext) = data.split_at(12);
let plaintext = aes_gcm_decrypt(key, nonce, ciphertext)?;
let state = bincode::deserialize(&plaintext)?;
Ok(Self { state })
}
}
}
Use case: Periodically save encrypted snapshots, load on startup.
File-Level Encryption (Middleware)
When you need selective encryption or per-file keys:
#![allow(unused)]
fn main() {
/// Middleware that encrypts file contents on write, decrypts on read.
/// Does NOT protect metadata, filenames, or directory structure.
pub struct FileEncryption<B> {
inner: B,
key: Secret<[u8; 32]>,
}
impl<B: Fs> FsWrite for FileEncryption<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
// Encrypt content with authenticated encryption (AES-GCM)
let nonce = generate_nonce();
let ciphertext = aes_gcm_encrypt(&self.key, &nonce, data)?;
let encrypted = [nonce.as_slice(), &ciphertext].concat();
self.inner.write(path, &encrypted)
}
}
impl<B: Fs> FsRead for FileEncryption<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let encrypted = self.inner.read(path)?;
let (nonce, ciphertext) = encrypted.split_at(12);
aes_gcm_decrypt(&self.key, nonce, ciphertext)
.map_err(|_| FsError::IntegrityError { path: path.as_ref().to_path_buf() })
}
}
}
Limitations:
- Filenames visible
- Directory structure visible
- File sizes visible (roughly - ciphertext slightly larger)
- Metadata unprotected
When to use:
- Some files need encryption, others don’t
- Different files need different keys
- Interop with systems that expect plaintext structure
Integrity Without Encryption
Sometimes you need tamper detection without hiding contents:
#![allow(unused)]
fn main() {
/// Middleware that adds HMAC to each file for integrity verification.
pub struct IntegrityVerified<B> {
inner: B,
key: Secret<[u8; 32]>,
}
impl<B: Fs> FsWrite for IntegrityVerified<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let mac = hmac_sha256(&self.key, data);
let protected = [data, mac.as_slice()].concat();
self.inner.write(path, &protected)
}
}
impl<B: Fs> FsRead for IntegrityVerified<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let protected = self.inner.read(path)?;
let (data, mac) = protected.split_at(protected.len() - 32);
if !hmac_verify(&self.key, data, mac) {
return Err(FsError::IntegrityError { path: path.as_ref().to_path_buf() });
}
Ok(data.to_vec())
}
}
}
RAM Encryption and Secure Memory
For high-security scenarios where memory dumps are a threat:
Threat Levels
| Threat | Mitigation | Library-Level? |
|---|---|---|
| Memory inspection after process exit | zeroize on drop | Yes |
| Core dumps | Disable via setrlimit | Yes (process config) |
| Swap file exposure | mlock() to pin pages | Yes (OS permitting) |
| Live memory scanning (same user) | OS process isolation | No |
| Cold boot attack | Hardware RAM encryption | No (Intel TME/AMD SME) |
| Hypervisor/DMA attack | SGX/SEV enclaves | No (hardware) |
Encrypted Memory Backend (Illustrative Pattern)
Note:
EncryptedMemoryBackendis an illustrative pattern for users who need encrypted RAM storage. It is not a built-in backend. Users can implement this pattern using the guidance below.
Keep data encrypted even in RAM - decrypt only during active use:
#![allow(unused)]
fn main() {
use zeroize::{Zeroize, ZeroizeOnDrop};
use secrecy::Secret;
/// Memory backend that stores all data encrypted in RAM.
/// Plaintext exists only briefly during read operations.
pub struct EncryptedMemoryBackend {
/// All nodes stored as encrypted blobs
nodes: HashMap<PathBuf, EncryptedNode>,
/// Encryption key - auto-zeroized on drop
key: Secret<[u8; 32]>,
}
struct EncryptedNode {
/// Encrypted file content (nonce || ciphertext)
encrypted_data: Vec<u8>,
/// Metadata can be encrypted too, or stored in the encrypted blob
metadata: EncryptedMetadata,
}
impl FsRead for EncryptedMemoryBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let node = self.nodes.get(path.as_ref())
.ok_or_else(|| FsError::NotFound { path: path.as_ref().to_path_buf() })?;
// Decrypt - plaintext briefly in RAM
let plaintext = self.decrypt(&node.encrypted_data)?;
// Return owned Vec - caller responsible for zeroizing if sensitive
Ok(plaintext)
}
}
impl FsWrite for EncryptedMemoryBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
// Encrypt immediately - plaintext never stored
let encrypted = self.encrypt(data)?;
self.nodes.insert(path.as_ref().to_path_buf(), EncryptedNode {
encrypted_data: encrypted,
metadata: self.encrypt_metadata(...)?,
});
Ok(())
}
}
impl Drop for EncryptedMemoryBackend {
fn drop(&mut self) {
// Zeroize all encrypted data (defense in depth)
for node in self.nodes.values_mut() {
node.encrypted_data.zeroize();
}
// Key is auto-zeroized via Secret<>
}
}
}
Serialization of Encrypted RAM
When persisting an encrypted memory backend:
#![allow(unused)]
fn main() {
impl EncryptedMemoryBackend {
/// Serialize to disk - data stays encrypted throughout.
/// RAM encrypted → Serialized encrypted → Disk encrypted
pub fn save_to_file(&self, path: &Path) -> Result<(), FsError> {
// Data is already encrypted in self.nodes
// Serialize the encrypted blobs directly - no decryption needed
let serialized = bincode::serialize(&self.nodes)?;
// Optionally add another encryption layer with different key
// (defense in depth: compromise of runtime key doesn't expose persisted data)
std::fs::write(path, &serialized)?;
Ok(())
}
/// Load from disk - data stays encrypted throughout.
/// Disk encrypted → Deserialized encrypted → RAM encrypted
pub fn load_from_file(path: &Path, key: Secret<[u8; 32]>) -> Result<Self, FsError> {
let serialized = std::fs::read(path)?;
let nodes = bincode::deserialize(&serialized)?;
Ok(Self { nodes, key })
}
}
}
Key property: Plaintext NEVER exists during save/load. Data flows:
Write: plaintext → encrypt → RAM (encrypted) → serialize → disk (encrypted)
Read: disk (encrypted) → deserialize → RAM (encrypted) → decrypt → plaintext
Secure Allocator Considerations
#![allow(unused)]
fn main() {
// In Cargo.toml - mimalloc secure mode zeros on free
mimalloc = { version = "0.1", features = ["secure"] }
// Note: This prevents USE-AFTER-FREE info leaks, but does NOT:
// - Encrypt RAM contents
// - Prevent live memory scanning
// - Protect against cold boot attacks
}
For true defense against memory scanning, combine:
EncryptedMemoryBackend(data encrypted at rest in RAM)zeroize(immediate cleanup of temporary plaintext)mlock()(prevent swapping sensitive pages)- Minimize plaintext lifetime (decrypt → use → zeroize immediately)
Encryption Summary
| Approach | Protects Contents | Protects Structure | RAM Security | Persistence |
|---|---|---|---|---|
SqliteBackend with encryption | Yes | Yes | No (SQLite uses plaintext RAM) | Encrypted .db file |
FileEncryption<B> middleware | Yes | No | Depends on B | Depends on B |
EncryptedMemoryBackend (illustrative) | Yes | Yes | Yes (encrypted in RAM) | Via save_to_file() |
IntegrityVerified<B> middleware | No | No (files only) | No | Depends on B |
Recommended Configurations
Sensitive Data Storage
#![allow(unused)]
fn main() {
// Full protection: encrypted container + secure memory practices
let backend = SqliteBackend::open_encrypted("secure.db", password)?;
let fs = FileStorage::new(backend);
}
High-Security RAM Processing (Illustrative)
#![allow(unused)]
fn main() {
// Data never plaintext at rest (RAM or disk)
// Note: EncryptedMemoryBackend is user-implemented (see pattern above)
let backend = EncryptedMemoryBackend::new(derive_key(password));
// ... use fs ...
backend.save_to_file("snapshot.enc")?; // Persists encrypted
}
Selective File Encryption
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// Some files encrypted, structure visible
let backend = FileEncryption::new(SqliteBackend::open("data.db")?)
.with_key(key);
}
TOCTOU-Proof Tenant Isolation with Virtual Backends
Why Virtual Backends Eliminate TOCTOU
Traditional path security libraries like strict-path work against a real filesystem:
┌─────────────────────────────────────────────────────────────────┐
│ REAL FILESYSTEM SECURITY │
│ │
│ Your Process OS Filesystem Other Processes │
│ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ Check │────────▶│ Canonical │◀────────│ Create │ │
│ │ path │ │ path │ │ symlink │ │
│ └──────────┘ └───────────┘ └──────────────┘ │
│ │ │ │ │
│ │ TOCTOU WINDOW │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌───────────┐ ┌──────────────┐ │
│ │ Use │────────▶│ DIFFERENT │◀────────│ Modified! │ │
│ │ path │ │ path now! │ │ │ │
│ └──────────┘ └───────────┘ └──────────────┘ │
│ │
│ Problem: OS state can change between check and use │
└─────────────────────────────────────────────────────────────────┘
Virtual backends eliminate this entirely:
┌─────────────────────────────────────────────────────────────────┐
│ VIRTUAL BACKEND SECURITY │
│ │
│ Your Process │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ FileStorage │ │
│ │ ┌──────────┐ ┌───────────┐ ┌──────────────────┐ │ │
│ │ │ Resolve │───▶│ SQLite │───▶│ Return data │ │ │
│ │ │ path │ │ Transaction│ │ │ │ │
│ │ └──────────┘ └───────────┘ └──────────────────┘ │ │
│ │ │ │ │
│ │ ATOMIC - No external modification possible │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ No OS filesystem. No other processes. No TOCTOU. │
└─────────────────────────────────────────────────────────────────┘
Security Comparison: strict-path vs Virtual Backend
| Threat | strict-path (Real FS) | Virtual Backend |
|---|---|---|
| Path traversal | Prevented (canonicalize + verify) | Impossible (no host FS to traverse to) |
| Symlink race (TOCTOU) | Mitigated (canonicalize first) | Impossible (we control all symlinks) |
| External symlink creation | Vulnerable window exists | Impossible (single-process ownership) |
| Windows 8.3 short names | Partial (only existing files) | N/A (no Windows FS) |
| Namespace escapes (/proc) | Fixed in soft-canonicalize | Impossible (no /proc exists) |
| Concurrent modification | OS handles (may race) | Atomic (SQLite transactions) |
| Tenant A accessing Tenant B | Requires careful path filtering | Impossible (separate .db files) |
Encryption: Separation of Concerns
Design principle: Backends handle storage, middleware handles policy. Container-level encryption is the exception.
| Security Level | Implementation | Why |
|---|---|---|
| Locked (container) | SqliteBackend with encryption feature | Must encrypt entire .db file at storage level |
| Privacy (file contents) | FileEncryption<SqliteBackend> middleware | Content encryption is policy |
| Normal | SqliteBackend | User applies encryption as needed |
Why encryption is a feature, not a separate type:
- SQLCipher is a drop-in replacement for SQLite with identical API
- The only difference is how the connection is opened (with password/key)
- Connection must be opened with password before ANY query
- Cannot be added as middleware - it’s a property of the connection itself
- Everything is encrypted: file contents, filenames, directory structure, timestamps, inodes
SqliteBackend Encryption (Ecosystem Crate, feature: encryption)
Full container encryption using SQLCipher. Encryption is a feature of SqliteBackend, not a separate type:
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
/// Encryption methods are only available with the `encryption` feature.
/// Uses SQLCipher for transparent AES-256 encryption.
///
/// Without the password, the .db file is indistinguishable from random bytes.
// Open with password (derives key via PBKDF2)
let backend = SqliteBackend::open_encrypted("secure.db", "password")?;
// Open with raw 256-bit key (no key derivation)
let backend = SqliteBackend::open_with_key("secure.db", &key)?;
// Create new encrypted database
let backend = SqliteBackend::create_encrypted("new.db", "password")?;
// Change password on open database
backend.change_password("new_password")?;
}
What SQLCipher Encrypts
| Data | Encrypted? |
|---|---|
| File contents | Yes |
| Filenames | Yes |
| Directory structure | Yes |
| File sizes | Yes |
| Timestamps | Yes |
| Permissions | Yes |
| Inode mappings | Yes |
| SQLite metadata | Yes |
| Everything in the .db file | Yes |
Cargo Configuration
[dependencies]
# anyfs-sqlite ecosystem crate with optional encryption
anyfs-sqlite = { version = "0.1" } # No encryption
anyfs-sqlite = { version = "0.1", features = ["encryption"] } # With SQLCipher
Note: The encryption feature enables SQLCipher. When enabled, open_encrypted() and open_with_key() methods become available.
Achieving Security Modes with Composition
Users compose backends and middleware to achieve their desired security level:
Locked Mode (Full Container Encryption)
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate with `encryption` feature
// Everything encrypted - password required to access anything
let backend = SqliteBackend::open_encrypted("tenant.db", "correct-horse-battery-staple")?;
let fs = FileStorage::new(backend);
// Without password: .db file is random bytes
// With password: full access to everything
}
Privacy Mode (Contents Encrypted, Metadata Visible)
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// File contents encrypted, metadata (names, sizes, structure) visible
let backend = FileEncryption::new(
SqliteBackend::open("tenant.db")?
)
.with_key(content_key);
let fs = FileStorage::new(backend);
// Host can: list files, see sizes, run statistics
// Host cannot: read file contents
}
Normal Mode (No Encryption)
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// No encryption - user encrypts sensitive files themselves
let backend = SqliteBackend::open("tenant.db")?;
let fs = FileStorage::new(backend);
// User applies per-file encryption as needed
}
Mode Comparison
| Aspect | Locked | Privacy | Normal |
|---|---|---|---|
| Implementation | SqliteBackend with encryption | FileEncryption<SqliteBackend> | SqliteBackend |
| File contents | Encrypted (SQLCipher) | Encrypted (AES-GCM) | Plaintext |
| Filenames | Encrypted | Visible | Visible |
| Directory structure | Encrypted | Visible | Visible |
| File sizes | Encrypted | Visible | Visible |
| Timestamps | Encrypted | Visible | Visible |
| Host can analyze | Nothing | Metadata only | Everything |
| Performance | Slowest (~10-15% overhead) | Medium | Fastest |
| Feature flag | encryption | middleware | (none) |
Why This Is TOCTOU-Proof
- No external filesystem - Paths exist only in our SQLite tables
- Atomic transactions - Path resolution + data access in single transaction
- Single-process ownership - No other process can modify the .db during operation
- We control symlinks - Symlinks are just rows in
nodestable, we decide when to follow - No OS involvement - OS never resolves our virtual paths
#![allow(unused)]
fn main() {
// This is TOCTOU-proof:
impl SecureSqliteBackend {
fn resolve_and_read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// Single transaction wraps everything
let tx = self.conn.transaction()?;
// 1. Resolve path (following symlinks in OUR table)
let inode = self.resolve_path_internal(&tx, path)?;
// 2. Read content
// No TOCTOU - same transaction, same snapshot
let data = tx.query_row(
"SELECT data FROM content WHERE inode = ?",
[inode],
|row| row.get(0)
)?;
// Transaction ensures atomicity
Ok(data)
}
}
}
Multi-Tenant Isolation
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate with `encryption` feature
/// Each tenant gets their own .db file - complete physical isolation
fn create_tenant_storage(tenant_id: &str, encrypted: bool) -> impl Fs {
let path = format!("tenants/{}.db", tenant_id);
if encrypted {
let password = get_tenant_password(tenant_id);
SqliteBackend::open_encrypted(&path, &password).unwrap()
} else {
SqliteBackend::open(&path).unwrap()
}
}
// Tenant A literally cannot access Tenant B's data:
// - Different .db files
// - Different passwords (if encrypted)
// - No shared state whatsoever
// - No path filtering bugs possible - there's nothing to filter
}
Comparison with strict-path approach:
| Approach | Tenant Isolation |
|---|---|
| Shared filesystem + strict-path | Logical isolation (paths filtered) |
| Shared filesystem + PathFilter | Logical isolation (middleware enforced) |
| Separate .db file per tenant | Physical isolation (separate files) |
Physical isolation is strictly stronger - there’s no bug in path filtering that could leak data because there’s no shared data to leak.
Host Analysis with Privacy Mode
When using FileEncryption<SqliteBackend> (Privacy mode), the host can query metadata directly from SQLite:
#![allow(unused)]
fn main() {
// Host can analyze metadata without the content encryption key
fn get_tenant_statistics(tenant_db: &str) -> TenantStats {
// Connect directly to SQLite (no content key needed)
let conn = Connection::open(tenant_db)?;
let (file_count, dir_count, total_size) = conn.query_row(
"SELECT
COUNT(*) FILTER (WHERE node_type = 0),
COUNT(*) FILTER (WHERE node_type = 1),
SUM(size)
FROM nodes",
[],
|row| Ok((row.get(0)?, row.get(1)?, row.get(2)?))
)?;
TenantStats { file_count, dir_count, total_size }
}
// List all files (names visible, contents encrypted)
fn list_tenant_files(tenant_db: &str) -> Vec<FileInfo> {
let conn = Connection::open(tenant_db)?;
conn.prepare("SELECT name, size, modified_at FROM nodes WHERE node_type = 0")?
.query_map([], |row| Ok(FileInfo { ... }))?
.collect()
}
}
Replacing strict-path Usage
For projects currently using strict-path for tenant isolation:
Before (strict-path):
#![allow(unused)]
fn main() {
use strict_path::VirtualRoot;
fn handle_tenant_request(tenant_id: &str, requested_path: &str) -> Result<Vec<u8>> {
// Shared filesystem, path containment via strict-path
let root = VirtualRoot::new(format!("/data/tenants/{}", tenant_id))?;
let safe_path = root.resolve(requested_path)?; // TOCTOU window here
std::fs::read(safe_path) // Another process could have modified
}
}
After (SqliteBackend with encryption - ecosystem crate):
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate with `encryption` feature
fn handle_tenant_request(tenant_id: &str, requested_path: &str) -> Result<Vec<u8>> {
// Separate encrypted database per tenant - no path containment needed
let backend = get_tenant_backend(tenant_id); // Cached connection
backend.read(requested_path) // Atomic, TOCTOU-proof
}
}
| Aspect | strict-path | Virtual Backend |
|---|---|---|
| Isolation model | Logical (path filtering) | Physical (separate files) |
| TOCTOU | Mitigated | Eliminated |
| External interference | Possible | Impossible |
| Symlink attacks | Resolved at check time | We control all symlinks |
| Cross-tenant leakage | Bug in filtering could leak | No shared data exists |
| Performance | Real FS I/O + canonicalization | SQLite (often faster for small files) |
| Encryption | Separate concern | Built-in (encryption feature) or middleware |
Known Limitations
- No ACLs: Simple permissions only (Unix mode bits)
- Side channels: Timing attacks, cache attacks require OS/hardware mitigations
- SQLite file access: Host OS can still access the
.dbfile (use Locked mode for encryption)
For implementation details, see Architecture Decision Records.
AnyFS - Technical Comparison with Alternatives
This document compares AnyFS with existing Rust filesystem abstractions.
Executive Summary
AnyFS is to filesystems what Axum/Tower is to HTTP: a composable middleware stack with pluggable backends.
Key differentiators:
- Composable middleware - Stack quota, sandboxing, tracing, caching as independent layers
- Backend agnostic - Swap Memory/SQLite/RealFS without code changes
- Policy separation - Storage logic separate from policy enforcement
- Third-party extensibility - Custom backends and middleware depend only on
anyfs-backend
Compared Solutions
| Solution | What it is | Middleware | Multiple Backends |
|---|---|---|---|
vfs | VFS trait + backends | No | Yes |
| AgentFS | SQLite agent runtime | No | No (SQLite only) |
| OpenDAL | Object storage layer | Yes | Yes (cloud-focused) |
| AnyFS | VFS + middleware stack | Yes | Yes |
1. Architecture Comparison
vfs Crate
Path-based trait, no middleware pattern:
#![allow(unused)]
fn main() {
pub trait FileSystem: Send + Sync {
fn read_dir(&self, path: &str) -> VfsResult<Box<dyn Iterator<Item = String>>>;
fn open_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndRead>>;
fn create_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndWrite>>;
// ...
}
}
Limitations:
- No standard way to add quotas, logging, sandboxing
- Each concern must be built into backends or wrapped externally
- Path validation is backend-specific
AgentFS
SQLite-based agent runtime:
#![allow(unused)]
fn main() {
// Fixed to SQLite, includes KV store and tool auditing
let fs = AgentFS::open("agent.db")?;
fs.write_file("/path", data)?;
fs.kv_set("key", "value")?; // KV store bundled
fs.toolcall_start("tool")?; // Auditing bundled
}
Limitations:
- Locked to SQLite (no memory backend for testing, no real FS)
- Monolithic design (can’t use FS without KV/auditing)
- No composable middleware
AnyFS
Tower-style middleware + pluggable backends:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, RestrictionsLayer, TracingLayer, FileStorage};
// Compose middleware stack
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.build())
.layer(TracingLayer::new());
let fs = FileStorage::new(backend);
}
Advantages:
- Add/remove middleware without touching backends
- Swap backends without touching middleware
- Third-party extensions via
anyfs-backendtrait
2. Feature Comparison
| Feature | AnyFS | vfs | AgentFS | OpenDAL |
|---|---|---|---|---|
| Middleware pattern | Yes | No | No | Yes |
| Multiple backends | Yes | Yes | No | Yes |
| SQLite backend | Yes | No | Yes | No |
| Memory backend | Yes | Yes | No | Yes |
| Real FS backend | Yes | Yes | No | No |
| Quota enforcement | Middleware | Manual | No | No |
| Path sandboxing | Middleware | Manual | No | No |
| Feature gating | Middleware | No | No | No |
| Rate limiting | Middleware | No | No | No |
| Tracing/logging | Middleware | Manual | Built-in | Middleware |
| Streaming I/O | Yes | Yes | Yes | Yes |
| Async API | Future | Partial | No | Yes |
| POSIX extension | Future | No | No | No |
| FUSE mountable | Yes | No | No | No |
| KV store | No | No | Yes | No |
3. Middleware Stack
AnyFS middleware can intercept, transform, and control operations:
| Middleware | Intercepts | Action |
|---|---|---|
Quota | Writes | Reject if over limit |
PathFilter | All ops | Block denied paths |
Restrictions | Permission changes | Block via .deny_permissions() |
RateLimit | All ops | Throttle per second |
ReadOnly | Writes | Block all writes |
Tracing | All ops | Log with tracing crate |
DryRun | Writes | Log without executing |
Cache | Reads | LRU caching |
Overlay | All ops | Union filesystem |
| Custom | Any | Encryption, compression, … |
4. Backend Trait
#![allow(unused)]
fn main() {
pub trait Fs: Send + Sync {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
// ... methods aligned with std::fs
}
}
Design principles:
&Pathin core traits (object-safe);FileStorage/FsExtacceptimpl AsRef<Path>for ergonomics- Aligned with
std::fsnaming - Streaming I/O via
open_read/open_write Sendbound for async compatibility
5. When to Use What
| Use Case | Recommendation |
|---|---|
| Need composable middleware | AnyFS |
| Need backend flexibility | AnyFS |
| Need SQLite + Memory + RealFS | AnyFS |
| Need just VFS abstraction (no policies) | vfs |
| Need AI agent runtime with KV + auditing | AgentFS |
| Need cloud object storage | OpenDAL |
| Need async-first design | OpenDAL (or wait for AnyFS async) |
6. Deep Dive: vfs Crate Compatibility
The vfs crate is the most similar project. This section details why we don’t adopt their trait and how we’ll provide interop.
vfs::FileSystem Trait (Complete)
#![allow(unused)]
fn main() {
pub trait FileSystem: Send + Sync {
// Required (9 methods)
fn read_dir(&self, path: &str) -> VfsResult<Box<dyn Iterator<Item = String>>>;
fn create_dir(&self, path: &str) -> VfsResult<()>;
fn open_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndRead>>;
fn create_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndWrite>>;
fn append_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndWrite>>;
fn metadata(&self, path: &str) -> VfsResult<VfsMetadata>;
fn exists(&self, path: &str) -> VfsResult<bool>;
fn remove_file(&self, path: &str) -> VfsResult<()>;
fn remove_dir(&self, path: &str) -> VfsResult<()>;
// Optional - default to NotSupported (6 methods)
fn set_creation_time(&self, path: &str, time: SystemTime) -> VfsResult<()>;
fn set_modification_time(&self, path: &str, time: SystemTime) -> VfsResult<()>;
fn set_access_time(&self, path: &str, time: SystemTime) -> VfsResult<()>;
fn copy_file(&self, src: &str, dest: &str) -> VfsResult<()>;
fn move_file(&self, src: &str, dest: &str) -> VfsResult<()>;
fn move_dir(&self, src: &str, dest: &str) -> VfsResult<()>;
}
}
Feature Gap Analysis
| Feature | vfs | AnyFS | Gap |
|---|---|---|---|
| Basic read/write | Yes | Yes | - |
| Directory ops | Yes | Yes | - |
| Streaming I/O | Yes | Yes | - |
rename | move_file | Yes | - |
copy | copy_file | Yes | - |
| Symlinks | No | Yes | Critical |
| Hard links | No | Yes | Critical |
| Permissions | No | Yes | Critical |
| truncate | No | Yes | Missing |
| sync/fsync | No | Yes | Missing |
| statfs | No | Yes | Missing |
| read_range | No | Yes | Missing |
| symlink_metadata | No | Yes | Missing |
| Path type | &str | &Path (core) + impl AsRef<Path> in ergonomic layer | Different |
| Middleware | No | Yes | Architectural |
Why Not Adopt Their Trait?
- No symlinks/hardlinks - Can’t virtualize real filesystem semantics
- No permissions - Our
Restrictionsmiddleware needsset_permissionsto gate - No durability primitives - No
sync/fsyncfor data integrity - No middleware pattern - Their
VfsPathbakes in behaviors we want composable &strpaths - Core traits use&Pathfor object safety; ergonomics come fromFileStorage/FsExt
Our trait is a strict superset. Everything vfs can do, we can do. The reverse is not true.
vfs Backends
| vfs Backend | AnyFS Equivalent | Notes |
|---|---|---|
PhysicalFS | StdFsBackend | Both use real filesystem directly |
MemoryFS | MemoryBackend | Both in-memory |
OverlayFS | Overlay<B1,B2> | Both union filesystems |
AltrootFS | VRootFsBackend | Both provide path containment |
EmbeddedFS | (none) | Read-only embedded assets |
| (none) | SqliteBackend | We have SQLite |
Interoperability Plan
Future anyfs-vfs-compat crate provides bidirectional adapters:
#![allow(unused)]
fn main() {
use anyfs_vfs_compat::{VfsCompat, AnyFsCompat};
// Use a vfs backend in AnyFS
// Missing features return FsError::NotSupported
let backend = VfsCompat::new(vfs::MemoryFS::new());
let fs = FileStorage::new(backend);
// Use an AnyFS backend in vfs-based code
// Only exposes what vfs supports
let anyfs_backend = MemoryBackend::new();
let vfs_fs: Box<dyn vfs::FileSystem> = Box::new(AnyFsCompat::new(anyfs_backend));
}
Use cases:
- Migrate from
vfsto AnyFS incrementally - Use
vfs::EmbeddedFSin AnyFS (read-only embedded assets) - Use AnyFS backends in projects depending on
vfs
7. Tradeoffs
AnyFS Advantages
- Composable middleware pattern
- Backend-agnostic
- Third-party extensibility
- Clean separation of concerns
- Full filesystem semantics (symlinks, permissions, durability)
AnyFS Limitations
- Sync-first (async planned)
- Smaller ecosystem (new project)
- Not full POSIX emulation
If this document conflicts with AGENTS.md or src/architecture/design-overview.md, treat those as authoritative.
AnyFS - Build vs. Reuse Analysis
Can your goals be achieved with existing crates, or does this project need to exist?
Core Requirements
- Backend flexibility - swap storage without changing application code
- Composable middleware - add/remove capabilities (quotas, sandboxing, logging)
- Tenant isolation - each tenant gets an isolated namespace
- Portable storage - single-file backend (SQLite) for easy move/copy/backup
- Filesystem semantics -
std::fs-aligned operations including symlinks and hard links - Path containment - prevent traversal attacks
What Already Exists
vfs crate (Rust)
What it provides:
- Filesystem abstraction with multiple backends
- MemoryFS, PhysicalFS, AltrootFS, OverlayFS, EmbeddedFS
What it lacks:
- SQLite backend
- Composable middleware pattern
- Quota/limit enforcement
- Policy layers (feature gating, path filtering)
AgentFS (Turso)
What it provides:
- SQLite-based filesystem for AI agents
- Key-value store
- Tool call auditing
- FUSE mounting
What it lacks:
- Multiple backend types (SQLite only)
- Composable middleware
- Backend-agnostic abstraction
rusqlite
What it provides: SQLite bindings, transactions, blobs.
What it lacks: Filesystem semantics, quota enforcement.
strict-path
What it provides: Path validation and containment (VirtualRoot).
What it lacks: Storage backends, filesystem API.
Gap Analysis
| Requirement | vfs | AgentFS | rusqlite | strict-path |
|---|---|---|---|---|
| Filesystem API | Yes | Yes | No | No |
| Multiple backends | Yes | No | N/A | No |
| SQLite backend | No | Yes | Yes (raw) | No |
| Composable middleware | No | No | No | No |
| Quota enforcement | No | No | Manual | No |
| Path sandboxing | Partial | No | Manual | Yes |
| Symlink/hard link control | Backend-dep | Yes | Manual | N/A |
Conclusion: No existing crate provides:
“Backend-agnostic filesystem abstraction with composable middleware for quotas, sandboxing, and policy enforcement.”
Why AnyFS Exists
AnyFS fills the gap by separating concerns:
| Crate | Responsibility |
|---|---|
anyfs-backend | Trait (Fs, Layer) + types |
anyfs | Backends + middleware + ergonomic wrapper (FileStorage<B>) |
The middleware pattern (like Tower/Axum) enables composition:
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, PathFilterLayer, TracingLayer, FileStorage};
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build())
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.build())
.layer(TracingLayer::new());
let fs = FileStorage::new(backend);
fs.write("/workspace/doc.txt", b"hello")?;
}
Alternatives Considered
Option A: Implement SQLite backend for vfs crate
Pros: Ecosystem compatibility.
Cons:
- No middleware pattern for quotas/policies
- Would still need to build quota/sandboxing outside the trait
- Doesn’t solve the composability problem
Option B: Use AgentFS
Pros: Already exists, SQLite-based, FUSE support.
Cons:
- Locked to SQLite (can’t swap to memory/real FS)
- No composable middleware
- Includes KV store and auditing we may not need
Option C: AnyFS (recommended)
Pros:
- Backend-agnostic (swap storage without code changes)
- Composable middleware (add/remove capabilities)
- Clean separation of concerns
- Third-party extensibility
Cons:
- New project, not yet widely adopted
Recommendation
Build AnyFS with reusable primitives (rusqlite, strict-path, thiserror, tracing) but maintain the two-crate split. The middleware pattern is what makes the design both flexible and safe.
Compatibility option: Later, provide an adapter that implements vfs traits on top of Fs for projects that need vfs compatibility.
Prior Art Analysis: Filesystem Abstraction Libraries
This document analyzes filesystem abstraction libraries in other languages to learn from their successes, identify features we should adopt, and avoid known vulnerabilities.
Executive Summary
| Library | Language | Key Strength | Key Weakness | What We Can Learn |
|---|---|---|---|---|
| fsspec | Python | Async + caching + data science integration | No middleware composition | Caching strategies, async design |
| PyFilesystem2 | Python | Clean URL-based API | Symlink handling issues | Path normalization |
| Afero | Go | Composition (CopyOnWrite, Cache, BasePathFs) | Symlink escape in BasePathFs | Composition patterns |
| Apache Commons VFS | Java | Enterprise-grade, many backends | CVE: Path traversal with encoded .. | URL encoding attacks |
| System.IO.Abstractions | .NET | Perfect for testing, mirrors System.IO | No middleware/composition | MockFileSystem patterns |
| memfs | Node.js | Browser + Node unified API | Fork exists due to “longstanding bugs” | In-memory implementation |
| soft-canonicalize | Rust | Non-existing path resolution, TOCTOU-safe | Real FS only (not virtual) | Attack patterns to defend |
| strict-path | Rust | 19+ attack types blocked, type-safe markers | Real FS only (not virtual) | Attack catalog for testing |
Detailed Analysis
1. Python: fsspec
Repository: fsspec/filesystem_spec
What they do well:
-
Unified Interface Across 20+ Backends
- Local, S3, GCS, Azure, HDFS, HTTP, FTP, SFTP, ZIP, TAR, Git, etc.
- Same API regardless of backend
-
Sophisticated Caching
# Block-wise caching - only download accessed parts fs = fsspec.filesystem('blockcache', target_protocol='s3', cache_storage='/tmp/cache') # Whole-file caching fs = fsspec.filesystem('filecache', target_protocol='s3', cache_storage='/tmp/cache') -
Async Support
AsyncFileSystembase class for async implementations- Concurrent bulk operations (
catfetches many files at once) - Used by Dask for parallel data processing
-
Data Science Integration
- Native integration with Pandas, Dask, Intake
- Parquet optimization with parallel chunk fetching
What we should adopt:
- Block-wise caching strategy (not just whole-file LRU)
- Async design from the start (our ADR-024 async plan)
- Consider “parts caching” for large file access patterns
What they lack that we have:
- No middleware composition pattern
- No quota/rate limiting built-in
- No path filtering/sandboxing
2. Python: PyFilesystem2
Repository: PyFilesystem/pyfilesystem2
What they do well:
-
URL-based Filesystem Specification
from fs import open_fs home_fs = open_fs('osfs://~/') zip_fs = open_fs('zip://foo.zip') ftp_fs = open_fs('ftp://ftp.example.com') mem_fs = open_fs('mem://') -
Consistent Path Handling
- Forward slashes everywhere (even on Windows)
- Paths normalized automatically
-
Glob Support Built-in
for match in fs.glob('**/*.py'): print(match.path)
Known Issues (from GitHub):
| Issue | Description | Impact |
|---|---|---|
| #171 | Symlink loops cause infinite recursion | DoS potential |
| #417 | No symlink creation support | Missing feature |
| #411 | Incorrect handling of symlinks with non-existing targets | Broken functionality |
| #61 | Symlinks not detected properly | Security concern |
Lessons for AnyFS:
- ⚠️ Symlink handling is complex - we must handle loops, non-existent targets, and escaping
- ✅ URL-based opening is convenient - consider for future
- ✅ Consistent path format - virtual backends use forward slashes internally; OS-backed backends follow OS semantics
3. Go: Afero
Repository: spf13/afero
What they do well:
-
Composition Pattern (Similar to Ours!)
// Sandboxing baseFs := afero.NewOsFs() restrictedFs := afero.NewBasePathFs(baseFs, "/var/data") // Caching layer cachedFs := afero.NewCacheOnReadFs(baseFs, afero.NewMemMapFs(), time.Hour) // Copy-on-write cowFs := afero.NewCopyOnWriteFs(baseFs, afero.NewMemMapFs()) -
io/fs Compatibility
- Works with Go 1.16+ standard library interfaces
ReadDirFS,ReadFileFS, etc.
-
Extensive Backend Support
- OS, Memory, SFTP, GCS
- Community: S3, MinIO, Dropbox, Google Drive, Git
Known Issues:
| Issue | Description | Our Mitigation |
|---|---|---|
| #282 | Symlinks in BasePathFs can escape jail | Use strict-path crate for VRootFsBackend |
| #88 | Symlink handling inconsistent | Document behavior clearly |
| #344 | BasePathFs fails when basepath is . | Test edge cases |
BasePathFs Symlink Escape Issue:
“SymlinkIfPossible will resolve the RealPath of underlayer filesystem before make a symlink. For example, creating a link like ‘/foo/bar’ -> ‘/foo/file’ will be transform into a link point to ‘/{basepath}/foo/file.’”
This means symlinks can potentially point outside the base path!
Our Solution:
VRootFsBackendusesstrict-pathfor real filesystem containment- Virtual backends (Memory, SQLite) are inherently safe - paths are just keys
PathFiltermiddleware provides additional sandboxing layer
What we should verify:
- Test symlink creation pointing outside VRootFsBackend
- Test
..in symlink targets - Test symlink loops with max depth
4. Java: Apache Commons VFS
Repository: Apache Commons VFS
🔴 CRITICAL VULNERABILITY: CVE in versions < 2.10.0
The Bug:
// FileObject API has resolveFile with scope parameter
FileObject file = baseFile.resolveFile("../secret.txt", NameScope.DESCENDENT);
// SHOULD throw exception - "../secret.txt" is not a descendent
// BUT with URL encoding:
FileObject file = baseFile.resolveFile("%2e%2e/secret.txt", NameScope.DESCENDENT);
// DOES NOT throw exception! Returns file outside base directory.
Root Cause: Path validation happened BEFORE URL decoding.
Lesson for AnyFS:
#![allow(unused)]
fn main() {
// WRONG - validate then decode
fn resolve(path: &str) -> Result<PathBuf, FsError> {
validate_no_traversal(path)?; // Checks for ".."
let decoded = url_decode(path); // "../" appears after decode!
Ok(PathBuf::from(decoded))
}
// CORRECT - decode then validate
fn resolve(path: &str) -> Result<PathBuf, FsError> {
let decoded = url_decode(path);
let normalized = normalize_path(&decoded); // Resolve all ".."
validate_containment(&normalized)?;
Ok(normalized)
}
}
Action Items:
- Add test: URL-encoded
%2e%2epath traversal attempt - Add test: Double-encoding
%252e%252e - Ensure path normalization happens BEFORE validation
- Document in security model
5. .NET: System.IO.Abstractions
Repository: TestableIO/System.IO.Abstractions
What they do well:
-
Perfect API Compatibility
- Mirrors
System.IOexactly - Drop-in replacement for testing
- Mirrors
-
MockFileSystem for Testing
var fileSystem = new MockFileSystem(new Dictionary<string, MockFileData> { { @"c:\myfile.txt", new MockFileData("Testing") }, { @"c:\demo\jQuery.js", new MockFileData("jQuery content") }, }); // Use in tests var sut = new MyComponent(fileSystem); -
Analyzers Package
- Roslyn analyzers warn when using
System.IOdirectly - Guides developers to use abstractions
- Roslyn analyzers warn when using
What they lack:
- No middleware/composition
- No caching layer
- No sandboxing/path filtering
- Testing-focused, not production backends
What we should adopt:
- Consider Rust analyzer/clippy lint for
std::fsusage - MockFileSystem pattern is similar to our
MemoryBackend
6. Node.js: memfs + unionfs
Repository: streamich/memfs
What they do well:
-
Browser + Node Unified
- Works in browser via File System API
- Same API as Node’s
fs
-
Union Filesystem Composition
import { Union } from 'unionfs'; import { fs as memfs } from 'memfs'; import * as fs from 'fs'; const ufs = new Union(); ufs.use(fs); // Real filesystem as base ufs.use(memfs); // Memory overlay
Known Issues:
“There is a fork of memfs maintained by SageMath (sagemathinc/memfs-js) which was created to fix 13 security vulnerabilities revealed by npm audit. This fork exists because, as their GitHub description notes, ‘there are longstanding bugs’ in the upstream memfs.”
Lesson: Even popular libraries can have security issues. Our conformance test suite should be comprehensive.
Vulnerabilities Summary
| Library | Vulnerability | Type | Our Mitigation |
|---|---|---|---|
| Apache Commons VFS | CVE (pre-2.10.0) | URL-encoded path traversal | Decode before validate |
| Afero (Go) | Issue #282, #88 | Symlink escape from BasePathFs | Use strict-path, test thoroughly |
| PyFilesystem2 | Issue #171 | Symlink loop causes infinite recursion | Loop detection with max depth |
| memfs (Node) | 13 vulns in npm audit | Various (unspecified) | Comprehensive test suite |
Features Comparison Matrix
| Feature | fsspec | PyFS2 | Afero | Commons VFS | System.IO.Abs | AnyFS |
|---|---|---|---|---|---|---|
| Middleware composition | ❌ | ❌ | ✅ | ❌ | ❌ | ✅ |
| Quota enforcement | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Path sandboxing | ❌ | ❌ | ✅ | ✅ | ❌ | ✅ |
| Rate limiting | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
| Caching layer | ✅ | ❌ | ✅ | ❌ | ❌ | ✅ |
| Async support | ✅ | ❌ | ❌ | ❌ | ❌ | 🔜 |
| Block-wise caching | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ |
| URL-based opening | ✅ | ✅ | ❌ | ✅ | ❌ | ❌ |
| Union/overlay FS | ❌ | ❌ | ✅ | ❌ | ❌ | ✅ |
| Memory backend | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| SQLite backend | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
| FUSE mounting | ✅ | ❌ | ✅ | ❌ | ❌ | 🔜 |
| Type-safe wrappers* | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
Future Ideas to Consider
These are optional extensions inspired by other ecosystems. They are intentionally not part of the core scope.
Keep (add-ons that fit the current design):
- URL-based backend registry (
sqlite://,mem://,stdfs://) as a helper crate, not in core APIs. - Bulk operation helpers (
read_many,write_many,copy_many,glob,walk) asFsExtor a utilities crate. - Early async adapter crate (
anyfs-async) to support remote backends without changing sync traits. - Bash-style shell (example app or
anyfs-shellcrate) that routesls/cd/cat/cp/mv/rm/mkdir/statthroughFileStorageto demonstrate middleware and backend neutrality (navigation and file management only, not full bash scripting). - Copy-on-write overlay middleware (Afero-style
CopyOnWriteFs) as a specializedOverlayvariant. - Archive backends (zip/tar) as separate crates implementing
Fs(PyFilesystem/fsspec-style).
Defer (valuable, but needs data or wider review):
- Range/block caching middleware for
read_rangeheavy workloads (fsspec-style block cache). - Runtime capability discovery (
Capabilitiesstruct) for feature detection (symlink control, case sensitivity, max path length). - Lint/analyzer to discourage direct
std::fsusage in app code (System.IO.Abstractions-style). - Retry/timeout middleware for remote backends (once remote backends exist).
Drop for now (adds noise or cross-platform complexity):
- Change notification support (optional
FsWatchtrait or polling middleware).
Security Tests to Add
Based on vulnerabilities found in other libraries, add these to our conformance test suite:
Path Traversal Tests
#![allow(unused)]
fn main() {
#[test]
fn test_url_encoded_path_traversal() {
let fs = create_sandboxed_fs("/sandbox");
// These should all fail or be contained
assert!(fs.read("%2e%2e/etc/passwd").is_err()); // URL-encoded ../
assert!(fs.read("%252e%252e/secret").is_err()); // Double-encoded
assert!(fs.read("..%2f..%2fetc/passwd").is_err()); // Mixed encoding
assert!(fs.read("....//....//etc/passwd").is_err()); // Extra dots
}
#[test]
fn test_symlink_escape() {
let fs = create_sandboxed_fs("/sandbox");
// Symlink pointing outside should fail or be contained
assert!(fs.symlink("/etc/passwd", "/sandbox/link").is_err());
assert!(fs.symlink("../../../etc/passwd", "/sandbox/link").is_err());
// Even if symlink created, reading should fail
fs.symlink("../secret", "/sandbox/link").ok();
assert!(fs.read("/sandbox/link").is_err());
}
#[test]
fn test_symlink_loop_detection() {
let fs = MemoryBackend::new();
// Create loop: a -> b -> a
fs.symlink("/b", "/a").unwrap();
fs.symlink("/a", "/b").unwrap();
// Should detect loop, not hang
let result = fs.read("/a");
assert!(matches!(result, Err(FsError::TooManySymlinks { .. })));
}
}
Resource Exhaustion Tests
#![allow(unused)]
fn main() {
#[test]
fn test_deep_directory_traversal() {
let fs = create_fs_with_depth_limit(64);
// Creating very deep paths should fail
let deep_path = "/".to_string() + &"a/".repeat(100);
assert!(fs.create_dir_all(&deep_path).is_err());
}
#[test]
fn test_many_open_handles() {
let fs = create_fs();
let mut handles = vec![];
// Opening many files shouldn't crash
for i in 0..10000 {
fs.write(format!("/file{}", i), b"x").unwrap();
if let Ok(h) = fs.open_read(format!("/file{}", i)) {
handles.push(h);
}
}
// Should either succeed or return resource error, not crash
}
}
Action Items
High Priority
| Task | Source | Priority |
|---|---|---|
| Add URL-encoded path traversal tests | Apache Commons VFS CVE | 🔴 Critical |
| Add symlink escape tests for VRootFsBackend | Afero issues | 🔴 Critical |
| Add symlink loop detection | PyFilesystem2 #171 | 🔴 Critical |
Verify strict-path handles all edge cases | Afero BasePathFs issues | 🔴 Critical |
Medium Priority (Future)
| Task | Source | Priority |
|---|---|---|
| Consider block-wise caching for large files | fsspec | 🟡 Enhancement |
| Add async support | fsspec async design | 🟡 Enhancement |
| URL-based filesystem specification | PyFilesystem2, Commons VFS | 🟢 Nice-to-have |
Documentation
| Task | Source |
|---|---|
| Document symlink behavior for each backend | All libraries have issues |
| Add security considerations for path handling | Apache Commons VFS CVE |
| Compare AnyFS to alternatives | This analysis |
Sibling Rust Projects: Path Security Libraries
AnyFS builds on foundational security work from two related Rust crates that specifically address path resolution vulnerabilities. These crates are planned to be used in AnyFS’s path handling implementation.
soft-canonicalize-rs
Repository: DK26/soft-canonicalize-rs
Purpose: Path canonicalization that works with non-existing paths—a critical gap in std::fs::canonicalize.
Security Features:
| Feature | Description | Attack Prevented |
|---|---|---|
| NTFS ADS validation | Blocks alternate data stream syntax | Hidden data, path escape |
| Symlink cycle detection | Bounded depth tracking | DoS via infinite loops |
| Path traversal clamping | Can’t ascend past root | Directory escape |
| Null byte rejection | Early validation | Null injection |
| TOCTOU resistance | Atomic-like resolution | Race conditions |
| Windows UNC handling | Normalizes extended paths | Path confusion |
| Linux namespace preservation | Uses proc-canonicalize | Container escape via /proc/PID/root |
Key Innovation: Anchored Canonicalization
#![allow(unused)]
fn main() {
// All paths (including symlink targets) are clamped to anchor
let result = anchored_canonicalize("/workspace", user_input)?;
// If symlink points to /etc/passwd, result becomes /workspace/etc/passwd
}
This is exactly what VRootFsBackend needs for safe path containment.
strict-path-rs
Repository: DK26/strict-path-rs
Purpose: Type-safe path handling that prevents traversal attacks at compile time.
Two Modes:
| Mode | Behavior | Use Case |
|---|---|---|
StrictPath | Returns Err(PathEscapesBoundary) on escape | Archive extraction, file uploads |
VirtualPath | Clamps escape attempts within sandbox | Multi-tenant, per-user storage |
Documented Attack Coverage (19+ vulnerabilities):
| Attack Type | Description |
|---|---|
| Symlink/junction escapes | Follows and validates canonical paths |
| Windows 8.3 short names | Detects PROGRA~1 obfuscation |
| NTFS Alternate Data Streams | Blocks file.txt:hidden:$DATA |
| Zip Slip (CVE-2018-1000178) | Validates archive entries before extraction |
| TOCTOU (CVE-2022-21658) | Handles time-of-check-time-of-use races |
| Unicode/encoding bypasses | Normalizes path representations |
| Mixed separators | Handles / and \ on Windows |
| UNC path tricks | Prevents \\?\C:\..\..\ attacks |
Type-Safe Marker Pattern (mirrors AnyFS’s design!):
#![allow(unused)]
fn main() {
struct UserFiles;
struct SystemFiles;
fn process_user(f: &StrictPath<UserFiles>) { /* ... */ }
// Wrong marker type = compile error
}
Applicability to AnyFS
Important distinction:
| Backend Type | Storage Mechanism | Path Resolution Provider |
|---|---|---|
VRootFsBackend | Real filesystem | OS (backend is SelfResolving) |
MemoryBackend | HashMap keys | FileStorage (symlink-aware) |
SqliteBackend | DB strings | FileStorage (symlink-aware) |
For virtual backends (Memory, SQLite, etc.):
- These third-party crates perform real filesystem resolution (follow actual symlinks on disk)
- Virtual backends treat paths as keys, so these crates can’t help
- AnyFS implements its own path resolution in
FileStoragethat:- Walks path components via
metadata()andread_link() - Resolves symlinks by reading targets from virtual storage
- Handles
..correctly after symlink resolution - Detects loops by tracking visited virtual paths
- Walks path components via
For VRootFsBackend only:
- Since it wraps the real filesystem,
strict-pathprovides safe containment - The backend implements
SelfResolving, so FileStorage skips its own resolution
Security Tests Added to Conformance Suite
Based on these libraries, we’ve added tests for:
Windows-Specific:
- NTFS Alternate Data Streams (
file.txt:hidden) - Windows 8.3 short names (
PROGRA~1) - UNC path traversal (
\\?\C:\..\..\) - Reserved device names (CON, PRN, NUL)
- Junction point escapes
Linux-Specific:
/proc/PID/rootmagic symlinks/dev/fd/Nfile descriptor symlinks
Unicode:
- NFC vs NFD normalization
- Right-to-Left Override (U+202E)
- Homoglyph confusion (Cyrillic vs Latin)
TOCTOU:
- Check-then-use race conditions
- Symlink target changes during resolution
Conclusion
What makes AnyFS unique:
- Middleware composition - Only Afero has this, and we do it better (Tower-style)
- Quota + rate limiting - No other library has built-in resource control
- Type-safe wrappers - Users can create wrapper newtypes for compile-time container isolation
- SQLite backend - No other abstraction library offers this
What we should learn from others:
- Path traversal via encoding - Apache Commons VFS vulnerability
- Symlink handling complexity - All libraries struggle with this
- Caching strategies - fsspec’s block-wise caching is sophisticated
- Async support - fsspec shows how to do this well
Critical security tests to add:
- URL-encoded path traversal (
%2e%2e) - Symlink escape from sandboxed directories
- Symlink loop detection
- Deep path exhaustion
Sources
External Libraries
- fsspec Documentation
- PyFilesystem2 GitHub
- Afero GitHub
- Apache Commons VFS
- System.IO.Abstractions GitHub
- memfs GitHub
Sibling Rust Projects
Vulnerability References
- Apache Commons VFS CVEs (NVD search)
- Afero BasePathFs Issue #282
- PyFilesystem2 Symlink Loop Issue #171
- CVE-2018-1000178 (Zip Slip)
- CVE-2022-21658 (TOCTOU in Rust std)
Benchmarking Plan
This document specifies the benchmarking strategy for AnyFS when the implementation exists. Functionality and security are the primary goals; performance validation is secondary but important.
Goals
- Validate design decisions - Confirm that the Tower-style middleware approach doesn’t introduce unacceptable overhead
- Identify optimization opportunities - Find hot paths that need attention
- Establish baselines - Know where we stand relative to alternatives
- Prevent regressions - Track performance across versions
Benchmark Categories
1. Backend Benchmarks
Compare AnyFS backends against equivalent solutions for their specific use cases.
MemoryBackend vs Alternatives
| Competitor | Use Case | Why Compare |
|---|---|---|
std::collections::HashMap | Raw key-value baseline | Theoretical minimum overhead |
tempfile + std::fs | In-memory temp files | Common testing approach |
vfs::MemoryFS | Virtual filesystem | Direct competitor |
virtual-fs | In-memory FS | Another VFS crate |
Metrics:
- Sequential read/write throughput (1KB, 64KB, 1MB, 16MB files)
- Random access latency (small reads at random offsets)
- Directory listing performance (10, 100, 1000, 10000 entries)
- Memory overhead per file/directory
SqliteBackend vs Alternatives
| Competitor | Use Case | Why Compare |
|---|---|---|
rusqlite raw | Baseline SQLite performance | Measure our abstraction cost |
sled | Embedded database | Alternative storage engine |
redb | Embedded database | Modern alternative |
| File-per-record | Direct filesystem | Traditional approach |
Metrics:
- Insert throughput (batch vs individual)
- Read throughput (sequential vs random)
- Transaction overhead
- Database size vs raw file size
- Startup time (opening existing database)
VRootFsBackend vs Alternatives
| Competitor | Use Case | Why Compare |
|---|---|---|
std::fs direct | Baseline filesystem | Measure containment overhead |
cap-std | Capability-based FS | Security-focused alternative |
chroot simulation | Traditional sandboxing | System-level approach |
Metrics:
- Path resolution overhead
- Symlink traversal cost
- Escape attempt detection cost
2. Middleware Overhead Benchmarks
Measure the cost of each middleware layer.
| Middleware | What to Measure |
|---|---|
Quota<B> | Size tracking overhead per operation |
PathFilter<B> | Glob matching cost per path |
ReadOnly<B> | Should be zero (just error return) |
RateLimit<B> | Fixed-window counter check overhead |
Tracing<B> | Span creation/logging cost |
Cache<B> | Cache hit/miss latency difference |
Key question: What’s the cost of a 5-layer middleware stack vs direct backend access?
Target: Middleware overhead should be <5% of I/O time for typical operations.
3. Composition Benchmarks
Measure real-world stacks, not isolated components.
AI Agent Sandbox Stack
Quota → PathFilter → RateLimit → Tracing → MemoryBackend
Compare against:
- Raw MemoryBackend (baseline)
- Manual checks in application code (alternative approach)
Persistent Database Stack
Cache → Tracing → SqliteBackend
Compare against:
- Raw SqliteBackend (baseline)
- Application-level caching (alternative approach)
4. Trait Implementation Benchmarks
Validate that strategic boxing doesn’t hurt performance.
| Operation | Expected Cost |
|---|---|
read() / write() | Zero-cost (monomorphized) |
open_read() → Box<dyn Read> | ~50ns allocation, negligible vs I/O |
read_dir() → ReadDirIter | One allocation per call |
FileStorage::boxed() | One-time cost at setup |
Competitor Matrix
By Use Case
| Use Case | AnyFS Component | Primary Competitors |
|---|---|---|
| Testing/mocking | MemoryBackend | tempfile, vfs::MemoryFS |
| Embedded database | SqliteBackend | sled, redb, raw SQLite |
| Sandboxed host access | VRootFsBackend | cap-std, chroot |
| Policy enforcement | Middleware stack | Manual application code |
| Union filesystem | Overlay | overlayfs (kernel), fuse-overlayfs |
Crate Comparison
| Crate | Strengths | Weaknesses | Compare For |
|---|---|---|---|
vfs | Simple API | No middleware, limited features | API ergonomics |
virtual-fs | WASM support | Less composable | Cross-platform |
cap-std | Security-focused | Different abstraction level | Sandboxing |
tempfile | Battle-tested | Not a VFS | Temp file operations |
include_dir | Compile-time embedding | Read-only | Embedded assets |
Benchmark Infrastructure
Framework
Use criterion for statistical rigor:
- Warm-up iterations
- Outlier detection
- Comparison between runs
Test Data Sets
| Dataset | Contents | Purpose |
|---|---|---|
| Small files | 1000 files × 1KB | Metadata-heavy workload |
| Large files | 10 files × 100MB | Throughput workload |
| Deep hierarchy | 10 levels × 10 dirs | Path resolution stress |
| Wide directory | 1 dir × 10000 files | Listing performance |
| Mixed realistic | Project-like structure | Real-world simulation |
Reporting
Generate:
- Throughput charts (ops/sec, MB/sec)
- Latency histograms (p50, p95, p99)
- Memory usage graphs
- Comparison tables vs competitors
Performance Targets
These are aspirational targets to validate during implementation:
| Metric | Target | Rationale |
|---|---|---|
| Middleware overhead | <5% of I/O time | Composability shouldn’t cost much |
| MemoryBackend vs HashMap | <2x slower | Abstraction cost |
| SqliteBackend vs raw SQLite | <1.5x slower | Thin wrapper |
| VRootFsBackend vs std::fs | <1.2x slower | Path checking cost |
| 5-layer stack | <10% overhead | Real-world composition |
Benchmark Workflow
Development Phase
cargo bench --bench <component>
Run focused benchmarks during development to catch regressions.
Release Phase
cargo bench --all
Full benchmark suite before releases, with comparison to previous version.
CI Integration
- Run subset of benchmarks on PR (smoke test)
- Full benchmark suite on main branch
- Store results for trend analysis
Non-Goals
- Beating std::fs at raw I/O - We add abstraction; some overhead is acceptable
- Micro-optimizing cold paths - Focus on hot paths (read, write, metadata)
- Benchmark gaming - Optimize for real use cases, not synthetic benchmarks
Tracking
GitHub Issue: Implement benchmark suite
- Blocked by: Core AnyFS implementation
- Dependencies:
criterion, test data generation - Milestone: Post-1.0 (after functionality and security are solid)
Implementation Plan
This plan describes a phased rollout of the AnyFS ecosystem:
anyfs-backend: Layered traits (Fs,FsFull,FsFuse,FsPosix) +Layer+ typesanyfs: Built-in backends + middleware (feature-gated) +FileStorage<B>ergonomic wrapper
Implementation Guidelines
These guidelines apply to ALL implementation work. Derived from analysis of issues in similar projects (vfs, agentfs).
1. No Panic Policy
NEVER panic in library code. Always return Result<T, FsError>.
- Audit all
.unwrap()and.expect()calls - replace with?or proper error handling - Use
ok_or_else(|| FsError::...)instead of.unwrap() - Edge cases must return errors, not panic
- Test in constrained environments (WASM) to catch hidden panics
#![allow(unused)]
fn main() {
// BAD
let entry = self.entries.get(&path).unwrap();
// GOOD
let entry = self.entries.get(&path)
.ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
}
2. Thread Safety Requirements
All backends must be safe for concurrent access:
MemoryBackend: UseArc<RwLock<...>>for internal stateSqliteBackend: Use WAL mode, handleSQLITE_BUSYVRootFsBackend: File operations are inherently concurrent-safe
Required: Concurrent stress tests in conformance suite.
3. Consistent Path Handling
FileStorage handles path resolution via pluggable PathResolver trait (see ADR-033):
- Always absolute paths internally
- Always
/separator (even on Windows) - Default
IterativeResolver: symlink-aware canonicalization (not lexical) - Handle edge cases:
//, trailing/, empty string, circular symlinks - Optional resolver:
CachingResolver(for read-heavy workloads)
Public canonicalization API on FileStorage:
canonicalize(path)- strict, all components must existsoft_canonicalize(path)- resolves existing, appends non-existent lexicallyanchored_canonicalize(path, anchor)- sandboxed resolution
Standalone utility:
normalize(path)- lexical cleanup only (collapses//, removes trailing/). Does NOT resolve.or...
4. Error Type Design
FsError must be:
- Easy to pattern match
- Include context (path, operation)
- Derive
thiserrorfor good messages - Use
#[non_exhaustive]for forward compatibility
#![allow(unused)]
fn main() {
#[non_exhaustive]
#[derive(Debug, thiserror::Error)]
pub enum FsError {
// Path/File Errors
#[error("not found: {path}")]
NotFound { path: PathBuf },
#[error("{operation}: already exists: {path}")]
AlreadyExists { path: PathBuf, operation: &'static str },
#[error("not a file: {path}")]
NotAFile { path: PathBuf },
#[error("not a directory: {path}")]
NotADirectory { path: PathBuf },
#[error("directory not empty: {path}")]
DirectoryNotEmpty { path: PathBuf },
// Permission/Access Errors
#[error("{operation}: permission denied: {path}")]
PermissionDenied { path: PathBuf, operation: &'static str },
#[error("access denied: {path} ({reason})")]
AccessDenied { path: PathBuf, reason: String },
#[error("read-only filesystem: {operation}")]
ReadOnly { operation: &'static str },
#[error("{operation}: feature not enabled: {feature}")]
FeatureNotEnabled { feature: &'static str, operation: &'static str },
// Resource Limit Errors (from Quota middleware)
#[error("quota exceeded: limit {limit}, requested {requested}, usage {usage}")]
QuotaExceeded { limit: u64, requested: u64, usage: u64 },
#[error("file size exceeded: {path} ({size} > {limit})")]
FileSizeExceeded { path: PathBuf, size: u64, limit: u64 },
#[error("rate limit exceeded: {limit}/s (window: {window_secs}s)")]
RateLimitExceeded { limit: u32, window_secs: u64 },
// ... see design-overview.md for complete list
}
}
See design-overview.md for the complete FsError definition.
5. Documentation Requirements
Every backend and middleware must document:
- Thread safety guarantees
- Performance characteristics
- Which operations are O(1) vs O(n)
- Any platform-specific behavior
Phase 1: anyfs-backend (core contract)
Goal: Define the stable backend interface using layered traits.
Layered Trait Architecture
FsPosix
│
┌──────────────┼──────────────┐
│ │ │
FsHandles FsLock FsXattr
│ │ │
└──────────────┴──────────────┘
│
FsFuse ← FsFull + FsInode
│
┌──────────────┴──────────────┐
│ │
FsFull FsInode
│
│
├──────┬───────┬───────┬──────┐
│ │ │ │ │
FsLink FsPerm FsSync FsStats │
│ │ │ │ │
└──────┴───────┴───────┴──────┘
│
Fs ← Most users only need this
│
┌───────────┼───────────┐
│ │ │
FsRead FsWrite FsDir
Core Traits (Layer 1 - Required)
FsRead:read,read_to_string,read_range,exists,metadata,open_readFsWrite:write,append,remove_file,rename,copy,truncate,open_writeFsDir:read_dir,create_dir,create_dir_all,remove_dir,remove_dir_all
Extended Traits (Layer 2 - Optional)
FsLink:symlink,hard_link,read_link,symlink_metadataFsPermissions:set_permissionsFsSync:sync,fsyncFsStats:statfs
Inode Trait (Layer 3 - For FUSE)
FsInode:path_to_inode,inode_to_path,lookup,metadata_by_inode- No blanket/default implementation - must be explicitly implemented
- Required for FUSE mounting (FUSE operates on inodes, not paths)
- Enables correct hardlink reporting (same inode = same file, nlink count)
- Note: FsLink defines hardlink creation; FsInode enables FUSE to track them
inode_to_pathrequires backend to maintain path mappings
POSIX Traits (Layer 4 - Full POSIX)
FsHandles:open,read_at,write_at,closeFsLock:lock,try_lock,unlockFsXattr:get_xattr,set_xattr,remove_xattr,list_xattr
Convenience Supertraits
#![allow(unused)]
fn main() {
/// Basic filesystem - covers 90% of use cases
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}
/// Full filesystem with all std::fs features
pub trait FsFull: Fs + FsLink + FsPermissions + FsSync + FsStats {}
/// FUSE-mountable filesystem
pub trait FsFuse: FsFull + FsInode {}
/// Full POSIX filesystem
pub trait FsPosix: FsFuse + FsHandles + FsLock + FsXattr {}
}
Other Definitions
- Define
Layertrait (Tower-style middleware composition) - Define
FsExttrait (extension methods for JSON, type checks) - Define
FsPathtrait (path canonicalization with default impl, requiresFsRead + FsLink) - Define core types (
Metadata,Permissions,FileType,DirEntry,StatFs) - Define
FsErrorwith contextual variants (see guidelines above) - Define
ROOT_INODE = 1constant - Define
SelfResolvingmarker trait (opt-in for backends that handle their own path resolution, e.g., VRootFsBackend)
Exit criteria: anyfs-backend stands alone with minimal dependencies (thiserror required; serde optional for JSON in FsExt).
Phase 2: anyfs (backends + middleware)
Goal: Provide reference backends and core middleware.
Path Resolution (FileStorage’s Responsibility)
FileStorage handles path resolution using its configured PathResolver:
- Walks path component by component using
metadata()andread_link() - Handles
..correctly after symlink resolution (symlink-aware, not lexical) - Default
IterativeResolverfollows symlinks for backends that implementFsLink - Custom resolvers can implement different behaviors (e.g., no symlink following)
- Detects circular symlinks (max depth or visited set)
- Returns canonical resolved path to the backend
SelfResolving backends (StdFsBackend, VRootFsBackend) handle their own resolution. Use FileStorage::with_resolver(backend, NoOpResolver) explicitly.
Backends receive already-resolved paths - they just store/retrieve bytes.
Backends (feature-gated)
Each backend implements the traits it supports:
memory(default):MemoryBackend- Implements:
Fs+FsLink+FsPermissions+FsSync+FsStats+FsInode=FsFuse - FileStorage handles path resolution (symlink-aware)
- Inode source: internal node IDs (incrementing counter)
- Implements:
stdfs(optional):StdFsBackend- directstd::fsdelegation- Implements:
FsPosix(all traits including Layer 4) +SelfResolving - Implements
SelfResolving(OS handles resolution) - Inode source: OS inode numbers (
std::fs::Metadata::ino()) - No path containment - full filesystem access
- Use when you only need middleware layers without sandboxing
- Implements:
vrootfs(optional):VRootFsBackendusingstrict-pathfor containment- Implements:
FsPosix(all traits including Layer 4) +SelfResolving - Implements
SelfResolving(OS handles resolution,strict-pathprevents escapes) - Inode source: OS inode numbers (
std::fs::Metadata::ino())
- Implements:
Middleware
Quota<B>+QuotaLayer- Resource limitsRestrictions<B>+RestrictionsLayer- Runtime policy (.deny_permissions())PathFilter<B>+PathFilterLayer- Path-based access controlReadOnly<B>+ReadOnlyLayer- Block writesRateLimit<B>+RateLimitLayer- Operation throttlingTracing<B>+TracingLayer- InstrumentationDryRun<B>+DryRunLayer- Log without executingCache<B>+CacheLayer- LRU read cacheOverlay<B1,B2>+OverlayLayer- Union filesystem
FileStorage (Ergonomic Wrapper)
FileStorage<B>- Thin wrapper withstd::fs-aligned API- Generic backend
B(no boxing, static dispatch) - Boxed
PathResolverinternally (cold path, boxing OK per ADR-025) .boxed()method for opt-in type erasure when needed- Users who need type-safe domains create wrapper types:
struct SandboxFs(FileStorage<B>)
- Generic backend
BackendStackbuilder for fluent middleware composition- Accepts
impl AsRef<Path>inFileStorage/FsExt(core traits use&Path) - Delegates all operations to wrapped backend
Axum-style design: Zero-cost by default, type erasure opt-in.
Note: FileStorage contains NO policy logic. Policy is handled by middleware.
Exit criteria: Each backend implements the appropriate trait level (Fs, FsFull, FsFuse) and passes conformance suite. Each middleware wraps backends implementing the same traits. Applications can use FileStorage as drop-in for std::fs patterns.
Phase 3: Conformance test suite
Goal: Prevent backend divergence and validate middleware behavior.
Backend conformance tests
Conformance tests are organized by trait layer:
Layer 1: Fs (Core) - All backends MUST pass
- FsRead:
read/read_to_string/read_range/exists/metadata/open_read - FsWrite:
write/append/remove_file/rename/copy/truncate/open_write - FsDir:
read_dir/create_dir*/remove_dir*
Layer 2: FsFull (Extended) - Backends that support these features
- FsLink:
symlink/hard_link/read_link/symlink_metadata - FsPermissions:
set_permissions - FsSync:
sync/fsync - FsStats:
statfs
Layer 3: FsFuse (Inode) - Backends that support FUSE mounting
- FsInode:
path_to_inode/inode_to_path/lookup/metadata_by_inode
Layer 4: FsPosix (Full POSIX) - Backends that support full POSIX
- FsHandles:
open/read_at/write_at/close - FsLock:
lock/try_lock/unlock - FsXattr:
get_xattr/set_xattr/remove_xattr/list_xattr
Path Resolution Tests (virtual backends only)
/foo/../barresolves correctly whenfoois a regular directory/foo/../barresolves correctly whenfoois a symlink (follows symlink, then..)- Symlink chains resolve correctly (A → B → C → target)
- Circular symlink detection (A → B → A returns error, not infinite loop)
- Max symlink depth enforced (prevent deep chains)
- Reading a symlink follows the target (virtual backends)
Path Edge Cases (learned from vfs issues)
//double//slashes//normalizes correctly- Note:
/foo/../barrequires resolution (see above), not simple normalization - Trailing slashes handled consistently
- Empty path returns error (not panic)
- Root path
/works correctly - Very long paths (near OS limits)
- Unicode paths
- Paths with spaces and special characters
Thread Safety Tests (learned from vfs #72, #47)
- Concurrent
readfrom multiple threads - Concurrent
writeto different files - Concurrent
create_dir_allto same path (must not race) - Concurrent
read_dirwhile modifying directory - Stress test: 100 threads, 1000 operations each
Error Handling Tests (learned from vfs #8, #23)
- Missing file returns
NotFound, not panic - Missing parent directory returns error, not panic
- Invalid UTF-8 in path returns error, not panic
- All error variants are matchable
Platform Tests
- Windows path separators (
\vs/) - Case sensitivity differences
- Symlink behavior differences
Middleware tests
Quota: Limit enforcement, usage tracking, streaming writesRestrictions: Permission blocking via.deny_permissions(), error messagesPathFilter: Glob pattern matching, deny-by-defaultRateLimit: Throttling behavior, burst handlingReadOnly: All write operations blockedTracing: Operations logged correctly- Middleware composition order (inner to outer)
- Middleware with streaming I/O (wrappers work correctly)
No-Panic Tests
#![allow(unused)]
fn main() {
#[test]
fn no_panic_on_missing_file() {
let backend = create_backend();
let result = backend.read(std::path::Path::new("/nonexistent"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn no_panic_on_invalid_operation() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
// Try to read directory on a file
let result = backend.read_dir(std::path::Path::new("/file.txt"));
assert!(matches!(result, Err(FsError::NotADirectory { .. })));
}
}
WASM Compatibility Tests (learned from vfs #68)
#![allow(unused)]
fn main() {
#[cfg(target_arch = "wasm32")]
#[wasm_bindgen_test]
fn memory_backend_works_in_wasm() {
let backend = MemoryBackend::new();
backend.write(std::path::Path::new("/test.txt"), b"hello").unwrap();
// Should not panic
}
}
Exit criteria: All backends pass same suite; middleware tests are backend-agnostic; zero panics in any test.
Phase 4: Documentation + examples
- Keep
AGENTS.mdandsrc/architecture/design-overview.mdauthoritative - Provide example per backend
- Provide backend implementer guide
- Provide middleware implementer guide
- Document performance characteristics per backend
- Document thread safety guarantees per backend
- Document platform-specific behavior
Phase 5: CI/CD Pipeline
Goal: Ensure quality across platforms and prevent regressions.
Cross-Platform Testing
# .github/workflows/ci.yml
strategy:
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
rust: [stable, beta]
Required CI checks:
cargo teston all platformscargo clippy -- -D warningscargo fmt --checkcargo doc --no-deps- WASM build test:
cargo build --target wasm32-unknown-unknown
Additional CI Jobs
- Miri (undefined behavior detection):
cargo +nightly miri test - Address Sanitizer: Detect memory issues
- Thread Sanitizer: Detect data races
- Coverage: Minimum 80% line coverage
Release Checklist
- All CI checks pass
- No new
clippywarnings - CHANGELOG updated
- Version bumped appropriately
- Documentation builds without warnings
Phase 6: Mounting Support (fuse, winfsp features)
Goal: Make mounting AnyFS stacks easy, safe, and enjoyable for programmers. Mounting is part of the anyfs crate behind feature flags.
Milestones
- Phase 0 (design complete): API shape and roadmap
MountHandle,MountBuilder,MountOptions,MountError- Platform detection hooks (
is_available) and error mapping - Examples anchored in the mounting guide
- Phase 1: Linux FUSE MVP (read-only)
- Lookup/getattr/readdir/read via
fuser - Read-only mount option; write ops return
PermissionDenied
- Lookup/getattr/readdir/read via
- Phase 2: Linux FUSE read/write
- Create/write/rename/remove/link operations
- Capability reporting and metadata mapping
- Phase 3: macOS parity (macFUSE)
- Adapter compatibility + driver detection
- Phase 4: Windows support (WinFsp, optional Dokan)
- Windows-specific mapping + driver detection
Exit criteria: Phase 2 delivered with reliable mount/unmount, no panics, and smoke tests; macOS/Windows continue in subsequent milestones.
API sketch (subject to change):
#![allow(unused)]
fn main() {
use anyfs::{MemoryBackend, QuotaLayer, FsFuse, MountHandle};
// RAM drive with 1GB quota
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(1024 * 1024 * 1024)
.build());
// Backend must implement FsFuse (includes FsInode)
let mount = MountHandle::mount(backend, "/mnt/ramdisk")?;
// Now it's a real mount point:
// $ df -h /mnt/ramdisk
// $ cp large_file.bin /mnt/ramdisk/ # fast!
// $ gcc -o /mnt/ramdisk/build ... # compile in RAM
}
Cross-Platform Support (planned):
| Platform | Provider | Rust Crate | Feature Flag | User Must Install |
|---|---|---|---|---|
| Linux | FUSE | fuser | fuse | fuse3 package |
| macOS | macFUSE | fuser | fuse | macFUSE |
| Windows | WinFsp | winfsp | winfsp | WinFsp |
The anyfs crate provides a unified API across platforms:
#![allow(unused)]
fn main() {
impl MountHandle {
#[cfg(unix)]
pub fn mount<B: FsFuse>(backend: B, path: impl AsRef<Path>) -> Result<Self, ...> {
// Uses fuser crate
}
#[cfg(windows)]
pub fn mount<B: FsFuse>(backend: B, path: impl AsRef<Path>) -> Result<Self, ...> {
// Uses winfsp crate
}
}
}
Creative Use Cases:
| Backend Stack | What You Get |
|---|---|
MemoryBackend | RAM drive |
MemoryBackend + Quota | RAM drive with size limit |
SqliteBackend | Single-file portable drive |
SqliteBackend (with SQLCipher) | Encrypted portable drive |
Overlay<SqliteBackend, MemoryBackend> | Persistent base + RAM scratch layer |
Cache<SqliteBackend> | SQLite with RAM read cache |
Tracing<MemoryBackend> | RAM drive with full audit log |
ReadOnly<SqliteBackend> | Immutable snapshot mount |
Example: AI Agent Sandbox
#![allow(unused)]
fn main() {
// Sandboxed workspace mounted as real filesystem
let sandbox = MountHandle::mount(
MemoryBackend::new()
.layer(PathFilterLayer::builder()
.allow("/**")
.deny("**/..*") // No hidden files
.build())
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024)
.build()),
"/mnt/agent-workspace"
)?;
// Agent's tools can now use standard filesystem APIs
// All operations are sandboxed, logged, and quota-limited
}
Architecture:
┌────────────────────────────────────────────────┐
│ /mnt/myfs (FUSE mount point) │
├────────────────────────────────────────────────┤
│ anyfs::mount (fuse/winfsp feature) │
│ - Linux/macOS: fuser │
│ - Windows: winfsp │
├────────────────────────────────────────────────┤
│ Middleware stack (Quota, PathFilter, etc.) │
├────────────────────────────────────────────────┤
│ FsFuse (Memory, SQLite, etc.) │
│ └─ includes FsInode for efficient lookups │
│ │
│ Optional: FsPosix for locks/xattr │
└────────────────────────────────────────────────┘
Requirements:
- Backend must implement
FsFuse(includesFsInodefor efficient inode operations) - Backends implementing
FsPosixget full lock/xattr support - Platform-specific FUSE provider must be installed
Future work (post-MVP)
- Async API (
AsyncFs,AsyncFsFull, etc.) - Import/export helpers (host path <-> container)
- Encryption middleware
- Compression middleware
no_stdsupport (learned fromvfs#38)- Batch operations for performance (learned from
agentfs#130) - URL-based backend registry helper (e.g.,
sqlite://,mem://) - Copy-on-write overlay variant (Afero-style
CopyOnWriteFs) - Archive backends (zip/tar) as separate crates
- Indexing middleware with pluggable index backends (SQLite, PostgreSQL, MariaDB, etc.)
- Companion shell (
anyfs-shell) for interactive exploration of backends and middleware - Language bindings (
anyfs-pythonvia PyO3, C bindings) - see design-overview.md for approach - Dynamic middleware plugin system (
MiddlewarePlugintrait for runtime-loaded.so/.dllplugins) - Metrics middleware with Prometheus exporter (
GET /metricsendpoint) - Configurable tracing/logging backends (structured logs, CEF events, remote sinks)
anyfs-shell - Local Companion Shell
Minimal interactive shell for exploring AnyFS behavior without writing a full app. This is a companion crate, not part of the core libraries.
Goals:
- Route all operations through
FileStorageto exercise middleware and backend composition. - Provide a familiar, low-noise CLI for navigation and file management.
- Keep scope intentionally small (no scripting, pipes, job control).
Command set:
ls [path]- list directory entries (default: current directory).cd <path>- change working directory.pwd- print current directory.cat <path>- print file contents (UTF-8; error on invalid data).cp <src> <dst>- copy files.mv <src> <dst>- rename/move files.rm <path>- remove file.mkdir <path>- create directory.stat <path>- show metadata (type, size, times, permissions if supported).help,exit- basic shell control.
Flags (minimal):
ls -l- long listing with size/type and modified time (when available).mkdir -p- create intermediate directories.rm -r- remove directory tree.
Backend selection (initial sketch):
--backend mem(default),--backend sqlite --db path,--backend stdfs --root path,--backend vrootfs --root path.--config pathto load a small TOML file describing backend + middleware stack.
Example session:
anyfs:/ > ls
docs tmp hello.txt
anyfs:/ > cat hello.txt
Hello!
anyfs:/ > stat docs
type=dir size=0 modified=2025-02-01T12:34:56Z
anyfs:/ > exit
anyfs-vfs-compat - Interop with vfs crate
Adapter crate for bidirectional compatibility with the vfs crate ecosystem.
Why not adopt their trait? The vfs::FileSystem trait is too limited:
- No symlinks, hard links, or permissions
- No
sync/fsyncfor durability - No
truncate,statfs, orread_range - No middleware composition pattern
Our layered traits are a superset - Fs covers everything vfs::FileSystem does, plus our extended traits add more.
Adapters:
#![allow(unused)]
fn main() {
// Wrap a vfs::FileSystem to use as AnyFS backend
// Only implements Fs (Layer 1) - no links, permissions, etc.
pub struct VfsCompat<F: vfs::FileSystem>(F);
impl<F: vfs::FileSystem> FsRead for VfsCompat<F> { ... }
impl<F: vfs::FileSystem> FsWrite for VfsCompat<F> { ... }
impl<F: vfs::FileSystem> FsDir for VfsCompat<F> { ... }
// VfsCompat<F> implements Fs via blanket impl
// Wrap an AnyFS backend to use as vfs::FileSystem
// Any backend implementing Fs works
pub struct AnyFsCompat<B: Fs>(B);
impl<B: Fs> vfs::FileSystem for AnyFsCompat<B> { ... }
}
Use cases:
- Migrate from
vfsto AnyFS incrementally - Use existing
vfsbackends (EmbeddedFS) in AnyFS - Use AnyFS backends in projects that depend on
vfs
Cloud Storage & Remote Access
The layered trait design enables building cloud storage services - each adapter requires only the traits it needs.
Architecture:
┌─────────────────────────────────────────────────────────────────────┐
│ YOUR SERVER │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Quota<Tracing<SqliteBackend>> (implements FsFuse) │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ ▲ ▲ ▲ ▲ │
│ │ │ │ │ │
│ ┌────┴────┐ ┌─────┴─────┐ ┌─────┴─────┐ ┌─────┴─────┐ │
│ │ S3 API │ │ gRPC/REST │ │ NFS │ │ WebDAV │ │
│ │ (Fs) │ │ (Fs) │ │ (FsFuse) │ │ (FsFull)│ │
│ └────┬────┘ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ │
└─────────┼──────────────┼──────────────┼──────────────┼─────────────┘
│ │ │ │
▼ ▼ ▼ ▼
AWS SDK/CLI Your SDK/app mount /cloud mount /webdav
Future crates for remote access:
| Crate | Required Trait | Purpose |
|---|---|---|
anyfs-s3-server | Fs | Expose as S3-compatible API (objects = files) |
anyfs-sftp-server | FsFull | SFTP server with permissions/links |
anyfs-ssh-shell | FsFuse | SSH server with FUSE-mounted home directories |
anyfs-remote | Fs | RemoteBackend client (implements Fs) |
anyfs-grpc | Fs | gRPC protocol adapter |
anyfs-webdav | FsFull | WebDAV server (needs permissions) |
anyfs-nfs | FsFuse | NFS server (needs inodes) |
anyfs-s3-server - S3-Compatible Object Storage
Expose any Fs backend as an S3-compatible API. Users access your storage with standard AWS SDKs.
#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, TracingLayer};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
use anyfs_s3_server::S3Server;
// Your storage backend with quotas and audit logging
let backend = SqliteBackend::open("storage.db")?
.layer(TracingLayer::new())
.layer(QuotaLayer::builder()
.max_total_size(100 * 1024 * 1024 * 1024) // 100GB
.build());
S3Server::new(backend)
.with_auth(auth_provider) // Your auth implementation
.with_bucket("user-files") // Virtual bucket name
.bind("0.0.0.0:9000")
.run()
.await?;
}
Client usage (standard AWS CLI/SDK):
# Upload a file
aws s3 cp document.pdf s3://user-files/ --endpoint-url http://yourserver:9000
# List files
aws s3 ls s3://user-files/ --endpoint-url http://yourserver:9000
# Download a file
aws s3 cp s3://user-files/document.pdf ./local.pdf --endpoint-url http://yourserver:9000
anyfs-remote - Remote Backend Client
An Fs implementation that connects to a remote server. Works with FileStorage or mounting.
#![allow(unused)]
fn main() {
use anyfs_remote::RemoteBackend;
use anyfs::FileStorage;
// Connect to your cloud service
let remote = RemoteBackend::connect("https://api.yourservice.com")
.with_auth(api_key)
.await?;
// Use like any other backend
let fs = FileStorage::new(remote);
fs.write("/documents/report.pdf", data)?;
}
Combined with FUSE for transparent mount:
#![allow(unused)]
fn main() {
use anyfs_remote::RemoteBackend;
use anyfs::MountHandle;
// Mount remote storage as local directory
let remote = RemoteBackend::connect("https://yourserver.com")?;
MountHandle::mount(remote, "/mnt/cloud")?;
// Now use standard filesystem tools:
// $ cp file.txt /mnt/cloud/
// $ ls /mnt/cloud/
// $ cat /mnt/cloud/file.txt
}
anyfs-grpc - gRPC Protocol
Efficient binary protocol for remote Fs access.
Server side:
#![allow(unused)]
fn main() {
use anyfs_grpc::GrpcServer;
let backend = SqliteBackend::open("storage.db")?;
GrpcServer::new(backend)
.bind("[::1]:50051")
.serve()
.await?;
}
Client side:
#![allow(unused)]
fn main() {
use anyfs_grpc::GrpcBackend;
let backend = GrpcBackend::connect("http://[::1]:50051").await?;
let fs = FileStorage::new(backend);
}
Multi-Tenant Cloud Storage Example
#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, PathFilterLayer, TracingLayer};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
use anyfs_s3_server::S3Server;
// Per-tenant backend factory
fn create_tenant_storage(tenant_id: &str, quota_bytes: u64) -> impl Fs {
let db_path = format!("/data/tenants/{}.db", tenant_id);
SqliteBackend::open(&db_path).unwrap()
.layer(TracingLayer::new()
.with_target(&format!("tenant.{}", tenant_id)))
.layer(PathFilterLayer::builder()
.allow("/**")
.deny("../**") // No path traversal
.build())
.layer(QuotaLayer::builder()
.max_total_size(quota_bytes)
.build())
}
// Tenant-aware S3 server
S3Server::new_multi_tenant(|request| {
let tenant_id = extract_tenant(request)?;
let quota = get_tenant_quota(tenant_id)?;
Ok(create_tenant_storage(tenant_id, quota))
})
.bind("0.0.0.0:9000")
.run()
.await?;
}
anyfs-sftp-server - SFTP Access with Shell Commands
Expose a FsFull backend as an SFTP server. Users connect with standard SSH/SFTP clients and navigate with familiar shell commands.
Architecture:
┌─────────────────────────────────────────────────────────────────┐
│ YOUR SERVER │
│ │
│ ┌───────────────┐ ┌───────────────────────────────────────┐ │
│ │ SFTP Server │───▶│ User's isolated FileStorage │ │
│ │ (anyfs-sftp) │ │ └─▶ Quota<SqliteBackend> │ │
│ │ └───────────────┘ │ └─▶ /data/users/alice.db │ │
│ │ ▲ └───────────────────────────────────────┘ │
│ └─────────┼───────────────────────────────────────────────────────┘
│ │
│ │ sftp://
│ │
│ ┌─────┴─────┐
│ │ Remote │ $ cd /documents
│ │ User │ $ ls
│ │ (shell) │ $ put file.txt
│ └───────────┘
Server implementation:
#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, TracingLayer};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
use anyfs_sftp_server::SftpServer;
// Per-user isolated backend factory
fn get_user_storage(username: &str) -> impl FsFull {
let db_path = format!("/data/users/{}.db", username);
SqliteBackend::open(&db_path).unwrap()
.layer(TracingLayer::new()
.with_target(&format!("user.{}", username)))
.layer(QuotaLayer::builder()
.max_total_size(10 * 1024 * 1024 * 1024) // 10GB per user
.build())
}
SftpServer::new(get_user_storage)
.with_host_key("/etc/ssh/host_key")
.bind("0.0.0.0:22")
.run()
.await?;
}
User experience (standard SFTP client):
$ sftp alice@yourserver.com
Connected to yourserver.com.
sftp> pwd
/
sftp> ls
documents/ photos/ backup/
sftp> cd documents
sftp> ls
report.pdf notes.txt
sftp> put local_file.txt
Uploading local_file.txt to /documents/local_file.txt
sftp> get notes.txt
Downloading /documents/notes.txt
sftp> mkdir projects
sftp> rm old_file.txt
All operations happen on the user’s isolated SQLite database on your server.
anyfs-ssh-shell - Full Shell Access with Sandboxed Home
Give users a real SSH shell where their home directory is backed by FsFuse.
Server implementation:
#![allow(unused)]
fn main() {
use anyfs::{QuotaLayer, MountHandle};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
use anyfs_ssh_shell::SshShellServer;
// On user login, mount their isolated storage as $HOME
fn on_user_login(username: &str) -> Result<(), Error> {
let db_path = format!("/data/users/{}.db", username);
let backend = SqliteBackend::open(&db_path)?
.layer(QuotaLayer::builder()
.max_total_size(10 * 1024 * 1024 * 1024)
.build());
let mount_point = format!("/home/{}", username);
MountHandle::mount(backend, &mount_point)?;
Ok(())
}
SshShellServer::new()
.on_login(on_user_login)
.bind("0.0.0.0:22")
.run()
.await?;
}
User experience (full shell):
$ ssh alice@yourserver.com
Welcome to YourServer!
alice@server:~$ pwd
/home/alice
alice@server:~$ ls -la
total 3
drwxr-xr-x 4 alice alice 4096 Dec 25 10:00 .
drwxr-xr-x 2 alice alice 4096 Dec 25 10:00 documents
drwxr-xr-x 2 alice alice 4096 Dec 25 10:00 photos
alice@server:~$ cat documents/notes.txt
Hello world!
alice@server:~$ echo "new content" > documents/new_file.txt
alice@server:~$ du -sh .
150M .
# Everything they do is actually stored in /data/users/alice.db on the server!
# They can use vim, gcc, python - all working on their isolated FsFuse backend
Isolated Shell Hosting Use Cases
| Use Case | Backend Stack | What Users Get |
|---|---|---|
| Shared hosting | Quota<SqliteBackend> | Shell + isolated home in SQLite |
| Dev containers | Overlay<BaseImage, MemoryBackend> | Shared base + ephemeral scratch |
| Coding education | Quota<MemoryBackend> | Temporary sandboxed environment |
| CI/CD runners | Tracing<MemoryBackend> | Audited ephemeral workspace |
| Secure file drop | PathFilter<SqliteBackend> | Write-only inbox directory |
Access Pattern Summary
| Access Method | Crate | Client Requirement | Best For |
|---|---|---|---|
| S3 API | anyfs-s3-server | AWS SDK (any language) | Object storage, web apps |
| SFTP | anyfs-sftp-server | Any SFTP client | Shell-like file access |
| SSH Shell | anyfs-ssh-shell + anyfs (fuse feature) | SSH client | Full shell with sandboxed home |
| gRPC | anyfs-grpc | Generated client | High-performance apps |
| REST | Custom adapter | HTTP client | Simple integrations |
| FUSE mount | anyfs (fuse feature) + anyfs-remote | FUSE installed | Transparent local access |
| WebDAV | anyfs-webdav | WebDAV client/OS | File manager access |
| NFS | anyfs-nfs | NFS client | Unix network shares |
Lessons Learned (Reference)
This plan incorporates lessons from issues in similar projects:
| Source | Issue | Lesson Applied |
|---|---|---|
| vfs #72 | RwLock panic | Thread safety tests |
| vfs #47 | create_dir_all race | Concurrent stress tests |
| vfs #8, #23 | Panics instead of errors | No-panic policy |
| vfs #24, #42 | Path inconsistencies | Path edge case tests |
| vfs #33 | Hard to match errors | Ergonomic FsError design |
| vfs #68 | WASM panics | WASM compatibility tests |
| vfs #66 | 'static confusion | Minimal trait bounds |
| agentfs #130 | Slow file deletion | Performance documentation |
| agentfs #129 | Signal handling | Proper Drop implementations |
See Lessons from Similar Projects for full analysis.
Backend Implementer’s Guide
This guide walks you through implementing a custom AnyFS backend.
Overview
AnyFS uses layered traits - you implement only what you need:
FsPosix (full POSIX)
│
FsFuse (FUSE-mountable)
│
FsFull (std::fs features)
│
Fs (basic - 90% of use cases)
│
FsRead + FsWrite + FsDir (core)
Key properties:
- Backends accept
&Pathfor all path parameters - Backends receive already-resolved paths - FileStorage handles path resolution via pluggable
PathResolver(see ADR-033). Default isIterativeResolverfor symlink-aware resolution. - Backends handle storage only - just store/retrieve bytes at given paths
- Policy (limits, feature gates) is handled by middleware, not backends
- Implement only the traits your backend supports
- Backends must be thread-safe - all trait methods use
&self, so backends must use interior mutability (e.g.,RwLock,Mutex) for synchronization
Dependency
Depend only on anyfs-backend:
[dependencies]
anyfs-backend = "0.1"
Choosing Which Traits to Implement
| Your Backend Supports | Implement |
|---|---|
| Basic file operations | Fs (= FsRead + FsWrite + FsDir) |
| Links, permissions, sync | Add FsLink, FsPermissions, FsSync, FsStats |
| Hardlinks, FUSE mounting | Add FsInode → becomes FsFuse |
| Full POSIX (handles, locks, xattr) | Add FsHandles, FsLock, FsXattr → becomes FsPosix |
Minimal Backend: Just Fs
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, FsError, Metadata, DirEntry};
use std::io::{Read, Write};
use std::path::{Path, PathBuf};
pub struct MyBackend {
// Your storage fields
}
// Implement FsRead
impl FsRead for MyBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
todo!()
}
fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
let data = self.read(path)?;
String::from_utf8(data).map_err(|e| FsError::Backend(e.to_string()))
}
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
todo!()
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
todo!()
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
todo!()
}
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
let data = self.read(path)?;
Ok(Box::new(std::io::Cursor::new(data)))
}
}
// Implement FsWrite
impl FsWrite for MyBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
todo!()
}
fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
todo!()
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
todo!()
}
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
todo!()
}
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError> {
todo!()
}
fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError> {
todo!()
}
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
todo!()
}
}
// Implement FsDir
impl FsDir for MyBackend {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
todo!()
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
todo!()
}
fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
todo!()
}
fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
todo!()
}
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
todo!()
}
}
// MyBackend now implements Fs automatically (blanket impl)!
}
Implementation Steps
Step 1: Pick a Data Model
Your backend needs internal storage. Options:
- HashMap-based:
HashMap<PathBuf, Entry>for simple cases - Tree-based: Explicit directory tree structure
- Database-backed: SQLite, key-value store, etc.
Minimum metadata per entry:
- File type (file/directory/symlink)
- Size (for files)
- Content (for files)
- Timestamps (optional)
- Permissions (optional)
Step 2: Implement FsRead (Layer 1)
Start with read operations (easiest):
#![allow(unused)]
fn main() {
impl FsRead for MyBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
fn read_to_string(&self, path: &Path) -> Result<String, FsError>;
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError>;
fn exists(&self, path: &Path) -> Result<bool, FsError>;
fn metadata(&self, path: &Path) -> Result<Metadata, FsError>;
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError>;
}
}
Streaming implementation options:
For MemoryBackend or similar, you can use std::io::Cursor:
#![allow(unused)]
fn main() {
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
let data = self.read(path)?;
Ok(Box::new(std::io::Cursor::new(data)))
}
}
For VRootFsBackend, return the actual file handle:
#![allow(unused)]
fn main() {
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
let file = std::fs::File::open(self.resolve(path)?)?;
Ok(Box::new(file))
}
}
Step 3: Implement FsWrite (Layer 1)
#![allow(unused)]
fn main() {
impl FsWrite for MyBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
fn append(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
fn remove_file(&self, path: &Path) -> Result<(), FsError>;
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError>;
fn copy(&self, from: &Path, to: &Path) -> Result<(), FsError>;
fn truncate(&self, path: &Path, size: u64) -> Result<(), FsError>;
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError>;
}
}
Note on truncate:
- If
size < current: discard trailing bytes - If
size > current: extend with zero bytes - Required for FUSE support and editor save operations
Step 4: Implement FsDir (Layer 1)
#![allow(unused)]
fn main() {
impl FsDir for MyBackend {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError>;
fn create_dir(&self, path: &Path) -> Result<(), FsError>;
fn create_dir_all(&self, path: &Path) -> Result<(), FsError>;
fn remove_dir(&self, path: &Path) -> Result<(), FsError>;
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError>;
}
}
Congratulations! After implementing FsRead, FsWrite, and FsDir, your backend implements Fs automatically (blanket impl). This covers 90% of use cases.
Optional: Layer 2 Traits
Add these if your backend supports the features:
FsLink - Symlinks and Hardlinks
#![allow(unused)]
fn main() {
impl FsLink for MyBackend {
fn symlink(&self, original: &Path, link: &Path) -> Result<(), FsError>;
fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError>;
fn read_link(&self, path: &Path) -> Result<PathBuf, FsError>;
fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError>;
}
}
- Symlinks store a target path as a string
- Hard links share content with the original (update link count)
FsPermissions
#![allow(unused)]
fn main() {
impl FsPermissions for MyBackend {
fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError>;
}
}
FsSync - Durability
#![allow(unused)]
fn main() {
impl FsSync for MyBackend {
fn sync(&self) -> Result<(), FsError>;
fn fsync(&self, path: &Path) -> Result<(), FsError>;
}
}
sync(): Flush all pending writes to durable storagefsync(path): Flush pending writes for a specific fileMemoryBackendcan no-op these (volatile by design)SqliteBackend:PRAGMA wal_checkpointor connection flushVRootFsBackend:std::fs::File::sync_all()
FsStats - Filesystem Stats
#![allow(unused)]
fn main() {
impl FsStats for MyBackend {
fn statfs(&self) -> Result<StatFs, FsError>;
}
}
Return filesystem capacity information:
#![allow(unused)]
fn main() {
StatFs {
total_bytes: 0, // 0 = unlimited
used_bytes: ...,
available_bytes: ...,
total_inodes: 0,
used_inodes: ...,
available_inodes: ...,
block_size: 4096,
max_name_len: 255,
}
}
Optional: Layer 3 - FsInode (For FUSE)
Implement FsInode if you need FUSE mounting or inode-based hardlink tracking:
#![allow(unused)]
fn main() {
impl FsInode for MyBackend {
fn path_to_inode(&self, path: &Path) -> Result<u64, FsError>;
fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError>;
fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError>;
fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError>;
}
}
No blanket/default implementation - you must implement this trait explicitly if you need:
- FUSE mounting: FUSE operates on inodes, not paths
- Inode tracking for hardlinks: Two paths share the same inode (note:
hard_link()creation is inFsLink)
Level 1: Simple backend (no FsInode)
Don’t implement FsInode. The backend won’t support FUSE mounting. Hardlink creation via FsLink::hard_link() still works, but inode sharing won’t be tracked.
Level 2: Hardlink support
Override path_to_inode so hardlinked paths return the same inode:
#![allow(unused)]
fn main() {
struct Node {
id: u64, // Unique node ID (the inode)
nlink: u64, // Hard link count
content: Vec<u8>,
}
struct MemoryBackend {
next_id: u64,
nodes: HashMap<u64, Node>, // inode -> Node
paths: HashMap<PathBuf, u64>, // path -> inode
}
impl FsInode for MemoryBackend {
fn path_to_inode(&self, path: &Path) -> Result<u64, FsError> {
self.paths.get(path.as_ref())
.copied()
.ok_or_else(|| FsError::NotFound { path: path.as_ref().into() })
}
// ... implement others
}
impl FsLink for MemoryBackend {
fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError> {
let inode = self.path_to_inode(&original)?;
self.paths.insert(link.as_ref().to_path_buf(), inode);
self.nodes.get_mut(&inode).unwrap().nlink += 1;
Ok(())
}
}
}
Level 3: Full FUSE efficiency
Override all 4 methods for O(1) inode operations:
#![allow(unused)]
fn main() {
impl FsInode for SqliteBackend {
fn path_to_inode(&self, path: &Path) -> Result<u64, FsError> {
self.conn.query_row(
"SELECT id FROM nodes WHERE path = ?",
[path.as_ref().to_string_lossy()],
|row| Ok(row.get::<_, i64>(0)? as u64),
).map_err(|_| FsError::NotFound { path: path.as_ref().into() })
}
fn inode_to_path(&self, inode: u64) -> Result<PathBuf, FsError> {
self.conn.query_row(
"SELECT path FROM nodes WHERE id = ?",
[inode as i64],
|row| Ok(PathBuf::from(row.get::<_, String>(0)?)),
).map_err(|_| FsError::NotFound { path: format!("inode:{}", inode).into() })
}
fn lookup(&self, parent_inode: u64, name: &OsStr) -> Result<u64, FsError> {
self.conn.query_row(
"SELECT id FROM nodes WHERE parent_id = ? AND name = ?",
params![parent_inode as i64, name.to_string_lossy()],
|row| Ok(row.get::<_, i64>(0)? as u64),
).map_err(|_| FsError::NotFound { path: name.into() })
}
fn metadata_by_inode(&self, inode: u64) -> Result<Metadata, FsError> {
self.conn.query_row(
"SELECT type, size, nlink, created, modified FROM nodes WHERE id = ?",
[inode as i64],
|row| Ok(Metadata {
inode,
nlink: row.get(2)?,
// ...
}),
).map_err(|_| FsError::NotFound { path: format!("inode:{}", inode).into() })
}
}
}
Summary:
| Your Backend | Implement | Result |
|---|---|---|
| Simple (no hardlinks) | Nothing | Works with defaults |
| With hardlinks | FsInode::path_to_inode | Hardlinks work correctly |
| FUSE-optimized | Full FsInode | Maximum performance |
Optional: Layer 4 - POSIX Traits
For full POSIX semantics (file handles, locking, extended attributes):
FsHandles - File Handle Operations
#![allow(unused)]
fn main() {
impl FsHandles for MyBackend {
fn open(&self, path: &Path, flags: OpenFlags) -> Result<Handle, FsError>;
fn read_at(&self, handle: Handle, buf: &mut [u8], offset: u64) -> Result<usize, FsError>;
fn write_at(&self, handle: Handle, data: &[u8], offset: u64) -> Result<usize, FsError>;
fn close(&self, handle: Handle) -> Result<(), FsError>;
}
}
FsLock - File Locking
#![allow(unused)]
fn main() {
impl FsLock for MyBackend {
fn lock(&self, handle: Handle, lock: LockType) -> Result<(), FsError>;
fn try_lock(&self, handle: Handle, lock: LockType) -> Result<bool, FsError>;
fn unlock(&self, handle: Handle) -> Result<(), FsError>;
}
}
FsXattr - Extended Attributes
#![allow(unused)]
fn main() {
impl FsXattr for MyBackend {
fn get_xattr(&self, path: &Path, name: &str) -> Result<Vec<u8>, FsError>;
fn set_xattr(&self, path: &Path, name: &str, value: &[u8]) -> Result<(), FsError>;
fn remove_xattr(&self, path: &Path, name: &str) -> Result<(), FsError>;
fn list_xattr(&self, path: &Path) -> Result<Vec<String>, FsError>;
}
}
Note: Most backends don’t need Layer 4. Only implement if you’re wrapping a real filesystem (VRootFsBackend) or building a database that needs full POSIX semantics.
Error Handling
Return appropriate FsError variants:
| Situation | Error |
|---|---|
| Path doesn’t exist | FsError::NotFound { path, operation } |
| Path already exists | FsError::AlreadyExists { path, operation } |
| Expected file, got dir | FsError::NotAFile { path } |
| Expected dir, got file | FsError::NotADirectory { path } |
| Remove non-empty dir | FsError::DirectoryNotEmpty { path } |
| Internal error | FsError::Backend { message } |
What Backends Do NOT Do
| Concern | Where It Lives |
|---|---|
| Quota enforcement | Quota<B> middleware |
| Feature gating | Restrictions<B> middleware |
| Logging | Tracing<B> middleware |
| Ergonomic API | FileStorage<B> wrapper |
Backends focus on storage. Keep them simple.
Optional Optimizations
Some trait methods have default implementations that work universally but may be suboptimal for specific backends. You can override these for better performance.
Path Canonicalization (FsPath Trait)
The FsPath trait provides canonicalize() and soft_canonicalize() with default implementations that call read_link() and symlink_metadata() per path component.
Default behavior: O(n) calls for a path with n components
When to override:
- Your backend can resolve paths more efficiently (e.g., SQL query)
- Your backend delegates to OS (which has optimized syscalls)
SQLite Example - Single Query Resolution:
#![allow(unused)]
fn main() {
impl FsPath for SqliteBackend {
fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
// Resolve entire path in one recursive CTE query
self.conn.query_row(
r#"
WITH RECURSIVE resolve(current, depth) AS (
SELECT :path, 0
UNION ALL
SELECT
CASE WHEN n.type = 'symlink'
THEN n.target
ELSE resolve.current
END,
depth + 1
FROM resolve
LEFT JOIN nodes n ON n.path = resolve.current
WHERE n.type = 'symlink' AND depth < 40
)
SELECT current FROM resolve ORDER BY depth DESC LIMIT 1
"#,
params![path.to_string_lossy()],
|row| Ok(PathBuf::from(row.get::<_, String>(0)?))
).map_err(|_| FsError::NotFound {
path: path.into(),
operation: "canonicalize"
})
}
}
}
VRootFsBackend Example - OS Delegation:
#![allow(unused)]
fn main() {
impl FsPath for VRootFsBackend {
fn canonicalize(&self, path: &Path) -> Result<PathBuf, FsError> {
// Delegate to OS, which uses optimized syscalls
let host_path = self.root.join(path.strip_prefix("/").unwrap_or(path));
let resolved = std::fs::canonicalize(&host_path)
.map_err(|e| FsError::NotFound {
path: path.into(),
operation: "canonicalize"
})?;
// Verify containment (security check)
if !resolved.starts_with(&self.root) {
return Err(FsError::AccessDenied {
path: path.into(),
reason: "path escapes root".into(),
});
}
// Convert back to virtual path
Ok(PathBuf::from("/").join(resolved.strip_prefix(&self.root).unwrap()))
}
}
}
Other Optimization Opportunities
| Method | Default | Optimization Opportunity |
|---|---|---|
canonicalize() | O(n) per component | SQL CTE, OS delegation |
create_dir_all() | Recursive create_dir() | Single SQL INSERT with path hierarchy |
remove_dir_all() | Recursive traversal | SQL DELETE with LIKE pattern |
copy() | read + write | Database-level copy, reflink |
General Pattern:
#![allow(unused)]
fn main() {
// Override any trait method with optimized implementation
impl FsDir for SqliteBackend {
fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
// Instead of calling create_dir() for each level,
// insert all parent paths in a single transaction
self.conn.execute_batch(&format!(
"INSERT OR IGNORE INTO nodes (path, type) VALUES {}",
generate_ancestor_values(path)
))?;
Ok(())
}
}
}
When NOT to optimize:
MemoryBackend: In-memory operations are already fast; keep it simple- Low-volume operations: Optimize where it matters (hot paths)
- Prototype phase: Get correctness first, optimize later
See ADR-032 for the full design rationale.
Testing Your Backend
Use the conformance test suite:
#![allow(unused)]
fn main() {
#[cfg(test)]
mod tests {
use super::MyBackend;
use anyfs_backend::Fs;
fn create_backend() -> MyBackend {
MyBackend::new()
}
#[test]
fn test_write_read() {
let backend = create_backend();
backend.write(std::path::Path::new("/test.txt"), b"hello").unwrap();
let content = backend.read(std::path::Path::new("/test.txt")).unwrap();
assert_eq!(content, b"hello");
}
#[test]
fn test_create_dir() {
let backend = create_backend();
backend.create_dir(std::path::Path::new("/foo")).unwrap();
assert!(backend.exists(std::path::Path::new("/foo")).unwrap());
}
// ... more tests
}
}
Note on VRootFsBackend
If you are implementing a backend that wraps a real host filesystem directory, consider using strict-path::VirtualPath and strict-path::VirtualRoot internally for path containment. This ensures paths cannot escape the designated root directory.
This is an implementation choice for filesystem-based backends, not a requirement of the Fs trait.
For Middleware Authors: Wrapping Streams
Middleware that needs to intercept streaming I/O must wrap the returned Box<dyn Read/Write>.
CountingWriter Example
#![allow(unused)]
fn main() {
use std::io::{self, Write};
use std::sync::{Arc, atomic::{AtomicU64, Ordering}};
pub struct CountingWriter<W: Write> {
inner: W,
bytes_written: Arc<AtomicU64>,
}
impl<W: Write> CountingWriter<W> {
pub fn new(inner: W, counter: Arc<AtomicU64>) -> Self {
Self { inner, bytes_written: counter }
}
}
impl<W: Write + Send> Write for CountingWriter<W> {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
let n = self.inner.write(buf)?;
self.bytes_written.fetch_add(n as u64, Ordering::Relaxed);
Ok(n)
}
fn flush(&mut self) -> io::Result<()> {
self.inner.flush()
}
}
}
Using in Quota Middleware
#![allow(unused)]
fn main() {
impl<B: Fs> Fs for Quota<B> {
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
// Check if we're at quota before opening
if self.usage.total_bytes >= self.limits.max_total_size {
return Err(FsError::QuotaExceeded { ... });
}
let inner = self.inner.open_write(path)?;
Ok(Box::new(CountingWriter::new(inner, self.usage.bytes_counter.clone())))
}
}
}
Alternatives to Wrapping
| Middleware | Alternative to wrapping |
|---|---|
| PathFilter | Check path at open time, pass stream through |
| ReadOnly | Block open_write entirely |
| RateLimit | Count the open call, not stream bytes |
| Tracing | Log the open call, pass stream through |
| DryRun | Return std::io::sink() instead of real writer |
Creating Custom Middleware
Custom middleware only requires anyfs-backend as a dependency - same as backends.
Dependency
[dependencies]
anyfs-backend = "0.1"
The Pattern (5 Minutes to Understand)
Middleware is just a struct that:
- Wraps another
Fs - Implements
Fsitself - Intercepts some methods, delegates others
#![allow(unused)]
fn main() {
// ┌─────────────────────────────────────┐
// │ Your Middleware │
// │ ┌─────────────────────────────────┐│
// │ │ Inner Backend (any Fs) ││
// │ └─────────────────────────────────┘│
// └─────────────────────────────────────┘
//
// Request → Middleware (intercept/modify) → Inner Backend
// Response ← Middleware (intercept/modify) ← Inner Backend
}
Simplest Possible Middleware: Operation Counter
This middleware counts how many operations are performed:
#![allow(unused)]
fn main() {
use anyfs_backend::{Fs, FsError, Metadata, DirEntry, Permissions, StatFs};
use std::sync::atomic::{AtomicU64, Ordering};
use std::path::{Path, PathBuf};
/// Counts all operations performed on the backend.
pub struct Counter<B> {
inner: B,
pub count: AtomicU64,
}
impl<B> Counter<B> {
pub fn new(inner: B) -> Self {
Self { inner, count: AtomicU64::new(0) }
}
pub fn operations(&self) -> u64 {
self.count.load(Ordering::Relaxed)
}
}
// Implement each trait the inner backend supports
impl<B: FsRead> FsRead for Counter<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
self.count.fetch_add(1, Ordering::Relaxed); // Count it
self.inner.read(path) // Delegate
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
self.count.fetch_add(1, Ordering::Relaxed);
self.inner.exists(path)
}
// ... repeat for all FsRead methods
}
impl<B: FsWrite> FsWrite for Counter<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
self.count.fetch_add(1, Ordering::Relaxed); // Count it
self.inner.write(path, data) // Delegate
}
// ... repeat for all FsWrite methods
}
impl<B: FsDir> FsDir for Counter<B> {
// ... implement FsDir methods
}
// Counter<B> now implements Fs when B: Fs (blanket impl)
}
Usage:
#![allow(unused)]
fn main() {
let backend = Counter::new(MemoryBackend::new());
backend.write(std::path::Path::new("/file.txt"), b"hello")?;
backend.read(std::path::Path::new("/file.txt"))?;
backend.read(std::path::Path::new("/file.txt"))?;
println!("Operations: {}", backend.operations()); // 3
}
That’s it. That’s the entire pattern.
Adding a Layer (for .layer() syntax)
To enable the fluent .layer() syntax, add a Layer struct. The .layer() method
comes from the LayerExt trait which has a blanket impl for all Fs types:
#![allow(unused)]
fn main() {
use anyfs_backend::{Layer, LayerExt}; // LayerExt provides .layer() method
pub struct CounterLayer;
impl<B: Fs> Layer<B> for CounterLayer {
type Backend = Counter<B>;
fn layer(self, backend: B) -> Counter<B> {
Counter::new(backend)
}
}
}
Usage with .layer():
#![allow(unused)]
fn main() {
// LayerExt is re-exported from anyfs crate
use anyfs::LayerExt;
let backend = MemoryBackend::new()
.layer(CounterLayer);
}
Real Example: ReadOnly Middleware
A practical middleware that blocks all write operations:
#![allow(unused)]
fn main() {
pub struct ReadOnly<B> {
inner: B,
}
impl<B> ReadOnly<B> {
pub fn new(inner: B) -> Self {
Self { inner }
}
}
// FsRead: just delegate
impl<B: FsRead> FsRead for ReadOnly<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
self.inner.read(path)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
self.inner.exists(path)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
self.inner.metadata(path)
}
// ... delegate all FsRead methods
}
// FsDir: delegate reads, block writes
impl<B: FsDir> FsDir for ReadOnly<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
self.inner.read_dir(path)
}
fn create_dir(&self, _path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "create_dir" })
}
fn create_dir_all(&self, _path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "create_dir_all" })
}
fn remove_dir(&self, _path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "remove_dir" })
}
fn remove_dir_all(&self, _path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "remove_dir_all" })
}
}
// FsWrite: block all operations
impl<B: FsWrite> FsWrite for ReadOnly<B> {
fn write(&self, _path: &Path, _data: &[u8]) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "write" })
}
fn remove_file(&self, _path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { operation: "remove_file" })
}
// ... block all FsWrite methods
}
}
Usage:
#![allow(unused)]
fn main() {
let backend = ReadOnly::new(MemoryBackend::new());
backend.read(std::path::Path::new("/file.txt")); // OK (if file exists)
backend.write(std::path::Path::new("/file.txt"), b""); // Error: ReadOnly
}
Middleware Decision Table
| What You Want | Intercept | Delegate | Example |
|---|---|---|---|
| Count operations | All methods (before) | All methods | Counter |
| Block writes | Write methods | Read methods | ReadOnly |
| Transform data | read/write | Everything else | Encryption |
| Check permissions | All methods (before) | All methods | PathFilter |
| Log operations | All methods (before) | All methods | Tracing |
| Enforce limits | Write methods (check size) | Read methods | Quota |
Macro for Boilerplate (Optional)
If you don’t want to manually delegate all 29 methods, you can use a macro:
#![allow(unused)]
fn main() {
macro_rules! delegate {
($self:ident, $method:ident, $($arg:ident),*) => {
$self.inner.$method($($arg),*)
};
}
impl<B: Fs> Fs for MyMiddleware<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// Your logic here
delegate!(self, read, path)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
delegate!(self, exists, path)
}
// ... etc
}
}
Or provide a delegate_all! macro in anyfs-backend that generates all the passthrough implementations.
Complete Example: Encryption Middleware
#![allow(unused)]
fn main() {
use anyfs_backend::{FsRead, FsWrite, FsDir, Layer, FsError, Metadata, DirEntry};
use std::io::{Read, Write};
use std::path::{Path, PathBuf};
/// Middleware that encrypts/decrypts file contents transparently.
pub struct Encrypted<B> {
inner: B,
key: [u8; 32],
}
impl<B> Encrypted<B> {
pub fn new(inner: B, key: [u8; 32]) -> Self {
Self { inner, key }
}
fn encrypt(&self, data: &[u8]) -> Vec<u8> {
// Your encryption logic here
data.iter().map(|b| b ^ self.key[0]).collect()
}
fn decrypt(&self, data: &[u8]) -> Vec<u8> {
// Your decryption logic here (symmetric for XOR)
self.encrypt(data)
}
}
// FsRead: decrypt on read
impl<B: FsRead> FsRead for Encrypted<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let encrypted = self.inner.read(path)?;
Ok(self.decrypt(&encrypted))
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
self.inner.exists(path)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
self.inner.metadata(path)
}
// ... delegate other FsRead methods
}
// FsWrite: encrypt on write
impl<B: FsWrite> FsWrite for Encrypted<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let encrypted = self.encrypt(data);
self.inner.write(path, &encrypted)
}
// ... delegate/encrypt other FsWrite methods
}
// FsDir: just delegate (directories don't need encryption)
impl<B: FsDir> FsDir for Encrypted<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
self.inner.read_dir(path)
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
self.inner.create_dir(path)
}
// ... delegate other FsDir methods
}
// Encrypted<B> now implements Fs when B: Fs (blanket impl)
/// Layer for creating Encrypted middleware.
pub struct EncryptedLayer {
key: [u8; 32],
}
impl EncryptedLayer {
pub fn new(key: [u8; 32]) -> Self {
Self { key }
}
}
impl<B: Fs> Layer<B> for EncryptedLayer {
type Backend = Encrypted<B>;
fn layer(self, backend: B) -> Self::Backend {
Encrypted::new(backend, self.key)
}
}
}
Usage
#![allow(unused)]
fn main() {
use anyfs::MemoryBackend;
use my_middleware::{EncryptedLayer, Encrypted};
// Direct construction
let fs = Encrypted::new(MemoryBackend::new(), key);
// Or via Layer trait
let fs = MemoryBackend::new()
.layer(EncryptedLayer::new(key));
}
Middleware Checklist
- Depends only on
anyfs-backend - Implements the same traits as the inner backend (
FsRead,FsWrite,FsDir, etc.) - Implements
Layer<B>forMyMiddlewareLayer - Delegates unmodified operations to inner backend
- Handles streaming I/O appropriately (wrap, pass-through, or block)
- Documents which operations are intercepted vs delegated
Backend Checklist
- Depends only on
anyfs-backend - Implements core traits:
FsRead,FsWrite,FsDir(=Fs) - Optional: Implements
FsLink,FsPermissions,FsSync,FsStats(=FsFull) - Optional: Implements
FsInodefor FUSE support (=FsFuse) - Optional: Implements
FsHandles,FsLock,FsXattrfor POSIX (=FsPosix) - Accepts
&Pathfor all paths - Returns correct
FsErrorvariants - Passes conformance tests for implemented traits
- No panics (see below)
- Thread-safe (see below)
- Documents performance characteristics
Critical Implementation Guidelines
These guidelines are derived from issues found in similar projects (vfs, agentfs). All implementations MUST follow these.
1. No Panic Policy
NEVER use .unwrap() or .expect() in library code.
#![allow(unused)]
fn main() {
// BAD - will panic on missing file
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let entry = self.entries.get(path.as_ref()).unwrap(); // PANIC!
Ok(entry.content.clone())
}
// GOOD - returns error
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
let entry = self.entries.get(path)
.ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
Ok(entry.content.clone())
}
}
Edge cases that must NOT panic:
- File doesn’t exist
- Directory doesn’t exist
- Path is empty string
- Path is invalid UTF-8 (if using OsStr)
- Parent directory missing
- Trying to read a directory as a file
- Trying to list a file as a directory
- Concurrent access conflicts
2. Thread Safety (Required)
All trait methods use &self, not &mut self. This means backends MUST use interior mutability for thread-safe concurrent access.
Why &self?
- Enables concurrent access patterns (multiple readers, concurrent operations)
- Matches real filesystem semantics (concurrent access is normal)
- More flexible API (can share references without exclusive ownership)
Backend implementer responsibility:
- Use
RwLock,Mutex, or similar for internal state - Ensure operations are atomic (a single
write()call shouldn’t produce partial results) - Handle lock poisoning gracefully
What the synchronization guarantees:
- Memory safety (no data corruption)
- Atomic operations (writes don’t interleave)
What it does NOT guarantee:
- Order of concurrent writes to the same path (last write wins - standard FS behavior)
#![allow(unused)]
fn main() {
use std::sync::{Arc, RwLock};
use std::collections::HashMap;
use std::path::PathBuf;
pub struct MemoryBackend {
entries: Arc<RwLock<HashMap<PathBuf, Entry>>>,
}
impl FsRead for MemoryBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let entries = self.entries.read()
.map_err(|_| FsError::Backend("lock poisoned".into()))?;
// ...
}
}
impl FsWrite for MemoryBackend {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let mut entries = self.entries.write()
.map_err(|_| FsError::Backend("lock poisoned".into()))?;
// ...
}
}
}
Common race conditions to avoid:
create_dir_allcalled concurrently for same pathreadduringwriteto same fileread_dirwhile directory is being modifiedrenamewith concurrent access to source or destination
3. Path Resolution - NOT Your Job
Backends do NOT handle path resolution. FileStorage handles:
- Resolving
..and.components - Following symlinks for non-
SelfResolvingbackends that implementFsLink - Normalizing paths (
//→/, trailing slashes, etc.) - Walking the virtual directory structure
Your backend receives already-resolved, clean paths. Just store and retrieve bytes at those paths.
#![allow(unused)]
fn main() {
impl FsRead for MyBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// Path is already resolved - just use it directly
let path = path.as_ref();
self.storage.get(path).ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })
}
}
}
Exception: If your backend wraps a real filesystem (like VRootFsBackend), implement SelfResolving to tell FileStorage to skip resolution - the OS handles it.
#![allow(unused)]
fn main() {
impl SelfResolving for VRootFsBackend {}
}
4. Error Messages
Include context in errors for debugging:
#![allow(unused)]
fn main() {
// BAD - no context
Err(FsError::NotFound)
// GOOD - includes path
Err(FsError::NotFound { path: path.to_path_buf() })
// BETTER - includes operation context
Err(FsError::Io {
path: path.to_path_buf(),
operation: "read",
source: io_error,
})
}
5. Drop Implementation
Ensure cleanup happens correctly:
#![allow(unused)]
fn main() {
impl Drop for SqliteBackend {
fn drop(&mut self) {
// Flush any pending writes
if let Err(e) = self.sync() {
eprintln!("Warning: failed to sync on drop: {}", e);
}
}
}
}
6. Performance Documentation
Document the complexity of operations:
#![allow(unused)]
fn main() {
/// Memory-based virtual filesystem backend.
///
/// # Performance Characteristics
///
/// | Operation | Complexity | Notes |
/// |-----------|------------|-------|
/// | `read` | O(1) | HashMap lookup |
/// | `write` | O(n) | n = data size |
/// | `read_dir` | O(k) | k = entries in directory |
/// | `create_dir_all` | O(d) | d = path depth |
/// | `remove_dir_all` | O(n) | n = total descendants |
///
/// # Thread Safety
///
/// All operations are thread-safe. Uses `RwLock` internally.
/// Multiple concurrent reads are allowed.
/// Writes are exclusive.
pub struct MemoryBackend { ... }
}
Testing Requirements
Your backend MUST pass these test categories:
Basic Functionality
#![allow(unused)]
fn main() {
#[test]
fn test_read_write_roundtrip() { ... }
#[test]
fn test_create_dir_and_list() { ... }
#[test]
fn test_remove_file() { ... }
}
Edge Cases (No Panics)
#![allow(unused)]
fn main() {
#[test]
fn test_read_nonexistent_returns_error() {
let backend = create_backend();
assert!(matches!(
backend.read(std::path::Path::new("/nonexistent")),
Err(FsError::NotFound { .. })
));
}
#[test]
fn test_read_dir_on_file_returns_error() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
assert!(matches!(
backend.read_dir(std::path::Path::new("/file.txt")),
Err(FsError::NotADirectory { .. })
));
}
#[test]
fn test_empty_path_returns_error() {
let backend = create_backend();
assert!(backend.read(std::path::Path::new("")).is_err());
}
}
Thread Safety
#![allow(unused)]
fn main() {
#[test]
fn test_concurrent_reads() {
let backend = Arc::new(create_backend_with_data());
let handles: Vec<_> = (0..10).map(|_| {
let backend = backend.clone();
std::thread::spawn(move || {
for _ in 0..100 {
backend.read(std::path::Path::new("/test.txt")).unwrap();
}
})
}).collect();
for handle in handles {
handle.join().unwrap();
}
}
#[test]
fn test_concurrent_create_dir_all() {
let backend = Arc::new(RwLock::new(create_backend()));
let handles: Vec<_> = (0..10).map(|_| {
let backend = backend.clone();
std::thread::spawn(move || {
let mut backend = backend.write().unwrap();
// Should not panic or corrupt state
let _ = backend.create_dir_all(std::path::Path::new("/a/b/c/d"));
})
}).collect();
for handle in handles {
handle.join().unwrap();
}
}
}
Path Normalization
Note: These tests apply to
FileStorageintegration tests, NOT direct backend tests. Backends receive already-resolved paths fromFileStorage. The tests below verify thatFileStoragecorrectly normalizes paths before passing them to backends.
#![allow(unused)]
fn main() {
#[test]
fn test_filestorage_path_normalization() {
// Use FileStorage, not raw backend
let fs = FileStorage::new(create_backend());
fs.create_dir_all("/foo/bar").unwrap();
fs.write("/foo/bar/test.txt", b"data").unwrap();
// FileStorage resolves these before calling backend
assert_eq!(fs.read("/foo/bar/test.txt").unwrap(), b"data");
assert_eq!(fs.read("/foo/bar/../bar/test.txt").unwrap(), b"data");
assert_eq!(fs.read("/foo/./bar/test.txt").unwrap(), b"data");
}
// Direct backend calls should use clean paths only
#[test]
fn test_backend_with_clean_paths() {
let backend = create_backend();
backend.create_dir_all(std::path::Path::new("/foo/bar")).unwrap();
backend.write(std::path::Path::new("/foo/bar/test.txt"), b"data").unwrap();
// Backends receive clean, resolved paths
assert_eq!(backend.read(std::path::Path::new("/foo/bar/test.txt")).unwrap(), b"data");
}
}
MemoryBackend Snapshot & Restore
MemoryBackend supports cloning its entire state (snapshot) and serializing to bytes for persistence.
Core Concept
Snapshot = Clone the storage. That’s it.
#![allow(unused)]
fn main() {
// MemoryBackend implements Clone (custom impl, not derive)
pub struct MemoryBackend { ... }
impl Clone for MemoryBackend {
fn clone(&self) -> Self {
// Deep copy of Arc<RwLock<...>> contents
// ...
}
}
// Snapshot is just .clone()
let snapshot = fs.clone();
// Restore is just assignment
fs = snapshot;
}
API
#![allow(unused)]
fn main() {
impl MemoryBackend {
/// Clone the entire filesystem state.
/// This is a DEEP COPY - modifications to the clone don't affect the original.
/// Implemented via custom Clone (not #[derive(Clone)]) to ensure deep copy
/// of Arc<RwLock<...>> contents.
pub fn clone(&self) -> Self { ... }
/// Serialize to bytes for persistence/transfer.
pub fn to_bytes(&self) -> Result<Vec<u8>, FsError>;
/// Deserialize from bytes.
pub fn from_bytes(data: &[u8]) -> Result<Self, FsError>;
/// Save to file.
pub fn save_to(&self, path: impl AsRef<Path>) -> Result<(), FsError>;
/// Load from file.
pub fn load_from(path: impl AsRef<Path>) -> Result<Self, FsError>;
}
}
Usage
#![allow(unused)]
fn main() {
let fs = MemoryBackend::new();
fs.write(std::path::Path::new("/data.txt"), b"important")?;
// Snapshot = clone
let checkpoint = fs.clone();
// Do risky work...
fs.write(std::path::Path::new("/data.txt"), b"corrupted")?;
// Rollback = replace with clone
fs = checkpoint;
assert_eq!(fs.read(std::path::Path::new("/data.txt"))?, b"important");
}
Persistence
#![allow(unused)]
fn main() {
// Save to disk
fs.save_to("state.bin")?;
// Load from disk
let fs = MemoryBackend::load_from("state.bin")?;
}
SqliteBackend
SQLite already has persistence - the database file IS the snapshot. For explicit snapshots:
#![allow(unused)]
fn main() {
impl SqliteBackend {
/// Create an in-memory copy of the database.
pub fn clone_to_memory(&self) -> Result<Self, FsError>;
/// Backup to another file.
pub fn backup_to(&self, path: impl AsRef<Path>) -> Result<(), FsError>;
}
}
SQLite Operational Guide
Production-ready SQLite for filesystem backends
This guide covers everything you need to run SQLite-backed storage at scale.
Overview
SQLite is an excellent choice for filesystem backends:
- Single-file deployment (portable, easy backup)
- ACID transactions (data integrity)
- Rich query capabilities (dashboards, analytics)
- Proven at scale (handles terabytes)
But it has specific requirements for concurrent access that you must understand.
Real-World Performance Reference
A single SQLite database on modern hardware can scale remarkably well (source):
| Metric | Typical (8 vCPU, NVMe, 32GB RAM) |
|---|---|
| Read P95 | 8-12 ms |
| Write P95 (batched) | 25-40 ms |
| Peak throughput | ~25k requests/min |
| Database size | 18 GB |
Key insight: “Our breakthrough was not faster hardware. It was deciding that writes were expensive.”
The Golden Rule: Single Writer
SQLite supports many readers but only ONE writer at a time.
Even in WAL mode, concurrent writes will block. This isn’t a bug - it’s a design choice that enables SQLite’s reliability.
The Write Queue Pattern
Note: This pattern shows an async implementation using tokio for reference. The AnyFS API is synchronous - if you need async, wrap calls with
spawn_blocking. See also the sync alternative usingstd::sync::mpscbelow.
For filesystem backends, use a single-writer queue:
#![allow(unused)]
fn main() {
// Async variant (optional - requires tokio runtime)
use tokio::sync::mpsc;
use rusqlite::Connection;
pub struct SqliteBackend {
/// Read-only connection pool (many readers OK)
read_pool: Pool<Connection>,
/// Write commands go through this channel
write_tx: mpsc::UnboundedSender<WriteCmd>,
}
enum WriteCmd {
Write { path: PathBuf, data: Vec<u8>, reply: oneshot::Sender<Result<(), FsError>> },
Remove { path: PathBuf, reply: oneshot::Sender<Result<(), FsError>> },
Rename { from: PathBuf, to: PathBuf, reply: oneshot::Sender<Result<(), FsError>> },
CreateDir { path: PathBuf, reply: oneshot::Sender<Result<(), FsError>> },
// ...
}
// Single writer task
async fn writer_loop(conn: Connection, mut rx: mpsc::UnboundedReceiver<WriteCmd>) {
while let Some(cmd) = rx.recv().await {
let result = match cmd {
WriteCmd::Write { path, data, reply } => {
let r = execute_write(&conn, &path, &data);
let _ = reply.send(r);
}
// ... handle other commands
};
}
}
}
Why this works:
- No
SQLITE_BUSYerrors (single writer = no contention) - Predictable latency (queue depth = backpressure)
- Natural batching opportunity (combine multiple ops per transaction)
- Clean audit logging (all writes go through one place)
Sync Alternative (no tokio required):
#![allow(unused)]
fn main() {
// Sync variant using std channels
use std::sync::mpsc;
pub struct SqliteBackend {
read_pool: Pool<Connection>,
write_tx: mpsc::Sender<WriteCmd>,
}
// Writer runs in a dedicated thread
fn writer_thread(conn: Connection, rx: mpsc::Receiver<WriteCmd>) {
while let Ok(cmd) = rx.recv() {
match cmd {
WriteCmd::Write { path, data, reply } => {
let r = execute_write(&conn, &path, &data);
let _ = reply.send(r);
}
// ... handle other commands
}
}
}
}
“One Door” Principle: Once there is one door to the database, nobody can sneak in a surprise write on the request path. This architectural discipline—not just code—is what makes SQLite reliable at scale.
Write Batching: The Key to Performance
“One transaction per event is a tax. One transaction per batch is a different economy.”
Treat writes like a budget. The breakthrough is not faster hardware—it’s deciding that writes are expensive and batching them accordingly.
Batch writes into single transactions for dramatic performance improvement:
#![allow(unused)]
fn main() {
impl SqliteBackend {
fn flush_writes(&self) -> Result<(), FsError> {
let ops = self.write_queue.drain();
if ops.is_empty() { return Ok(()); }
let tx = self.conn.transaction()?;
for op in ops {
op.execute(&tx)?;
}
tx.commit()?; // One commit for many operations
Ok(())
}
}
}
Flush triggers:
- Batch size reached (e.g., 100 operations)
- Timeout elapsed (e.g., 50ms since first queued write)
- Explicit
sync()call - Read-after-write on same path (for consistency)
WAL Mode (Required)
Always enable WAL (Write-Ahead Logging) mode for concurrent access.
Recommended Pragma Defaults
| Pragma | Default | Purpose | Tradeoff |
|---|---|---|---|
journal_mode | WAL | Concurrent reads during writes | Creates .wal/.shm files |
synchronous | FULL | Data integrity on power loss | Slower writes, safest default |
temp_store | MEMORY | Faster temp operations | Uses RAM for temp tables |
cache_size | -32000 | 32MB page cache | Tune based on dataset size |
busy_timeout | 5000 | Wait 5s on lock contention | Prevents SQLITE_BUSY errors |
foreign_keys | ON | Enforce referential integrity | Slight overhead on writes |
#![allow(unused)]
fn main() {
fn open_connection(path: &Path) -> Result<Connection, rusqlite::Error> {
let conn = Connection::open(path)?;
conn.execute_batch("
PRAGMA journal_mode = WAL;
PRAGMA synchronous = FULL;
PRAGMA temp_store = MEMORY;
PRAGMA cache_size = -32000;
PRAGMA busy_timeout = 5000;
PRAGMA foreign_keys = ON;
")?;
Ok(conn)
}
}
Synchronous Mode: Safety vs Performance
| Mode | Behavior | Use When |
|---|---|---|
FULL | Sync WAL before each commit | Default - data integrity is critical |
NORMAL | Sync WAL before checkpoint only | High-throughput, battery-backed storage, or acceptable data loss |
OFF | No syncs | Testing only, high corruption risk |
Why FULL is the default:
SqliteBackendstores file content—losing data on power failure is unacceptable- Consumer SSDs often lack power-loss protection
- Filesystem users expect durability guarantees
When to use NORMAL:
#![allow(unused)]
fn main() {
// Opt-in for performance when you have:
// - Enterprise storage with battery-backed write cache
// - UPS-protected systems
// - Acceptable risk of losing last few transactions
let backend = SqliteBackend::builder()
.synchronous(Synchronous::Normal)
.build()?;
}
Cache Size Tuning
The default 32MB cache is conservative. Tune based on your dataset:
| Dataset Size | Recommended Cache | Rationale |
|---|---|---|
| < 100MB | 8-16MB | Small datasets fit in cache easily |
| 100MB - 1GB | 32-64MB | Default is appropriate |
| 1GB - 10GB | 64-128MB | Larger cache reduces disk I/O |
| > 10GB | 128-256MB | Diminishing returns above this |
#![allow(unused)]
fn main() {
// Configure via builder
let backend = SqliteBackend::builder()
.cache_size_mb(64)
.build()?;
}
WAL vs Rollback Journal
| Aspect | WAL Mode | Rollback Journal |
|---|---|---|
| Concurrent reads during write | ✅ Yes | ❌ No (blocked) |
| Read performance | Faster | Slower |
| Write performance | Similar | Similar |
| File count | 3 files (.db, .wal, .shm) | 1-2 files |
| Crash recovery | Automatic | Automatic |
Always use WAL for filesystem backends.
WAL Checkpointing
WAL files grow until checkpointed. SQLite auto-checkpoints at 1000 pages, but you can control this:
#![allow(unused)]
fn main() {
// Manual checkpoint (call periodically or after bulk operations)
conn.execute_batch("PRAGMA wal_checkpoint(TRUNCATE);")?;
// Or configure auto-checkpoint threshold
conn.execute_batch("PRAGMA wal_autocheckpoint = 1000;")?; // pages
}
Checkpoint modes:
PASSIVE- Checkpoint without blocking writers (may not complete)FULL- Wait for writers, then checkpoint completelyRESTART- Like FULL, but also resets WAL fileTRUNCATE- Like RESTART, but truncates WAL to zero bytes
For filesystem backends, run TRUNCATE checkpoint:
- During quiet periods
- After bulk imports
- Before backups
Busy Handling
Even with a write queue, reads might briefly block during checkpoints. Handle this gracefully:
#![allow(unused)]
fn main() {
fn open_connection(path: &Path) -> Result<Connection, rusqlite::Error> {
let conn = Connection::open(path)?;
// Wait up to 30 seconds if database is busy
conn.busy_timeout(Duration::from_secs(30))?;
// Or use a custom busy handler
conn.busy_handler(Some(|attempts| {
if attempts > 100 {
false // Give up after 100 retries
} else {
std::thread::sleep(Duration::from_millis(10 * attempts as u64));
true // Keep trying
}
}))?;
Ok(conn)
}
}
Never let SQLITE_BUSY propagate to users - it’s a transient condition.
Connection Pooling
For read operations, use a connection pool:
#![allow(unused)]
fn main() {
use r2d2::{Pool, PooledConnection};
use r2d2_sqlite::SqliteConnectionManager;
pub struct SqliteBackend {
read_pool: Pool<SqliteConnectionManager>,
write_tx: mpsc::UnboundedSender<WriteCmd>,
}
impl SqliteBackend {
pub fn open(path: impl AsRef<Path>) -> Result<Self, FsError> {
let manager = SqliteConnectionManager::file(path.as_ref())
.with_flags(rusqlite::OpenFlags::SQLITE_OPEN_READ_ONLY);
let read_pool = Pool::builder()
.max_size(10) // 10 concurrent readers
.build(manager)
.map_err(|e| FsError::Backend(e.to_string()))?;
// ... set up write queue
Ok(Self { read_pool, write_tx })
}
}
impl FsRead for SqliteBackend {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let conn = self.read_pool.get()
.map_err(|e| FsError::Backend(format!("pool exhausted: {}", e)))?;
// Use read-only connection
query_file_content(&conn, path.as_ref())
}
}
}
Pool sizing:
- Start with
max_size = CPU cores * 2 - Monitor pool exhaustion
- Increase if reads queue up
Vacuum and Maintenance
SQLite doesn’t automatically reclaim space from deleted data. You need VACUUM.
Auto-Vacuum (Recommended)
Enable incremental auto-vacuum for gradual space reclamation:
-- Set once when creating the database
PRAGMA auto_vacuum = INCREMENTAL;
-- Then periodically run (e.g., daily or after large deletes)
PRAGMA incremental_vacuum(1000); -- Free up to 1000 pages
Manual Vacuum
Full vacuum rebuilds the entire database (expensive but thorough):
#![allow(unused)]
fn main() {
impl SqliteBackend {
/// Compact the database. Call during maintenance windows.
pub fn vacuum(&self) -> Result<(), FsError> {
// Vacuum needs exclusive access - pause writes
let conn = self.get_write_connection()?;
// This can take a long time for large databases
conn.execute_batch("VACUUM;")?;
Ok(())
}
}
}
When to vacuum:
- After deleting >25% of data
- After schema migrations
- During scheduled maintenance
- Never during peak usage
Integrity Check
Periodically verify database integrity:
#![allow(unused)]
fn main() {
impl SqliteBackend {
pub fn check_integrity(&self) -> Result<bool, FsError> {
let conn = self.read_pool.get()?;
let result: String = conn.query_row(
"PRAGMA integrity_check;",
[],
|row| row.get(0),
)?;
Ok(result == "ok")
}
}
}
Run integrity checks:
- After crash recovery
- Before backups
- Periodically (weekly/monthly)
Schema Migrations
Filesystem schemas evolve. Handle migrations properly:
Version Tracking
-- Store schema version in the database
CREATE TABLE IF NOT EXISTS meta (
key TEXT PRIMARY KEY,
value TEXT
);
INSERT OR REPLACE INTO meta (key, value) VALUES ('schema_version', '1');
Migration Pattern
#![allow(unused)]
fn main() {
const CURRENT_VERSION: i32 = 3;
impl SqliteBackend {
fn migrate(&self, conn: &Connection) -> Result<(), FsError> {
let version: i32 = conn.query_row(
"SELECT COALESCE((SELECT value FROM meta WHERE key = 'schema_version'), '0')",
[],
|row| row.get::<_, String>(0)?.parse().map_err(|_| rusqlite::Error::InvalidQuery),
).unwrap_or(0);
if version < 1 {
self.migrate_v0_to_v1(conn)?;
}
if version < 2 {
self.migrate_v1_to_v2(conn)?;
}
if version < 3 {
self.migrate_v2_to_v3(conn)?;
}
conn.execute(
"INSERT OR REPLACE INTO meta (key, value) VALUES ('schema_version', ?)",
[CURRENT_VERSION.to_string()],
)?;
Ok(())
}
fn migrate_v1_to_v2(&self, conn: &Connection) -> Result<(), FsError> {
conn.execute_batch("
-- Add new column
ALTER TABLE nodes ADD COLUMN checksum TEXT;
-- Backfill (expensive but necessary)
-- UPDATE nodes SET checksum = compute_checksum(content) WHERE content IS NOT NULL;
")?;
Ok(())
}
}
}
Migration rules:
- Always wrap in transaction
- Test migrations on copy of production data
- Have rollback plan (backup before migration)
- Never delete columns in SQLite (not supported) - add new ones instead
Backup Strategies
Online Backup API (Recommended)
SQLite’s backup API creates consistent snapshots of a live database:
#![allow(unused)]
fn main() {
use rusqlite::backup::{Backup, Progress};
impl SqliteBackend {
/// Create a consistent backup while database is in use.
pub fn backup(&self, dest_path: impl AsRef<Path>) -> Result<(), FsError> {
let src = self.get_read_connection()?;
let mut dest = Connection::open(dest_path.as_ref())?;
let backup = Backup::new(&src, &mut dest)?;
// Copy in chunks (allows progress reporting)
loop {
let more = backup.step(100)?; // 100 pages at a time
if !more {
break;
}
// Optional: report progress
let progress = backup.progress();
println!("Backup: {}/{} pages", progress.pagecount - progress.remaining, progress.pagecount);
}
Ok(())
}
}
}
Benefits:
- No downtime (backup while serving requests)
- Consistent snapshot (point-in-time)
- Can copy to any destination (file, memory, network)
File Copy (Simple but Risky)
Only safe if database is not in use:
#![allow(unused)]
fn main() {
// DANGER: Only do this if no connections are open!
impl SqliteBackend {
pub fn backup_cold(&self, dest: impl AsRef<Path>) -> Result<(), FsError> {
// Ensure WAL is checkpointed first
let conn = self.get_write_connection()?;
conn.execute_batch("PRAGMA wal_checkpoint(TRUNCATE);")?;
drop(conn);
// Now safe to copy
std::fs::copy(&self.db_path, dest.as_ref())?;
Ok(())
}
}
}
Backup Schedule
| Scenario | Strategy |
|---|---|
| Development | Manual or none |
| Small production (<1GB) | Hourly online backup |
| Large production (>1GB) | Daily full + WAL archiving |
| Critical data | Continuous WAL shipping to replica |
Performance Tuning
Essential PRAGMAs
-- Safe defaults
PRAGMA journal_mode = WAL; -- Required for concurrent access
PRAGMA synchronous = FULL; -- Data integrity on power loss (default)
PRAGMA cache_size = -32000; -- 32MB cache (tune based on dataset)
PRAGMA temp_store = MEMORY; -- Temp tables in memory
-- Performance opt-in (when you have battery-backed storage)
-- PRAGMA synchronous = NORMAL; -- Faster, risk of data loss on power failure
-- PRAGMA cache_size = -128000; -- Larger cache for big datasets
-- PRAGMA mmap_size = 268435456; -- 256MB memory-mapped I/O
-- For read-heavy workloads
PRAGMA read_uncommitted = ON; -- Allow dirty reads (faster, use carefully)
-- For write-heavy workloads
PRAGMA wal_autocheckpoint = 10000; -- Checkpoint less frequently
Indexing Strategy
-- Essential indexes for filesystem operations
CREATE INDEX idx_nodes_parent ON nodes(parent_inode);
CREATE INDEX idx_nodes_name ON nodes(parent_inode, name);
-- For metadata queries
CREATE INDEX idx_nodes_type ON nodes(node_type);
CREATE INDEX idx_nodes_modified ON nodes(modified_at);
-- For GC queries
CREATE INDEX idx_blobs_orphan ON blobs(refcount) WHERE refcount = 0;
-- Composite indexes for common queries
CREATE INDEX idx_nodes_parent_type ON nodes(parent_inode, node_type);
Query Optimization
#![allow(unused)]
fn main() {
// BAD: Multiple queries
fn get_children_with_metadata(parent: i64) -> Vec<Node> {
let children = query("SELECT * FROM nodes WHERE parent = ?", [parent]);
for child in children {
let metadata = query("SELECT * FROM metadata WHERE inode = ?", [child.inode]);
// ...
}
}
// GOOD: Single query with JOIN
fn get_children_with_metadata(parent: i64) -> Vec<Node> {
query("
SELECT n.*, m.*
FROM nodes n
LEFT JOIN metadata m ON n.inode = m.inode
WHERE n.parent = ?
", [parent])
}
}
Prepared Statements
Always use prepared statements for repeated queries:
#![allow(unused)]
fn main() {
impl SqliteBackend {
fn prepare_statements(conn: &Connection) -> Result<Statements, FsError> {
Ok(Statements {
read_file: conn.prepare_cached(
"SELECT content FROM nodes WHERE parent_inode = ? AND name = ?"
)?,
list_dir: conn.prepare_cached(
"SELECT name, node_type, size FROM nodes WHERE parent_inode = ?"
)?,
// ... other common queries
})
}
}
}
Monitoring and Diagnostics
Key Metrics to Track
#![allow(unused)]
fn main() {
impl SqliteBackend {
pub fn stats(&self) -> Result<DbStats, FsError> {
let conn = self.read_pool.get()?;
Ok(DbStats {
// Database size
page_count: pragma_i64(&conn, "page_count"),
page_size: pragma_i64(&conn, "page_size"),
// WAL status
wal_pages: pragma_i64(&conn, "wal_checkpoint"),
// Cache efficiency
cache_hit: pragma_i64(&conn, "cache_hit"),
cache_miss: pragma_i64(&conn, "cache_miss"),
// Fragmentation
freelist_count: pragma_i64(&conn, "freelist_count"),
})
}
}
fn pragma_i64(conn: &Connection, name: &str) -> i64 {
conn.query_row(&format!("PRAGMA {}", name), [], |r| r.get(0)).unwrap_or(0)
}
}
Health Checks
#![allow(unused)]
fn main() {
impl SqliteBackend {
pub fn health_check(&self) -> HealthStatus {
// 1. Can we connect?
let conn = match self.read_pool.get() {
Ok(c) => c,
Err(e) => return HealthStatus::Unhealthy(format!("pool: {}", e)),
};
// 2. Is database intact?
let integrity: String = conn.query_row("PRAGMA integrity_check", [], |r| r.get(0))
.unwrap_or_else(|_| "error".to_string());
if integrity != "ok" {
return HealthStatus::Unhealthy(format!("integrity: {}", integrity));
}
// 3. Is WAL file reasonable size?
let wal_size = std::fs::metadata(format!("{}-wal", self.db_path))
.map(|m| m.len())
.unwrap_or(0);
if wal_size > 100 * 1024 * 1024 { // > 100MB
return HealthStatus::Degraded("WAL file large - checkpoint needed".into());
}
// 4. Is write queue backed up?
if self.write_queue_depth() > 1000 {
return HealthStatus::Degraded("Write queue backlog".into());
}
HealthStatus::Healthy
}
}
}
Common Pitfalls
1. Opening Too Many Connections
#![allow(unused)]
fn main() {
// BAD: New connection per operation
fn read(&self, path: &Path) -> Vec<u8> {
let conn = Connection::open(&self.db_path).unwrap(); // DON'T
// ...
}
// GOOD: Use connection pool
fn read(&self, path: &Path) -> Vec<u8> {
let conn = self.pool.get().unwrap(); // Reuse connections
// ...
}
}
2. Long-Running Transactions
#![allow(unused)]
fn main() {
// BAD: Transaction open while doing slow work
let tx = conn.transaction()?;
for file in files {
tx.execute("INSERT ...", [&file])?;
upload_to_s3(&file)?; // SLOW - blocks other writers!
}
tx.commit()?;
// GOOD: Minimize transaction scope
for file in files {
upload_to_s3(&file)?; // Do slow work outside transaction
}
let tx = conn.transaction()?;
for file in files {
tx.execute("INSERT ...", [&file])?; // Fast inserts only
}
tx.commit()?;
}
3. Ignoring SQLITE_BUSY
#![allow(unused)]
fn main() {
// BAD: Crash on busy
conn.execute("INSERT ...", [])?; // May return SQLITE_BUSY
// GOOD: Retry logic (or use busy_timeout)
loop {
match conn.execute("INSERT ...", []) {
Ok(_) => break,
Err(rusqlite::Error::SqliteFailure(e, _)) if e.code == ErrorCode::DatabaseBusy => {
std::thread::sleep(Duration::from_millis(10));
continue;
}
Err(e) => return Err(e.into()),
}
}
}
4. Forgetting to Checkpoint
// BAD: WAL grows forever
// (no checkpoint calls)
// GOOD: Periodic checkpoint
impl SqliteBackend {
pub fn maintenance(&self) -> Result<(), FsError> {
let conn = self.get_write_connection()?;
conn.execute_batch("PRAGMA wal_checkpoint(TRUNCATE);")?;
Ok(())
}
}
5. Not Using Transactions for Batch Operations
#![allow(unused)]
fn main() {
// BAD: 1000 separate transactions
for item in items {
conn.execute("INSERT ...", [item])?; // Each is auto-committed
}
// GOOD: Single transaction
let tx = conn.transaction()?;
for item in items {
tx.execute("INSERT ...", [item])?;
}
tx.commit()?; // 10-100x faster
}
SQLCipher (Encryption)
For encrypted databases, use SQLCipher:
#![allow(unused)]
fn main() {
use rusqlite::Connection;
fn open_encrypted(path: &Path, key: &str) -> Result<Connection, rusqlite::Error> {
let conn = Connection::open(path)?;
// Set encryption key (must be first operation)
conn.execute_batch(&format!("PRAGMA key = '{}';", key))?;
// Verify encryption is working
conn.execute_batch("SELECT count(*) FROM sqlite_master;")?;
// Now configure as normal
conn.execute_batch("
PRAGMA journal_mode = WAL;
PRAGMA synchronous = FULL;
")?;
Ok(conn)
}
}
Key management:
- Never hardcode keys
- Rotate keys periodically (requires re-encryption)
- Use key derivation (PBKDF2) for password-based keys
- Store key metadata separately from data
See Security Model for key rotation patterns.
Path Resolution Performance
The N-query problem is the dominant cost for SQLite filesystems.
With a parent/name schema, resolving /documents/2024/q1/report.pdf requires:
Query 1: SELECT inode FROM nodes WHERE parent=1 AND name='documents' → 2
Query 2: SELECT inode FROM nodes WHERE parent=2 AND name='2024' → 3
Query 3: SELECT inode FROM nodes WHERE parent=3 AND name='q1' → 4
Query 4: SELECT inode FROM nodes WHERE parent=4 AND name='report.pdf' → 5
Four round-trips for one file! Deep directory structures multiply this cost.
Solution 1: Path Caching (Recommended)
Cache resolved paths at the FileStorage layer using CachingResolver:
#![allow(unused)]
fn main() {
let backend = SqliteBackend::open("data.db")?;
let fs = FileStorage::with_resolver(
backend,
CachingResolver::new(IterativeResolver, 10_000) // 10K entry cache
);
}
Cache invalidation: Clear cache entries on rename/remove operations that affect path prefixes.
Solution 2: Recursive CTE (Single Query)
Resolve entire path in one query using SQLite’s recursive CTE:
WITH RECURSIVE path_walk(depth, inode, name, remaining) AS (
-- Start at root
SELECT 0, 1, '', '/documents/2024/q1/report.pdf'
UNION ALL
-- Walk each component
SELECT
pw.depth + 1,
n.inode,
n.name,
substr(pw.remaining, instr(pw.remaining, '/') + 1)
FROM path_walk pw
JOIN nodes n ON n.parent = pw.inode
AND n.name = substr(
pw.remaining,
1,
CASE WHEN instr(pw.remaining, '/') > 0
THEN instr(pw.remaining, '/') - 1
ELSE length(pw.remaining)
END
)
WHERE pw.remaining != ''
)
SELECT inode FROM path_walk ORDER BY depth DESC LIMIT 1;
Tradeoff: More complex query, but single round-trip. Best for deep paths without caching.
Solution 3: Full-Path Index (Alternative Schema)
Store full paths as keys for O(1) lookups:
CREATE TABLE nodes (
path TEXT PRIMARY KEY, -- '/documents/2024/q1/report.pdf'
parent_path TEXT NOT NULL,
name TEXT NOT NULL,
-- ... other columns
);
CREATE INDEX idx_nodes_parent_path ON nodes(parent_path);
Tradeoff: Instant lookups, but rename() must update all descendants’ paths.
Recommendation
| Workload | Best Approach |
|---|---|
| Read-heavy, shallow paths | Parent/name + basic index |
| Read-heavy, deep paths | Parent/name + CachingResolver |
| Write-heavy with renames | Parent/name (rename is O(1)) |
| Read-dominated, few renames | Full-path index |
SqliteBackend Schema
The anyfs-sqlite crate stores everything in SQLite, including file content:
CREATE TABLE nodes (
inode INTEGER PRIMARY KEY,
parent INTEGER NOT NULL,
name TEXT NOT NULL,
node_type INTEGER NOT NULL, -- 0=file, 1=dir, 2=symlink
content BLOB, -- File content (inline)
target TEXT, -- Symlink target
size INTEGER NOT NULL DEFAULT 0,
permissions INTEGER NOT NULL DEFAULT 420, -- 0o644
nlink INTEGER NOT NULL DEFAULT 1,
created_at INTEGER NOT NULL,
modified_at INTEGER NOT NULL,
accessed_at INTEGER NOT NULL,
UNIQUE(parent, name)
);
-- Root directory
INSERT INTO nodes (inode, parent, name, node_type, created_at, modified_at, accessed_at)
VALUES (1, 1, '', 1, unixepoch(), unixepoch(), unixepoch());
-- Indexes
CREATE INDEX idx_nodes_parent ON nodes(parent);
CREATE INDEX idx_nodes_parent_name ON nodes(parent, name);
Key design choices:
- Inline BLOBs: Simple, portable, single-file backup
- Integer node_type: Faster comparison than TEXT
- Parent/name unique: Enforces filesystem semantics at database level
BLOB Storage Strategies
Inline (SqliteBackend)
All content stored in nodes.content column:
| Pros | Cons |
|---|---|
| Single-file portability | Memory pressure for large files |
| Atomic operations | SQLite page overhead for small files |
| Simple backup/restore | WAL growth during large writes |
Best for: Files <10MB, portability-focused use cases.
External (IndexedBackend)
Content stored as files, SQLite holds only metadata:
| Pros | Cons |
|---|---|
| Native streaming I/O | Two-component backup |
| No memory pressure | Blob/index consistency risk |
| Efficient for large files | More complex implementation |
Best for: Large files, media libraries, streaming workloads.
Hybrid Approach (Future Consideration)
Inline small files, external for large:
#![allow(unused)]
fn main() {
const INLINE_THRESHOLD: usize = 64 * 1024; // 64KB
fn store_content(&self, data: &[u8]) -> Result<ContentRef, FsError> {
if data.len() <= INLINE_THRESHOLD {
Ok(ContentRef::Inline(data.to_vec()))
} else {
let blob_id = self.blob_store.put(data)?;
Ok(ContentRef::External(blob_id))
}
}
}
Tradeoff: Best of both worlds, but adds schema complexity.
When to Outgrow SQLite
From real-world experience:
“We eventually migrated, not because SQLite failed, but because our product changed. We added features that created heavier concurrent writes. That is when a single file stops being an advantage and starts being a ceiling.”
SQLite Works Well For
- Read-heavy workloads (feeds, search, file serving)
- Single-process applications
- Embedded/desktop applications
- Development and testing
- Workloads up to ~25k requests/minute (read-dominated)
Consider Migration When
| Signal | What It Means |
|---|---|
| Write contention dominates | Queue depth grows, latency spikes |
| Multi-process writes needed | SQLite’s single-writer limit |
| Horizontal scaling required | SQLite can’t distribute |
| Real-time sync across nodes | No built-in replication |
Migration Path
- Abstract early: Use AnyFS traits so backends are swappable
- Measure first: Profile before assuming SQLite is the bottleneck
- Consider IndexedBackend: External blobs reduce SQLite pressure
- Postgres/MySQL: When you truly need concurrent writes
Key insight: The architecture patterns (write batching, connection pooling, caching) transfer to any database. SQLite teaches discipline that scales.
Summary Checklist
Before deploying SQLite backend to production:
Architecture:
- Single-writer queue implemented (“one door” principle)
- Connection pool for readers (4-8 connections)
- Write batching enabled (batch size + timeout flush)
- Path resolution strategy chosen (caching, CTE, or full-path)
Configuration:
- WAL mode enabled (
PRAGMA journal_mode = WAL) -
synchronous = FULL(safe default, opt-in to NORMAL) - Cache size tuned for dataset (default 32MB)
- Busy timeout configured (5+ seconds)
- Auto-vacuum configured (INCREMENTAL)
Indexes:
- Parent/name composite index for path lookups
- Indexes match actual query patterns (measure first!)
- Partial indexes for GC queries
Operations:
- Backup strategy in place (online backup API)
- Monitoring for WAL size, queue depth, cache hit ratio
- Integrity checks scheduled (weekly/monthly)
- Migration path for schema changes
“SQLite did not scale our app. Measurement, batching, and restraint did.”
Security Model
Threat modeling, encryption, and security hardening for AnyFS deployments
This guide covers security considerations for deploying AnyFS-based filesystems, from single-user local use to multi-tenant cloud services.
Threat Model
Actors
| Actor | Description | Trust Level |
|---|---|---|
| User | Legitimate filesystem user | Trusted for their data |
| Other User | Another tenant (multi-tenant) | Untrusted (isolation required) |
| Operator | System administrator | Trusted for ops, not data |
| Attacker | External malicious actor | Untrusted |
| Compromised Host | Server with attacker access | Assume worst case |
Assets to Protect
| Asset | Confidentiality | Integrity | Availability |
|---|---|---|---|
| File contents | High | High | High |
| File metadata (names, sizes) | Medium | High | High |
| Directory structure | Medium | High | Medium |
| Encryption keys | Critical | Critical | High |
| Audit logs | Medium | Critical | Medium |
| User credentials | Critical | Critical | High |
Attack Vectors
| Vector | Mitigation |
|---|---|
| Network interception | TLS for all traffic |
| Unauthorized access | Authentication + authorization |
| Data theft (at rest) | Encryption (SQLCipher) |
| Data theft (in memory) | Memory protection, key isolation |
| Tenant data leakage | Strict isolation, no cross-tenant dedup |
| Path traversal | PathFilter middleware, input validation |
| Denial of service | Rate limiting, quotas |
| Privilege escalation | Principle of least privilege |
| Audit tampering | Append-only logs, signatures |
Encryption at Rest
SQLCipher Integration
For encrypted SQLite backends, use SQLCipher:
#![allow(unused)]
fn main() {
use rusqlite::Connection;
pub struct EncryptedSqliteBackend {
conn: Connection,
}
impl EncryptedSqliteBackend {
/// Open an encrypted database.
///
/// # Security Notes
/// - Key should be 256 bits of cryptographically random data
/// - Or use a strong passphrase with proper key derivation
pub fn open(path: &Path, key: &EncryptionKey) -> Result<Self, FsError> {
let conn = Connection::open(path)
.map_err(|e| FsError::Backend(e.to_string()))?;
// Apply encryption key (MUST be first operation)
match key {
EncryptionKey::Raw(bytes) => {
// Raw 256-bit key (hex encoded for SQLCipher)
let hex_key = hex::encode(bytes);
conn.execute_batch(&format!("PRAGMA key = \"x'{}'\";", hex_key))?;
}
EncryptionKey::Passphrase(pass) => {
// Passphrase (SQLCipher uses PBKDF2 internally)
conn.execute_batch(&format!("PRAGMA key = '{}';", escape_sql(pass)))?;
}
}
// Verify encryption is working
conn.execute_batch("SELECT count(*) FROM sqlite_master;")
.map_err(|_| FsError::InvalidPassword)?;
// Configure after key is set
conn.execute_batch("
PRAGMA journal_mode = WAL;
PRAGMA synchronous = FULL;
")?;
Ok(Self { conn })
}
}
pub enum EncryptionKey {
/// Raw 256-bit key (32 bytes)
Raw([u8; 32]),
/// Passphrase (key derived via PBKDF2)
Passphrase(String),
}
}
Key Derivation
For passphrase-based keys, SQLCipher uses PBKDF2 internally. For custom key derivation:
#![allow(unused)]
fn main() {
use argon2::{Argon2, password_hash::SaltString};
use rand::rngs::OsRng;
/// Derive a 256-bit key from a passphrase.
pub fn derive_key(passphrase: &str, salt: &[u8]) -> [u8; 32] {
let argon2 = Argon2::default();
let mut key = [0u8; 32];
argon2.hash_password_into(
passphrase.as_bytes(),
salt,
&mut key,
).expect("key derivation failed");
key
}
/// Generate a random salt for key derivation.
pub fn generate_salt() -> [u8; 16] {
let mut salt = [0u8; 16];
OsRng.fill_bytes(&mut salt);
salt
}
}
Salt storage: Store salt separately from encrypted data (e.g., in a key management service or config file).
Key Management
Key Lifecycle
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
│ Generate│ ──→ │ Store │ ──→ │ Use │ ──→ │ Rotate │
└─────────┘ └─────────┘ └─────────┘ └─────────┘
│
▼
┌─────────┐
│ Destroy │
└─────────┘
Key Generation
#![allow(unused)]
fn main() {
use rand::rngs::OsRng;
use rand::RngCore;
/// Generate a cryptographically secure 256-bit key.
pub fn generate_key() -> [u8; 32] {
let mut key = [0u8; 32];
OsRng.fill_bytes(&mut key);
key
}
/// Generate a key ID for tracking.
pub fn generate_key_id() -> String {
let mut id = [0u8; 16];
OsRng.fill_bytes(&mut id);
format!("key_{}", hex::encode(id))
}
}
Key Storage
Never store keys:
- In source code
- In plain text config files
- In the same location as encrypted data
- In environment variables (visible in process lists)
Recommended storage:
| Environment | Solution |
|---|---|
| Development | File with restricted permissions (0600) |
| Production (cloud) | KMS (AWS KMS, GCP KMS, Azure Key Vault) |
| Production (on-prem) | HSM or dedicated secrets manager |
| User devices | OS keychain (macOS Keychain, Windows Credential Manager) |
#![allow(unused)]
fn main() {
/// Key storage abstraction.
pub trait KeyStore: Send + Sync {
/// Retrieve a key by ID.
fn get_key(&self, key_id: &str) -> Result<[u8; 32], KeyError>;
/// Store a new key, returns key ID.
fn store_key(&self, key: &[u8; 32]) -> Result<String, KeyError>;
/// Delete a key (after rotation).
fn delete_key(&self, key_id: &str) -> Result<(), KeyError>;
/// List all key IDs.
fn list_keys(&self) -> Result<Vec<String>, KeyError>;
}
/// AWS KMS implementation.
pub struct AwsKmsKeyStore {
client: aws_sdk_kms::Client,
master_key_id: String,
}
impl KeyStore for AwsKmsKeyStore {
fn get_key(&self, key_id: &str) -> Result<[u8; 32], KeyError> {
// Keys are stored encrypted in DynamoDB/S3
// Decrypt using KMS
let encrypted = self.fetch_encrypted_key(key_id)?;
let decrypted = self.client.decrypt()
.key_id(&self.master_key_id)
.ciphertext_blob(Blob::new(encrypted))
.send()
.await?;
let plaintext = decrypted.plaintext().unwrap();
let mut key = [0u8; 32];
key.copy_from_slice(&plaintext.as_ref()[..32]);
Ok(key)
}
// ... other methods
}
}
Key Rotation
Regular key rotation limits damage from key compromise:
#![allow(unused)]
fn main() {
impl EncryptedSqliteBackend {
/// Rotate encryption key.
///
/// This re-encrypts the entire database with a new key.
/// Can take a long time for large databases.
pub fn rotate_key(&self, new_key: &EncryptionKey) -> Result<(), FsError> {
let new_key_sql = match new_key {
EncryptionKey::Raw(bytes) => format!("\"x'{}'\"", hex::encode(bytes)),
EncryptionKey::Passphrase(pass) => format!("'{}'", escape_sql(pass)),
};
// SQLCipher's PRAGMA rekey re-encrypts the database
self.conn.execute_batch(&format!("PRAGMA rekey = {};", new_key_sql))
.map_err(|e| FsError::Backend(format!("key rotation failed: {}", e)))?;
Ok(())
}
}
/// Key rotation schedule.
pub struct KeyRotationPolicy {
/// Maximum age of a key before rotation.
pub max_key_age: Duration,
/// Maximum amount of data encrypted with one key.
pub max_data_encrypted: u64,
/// Whether to auto-rotate.
pub auto_rotate: bool,
}
impl Default for KeyRotationPolicy {
fn default() -> Self {
Self {
max_key_age: Duration::from_secs(90 * 24 * 60 * 60), // 90 days
max_data_encrypted: 1024 * 1024 * 1024 * 100, // 100 GB
auto_rotate: true,
}
}
}
}
Rotation workflow:
- Generate new key
- Store new key in key store
- Re-encrypt database with
PRAGMA rekey - Update key ID reference
- Audit log the rotation
- After retention period, delete old key
Multi-Tenant Isolation
Isolation Strategies
| Strategy | Isolation Level | Complexity | Use Case |
|---|---|---|---|
| Separate databases | Strongest | Low | Few large tenants |
| Separate tables | Strong | Medium | Many small tenants |
| Row-level | Moderate | High | Shared infrastructure |
Recommendation: Separate databases (one SQLite file per tenant).
Per-Tenant Keys
Each tenant should have their own encryption key:
#![allow(unused)]
fn main() {
pub struct MultiTenantBackend {
key_store: Arc<dyn KeyStore>,
tenant_backends: RwLock<HashMap<TenantId, Arc<EncryptedSqliteBackend>>>,
}
impl MultiTenantBackend {
/// Get or create backend for a tenant.
pub fn get_tenant(&self, tenant_id: &TenantId) -> Result<Arc<EncryptedSqliteBackend>, FsError> {
// Check cache
{
let backends = self.tenant_backends.read().unwrap();
if let Some(backend) = backends.get(tenant_id) {
return Ok(backend.clone());
}
}
// Create new backend
let key = self.key_store.get_key(&tenant_id.key_id())?;
let path = self.tenant_db_path(tenant_id);
let backend = Arc::new(EncryptedSqliteBackend::open(&path, &EncryptionKey::Raw(key))?);
// Cache it
let mut backends = self.tenant_backends.write().unwrap();
backends.insert(tenant_id.clone(), backend.clone());
Ok(backend)
}
}
}
Cross-Tenant Dedup Considerations
Warning: Cross-tenant deduplication can leak information.
Tenant A uploads secret.pdf (hash: abc123)
Tenant B uploads same file → instantly deduped → B knows A has that file
Options:
| Approach | Dedup Savings | Privacy |
|---|---|---|
| No cross-tenant dedup | None | Full privacy |
| Convergent encryption | Partial | Leaks file existence |
| Per-tenant keys before hash | None | Full privacy |
Recommendation: Only deduplicate within a tenant, not across tenants.
#![allow(unused)]
fn main() {
// Pattern for any hybrid backend (IndexedBackend or custom implementations)
// See hybrid-backend-design.md for the full pattern
impl IndexedBackend {
fn blob_id_for_tenant(&self, tenant_id: &TenantId, data: &[u8]) -> String {
// Include tenant ID in hash to prevent cross-tenant dedup
let mut hasher = Sha256::new();
hasher.update(tenant_id.as_bytes());
hasher.update(data);
hex::encode(hasher.finalize())
}
}
}
Audit Logging
What to Log
| Event | Severity | Data to Capture |
|---|---|---|
| File read | Info | path, user, timestamp, size |
| File write | Info | path, user, timestamp, size, hash |
| File delete | Warning | path, user, timestamp |
| Permission change | Warning | path, user, old/new perms |
| Login success | Info | user, IP, timestamp |
| Login failure | Warning | user, IP, timestamp, reason |
| Key rotation | Critical | key_id, user, timestamp |
| Admin action | Critical | action, user, timestamp |
Audit Log Schema
CREATE TABLE audit_log (
seq INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp INTEGER NOT NULL,
event_type TEXT NOT NULL,
severity TEXT NOT NULL, -- 'info', 'warning', 'critical'
actor TEXT, -- user ID or 'system'
actor_ip TEXT,
resource TEXT, -- path or resource ID
action TEXT NOT NULL,
details TEXT, -- JSON
signature BLOB -- HMAC for tamper detection
);
CREATE INDEX idx_audit_timestamp ON audit_log(timestamp);
CREATE INDEX idx_audit_actor ON audit_log(actor);
CREATE INDEX idx_audit_resource ON audit_log(resource);
Tamper-Evident Logging
Sign audit entries to detect tampering:
#![allow(unused)]
fn main() {
use hmac::{Hmac, Mac};
use sha2::Sha256;
type HmacSha256 = Hmac<Sha256>;
pub struct AuditLogger {
conn: Connection,
signing_key: [u8; 32],
prev_signature: RwLock<Vec<u8>>, // Chain signatures
}
impl AuditLogger {
pub fn log(&self, event: AuditEvent) -> Result<(), FsError> {
let timestamp = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_secs() as i64;
let details = serde_json::to_string(&event.details)?;
// Create signature (includes previous signature for chaining)
let prev_sig = self.prev_signature.read().unwrap().clone();
let signature = self.sign_entry(timestamp, &event, &details, &prev_sig);
self.conn.execute(
"INSERT INTO audit_log (timestamp, event_type, severity, actor, actor_ip, resource, action, details, signature)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)",
params![
timestamp,
event.event_type,
event.severity,
event.actor,
event.actor_ip,
event.resource,
event.action,
details,
&signature[..],
],
)?;
// Update chain
*self.prev_signature.write().unwrap() = signature;
Ok(())
}
fn sign_entry(
&self,
timestamp: i64,
event: &AuditEvent,
details: &str,
prev_sig: &[u8],
) -> Vec<u8> {
let mut mac = HmacSha256::new_from_slice(&self.signing_key).unwrap();
mac.update(×tamp.to_le_bytes());
mac.update(event.event_type.as_bytes());
mac.update(event.action.as_bytes());
mac.update(details.as_bytes());
mac.update(prev_sig); // Chain to previous entry
mac.finalize().into_bytes().to_vec()
}
/// Verify audit log integrity.
pub fn verify_integrity(&self) -> Result<bool, FsError> {
let mut prev_sig = Vec::new();
let mut stmt = self.conn.prepare(
"SELECT timestamp, event_type, severity, actor, actor_ip, resource, action, details, signature
FROM audit_log ORDER BY seq"
)?;
let rows = stmt.query_map([], |row| {
Ok(AuditRow {
timestamp: row.get(0)?,
event_type: row.get(1)?,
severity: row.get(2)?,
actor: row.get(3)?,
actor_ip: row.get(4)?,
resource: row.get(5)?,
action: row.get(6)?,
details: row.get(7)?,
signature: row.get(8)?,
})
})?;
for row in rows {
let row = row?;
let expected_sig = self.sign_entry(
row.timestamp,
&row.to_event(),
&row.details,
&prev_sig,
);
if expected_sig != row.signature {
return Ok(false); // Tampered!
}
prev_sig = row.signature;
}
Ok(true)
}
}
}
Audit Log Retention
#![allow(unused)]
fn main() {
impl AuditLogger {
/// Rotate old audit logs to cold storage.
pub fn rotate(&self, max_age: Duration) -> Result<RotationStats, FsError> {
let cutoff = SystemTime::now()
.duration_since(UNIX_EPOCH)
.unwrap()
.as_secs() as i64 - max_age.as_secs() as i64;
// Export old entries to archive
let old_entries: Vec<AuditRow> = self.conn.prepare(
"SELECT * FROM audit_log WHERE timestamp < ?"
)?.query_map([cutoff], |row| /* ... */)?.collect();
// Write to archive file (compressed, signed)
self.write_archive(&old_entries)?;
// Delete from active log
let deleted = self.conn.execute(
"DELETE FROM audit_log WHERE timestamp < ?",
[cutoff],
)?;
Ok(RotationStats { archived: old_entries.len(), deleted })
}
}
}
Access Control
Path-Based Access Control
Use PathFilterLayer middleware for path-based restrictions:
#![allow(unused)]
fn main() {
use anyfs::{PathFilterLayer};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
let backend = SqliteBackend::open("data.db")?
.layer(PathFilterLayer::builder()
// Allow specific directories
.allow("/home/{user}/**")
.allow("/shared/**")
// Block sensitive paths
.deny("**/.env")
.deny("**/.git/**")
.deny("**/node_modules/**")
// Block by extension
.deny("**/*.key")
.deny("**/*.pem")
.build());
}
Role-Based Access Control
Implement RBAC at the application layer:
#![allow(unused)]
fn main() {
#[derive(Debug, Clone)]
pub enum Role {
Admin,
ReadWrite,
ReadOnly,
Custom(Vec<Permission>),
}
#[derive(Debug, Clone)]
pub enum Permission {
Read(PathPattern),
Write(PathPattern),
Delete(PathPattern),
Admin,
}
pub struct RbacMiddleware<B> {
inner: B,
user_roles: Arc<dyn RoleProvider>,
}
impl<B: FsRead> FsRead for RbacMiddleware<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
let user = current_user()?;
let role = self.user_roles.get_role(&user)?;
if !role.can_read(path) {
return Err(FsError::AccessDenied {
path: path.to_path_buf(),
reason: "insufficient permissions".into(),
});
}
self.inner.read(path)
}
}
impl<B: FsWrite> FsWrite for RbacMiddleware<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
let user = current_user()?;
let role = self.user_roles.get_role(&user)?;
if !role.can_write(path) {
return Err(FsError::AccessDenied {
path: path.to_path_buf(),
reason: "write permission denied".into(),
});
}
self.inner.write(path, data)
}
}
}
Network Security
TLS Configuration
Always use TLS for network communication:
#![allow(unused)]
fn main() {
use tonic::transport::{Server, ServerTlsConfig, Identity, Certificate};
pub async fn serve_with_tls(
backend: impl Fs,
addr: &str,
cert_path: &Path,
key_path: &Path,
) -> Result<(), Box<dyn Error>> {
let cert = std::fs::read_to_string(cert_path)?;
let key = std::fs::read_to_string(key_path)?;
let identity = Identity::from_pem(cert, key);
let tls_config = ServerTlsConfig::new()
.identity(identity);
Server::builder()
.tls_config(tls_config)?
.add_service(FsServiceServer::new(FsServer::new(backend)))
.serve(addr.parse()?)
.await?;
Ok(())
}
}
Client Certificate Authentication (mTLS)
For high-security deployments, require client certificates:
#![allow(unused)]
fn main() {
use tonic::transport::ClientTlsConfig;
pub async fn connect_with_mtls(
addr: &str,
ca_cert: &Path,
client_cert: &Path,
client_key: &Path,
) -> Result<FsServiceClient<Channel>, Box<dyn Error>> {
let ca = std::fs::read_to_string(ca_cert)?;
let cert = std::fs::read_to_string(client_cert)?;
let key = std::fs::read_to_string(client_key)?;
let tls_config = ClientTlsConfig::new()
.ca_certificate(Certificate::from_pem(ca))
.identity(Identity::from_pem(cert, key));
let channel = Channel::from_shared(addr.to_string())?
.tls_config(tls_config)?
.connect()
.await?;
Ok(FsServiceClient::new(channel))
}
}
Security Checklist
Development
- No secrets in source code
- No secrets in logs
- Input validation on all paths
- Error messages don’t leak sensitive info
Deployment
- TLS enabled for all network traffic
- Encryption at rest (SQLCipher)
- Keys stored in secure key management system
- Key rotation policy defined and automated
- Audit logging enabled
- Rate limiting configured
- Quotas configured
Operations
- Regular security audits
- Vulnerability scanning
- Audit log review
- Key rotation executed
- Backup encryption verified
- Access reviews (who has what permissions)
Multi-Tenant
- Tenant isolation verified
- Per-tenant encryption keys
- No cross-tenant dedup (or risk accepted)
- Tenant data segregation in backups
Summary
| Layer | Protection |
|---|---|
| Transport | TLS, mTLS |
| Authentication | Tokens, certificates |
| Authorization | RBAC, PathFilter |
| Data at rest | SQLCipher encryption |
| Key management | KMS, rotation |
| Audit | Tamper-evident logging |
| Isolation | Per-tenant DBs and keys |
Security is not optional. Build it in from the start.
Testing Guide
Comprehensive testing strategy for AnyFS
Overview
AnyFS uses a layered testing approach:
| Layer | What it tests | Run with |
|---|---|---|
| Unit tests | Individual components | cargo test |
| Conformance tests | Backend trait compliance | cargo test --features conformance |
| Integration tests | Full stack behavior | cargo test --test integration |
| Stress tests | Concurrency & limits | cargo test --release -- --ignored |
| Platform tests | Cross-platform behavior | CI matrix |
1. Backend Conformance Tests
Every backend must pass the same conformance suite. This ensures backends are interchangeable.
Running Conformance Tests
#![allow(unused)]
fn main() {
use anyfs_test::{run_conformance_suite, ConformanceLevel};
#[test]
fn memory_backend_conformance() {
run_conformance_suite(
MemoryBackend::new(),
ConformanceLevel::Fs, // or FsFull, FsFuse, FsPosix
);
}
#[test]
fn vrootfs_backend_conformance() {
let temp = tempfile::tempdir().unwrap();
run_conformance_suite(
VRootFsBackend::new(temp.path()).unwrap(),
ConformanceLevel::FsFull,
);
}
}
Conformance Levels
FsPosix ──▶ FsHandles, FsLock, FsXattr tests
│
FsFuse ──▶ FsInode tests (path_to_inode, lookup, etc.)
│
FsFull ──▶ FsLink, FsPermissions, FsSync, FsStats tests
│
Fs ──▶ FsRead, FsWrite, FsDir tests (REQUIRED for all)
Core Tests (Fs level)
#![allow(unused)]
fn main() {
#[test]
fn test_write_and_read() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"hello world").unwrap();
let content = backend.read(std::path::Path::new("/file.txt")).unwrap();
assert_eq!(content, b"hello world");
}
#[test]
fn test_read_nonexistent_returns_not_found() {
let backend = create_backend();
let result = backend.read(std::path::Path::new("/nonexistent.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn test_create_dir_and_list() {
let backend = create_backend();
backend.create_dir(std::path::Path::new("/mydir")).unwrap();
backend.write(std::path::Path::new("/mydir/file.txt"), b"data").unwrap();
let entries: Vec<_> = backend.read_dir(std::path::Path::new("/mydir")).unwrap()
.collect::<Result<Vec<_>, _>>().unwrap();
assert_eq!(entries.len(), 1);
assert_eq!(entries[0].name, "file.txt");
}
#[test]
fn test_create_dir_all() {
let backend = create_backend();
backend.create_dir_all(std::path::Path::new("/a/b/c/d")).unwrap();
assert!(backend.exists(std::path::Path::new("/a/b/c/d")).unwrap());
}
#[test]
fn test_remove_file() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
backend.remove_file(std::path::Path::new("/file.txt")).unwrap();
assert!(!backend.exists(std::path::Path::new("/file.txt")).unwrap());
}
#[test]
fn test_remove_dir_all() {
let backend = create_backend();
backend.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
backend.write(std::path::Path::new("/a/b/c/file.txt"), b"data").unwrap();
backend.remove_dir_all(std::path::Path::new("/a")).unwrap();
assert!(!backend.exists(std::path::Path::new("/a")).unwrap());
}
#[test]
fn test_rename() {
let backend = create_backend();
backend.write(std::path::Path::new("/old.txt"), b"data").unwrap();
backend.rename(std::path::Path::new("/old.txt"), std::path::Path::new("/new.txt")).unwrap();
assert!(!backend.exists(std::path::Path::new("/old.txt")).unwrap());
assert_eq!(backend.read(std::path::Path::new("/new.txt")).unwrap(), b"data");
}
#[test]
fn test_copy() {
let backend = create_backend();
backend.write(std::path::Path::new("/original.txt"), b"data").unwrap();
backend.copy(std::path::Path::new("/original.txt"), std::path::Path::new("/copy.txt")).unwrap();
assert_eq!(backend.read(std::path::Path::new("/original.txt")).unwrap(), b"data");
assert_eq!(backend.read(std::path::Path::new("/copy.txt")).unwrap(), b"data");
}
#[test]
fn test_metadata() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"hello").unwrap();
let meta = backend.metadata(std::path::Path::new("/file.txt")).unwrap();
assert_eq!(meta.size, 5);
assert!(meta.file_type.is_file());
}
#[test]
fn test_append() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"hello").unwrap();
backend.append(std::path::Path::new("/file.txt"), b" world").unwrap();
assert_eq!(backend.read(std::path::Path::new("/file.txt")).unwrap(), b"hello world");
}
#[test]
fn test_truncate() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"hello world").unwrap();
backend.truncate(std::path::Path::new("/file.txt"), 5).unwrap();
assert_eq!(backend.read(std::path::Path::new("/file.txt")).unwrap(), b"hello");
}
#[test]
fn test_read_range() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"hello world").unwrap();
let partial = backend.read_range(std::path::Path::new("/file.txt"), 6, 5).unwrap();
assert_eq!(partial, b"world");
}
}
Extended Tests (FsFull level)
#![allow(unused)]
fn main() {
#[test]
fn test_symlink() {
let backend = create_backend();
backend.write(std::path::Path::new("/target.txt"), b"data").unwrap();
backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();
// read_link returns the target
assert_eq!(backend.read_link(std::path::Path::new("/link.txt")).unwrap(), Path::new("/target.txt"));
// reading the symlink follows it
assert_eq!(backend.read(std::path::Path::new("/link.txt")).unwrap(), b"data");
}
#[test]
fn test_hard_link() {
let backend = create_backend();
backend.write(std::path::Path::new("/original.txt"), b"data").unwrap();
backend.hard_link(std::path::Path::new("/original.txt"), std::path::Path::new("/hardlink.txt")).unwrap();
// Both point to same data
assert_eq!(backend.read(std::path::Path::new("/hardlink.txt")).unwrap(), b"data");
// Metadata shows nlink > 1
let meta = backend.metadata(std::path::Path::new("/original.txt")).unwrap();
assert!(meta.nlink >= 2);
}
#[test]
fn test_symlink_metadata() {
let backend = create_backend();
backend.write(std::path::Path::new("/target.txt"), b"data").unwrap();
backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();
// symlink_metadata returns metadata of the symlink itself
let meta = backend.symlink_metadata(std::path::Path::new("/link.txt")).unwrap();
assert!(meta.file_type.is_symlink());
}
#[test]
fn test_set_permissions() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
backend.set_permissions(std::path::Path::new("/file.txt"), Permissions::from_mode(0o644)).unwrap();
let meta = backend.metadata(std::path::Path::new("/file.txt")).unwrap();
assert_eq!(meta.permissions.mode() & 0o777, 0o644);
}
#[test]
fn test_sync() {
let backend = create_backend();
backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
// Should not error
backend.sync().unwrap();
backend.fsync(std::path::Path::new("/file.txt")).unwrap();
}
#[test]
fn test_statfs() {
let backend = create_backend();
let stats = backend.statfs().unwrap();
assert!(stats.total_bytes > 0 || stats.total_bytes == 0); // Memory may report 0
}
}
2. Middleware Tests
Each middleware is tested in isolation and in combination.
Quota Tests
#![allow(unused)]
fn main() {
#[test]
fn test_quota_blocks_when_exceeded() {
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(100).build());
let fs = FileStorage::new(backend);
let result = fs.write("/big.txt", &[0u8; 200]);
assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}
#[test]
fn test_quota_allows_within_limit() {
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(1000).build());
let fs = FileStorage::new(backend);
fs.write("/small.txt", &[0u8; 100]).unwrap();
assert!(fs.exists("/small.txt").unwrap());
}
#[test]
fn test_quota_tracks_deletes() {
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(100).build());
let fs = FileStorage::new(backend);
fs.write("/file.txt", &[0u8; 50]).unwrap();
fs.remove_file("/file.txt").unwrap();
// Should be able to write again after delete
fs.write("/file2.txt", &[0u8; 50]).unwrap();
}
#[test]
fn test_quota_max_file_size() {
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_file_size(50).build());
let fs = FileStorage::new(backend);
let result = fs.write("/big.txt", &[0u8; 100]);
assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}
#[test]
fn test_quota_streaming_write() {
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(100).build());
let fs = FileStorage::new(backend);
let mut writer = fs.open_write("/file.txt").unwrap();
writer.write_all(&[0u8; 50]).unwrap();
writer.write_all(&[0u8; 50]).unwrap();
drop(writer);
// Next write should fail
let result = fs.write("/file2.txt", &[0u8; 10]);
assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}
}
Restrictions Tests
#![allow(unused)]
fn main() {
#[test]
fn test_restrictions_blocks_permissions() {
let backend = MemoryBackend::new()
.layer(RestrictionsLayer::builder().deny_permissions().build());
let fs = FileStorage::new(backend);
fs.write("/file.txt", b"data").unwrap();
let result = fs.set_permissions("/file.txt", Permissions::from_mode(0o644));
assert!(matches!(result, Err(FsError::FeatureNotEnabled { .. })));
}
#[test]
fn test_restrictions_allows_links() {
// Restrictions doesn't block FsLink - capability is via trait bounds
let backend = MemoryBackend::new()
.layer(RestrictionsLayer::builder().deny_permissions().build());
let fs = FileStorage::new(backend);
fs.write("/target.txt", b"data").unwrap();
fs.symlink("/target.txt", "/link.txt").unwrap(); // Works - MemoryBackend: FsLink
fs.hard_link("/target.txt", "/hardlink.txt").unwrap(); // Works too
}
#[test]
fn test_restrictions_blocks_permissions() {
let backend = MemoryBackend::new()
.layer(RestrictionsLayer::builder().deny_permissions().build());
let fs = FileStorage::new(backend);
fs.write("/file.txt", b"data").unwrap();
let result = fs.set_permissions("/file.txt", Permissions::from_mode(0o777));
assert!(matches!(result, Err(FsError::FeatureNotEnabled { .. })));
}
}
PathFilter Tests
#![allow(unused)]
fn main() {
#[test]
fn test_pathfilter_allows_matching() {
let backend = MemoryBackend::new()
.layer(PathFilterLayer::builder().allow("/workspace/**").build());
let fs = FileStorage::new(backend);
fs.create_dir_all("/workspace/project").unwrap();
fs.write("/workspace/project/file.txt", b"data").unwrap();
}
#[test]
fn test_pathfilter_blocks_non_matching() {
let backend = MemoryBackend::new()
.layer(PathFilterLayer::builder().allow("/workspace/**").build());
let fs = FileStorage::new(backend);
let result = fs.write("/etc/passwd", b"data");
assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}
#[test]
fn test_pathfilter_deny_overrides_allow() {
let backend = MemoryBackend::new()
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.build());
let fs = FileStorage::new(backend);
let result = fs.write("/workspace/.env", b"SECRET=xxx");
assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}
#[test]
fn test_pathfilter_read_dir_filters() {
let mut inner = MemoryBackend::new();
inner.write(std::path::Path::new("/workspace/allowed.txt"), b"data").unwrap();
inner.write(std::path::Path::new("/workspace/.env"), b"secret").unwrap();
let backend = inner
.layer(PathFilterLayer::builder()
.allow("/workspace/**")
.deny("**/.env")
.build());
let fs = FileStorage::new(backend);
let entries: Vec<_> = fs.read_dir("/workspace").unwrap()
.collect::<Result<Vec<_>, _>>().unwrap();
// .env should be filtered out
assert_eq!(entries.len(), 1);
assert_eq!(entries[0].name, "allowed.txt");
}
}
ReadOnly Tests
#![allow(unused)]
fn main() {
#[test]
fn test_readonly_blocks_writes() {
let mut inner = MemoryBackend::new();
inner.write(std::path::Path::new("/file.txt"), b"original").unwrap();
let backend = ReadOnly::new(inner);
let fs = FileStorage::new(backend);
let result = fs.write("/file.txt", b"modified");
assert!(matches!(result, Err(FsError::ReadOnly { .. })));
let result = fs.remove_file("/file.txt");
assert!(matches!(result, Err(FsError::ReadOnly { .. })));
}
#[test]
fn test_readonly_allows_reads() {
let mut inner = MemoryBackend::new();
inner.write(std::path::Path::new("/file.txt"), b"data").unwrap();
let backend = ReadOnly::new(inner);
let fs = FileStorage::new(backend);
assert_eq!(fs.read("/file.txt").unwrap(), b"data");
}
}
Middleware Composition Tests
#![allow(unused)]
fn main() {
#[test]
fn test_middleware_composition_order() {
// Quota inside, Restrictions outside
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(100).build())
.layer(RestrictionsLayer::builder().deny_permissions().build());
let fs = FileStorage::new(backend);
// Write should hit quota
let result = fs.write("/big.txt", &[0u8; 200]);
assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}
#[test]
fn test_layer_syntax() {
// All configurable middleware use builder pattern (per ADR-022)
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(1000).build())
.layer(RestrictionsLayer::builder().deny_permissions().build())
.layer(TracingLayer::new()); // TracingLayer has sensible defaults
let fs = FileStorage::new(backend);
fs.write("/test.txt", b"data").unwrap();
}
}
3. FileStorage Tests
#![allow(unused)]
fn main() {
#[test]
fn test_filestorage_type_inference() {
// Type should be inferred
let fs = FileStorage::new(MemoryBackend::new());
// No explicit type needed
}
#[test]
fn test_filestorage_wrapper_types() {
// Users who need type-safe domain separation create wrapper types
struct SandboxFs(FileStorage<MemoryBackend>);
struct ProductionFs(FileStorage<MemoryBackend>);
let sandbox = SandboxFs(FileStorage::new(MemoryBackend::new()));
let prod = ProductionFs(FileStorage::new(MemoryBackend::new()));
fn only_sandbox(_fs: &SandboxFs) {}
only_sandbox(&sandbox); // Compiles
// only_sandbox(&prod); // Would not compile - different type
}
}
4. Integration Tests (Real Filesystem)
Tests that use real filesystem backends (VRootFsBackend, tempfile) are integration tests, not unit tests.
#![allow(unused)]
fn main() {
#[test]
fn test_filestorage_boxed_with_real_fs() {
// This is an INTEGRATION test - uses real filesystem via tempfile
let fs1 = FileStorage::new(MemoryBackend::new()).boxed();
let temp = tempfile::tempdir().unwrap();
let fs2 = FileStorage::with_resolver(
VRootFsBackend::new(temp.path()).unwrap(),
NoOpResolver
).boxed();
// Both can be stored in same collection
let _filesystems: Vec<FileStorage<Box<dyn Fs>>> = vec![fs1, fs2];
}
}
5. Error Handling Tests
#![allow(unused)]
fn main() {
#[test]
fn test_error_not_found() {
let fs = FileStorage::new(MemoryBackend::new());
match fs.read("/nonexistent") {
Err(FsError::NotFound { path, operation }) => {
assert_eq!(path, Path::new("/nonexistent"));
assert_eq!(operation, "read");
}
_ => panic!("Expected NotFound error"),
}
}
#[test]
fn test_error_already_exists() {
let fs = FileStorage::new(MemoryBackend::new());
fs.create_dir("/mydir").unwrap();
match fs.create_dir("/mydir") {
Err(FsError::AlreadyExists { path, .. }) => {
assert_eq!(path, Path::new("/mydir"));
}
_ => panic!("Expected AlreadyExists error"),
}
}
#[test]
fn test_error_not_a_directory() {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/file.txt", b"data").unwrap();
match fs.read_dir("/file.txt") {
Err(FsError::NotADirectory { path }) => {
assert_eq!(path, Path::new("/file.txt"));
}
_ => panic!("Expected NotADirectory error"),
}
}
#[test]
fn test_error_directory_not_empty() {
let fs = FileStorage::new(MemoryBackend::new());
fs.create_dir("/mydir").unwrap();
fs.write("/mydir/file.txt", b"data").unwrap();
match fs.remove_dir("/mydir") {
Err(FsError::DirectoryNotEmpty { path }) => {
assert_eq!(path, Path::new("/mydir"));
}
_ => panic!("Expected DirectoryNotEmpty error"),
}
}
}
5. Concurrency Tests
#![allow(unused)]
fn main() {
#[test]
fn test_concurrent_reads() {
let backend = MemoryBackend::new();
backend.write(std::path::Path::new("/file.txt"), b"data").unwrap();
let backend = Arc::new(RwLock::new(backend));
let handles: Vec<_> = (0..10).map(|_| {
let backend = Arc::clone(&backend);
thread::spawn(move || {
let guard = backend.read().unwrap();
guard.read(std::path::Path::new("/file.txt")).unwrap()
})
}).collect();
for handle in handles {
assert_eq!(handle.join().unwrap(), b"data");
}
}
#[test]
fn test_concurrent_create_dir_all() {
let backend = Arc::new(Mutex::new(MemoryBackend::new()));
let handles: Vec<_> = (0..10).map(|_| {
let backend = Arc::clone(&backend);
thread::spawn(move || {
let mut guard = backend.lock().unwrap();
// Multiple threads creating same path should not race
guard.create_dir_all(std::path::Path::new("/a/b/c/d")).unwrap();
})
}).collect();
for handle in handles {
handle.join().unwrap();
}
assert!(backend.lock().unwrap().exists(std::path::Path::new("/a/b/c/d")).unwrap());
}
#[test]
#[ignore] // Run with: cargo test --release -- --ignored
fn stress_test_concurrent_operations() {
let backend = Arc::new(Mutex::new(MemoryBackend::new()));
let handles: Vec<_> = (0..100).map(|i| {
let backend = Arc::clone(&backend);
thread::spawn(move || {
for j in 0..100 {
let path = format!("/thread_{}/file_{}.txt", i, j);
let mut guard = backend.lock().unwrap();
guard.create_dir_all(std::path::Path::new(&format!("/thread_{}", i))).ok();
guard.write(std::path::Path::new(&path), b"data").unwrap();
drop(guard);
let guard = backend.lock().unwrap();
let _ = guard.read(std::path::Path::new(&path));
}
})
}).collect();
for handle in handles {
handle.join().unwrap();
}
}
}
6. Path Edge Case Tests
#![allow(unused)]
fn main() {
#[test]
fn test_path_normalization() {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/a/b/../c/file.txt", b"data").unwrap();
// Should be accessible via normalized path
assert_eq!(fs.read("/a/c/file.txt").unwrap(), b"data");
}
#[test]
fn test_double_slashes() {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("//a//b//file.txt", b"data").unwrap();
assert_eq!(fs.read("/a/b/file.txt").unwrap(), b"data");
}
#[test]
fn test_root_path() {
let fs = FileStorage::new(MemoryBackend::new());
// ReadDirIter is an iterator, use collect_all() to check contents
let entries = fs.read_dir("/").unwrap().collect_all().unwrap();
assert!(entries.is_empty());
}
#[test]
fn test_empty_path_returns_error() {
let fs = FileStorage::new(MemoryBackend::new());
let result = fs.read("");
assert!(result.is_err());
}
#[test]
fn test_unicode_paths() {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/文件/データ.txt", b"data").unwrap();
assert_eq!(fs.read("/文件/データ.txt").unwrap(), b"data");
}
#[test]
fn test_paths_with_spaces() {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/my folder/my file.txt", b"data").unwrap();
assert_eq!(fs.read("/my folder/my file.txt").unwrap(), b"data");
}
}
7. No-Panic Guarantee Tests
#![allow(unused)]
fn main() {
#[test]
fn no_panic_missing_file() {
let fs = FileStorage::new(MemoryBackend::new());
let _ = fs.read("/missing"); // Should return Err, not panic
}
#[test]
fn no_panic_missing_parent() {
let fs = FileStorage::new(MemoryBackend::new());
let _ = fs.write("/missing/parent/file.txt", b"data"); // Should return Err
}
#[test]
fn no_panic_read_dir_on_file() {
let fs = FileStorage::new(MemoryBackend::new());
fs.write("/file.txt", b"data").unwrap();
let _ = fs.read_dir("/file.txt"); // Should return Err, not panic
}
#[test]
fn no_panic_remove_nonempty_dir() {
let fs = FileStorage::new(MemoryBackend::new());
fs.create_dir("/dir").unwrap();
fs.write("/dir/file.txt", b"data").unwrap();
let _ = fs.remove_dir("/dir"); // Should return Err, not panic
}
}
8. Symlink Security Tests
#![allow(unused)]
fn main() {
// Virtual backend symlink resolution (always follows for FsLink backends)
#[test]
fn test_virtual_backend_symlink_following() {
let backend = MemoryBackend::new();
backend.write(std::path::Path::new("/target.txt"), b"secret").unwrap();
backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();
assert_eq!(backend.read(std::path::Path::new("/link.txt")).unwrap(), b"secret");
}
#[test]
fn test_symlink_chain_resolution() {
let backend = MemoryBackend::new();
backend.write(std::path::Path::new("/target.txt"), b"data").unwrap();
backend.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link1.txt")).unwrap();
backend.symlink(std::path::Path::new("/link1.txt"), std::path::Path::new("/link2.txt")).unwrap();
// Should follow chain
assert_eq!(backend.read(std::path::Path::new("/link2.txt")).unwrap(), b"data");
}
#[test]
fn test_symlink_loop_detection() {
let backend = MemoryBackend::new();
backend.symlink(std::path::Path::new("/link2.txt"), std::path::Path::new("/link1.txt")).unwrap();
backend.symlink(std::path::Path::new("/link1.txt"), std::path::Path::new("/link2.txt")).unwrap();
let result = backend.read(std::path::Path::new("/link1.txt"));
assert!(matches!(result, Err(FsError::SymlinkLoop { .. })));
}
#[test]
fn test_virtual_symlink_cannot_escape() {
let backend = MemoryBackend::new();
// Create a symlink pointing "outside" - but in virtual backend, paths are just keys
backend.symlink(std::path::Path::new("../../../etc/passwd"), std::path::Path::new("/link.txt")).unwrap();
// Reading should fail (target doesn't exist), not read real /etc/passwd
let result = backend.read(std::path::Path::new("/link.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
}
VRootFsBackend Containment Tests
#![allow(unused)]
fn main() {
#[test]
fn test_vroot_prevents_path_traversal() {
let temp = tempfile::tempdir().unwrap();
let backend = VRootFsBackend::new(temp.path()).unwrap();
let fs = FileStorage::new(backend);
// Attempt to escape via ..
let result = fs.read("/../../../etc/passwd");
assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}
#[test]
fn test_vroot_prevents_symlink_escape() {
let temp = tempfile::tempdir().unwrap();
std::fs::write(temp.path().join("file.txt"), b"data").unwrap();
// Create symlink pointing outside the jail
#[cfg(unix)]
std::os::unix::fs::symlink("/etc/passwd", temp.path().join("escape")).unwrap();
let backend = VRootFsBackend::new(temp.path()).unwrap();
let fs = FileStorage::new(backend);
// Reading should be blocked by strict-path
let result = fs.read("/escape");
assert!(matches!(result, Err(FsError::AccessDenied { .. })));
}
#[test]
fn test_vroot_allows_internal_symlinks() {
let temp = tempfile::tempdir().unwrap();
std::fs::write(temp.path().join("target.txt"), b"data").unwrap();
#[cfg(unix)]
std::os::unix::fs::symlink("target.txt", temp.path().join("link.txt")).unwrap();
let backend = VRootFsBackend::new(temp.path()).unwrap();
let fs = FileStorage::new(backend);
// Internal symlinks should work
assert_eq!(fs.read("/link.txt").unwrap(), b"data");
}
#[test]
fn test_vroot_canonicalizes_paths() {
let temp = tempfile::tempdir().unwrap();
let backend = VRootFsBackend::new(temp.path()).unwrap();
let fs = FileStorage::new(backend);
fs.create_dir("/a").unwrap();
fs.write("/a/file.txt", b"data").unwrap();
// Access via normalized path
assert_eq!(fs.read("/a/../a/./file.txt").unwrap(), b"data");
}
}
9. RateLimit Middleware Tests
#![allow(unused)]
fn main() {
#[test]
fn test_ratelimit_allows_within_limit() {
let backend = MemoryBackend::new()
.layer(RateLimitLayer::builder().max_ops(10).per_second().build());
let fs = FileStorage::new(backend);
// Should succeed within limit
for i in 0..5 {
fs.write(format!("/file{}.txt", i), b"data").unwrap();
}
}
#[test]
fn test_ratelimit_blocks_when_exceeded() {
let backend = MemoryBackend::new()
.layer(RateLimitLayer::builder().max_ops(3).per_second().build());
let fs = FileStorage::new(backend);
fs.write("/file1.txt", b"data").unwrap();
fs.write("/file2.txt", b"data").unwrap();
fs.write("/file3.txt", b"data").unwrap();
let result = fs.write("/file4.txt", b"data");
assert!(matches!(result, Err(FsError::RateLimitExceeded { .. })));
}
#[test]
fn test_ratelimit_resets_after_window() {
let backend = MemoryBackend::new()
.layer(RateLimitLayer::builder().max_ops(2).per(Duration::from_millis(100)).build());
let fs = FileStorage::new(backend);
fs.write("/file1.txt", b"data").unwrap();
fs.write("/file2.txt", b"data").unwrap();
// Wait for window to reset
std::thread::sleep(Duration::from_millis(150));
// Should succeed again
fs.write("/file3.txt", b"data").unwrap();
}
#[test]
fn test_ratelimit_counts_all_operations() {
let backend = MemoryBackend::new()
.layer(RateLimitLayer::builder().max_ops(3).per_second().build());
let fs = FileStorage::new(backend);
fs.write("/file.txt", b"data").unwrap(); // 1
let _ = fs.read("/file.txt"); // 2
let _ = fs.exists("/file.txt"); // 3
let result = fs.metadata("/file.txt");
assert!(matches!(result, Err(FsError::RateLimitExceeded { .. })));
}
}
10. Tracing Middleware Tests
#![allow(unused)]
fn main() {
use std::sync::{Arc, Mutex};
#[derive(Default)]
struct TestLogger {
logs: Arc<Mutex<Vec<String>>>,
}
impl TestLogger {
fn entries(&self) -> Vec<String> {
self.logs.lock().unwrap().clone()
}
}
#[test]
fn test_tracing_logs_operations() {
let logger = TestLogger::default();
let logs = Arc::clone(&logger.logs);
let backend = MemoryBackend::new()
.layer(TracingLayer::new()
.with_logger(move |op| {
logs.lock().unwrap().push(op.to_string());
}));
let fs = FileStorage::new(backend);
fs.write("/file.txt", b"data").unwrap();
fs.read("/file.txt").unwrap();
let entries = logger.entries();
assert!(entries.iter().any(|e| e.contains("write")));
assert!(entries.iter().any(|e| e.contains("read")));
}
#[test]
fn test_tracing_includes_path() {
let logger = TestLogger::default();
let logs = Arc::clone(&logger.logs);
let backend = MemoryBackend::new()
.layer(TracingLayer::new()
.with_logger(move |op| {
logs.lock().unwrap().push(op.to_string());
}));
let fs = FileStorage::new(backend);
fs.write("/important/secret.txt", b"data").unwrap();
let entries = logger.entries();
assert!(entries.iter().any(|e| e.contains("/important/secret.txt")));
}
#[test]
fn test_tracing_logs_errors() {
let logger = TestLogger::default();
let logs = Arc::clone(&logger.logs);
let backend = MemoryBackend::new()
.layer(TracingLayer::new()
.with_logger(move |op| {
logs.lock().unwrap().push(op.to_string());
}));
let fs = FileStorage::new(backend);
let _ = fs.read("/nonexistent.txt");
let entries = logger.entries();
assert!(entries.iter().any(|e| e.contains("NotFound") || e.contains("error")));
}
#[test]
fn test_tracing_with_span_context() {
use tracing::{info_span, Instrument};
let backend = MemoryBackend::new().layer(TracingLayer::new());
let fs = FileStorage::new(backend);
async {
fs.write("/async.txt", b"data").unwrap();
}
.instrument(info_span!("test_operation"))
.now_or_never();
}
}
11. Backend Interchangeability Tests
#![allow(unused)]
fn main() {
/// Ensure all backends can be used interchangeably
fn generic_filesystem_test<B: Fs>(mut backend: B) {
backend.create_dir(std::path::Path::new("/test")).unwrap();
backend.write(std::path::Path::new("/test/file.txt"), b"hello").unwrap();
assert_eq!(backend.read(std::path::Path::new("/test/file.txt")).unwrap(), b"hello");
backend.remove_dir_all(std::path::Path::new("/test")).unwrap();
assert!(!backend.exists(std::path::Path::new("/test")).unwrap());
}
#[test]
fn test_memory_backend_interchangeable() {
generic_filesystem_test(MemoryBackend::new());
}
#[test]
fn test_sqlite_backend_interchangeable() {
let (backend, _temp) = temp_sqlite_backend();
generic_filesystem_test(backend);
}
#[test]
fn test_vroot_backend_interchangeable() {
let temp = tempfile::tempdir().unwrap();
let backend = VRootFsBackend::new(temp.path()).unwrap();
generic_filesystem_test(backend);
}
#[test]
fn test_middleware_stack_interchangeable() {
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(1024 * 1024)
.build())
.layer(TracingLayer::new());
generic_filesystem_test(backend);
}
}
12. Property-Based Tests
#![allow(unused)]
fn main() {
use proptest::prelude::*;
proptest! {
#[test]
fn prop_write_read_roundtrip(data: Vec<u8>) {
let backend = MemoryBackend::new();
backend.write(std::path::Path::new("/file.bin"), &data).unwrap();
let read_data = backend.read(std::path::Path::new("/file.bin")).unwrap();
prop_assert_eq!(data, read_data);
}
#[test]
fn prop_path_normalization_idempotent(path in "[a-z/]{1,50}") {
let backend = MemoryBackend::new();
if let Ok(()) = backend.create_dir_all(std::path::Path::new(&path)) {
// Creating again should either succeed or return AlreadyExists
let result = backend.create_dir_all(std::path::Path::new(&path));
prop_assert!(result.is_ok() || matches!(result, Err(FsError::AlreadyExists { .. })));
}
}
#[test]
fn prop_quota_never_exceeds_limit(
file_count in 1..10usize,
file_sizes in prop::collection::vec(1..100usize, 1..10)
) {
let limit = 500usize;
let backend = MemoryBackend::new()
.layer(QuotaLayer::builder().max_total_size(limit as u64).build());
let fs = FileStorage::new(backend);
let mut total_written = 0usize;
for (i, size) in file_sizes.into_iter().take(file_count).enumerate() {
let data = vec![0u8; size];
match fs.write(format!("/file{}.txt", i), &data) {
Ok(()) => total_written += size,
Err(FsError::QuotaExceeded { .. }) => break,
Err(e) => panic!("Unexpected error: {:?}", e),
}
}
prop_assert!(total_written <= limit);
}
}
}
13. Snapshot & Restore Tests
#![allow(unused)]
fn main() {
// MemoryBackend implements Clone - that's the snapshot mechanism
#[test]
fn test_clone_creates_independent_copy() {
let mut original = MemoryBackend::new();
original.write(std::path::Path::new("/file.txt"), b"original").unwrap();
// Clone = snapshot
let mut snapshot = original.clone();
// Modify original
original.write(std::path::Path::new("/file.txt"), b"modified").unwrap();
original.write(std::path::Path::new("/new.txt"), b"new").unwrap();
// Snapshot is unchanged
assert_eq!(snapshot.read(std::path::Path::new("/file.txt")).unwrap(), b"original");
assert!(!snapshot.exists(std::path::Path::new("/new.txt")).unwrap());
}
#[test]
fn test_checkpoint_and_rollback() {
let fs = MemoryBackend::new();
fs.write(std::path::Path::new("/important.txt"), b"original").unwrap();
// Checkpoint = clone
let checkpoint = fs.clone();
// Do risky work
fs.write(std::path::Path::new("/important.txt"), b"corrupted").unwrap();
// Rollback = replace with checkpoint
fs = checkpoint;
assert_eq!(fs.read(std::path::Path::new("/important.txt")).unwrap(), b"original");
}
#[test]
fn test_persistence_roundtrip() {
let temp = tempfile::tempdir().unwrap();
let path = temp.path().join("state.bin");
let fs = MemoryBackend::new();
fs.write(std::path::Path::new("/data.txt"), b"persisted").unwrap();
// Save
fs.save_to(&path).unwrap();
// Load
let restored = MemoryBackend::load_from(&path).unwrap();
assert_eq!(restored.read(std::path::Path::new("/data.txt")).unwrap(), b"persisted");
}
#[test]
fn test_to_bytes_from_bytes() {
let fs = MemoryBackend::new();
fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
fs.write(std::path::Path::new("/a/b/c/file.txt"), b"nested").unwrap();
let bytes = fs.to_bytes().unwrap();
let restored = MemoryBackend::from_bytes(&bytes).unwrap();
assert_eq!(restored.read(std::path::Path::new("/a/b/c/file.txt")).unwrap(), b"nested");
}
#[test]
fn test_from_bytes_invalid_data() {
let result = MemoryBackend::from_bytes(b"garbage");
assert!(result.is_err());
}
}
14. Running Tests
# All tests
cargo test
# Specific backend conformance
cargo test memory_backend_conformance
cargo test sqlite_backend_conformance
# Middleware tests
cargo test quota
cargo test restrictions
cargo test pathfilter
# Stress tests (release mode)
cargo test --release -- --ignored
# With coverage
cargo tarpaulin --out Html
# Cross-platform (CI)
cargo test --target x86_64-unknown-linux-gnu
cargo test --target x86_64-pc-windows-msvc
cargo test --target x86_64-apple-darwin
# WASM
cargo test --target wasm32-unknown-unknown
9. Test Utilities
#![allow(unused)]
fn main() {
// In anyfs-test crate
/// Create a temporary backend for testing
pub fn temp_vrootfs_backend() -> (VRootFsBackend, tempfile::TempDir) {
let temp = tempfile::tempdir().unwrap();
let backend = VRootFsBackend::new(temp.path()).unwrap();
(backend, temp)
}
/// Run a test against multiple backends
pub fn test_all_backends<F>(test: F)
where
F: Fn(&mut dyn Fs),
{
// Memory
let backend = MemoryBackend::new();
test(&mut backend);
// VRootFs (real filesystem with containment)
let (mut backend, _temp) = temp_vrootfs_backend();
test(&mut backend);
}
}
Conformance Test Suite
Verify your backend or middleware works correctly with AnyFS
This document provides a complete test suite skeleton that any backend or middleware implementer can use to verify their implementation meets the AnyFS trait contracts.
Overview
The conformance test suite verifies:
- Correctness - Operations behave as specified
- Error handling - Correct errors for edge cases
- Thread safety - Safe concurrent access
- Trait contracts - All trait requirements met
Test Levels
| Level | Traits Tested | When to Use |
|---|---|---|
| Core | FsRead, FsWrite, FsDir (= Fs) | All backends |
| Full | + FsLink, FsPermissions, FsSync, FsStats | Extended backends |
| Fuse | + FsInode | FUSE-mountable backends |
| Posix | + FsHandles, FsLock, FsXattr | Full POSIX backends |
Quick Start
For Backend Implementers
Add this to your Cargo.toml:
[dev-dependencies]
anyfs-test = "0.1" # Conformance test suite
Then in your test file:
#![allow(unused)]
fn main() {
use anyfs_test::prelude::*;
// Tell the test suite how to create your backend
fn create_backend() -> MyBackend {
MyBackend::new()
}
// Run all Fs-level tests
anyfs_test::generate_fs_tests!(create_backend);
// If you implement FsFull traits:
// anyfs_test::generate_fs_full_tests!(create_backend);
// If you implement FsFuse traits:
// anyfs_test::generate_fs_fuse_tests!(create_backend);
}
For Middleware Implementers
#![allow(unused)]
fn main() {
use anyfs_test::prelude::*;
use anyfs::MemoryBackend;
// Wrap MemoryBackend with your middleware
fn create_backend() -> MyMiddleware<MemoryBackend> {
MyMiddleware::new(MemoryBackend::new())
}
// Run all tests through your middleware
anyfs_test::generate_fs_tests!(create_backend);
}
Core Test Suite (Fs Traits)
Copy this entire module into your test file and customize create_backend().
#![allow(unused)]
fn main() {
//! Conformance tests for Fs trait implementations.
//!
//! To use: implement `create_backend()` and include this module.
use anyfs_backend::{Fs, FsRead, FsWrite, FsDir, FsError, FileType, Metadata, ReadDirIter};
use std::path::Path;
use std::sync::Arc;
use std::thread;
/// Create a fresh backend instance for testing.
/// Implement this for your backend.
fn create_backend() -> impl Fs {
todo!("Return your backend here")
}
// ============================================================================
// FsRead Tests
// ============================================================================
mod fs_read {
use super::*;
#[test]
fn read_existing_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/test.txt"), b"hello world").unwrap();
let content = fs.read(std::path::Path::new("/test.txt")).unwrap();
assert_eq!(content, b"hello world");
}
#[test]
fn read_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.read(std::path::Path::new("/nonexistent.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn read_directory_returns_not_a_file() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/mydir")).unwrap();
let result = fs.read(std::path::Path::new("/mydir"));
assert!(matches!(result, Err(FsError::NotAFile { .. })));
}
#[test]
fn read_to_string_valid_utf8() {
let fs = create_backend();
fs.write(std::path::Path::new("/text.txt"), "hello unicode: 你好".as_bytes()).unwrap();
let content = fs.read_to_string(std::path::Path::new("/text.txt")).unwrap();
assert_eq!(content, "hello unicode: 你好");
}
#[test]
fn read_to_string_invalid_utf8_returns_error() {
let fs = create_backend();
fs.write(std::path::Path::new("/binary.bin"), &[0xFF, 0xFE, 0x00, 0x01]).unwrap();
let result = fs.read_to_string(std::path::Path::new("/binary.bin"));
assert!(result.is_err());
}
#[test]
fn read_range_full_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();
let content = fs.read_range(std::path::Path::new("/data.bin"), 0, 10).unwrap();
assert_eq!(content, b"0123456789");
}
#[test]
fn read_range_partial() {
let fs = create_backend();
fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();
let content = fs.read_range(std::path::Path::new("/data.bin"), 3, 4).unwrap();
assert_eq!(content, b"3456");
}
#[test]
fn read_range_past_end_returns_available() {
let fs = create_backend();
fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();
let content = fs.read_range(std::path::Path::new("/data.bin"), 8, 100).unwrap();
assert_eq!(content, b"89");
}
#[test]
fn read_range_offset_past_end_returns_empty() {
let fs = create_backend();
fs.write(std::path::Path::new("/data.bin"), b"0123456789").unwrap();
let content = fs.read_range(std::path::Path::new("/data.bin"), 100, 10).unwrap();
assert!(content.is_empty());
}
#[test]
fn exists_for_existing_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/exists.txt"), b"data").unwrap();
assert!(fs.exists(std::path::Path::new("/exists.txt")).unwrap());
}
#[test]
fn exists_for_nonexistent_file() {
let fs = create_backend();
assert!(!fs.exists(std::path::Path::new("/nonexistent.txt")).unwrap());
}
#[test]
fn exists_for_directory() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/mydir")).unwrap();
assert!(fs.exists(std::path::Path::new("/mydir")).unwrap());
}
#[test]
fn metadata_for_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"hello").unwrap();
let meta = fs.metadata(std::path::Path::new("/file.txt")).unwrap();
assert_eq!(meta.file_type, FileType::File);
assert_eq!(meta.size, 5);
}
#[test]
fn metadata_for_directory() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/mydir")).unwrap();
let meta = fs.metadata(std::path::Path::new("/mydir")).unwrap();
assert_eq!(meta.file_type, FileType::Directory);
}
#[test]
fn metadata_for_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.metadata(std::path::Path::new("/nonexistent"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn open_read_and_consume() {
let fs = create_backend();
fs.write(std::path::Path::new("/stream.txt"), b"streaming content").unwrap();
let mut reader = fs.open_read(std::path::Path::new("/stream.txt")).unwrap();
let mut buf = Vec::new();
std::io::Read::read_to_end(&mut reader, &mut buf).unwrap();
assert_eq!(buf, b"streaming content");
}
}
// ============================================================================
// FsWrite Tests
// ============================================================================
mod fs_write {
use super::*;
#[test]
fn write_creates_new_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/new.txt"), b"new content").unwrap();
assert!(fs.exists(std::path::Path::new("/new.txt")).unwrap());
assert_eq!(fs.read(std::path::Path::new("/new.txt")).unwrap(), b"new content");
}
#[test]
fn write_overwrites_existing_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"original").unwrap();
fs.write(std::path::Path::new("/file.txt"), b"replaced").unwrap();
assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"replaced");
}
#[test]
fn write_to_nonexistent_parent_returns_not_found() {
let fs = create_backend();
let result = fs.write(std::path::Path::new("/nonexistent/file.txt"), b"data");
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn write_empty_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/empty.txt"), b"").unwrap();
assert!(fs.exists(std::path::Path::new("/empty.txt")).unwrap());
assert_eq!(fs.read(std::path::Path::new("/empty.txt")).unwrap(), b"");
}
#[test]
fn write_binary_data() {
let fs = create_backend();
let binary: Vec<u8> = (0..=255).collect();
fs.write(std::path::Path::new("/binary.bin"), &binary).unwrap();
assert_eq!(fs.read(std::path::Path::new("/binary.bin")).unwrap(), binary);
}
#[test]
fn append_to_existing_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/log.txt"), b"line1\n").unwrap();
fs.append(std::path::Path::new("/log.txt"), b"line2\n").unwrap();
assert_eq!(fs.read(std::path::Path::new("/log.txt")).unwrap(), b"line1\nline2\n");
}
#[test]
fn append_to_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.append(std::path::Path::new("/nonexistent.txt"), b"data");
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn remove_file_existing() {
let fs = create_backend();
fs.write(std::path::Path::new("/delete-me.txt"), b"bye").unwrap();
fs.remove_file(std::path::Path::new("/delete-me.txt")).unwrap();
assert!(!fs.exists(std::path::Path::new("/delete-me.txt")).unwrap());
}
#[test]
fn remove_file_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.remove_file(std::path::Path::new("/nonexistent.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn remove_file_on_directory_returns_not_a_file() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/mydir")).unwrap();
let result = fs.remove_file(std::path::Path::new("/mydir"));
assert!(matches!(result, Err(FsError::NotAFile { .. })));
}
#[test]
fn rename_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/old.txt"), b"content").unwrap();
fs.rename(std::path::Path::new("/old.txt"), std::path::Path::new("/new.txt")).unwrap();
assert!(!fs.exists(std::path::Path::new("/old.txt")).unwrap());
assert!(fs.exists(std::path::Path::new("/new.txt")).unwrap());
assert_eq!(fs.read(std::path::Path::new("/new.txt")).unwrap(), b"content");
}
#[test]
fn rename_overwrites_destination() {
let fs = create_backend();
fs.write(std::path::Path::new("/src.txt"), b"source").unwrap();
fs.write(std::path::Path::new("/dst.txt"), b"destination").unwrap();
fs.rename(std::path::Path::new("/src.txt"), std::path::Path::new("/dst.txt")).unwrap();
assert!(!fs.exists(std::path::Path::new("/src.txt")).unwrap());
assert_eq!(fs.read(std::path::Path::new("/dst.txt")).unwrap(), b"source");
}
#[test]
fn rename_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.rename(std::path::Path::new("/nonexistent.txt"), std::path::Path::new("/new.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn copy_file() {
let fs = create_backend();
fs.write(std::path::Path::new("/original.txt"), b"data").unwrap();
fs.copy(std::path::Path::new("/original.txt"), std::path::Path::new("/copy.txt")).unwrap();
assert!(fs.exists(std::path::Path::new("/original.txt")).unwrap());
assert!(fs.exists(std::path::Path::new("/copy.txt")).unwrap());
assert_eq!(fs.read(std::path::Path::new("/copy.txt")).unwrap(), b"data");
}
#[test]
fn copy_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.copy(std::path::Path::new("/nonexistent.txt"), std::path::Path::new("/copy.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn truncate_shrink() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"0123456789").unwrap();
fs.truncate(std::path::Path::new("/file.txt"), 5).unwrap();
assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"01234");
}
#[test]
fn truncate_expand() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"abc").unwrap();
fs.truncate(std::path::Path::new("/file.txt"), 6).unwrap();
let content = fs.read(std::path::Path::new("/file.txt")).unwrap();
assert_eq!(content.len(), 6);
assert_eq!(&content[..3], b"abc");
// Expanded bytes should be zero
assert!(content[3..].iter().all(|&b| b == 0));
}
#[test]
fn truncate_to_zero() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"content").unwrap();
fs.truncate(std::path::Path::new("/file.txt"), 0).unwrap();
assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"");
}
#[test]
fn open_write_and_close() {
let fs = create_backend();
{
let mut writer = fs.open_write(std::path::Path::new("/stream.txt")).unwrap();
std::io::Write::write_all(&mut writer, b"streamed").unwrap();
}
// Content should be visible after writer is dropped
assert_eq!(fs.read(std::path::Path::new("/stream.txt")).unwrap(), b"streamed");
}
}
// ============================================================================
// FsDir Tests
// ============================================================================
mod fs_dir {
use super::*;
#[test]
fn create_dir_single() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/newdir")).unwrap();
assert!(fs.exists(std::path::Path::new("/newdir")).unwrap());
let meta = fs.metadata(std::path::Path::new("/newdir")).unwrap();
assert_eq!(meta.file_type, FileType::Directory);
}
#[test]
fn create_dir_already_exists_returns_error() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/existing")).unwrap();
let result = fs.create_dir(std::path::Path::new("/existing"));
assert!(matches!(result, Err(FsError::AlreadyExists { .. })));
}
#[test]
fn create_dir_parent_not_exists_returns_not_found() {
let fs = create_backend();
let result = fs.create_dir(std::path::Path::new("/nonexistent/child"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn create_dir_all_nested() {
let fs = create_backend();
fs.create_dir_all(std::path::Path::new("/a/b/c/d")).unwrap();
assert!(fs.exists(std::path::Path::new("/a")).unwrap());
assert!(fs.exists(std::path::Path::new("/a/b")).unwrap());
assert!(fs.exists(std::path::Path::new("/a/b/c")).unwrap());
assert!(fs.exists(std::path::Path::new("/a/b/c/d")).unwrap());
}
#[test]
fn create_dir_all_partially_exists() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/exists")).unwrap();
fs.create_dir_all(std::path::Path::new("/exists/new/nested")).unwrap();
assert!(fs.exists(std::path::Path::new("/exists/new/nested")).unwrap());
}
#[test]
fn create_dir_all_already_exists_is_ok() {
let fs = create_backend();
fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
// Should not error
fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
}
#[test]
fn read_dir_empty() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/empty")).unwrap();
let entries: Vec<_> = fs.read_dir(std::path::Path::new("/empty")).unwrap()
.filter_map(|e| e.ok())
.collect();
assert!(entries.is_empty());
}
#[test]
fn read_dir_with_files() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/parent")).unwrap();
fs.write(std::path::Path::new("/parent/file1.txt"), b"1").unwrap();
fs.write(std::path::Path::new("/parent/file2.txt"), b"2").unwrap();
fs.create_dir(std::path::Path::new("/parent/subdir")).unwrap();
let mut entries: Vec<_> = fs.read_dir(std::path::Path::new("/parent")).unwrap()
.filter_map(|e| e.ok())
.collect();
entries.sort_by(|a, b| a.name.cmp(&b.name));
assert_eq!(entries.len(), 3);
assert_eq!(entries[0].name, "file1.txt");
assert_eq!(entries[0].file_type, FileType::File);
assert_eq!(entries[1].name, "file2.txt");
assert_eq!(entries[2].name, "subdir");
assert_eq!(entries[2].file_type, FileType::Directory);
}
#[test]
fn read_dir_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.read_dir(std::path::Path::new("/nonexistent"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn read_dir_on_file_returns_not_a_directory() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
let result = fs.read_dir(std::path::Path::new("/file.txt"));
assert!(matches!(result, Err(FsError::NotADirectory { .. })));
}
#[test]
fn remove_dir_empty() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/todelete")).unwrap();
fs.remove_dir(std::path::Path::new("/todelete")).unwrap();
assert!(!fs.exists(std::path::Path::new("/todelete")).unwrap());
}
#[test]
fn remove_dir_not_empty_returns_error() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/notempty")).unwrap();
fs.write(std::path::Path::new("/notempty/file.txt"), b"data").unwrap();
let result = fs.remove_dir(std::path::Path::new("/notempty"));
assert!(matches!(result, Err(FsError::DirectoryNotEmpty { .. })));
}
#[test]
fn remove_dir_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.remove_dir(std::path::Path::new("/nonexistent"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn remove_dir_on_file_returns_not_a_directory() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
let result = fs.remove_dir(std::path::Path::new("/file.txt"));
assert!(matches!(result, Err(FsError::NotADirectory { .. })));
}
#[test]
fn remove_dir_all_recursive() {
let fs = create_backend();
fs.create_dir_all(std::path::Path::new("/root/a/b")).unwrap();
fs.write(std::path::Path::new("/root/file.txt"), b"data").unwrap();
fs.write(std::path::Path::new("/root/a/nested.txt"), b"nested").unwrap();
fs.remove_dir_all(std::path::Path::new("/root")).unwrap();
assert!(!fs.exists(std::path::Path::new("/root")).unwrap());
assert!(!fs.exists(std::path::Path::new("/root/a")).unwrap());
assert!(!fs.exists(std::path::Path::new("/root/file.txt")).unwrap());
}
#[test]
fn remove_dir_all_nonexistent_returns_not_found() {
let fs = create_backend();
let result = fs.remove_dir_all(std::path::Path::new("/nonexistent"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
}
// ============================================================================
// Edge Case Tests
// ============================================================================
mod edge_cases {
use super::*;
#[test]
fn root_directory_exists() {
let fs = create_backend();
assert!(fs.exists(std::path::Path::new("/")).unwrap());
let meta = fs.metadata(std::path::Path::new("/")).unwrap();
assert_eq!(meta.file_type, FileType::Directory);
}
#[test]
fn read_dir_root() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
let entries: Vec<_> = fs.read_dir(std::path::Path::new("/")).unwrap()
.filter_map(|e| e.ok())
.collect();
assert!(!entries.is_empty());
}
#[test]
fn cannot_remove_root() {
let fs = create_backend();
let result = fs.remove_dir(std::path::Path::new("/"));
assert!(result.is_err());
}
#[test]
fn cannot_remove_root_all() {
let fs = create_backend();
let result = fs.remove_dir_all(std::path::Path::new("/"));
assert!(result.is_err());
}
#[test]
fn file_at_root_level() {
let fs = create_backend();
fs.write(std::path::Path::new("/rootfile.txt"), b"at root").unwrap();
assert!(fs.exists(std::path::Path::new("/rootfile.txt")).unwrap());
assert_eq!(fs.read(std::path::Path::new("/rootfile.txt")).unwrap(), b"at root");
}
#[test]
fn deeply_nested_path() {
let fs = create_backend();
let deep_path = "/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p";
fs.create_dir_all(std::path::Path::new(deep_path)).unwrap();
fs.write(std::path::Path::new(&format!("{}/file.txt", deep_path)), b"deep").unwrap();
assert_eq!(
fs.read(std::path::Path::new(&format!("{}/file.txt", deep_path))).unwrap(),
b"deep"
);
}
#[test]
fn unicode_filename() {
let fs = create_backend();
fs.write(std::path::Path::new("/文件.txt"), b"chinese").unwrap();
fs.write(std::path::Path::new("/файл.txt"), b"russian").unwrap();
fs.write(std::path::Path::new("/αρχείο.txt"), b"greek").unwrap();
assert_eq!(fs.read(std::path::Path::new("/文件.txt")).unwrap(), b"chinese");
assert_eq!(fs.read(std::path::Path::new("/файл.txt")).unwrap(), b"russian");
assert_eq!(fs.read(std::path::Path::new("/αρχείο.txt")).unwrap(), b"greek");
}
#[test]
fn filename_with_spaces() {
let fs = create_backend();
fs.write(std::path::Path::new("/file with spaces.txt"), b"spaced").unwrap();
assert!(fs.exists(std::path::Path::new("/file with spaces.txt")).unwrap());
assert_eq!(fs.read(std::path::Path::new("/file with spaces.txt")).unwrap(), b"spaced");
}
#[test]
fn filename_with_special_chars() {
let fs = create_backend();
fs.write(std::path::Path::new("/file-name_123.test.txt"), b"special").unwrap();
assert!(fs.exists(std::path::Path::new("/file-name_123.test.txt")).unwrap());
}
#[test]
fn large_file() {
let fs = create_backend();
let large_data: Vec<u8> = (0..1_000_000).map(|i| (i % 256) as u8).collect();
fs.write(std::path::Path::new("/large.bin"), &large_data).unwrap();
assert_eq!(fs.read(std::path::Path::new("/large.bin")).unwrap(), large_data);
}
#[test]
fn many_files_in_directory() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/many")).unwrap();
for i in 0..100 {
fs.write(std::path::Path::new(&format!("/many/file_{:03}.txt", i)), format!("{}", i).as_bytes()).unwrap();
}
let entries: Vec<_> = fs.read_dir(std::path::Path::new("/many")).unwrap()
.filter_map(|e| e.ok())
.collect();
assert_eq!(entries.len(), 100);
}
#[test]
fn overwrite_larger_with_smaller() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"this is a longer content").unwrap();
fs.write(std::path::Path::new("/file.txt"), b"short").unwrap();
assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"short");
}
#[test]
fn overwrite_smaller_with_larger() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"short").unwrap();
fs.write(std::path::Path::new("/file.txt"), b"this is a longer content").unwrap();
assert_eq!(fs.read(std::path::Path::new("/file.txt")).unwrap(), b"this is a longer content");
}
}
// ============================================================================
// Security Tests (Learned from Prior Art Vulnerabilities)
// ============================================================================
mod security {
use super::*;
// ------------------------------------------------------------------------
// Path Traversal Tests (Apache Commons VFS CVE-inspired)
// ------------------------------------------------------------------------
#[test]
fn reject_dotdot_traversal() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
fs.write(std::path::Path::new("/secret.txt"), b"secret").unwrap();
// Direct .. traversal must be blocked or normalized
let result = fs.read(std::path::Path::new("/sandbox/../secret.txt"));
// Either blocks the operation or normalizes to /secret.txt (acceptable)
// But must NOT escape sandbox context in sandboxed backends
}
#[test]
fn reject_url_encoded_dotdot() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
// URL-encoded path traversal: %2e = '.', %2f = '/'
// This caused CVE in Apache Commons VFS
let result = fs.read(std::path::Path::new("/sandbox/%2e%2e/etc/passwd"));
assert!(result.is_err(), "URL-encoded path traversal must be rejected");
// Double-encoded traversal
let result = fs.read(std::path::Path::new("/sandbox/%252e%252e/etc/passwd"));
assert!(result.is_err(), "Double URL-encoded traversal must be rejected");
}
#[test]
fn reject_backslash_traversal() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
// Windows-style path traversal
let result = fs.read(std::path::Path::new("/sandbox\\..\\secret.txt"));
assert!(result.is_err(), "Backslash traversal must be rejected");
}
#[test]
fn reject_mixed_slash_traversal() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
// Mixed forward/backward slashes
let result = fs.read(std::path::Path::new("/sandbox/..\\..\\secret.txt"));
assert!(result.is_err(), "Mixed slash traversal must be rejected");
let result = fs.read(std::path::Path::new("/sandbox\\../secret.txt"));
assert!(result.is_err(), "Mixed slash traversal must be rejected");
}
#[test]
fn reject_null_byte_injection() {
let fs = create_backend();
fs.write(std::path::Path::new("/safe.txt"), b"safe").unwrap();
fs.write(std::path::Path::new("/safe.txt.bak"), b"backup").unwrap();
// Null byte injection: /safe.txt\0.bak -> /safe.txt
let result = fs.read(std::path::Path::new("/safe.txt\0.bak"));
// Should either reject or not truncate at null
if let Ok(content) = result {
// If it succeeds, it should read the full path, not truncated
assert_ne!(content, b"safe", "Null byte must not truncate path");
}
}
// ------------------------------------------------------------------------
// Symlink Security Tests (Afero-inspired)
// ------------------------------------------------------------------------
#[test]
fn symlink_cannot_escape_sandbox() {
// This test is for sandboxed backends (e.g., VRootFs)
// Regular backends may allow this, which is fine
let fs = create_backend();
// Attempt to create symlink pointing outside virtual root
let result = fs.symlink(std::path::Path::new("/etc/passwd"), std::path::Path::new("/escape_link"));
// Sandboxed backends MUST reject this
// Non-sandboxed backends may allow it
// The key is: reading through the symlink must not expose
// content outside the sandbox
}
#[test]
fn symlink_to_absolute_path_outside() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
fs.write(std::path::Path::new("/sandbox/safe.txt"), b"safe").unwrap();
// Symlink pointing to absolute path outside sandbox
// In sandboxed context, this must either:
// 1. Reject symlink creation, or
// 2. Resolve relative to sandbox root
let result = fs.symlink(std::path::Path::new("/../../../etc/passwd"), std::path::Path::new("/sandbox/link"));
// Behavior depends on backend type
}
#[test]
fn relative_symlink_traversal() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
fs.create_dir(std::path::Path::new("/sandbox/subdir")).unwrap();
fs.write(std::path::Path::new("/secret.txt"), b"secret outside sandbox").unwrap();
// Relative symlink that traverses up and out
let _ = fs.symlink(std::path::Path::new("../../secret.txt"), std::path::Path::new("/sandbox/subdir/link"));
// If symlink was created, reading through it in a sandboxed
// context must not expose /secret.txt
}
// ------------------------------------------------------------------------
// Symlink Loop Detection Tests (PyFilesystem2-inspired)
// ------------------------------------------------------------------------
#[test]
fn detect_direct_symlink_loop() {
let fs = create_backend();
// Self-referential symlink
let _ = fs.symlink(std::path::Path::new("/loop"), std::path::Path::new("/loop"));
// Reading must detect the loop
let result = fs.read(std::path::Path::new("/loop"));
assert!(matches!(result, Err(FsError::TooManySymlinks { .. }))
|| matches!(result, Err(FsError::NotFound { .. }))
|| result.is_err(),
"Direct symlink loop must be detected");
}
#[test]
fn detect_indirect_symlink_loop() {
let fs = create_backend();
// Two symlinks pointing to each other: a -> b, b -> a
let _ = fs.symlink(std::path::Path::new("/b"), std::path::Path::new("/a"));
let _ = fs.symlink(std::path::Path::new("/a"), std::path::Path::new("/b"));
// Reading either must detect the loop
let result = fs.read(std::path::Path::new("/a"));
assert!(matches!(result, Err(FsError::TooManySymlinks { .. }))
|| result.is_err(),
"Indirect symlink loop must be detected");
}
#[test]
fn detect_deep_symlink_chain() {
let fs = create_backend();
// Create a long chain of symlinks
// link_0 -> link_1 -> link_2 -> ... -> link_N
for i in 0..100 {
let _ = fs.symlink(
std::path::Path::new(&format!("/link_{}", i + 1)),
std::path::Path::new(&format!("/link_{}", i))
);
}
fs.write(std::path::Path::new("/link_100"), b"target").unwrap();
// Following the chain should either succeed or fail with TooManySymlinks
// Must NOT cause stack overflow or infinite loop
let result = fs.read(std::path::Path::new("/link_0"));
// Either succeeds (if backend allows deep chains) or returns error
// Key is: it must terminate
}
#[test]
fn symlink_loop_with_directories() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/dir1")).unwrap();
fs.create_dir(std::path::Path::new("/dir2")).unwrap();
// Create directory symlink loop
let _ = fs.symlink(std::path::Path::new("/dir2"), std::path::Path::new("/dir1/link_to_dir2"));
let _ = fs.symlink(std::path::Path::new("/dir1"), std::path::Path::new("/dir2/link_to_dir1"));
// Attempting to read a file through the loop
let result = fs.read(std::path::Path::new("/dir1/link_to_dir2/link_to_dir1/link_to_dir2/file.txt"));
assert!(result.is_err(), "Directory symlink loop must be detected");
}
// ------------------------------------------------------------------------
// Resource Exhaustion Tests
// ------------------------------------------------------------------------
#[test]
fn reject_excessive_symlink_depth() {
let fs = create_backend();
// FUSE typically limits to 40 symlink follows
// We should have a reasonable limit (e.g., 40-256)
const MAX_EXPECTED_DEPTH: u32 = 256;
// Create chain that exceeds expected limit
for i in 0..MAX_EXPECTED_DEPTH + 10 {
let _ = fs.symlink(
std::path::Path::new(&format!("/excessive_{}", i + 1)),
std::path::Path::new(&format!("/excessive_{}", i))
);
}
// Create actual target
fs.write(std::path::Path::new(&format!("/excessive_{}", MAX_EXPECTED_DEPTH + 10)), b"data").unwrap();
// Should reject or limit, not follow indefinitely
let result = fs.read(std::path::Path::new("/excessive_0"));
// Either succeeds (backend allows this depth) or errors
// Key: must not hang or OOM
}
// ------------------------------------------------------------------------
// Path Normalization Tests (FileStorage Integration)
// ------------------------------------------------------------------------
//
// NOTE: Path normalization (`.`, `..`, `//`) is handled by FileStorage,
// NOT by backends. Backends receive already-resolved, clean paths.
// These tests verify FileStorage + backend work together correctly.
//
// See testing-guide.md for the full FileStorage path normalization suite.
// Backend conformance tests should only use clean paths like "/parent/file.txt".
#[test]
fn path_normalization_removes_dots() {
// Test through FileStorage, not raw backend
let fs = anyfs::FileStorage::new(create_backend());
fs.create_dir("/parent").unwrap();
fs.write("/parent/file.txt", b"content").unwrap();
// FileStorage normalizes paths before passing to backend
assert_eq!(fs.read("/parent/./file.txt").unwrap(), b"content");
assert_eq!(fs.read("/parent/subdir/../file.txt").unwrap(), b"content");
}
#[test]
fn path_normalization_removes_double_slashes() {
// Test through FileStorage, not raw backend
let fs = anyfs::FileStorage::new(create_backend());
fs.write("/file.txt", b"content").unwrap();
// FileStorage normalizes double slashes
assert_eq!(fs.read("//file.txt").unwrap(), b"content");
assert_eq!(fs.read("/parent//file.txt").is_err(), true); // Parent doesn't exist
}
#[test]
fn trailing_slash_handling() {
// Test through FileStorage, not raw backend
let fs = anyfs::FileStorage::new(create_backend());
fs.create_dir("/mydir").unwrap();
fs.write("/mydir/file.txt", b"content").unwrap();
// Directory with trailing slash - FileStorage normalizes
assert!(fs.exists("/mydir/").unwrap());
// File with trailing slash - implementation-defined behavior
// FileStorage may normalize or reject
}
// ------------------------------------------------------------------------
// Windows-Specific Security Tests (from soft-canonicalize/strict-path)
// ------------------------------------------------------------------------
#[test]
#[cfg(windows)]
fn reject_ntfs_alternate_data_streams() {
let fs = create_backend();
fs.write(std::path::Path::new("/file.txt"), b"main content").unwrap();
// NTFS ADS: file.txt:hidden_stream
// Attacker may try to hide data or escape paths via ADS
let result = fs.read(std::path::Path::new("/file.txt:hidden"));
assert!(result.is_err(), "NTFS ADS must be rejected");
let result = fs.read(std::path::Path::new("/file.txt:$DATA"));
assert!(result.is_err(), "NTFS ADS with $DATA must be rejected");
let result = fs.read(std::path::Path::new("/file.txt::$DATA"));
assert!(result.is_err(), "NTFS default stream syntax must be rejected");
// ADS in directory path (traversal attempt)
let result = fs.read(std::path::Path::new("/dir:ads/../secret.txt"));
assert!(result.is_err(), "ADS in directory path must be rejected");
}
#[test]
#[cfg(windows)]
fn reject_windows_8_3_short_names() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/Program Files")).unwrap();
fs.write(std::path::Path::new("/Program Files/secret.txt"), b"secret").unwrap();
// 8.3 short names can be used to obfuscate paths
// PROGRA~1 is the typical short name for "Program Files"
// Virtual filesystems should either:
// 1. Not support 8.3 names at all (reject)
// 2. Resolve them consistently to the same canonical path
// Test that we don't accidentally create different files
let result1 = fs.exists(std::path::Path::new("/Program Files/secret.txt"));
let result2 = fs.exists(std::path::Path::new("/PROGRA~1/secret.txt"));
// Either both exist (resolved) or short name doesn't exist (rejected)
// Key: they must NOT be different files
if result1.unwrap_or(false) && result2.unwrap_or(false) {
// If both exist, they must have same content
let content1 = fs.read(std::path::Path::new("/Program Files/secret.txt")).unwrap();
let content2 = fs.read(std::path::Path::new("/PROGRA~1/secret.txt")).unwrap();
assert_eq!(content1, content2, "8.3 names must resolve to same file");
}
}
#[test]
#[cfg(windows)]
fn reject_windows_unc_traversal() {
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
// Extended-length path prefix traversal
let result = fs.read(std::path::Path::new("\\\\?\\C:\\..\\..\\etc\\passwd"));
assert!(result.is_err(), "UNC extended path traversal must be rejected");
// Device namespace
let result = fs.read(std::path::Path::new("\\\\.\\C:\\secret.txt"));
assert!(result.is_err(), "Device namespace paths must be rejected");
// UNC server path
let result = fs.read(std::path::Path::new("\\\\server\\share\\..\\..\\secret.txt"));
assert!(result.is_err(), "UNC server paths must be rejected");
}
#[test]
#[cfg(windows)]
fn reject_windows_reserved_names() {
let fs = create_backend();
// Windows reserved device names (CON, PRN, AUX, NUL, COM1-9, LPT1-9)
// These can cause hangs or unexpected behavior
let reserved_names = ["CON", "PRN", "AUX", "NUL", "COM1", "LPT1"];
for name in reserved_names {
let result = fs.write(std::path::Path::new(&format!("/{}", name)), b"data");
// Should either reject or handle safely (not hang)
let result = fs.write(std::path::Path::new(&format!("/{}.txt", name)), b"data");
// CON.txt is also problematic on Windows
}
}
#[test]
#[cfg(windows)]
fn reject_windows_junction_escape() {
// Junction points are Windows' equivalent of directory symlinks
// They can be used for sandbox escape similar to symlinks
let fs = create_backend();
fs.create_dir(std::path::Path::new("/sandbox")).unwrap();
// If backend supports junctions, they must be contained like symlinks
// The test setup would require actual junction creation capability
// This documents the requirement even if not all backends support it
}
// ------------------------------------------------------------------------
// Linux-Specific Security Tests (from soft-canonicalize/strict-path)
// ------------------------------------------------------------------------
#[test]
#[cfg(target_os = "linux")]
fn reject_proc_magic_symlinks() {
// /proc/PID/root and similar "magic" symlinks can escape namespaces
// Virtual filesystems wrapping real FS must not follow these
let fs = create_backend();
// These paths are only relevant for backends that wrap real filesystem
// In-memory backends naturally don't have this issue
// /proc/self/root points to the filesystem root, even in containers
// Following it would escape chroot/container boundaries
let result = fs.read(std::path::Path::new("/proc/self/root/etc/passwd"));
// Either NotFound (good - path doesn't exist in VFS)
// or handled safely (doesn't escape actual container)
}
#[test]
#[cfg(target_os = "linux")]
fn reject_dev_fd_symlinks() {
let fs = create_backend();
// /dev/fd/N symlinks to open file descriptors
// Could be used to access files outside sandbox
let result = fs.read(std::path::Path::new("/dev/fd/0"));
// Should fail or be isolated from real /dev/fd
}
// ------------------------------------------------------------------------
// Unicode Security Tests (from strict-path)
// ------------------------------------------------------------------------
#[test]
fn unicode_normalization_consistency() {
let fs = create_backend();
// NFC vs NFD normalization: é can be:
// - U+00E9 (precomposed, NFC)
// - U+0065 U+0301 (decomposed, NFD: e + combining acute)
let nfc = "/caf\u{00E9}.txt"; // precomposed
let nfd = "/cafe\u{0301}.txt"; // decomposed
fs.write(nfc, b"coffee").unwrap();
// If backend normalizes, both should access same file
// If backend doesn't normalize, second should not exist
// Key: must NOT create two different files that look identical
let result_nfc = fs.exists(nfc);
let result_nfd = fs.exists(nfd);
// Document the backend's behavior
// Either both true (normalized) or only NFC true (strict)
}
#[test]
fn reject_unicode_direction_override() {
let fs = create_backend();
// Right-to-Left Override (U+202E) can make paths appear different
// "secret\u{202E}txt.exe" displays as "secretexe.txt" in some contexts
let malicious_path = "/secret\u{202E}txt.exe";
let result = fs.write(malicious_path, b"data");
// Should either reject or sanitize bidirectional control characters
}
#[test]
fn reject_unicode_homoglyphs() {
let fs = create_backend();
// Cyrillic 'а' (U+0430) looks like Latin 'a' (U+0061)
let latin_path = "/data/file.txt";
let cyrillic_path = "/d\u{0430}ta/file.txt"; // Cyrillic 'а'
fs.create_dir(std::path::Path::new("/data")).unwrap();
fs.write(latin_path, b"real content").unwrap();
// These must NOT silently access the same file
// Either cyrillic path is NotFound, or it's a different file
let result = fs.read(cyrillic_path);
if let Ok(content) = result {
// If cyrillic path exists, it must be a distinct file
// (not accidentally matching the latin path)
}
}
#[test]
fn reject_null_in_unicode() {
let fs = create_backend();
// Null can be encoded in various ways
// UTF-8 null is just 0x00, but check overlong encodings aren't decoded
let path_with_null = "/file\u{0000}name.txt";
let result = fs.write(path_with_null, b"data");
assert!(result.is_err(), "Embedded null must be rejected");
}
// ------------------------------------------------------------------------
// TOCTOU Race Condition Tests (from soft-canonicalize/strict-path)
// ------------------------------------------------------------------------
#[test]
fn toctou_check_then_use() {
let fs = Arc::new(create_backend());
fs.create_dir(std::path::Path::new("/uploads")).unwrap();
// Simulate TOCTOU: check if path is safe, then use it
// An attacker might change the filesystem between check and use
let fs_checker = fs.clone();
let fs_writer = fs.clone();
// This test documents the requirement for atomic operations
// or proper locking in security-critical paths
// Thread 1: Check then write
let checker = thread::spawn(move || {
for i in 0..100 {
let path = format!("/uploads/file_{}.txt", i);
// Check
if !fs_checker.exists(std::path::Path::new(&path)).unwrap_or(true) {
// Use (potential race window here)
let _ = fs_checker.write(std::path::Path::new(&path), b"data");
}
}
});
// Thread 2: Rapid file creation/deletion
let writer = thread::spawn(move || {
for i in 0..100 {
let path = format!("/uploads/file_{}.txt", i);
let _ = fs_writer.write(std::path::Path::new(&path), b"attacker");
let _ = fs_writer.remove_file(std::path::Path::new(&path));
}
});
checker.join().unwrap();
writer.join().unwrap();
// Test passes if no panic/crash occurs
// Real protection requires atomic create-if-not-exists operations
}
#[test]
fn symlink_toctou_during_resolution() {
let fs = Arc::new(create_backend());
fs.create_dir(std::path::Path::new("/safe")).unwrap();
fs.write(std::path::Path::new("/safe/target.txt"), b"safe content").unwrap();
fs.write(std::path::Path::new("/unsafe.txt"), b"unsafe content").unwrap();
// Attacker rapidly changes symlink target during path resolution
let fs_attacker = fs.clone();
let fs_reader = fs.clone();
let attacker = thread::spawn(move || {
for _ in 0..100 {
// Create symlink to safe target
let _ = fs_attacker.remove_file(std::path::Path::new("/safe/link.txt"));
let _ = fs_attacker.symlink(std::path::Path::new("/safe/target.txt"), std::path::Path::new("/safe/link.txt"));
// Quickly change to unsafe target
let _ = fs_attacker.remove_file(std::path::Path::new("/safe/link.txt"));
let _ = fs_attacker.symlink(std::path::Path::new("/unsafe.txt"), std::path::Path::new("/safe/link.txt"));
}
});
let reader = thread::spawn(move || {
for _ in 0..100 {
// Try to read through symlink
// Must not accidentally read /unsafe.txt if sandboxed
let _ = fs_reader.read(std::path::Path::new("/safe/link.txt"));
}
});
attacker.join().unwrap();
reader.join().unwrap();
// For sandboxed backends: must never return content from /unsafe.txt
// This test verifies the implementation doesn't have TOCTOU in symlink resolution
}
}
// ============================================================================
// Thread Safety Tests
// ============================================================================
mod thread_safety {
use super::*;
#[test]
fn concurrent_reads() {
let fs = Arc::new(create_backend());
fs.write(std::path::Path::new("/shared.txt"), b"shared content").unwrap();
let handles: Vec<_> = (0..10)
.map(|_| {
let fs = fs.clone();
thread::spawn(move || {
for _ in 0..100 {
let content = fs.read(std::path::Path::new("/shared.txt")).unwrap();
assert_eq!(content, b"shared content");
}
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
}
#[test]
fn concurrent_writes_different_files() {
let fs = Arc::new(create_backend());
let handles: Vec<_> = (0..10)
.map(|i| {
let fs = fs.clone();
thread::spawn(move || {
let path = format!("/file_{}.txt", i);
for j in 0..100 {
fs.write(std::path::Path::new(&path), format!("{}:{}", i, j).as_bytes()).unwrap();
}
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
// Verify all files exist
for i in 0..10 {
assert!(fs.exists(std::path::Path::new(&format!("/file_{}.txt", i))).unwrap());
}
}
#[test]
fn concurrent_create_dir_all_same_path() {
let fs = Arc::new(create_backend());
let handles: Vec<_> = (0..10)
.map(|_| {
let fs = fs.clone();
thread::spawn(move || {
// All threads try to create the same path
let _ = fs.create_dir_all(std::path::Path::new("/a/b/c/d"));
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
// Path should exist regardless of race
assert!(fs.exists(std::path::Path::new("/a/b/c/d")).unwrap());
}
#[test]
fn read_during_write() {
let fs = Arc::new(create_backend());
fs.write(std::path::Path::new("/changing.txt"), b"initial").unwrap();
let fs_writer = fs.clone();
let writer = thread::spawn(move || {
for i in 0..100 {
fs_writer.write(std::path::Path::new("/changing.txt"), format!("version {}", i).as_bytes()).unwrap();
}
});
let fs_reader = fs.clone();
let reader = thread::spawn(move || {
for _ in 0..100 {
// Should not panic or return garbage
let result = fs_reader.read(std::path::Path::new("/changing.txt"));
assert!(result.is_ok());
}
});
writer.join().unwrap();
reader.join().unwrap();
}
#[test]
fn metadata_consistency() {
let fs = Arc::new(create_backend());
fs.write(std::path::Path::new("/meta.txt"), b"content").unwrap();
let handles: Vec<_> = (0..10)
.map(|_| {
let fs = fs.clone();
thread::spawn(move || {
for _ in 0..100 {
let meta = fs.metadata(std::path::Path::new("/meta.txt")).unwrap();
// Size should be consistent
assert!(meta.size > 0);
}
})
})
.collect();
for handle in handles {
handle.join().unwrap();
}
}
}
// ============================================================================
// No Panic Tests (Edge Cases That Must Not Crash)
// ============================================================================
mod no_panic {
use super::*;
#[test]
fn empty_path_does_not_panic() {
let fs = create_backend();
// These should return errors, not panic
let _ = fs.read(std::path::Path::new(""));
let _ = fs.write(std::path::Path::new(""), b"data");
let _ = fs.metadata(std::path::Path::new(""));
let _ = fs.exists(std::path::Path::new(""));
let _ = fs.read_dir(std::path::Path::new(""));
}
#[test]
fn path_with_null_does_not_panic() {
let fs = create_backend();
// Paths with null bytes should error or be handled gracefully
let _ = fs.read(std::path::Path::new("/file\0name.txt"));
let _ = fs.write(std::path::Path::new("/file\0name.txt"), b"data");
}
#[test]
fn very_long_path_does_not_panic() {
let fs = create_backend();
let long_name = "a".repeat(10000);
let long_path = format!("/{}", long_name);
// Should error gracefully, not panic
let _ = fs.write(std::path::Path::new(&long_path), b"data");
let _ = fs.read(std::path::Path::new(&long_path));
}
#[test]
fn very_long_filename_does_not_panic() {
let fs = create_backend();
let long_name = format!("/{}.txt", "x".repeat(1000));
let _ = fs.write(std::path::Path::new(&long_name), b"data");
}
#[test]
fn read_after_remove_does_not_panic() {
let fs = create_backend();
fs.write(std::path::Path::new("/temp.txt"), b"data").unwrap();
fs.remove_file(std::path::Path::new("/temp.txt")).unwrap();
// Should return NotFound, not panic
let result = fs.read(std::path::Path::new("/temp.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
#[test]
fn double_remove_does_not_panic() {
let fs = create_backend();
fs.write(std::path::Path::new("/temp.txt"), b"data").unwrap();
fs.remove_file(std::path::Path::new("/temp.txt")).unwrap();
// Second remove should error, not panic
let result = fs.remove_file(std::path::Path::new("/temp.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
}
}
Extended Test Suite (FsFull Traits)
For backends implementing FsFull:
#![allow(unused)]
fn main() {
mod fs_full {
use super::*;
use anyfs_backend::{FsLink, FsPermissions, FsSync, FsStats, Permissions};
// Only run these if the backend implements FsFull traits
fn create_full_backend() -> impl Fs + FsLink + FsPermissions + FsSync + FsStats {
todo!("Return your FsFull backend")
}
// ========================================================================
// FsLink Tests
// ========================================================================
mod fs_link {
use super::*;
#[test]
fn create_symlink() {
let fs = create_full_backend();
fs.write(std::path::Path::new("/target.txt"), b"target content").unwrap();
fs.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();
assert!(fs.exists(std::path::Path::new("/link.txt")).unwrap());
let meta = fs.symlink_metadata(std::path::Path::new("/link.txt")).unwrap();
assert_eq!(meta.file_type, FileType::Symlink);
}
#[test]
fn read_symlink() {
let fs = create_full_backend();
fs.write(std::path::Path::new("/target.txt"), b"content").unwrap();
fs.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();
let target = fs.read_link(std::path::Path::new("/link.txt")).unwrap();
assert_eq!(target.to_string_lossy(), "/target.txt");
}
#[test]
fn hard_link() {
let fs = create_full_backend();
fs.write(std::path::Path::new("/original.txt"), b"shared content").unwrap();
fs.hard_link(std::path::Path::new("/original.txt"), std::path::Path::new("/hardlink.txt")).unwrap();
// Both paths should have the same content
assert_eq!(fs.read(std::path::Path::new("/original.txt")).unwrap(), b"shared content");
assert_eq!(fs.read(std::path::Path::new("/hardlink.txt")).unwrap(), b"shared content");
// Modifying one should affect the other
fs.write(std::path::Path::new("/hardlink.txt"), b"modified").unwrap();
assert_eq!(fs.read(std::path::Path::new("/original.txt")).unwrap(), b"modified");
}
#[test]
fn symlink_metadata_vs_metadata() {
let fs = create_full_backend();
fs.write(std::path::Path::new("/target.txt"), b"content").unwrap();
fs.symlink(std::path::Path::new("/target.txt"), std::path::Path::new("/link.txt")).unwrap();
// symlink_metadata returns the symlink's metadata
let sym_meta = fs.symlink_metadata(std::path::Path::new("/link.txt")).unwrap();
assert_eq!(sym_meta.file_type, FileType::Symlink);
// metadata (if it follows symlinks) returns target's metadata
// Note: behavior depends on implementation
}
}
// ========================================================================
// FsPermissions Tests
// ========================================================================
mod fs_permissions {
use super::*;
#[test]
fn set_permissions() {
let fs = create_full_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
fs.set_permissions(std::path::Path::new("/file.txt"), Permissions::from_mode(0o755)).unwrap();
let meta = fs.metadata(std::path::Path::new("/file.txt")).unwrap();
assert_eq!(meta.permissions, Some(0o755));
}
#[test]
fn set_permissions_nonexistent_returns_not_found() {
let fs = create_full_backend();
let result = fs.set_permissions(std::path::Path::new("/nonexistent"), Permissions::from_mode(0o644));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
}
// ========================================================================
// FsSync Tests
// ========================================================================
mod fs_sync {
use super::*;
#[test]
fn sync_does_not_error() {
let fs = create_full_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
// sync() should complete without error
fs.sync().unwrap();
}
#[test]
fn fsync_specific_file() {
let fs = create_full_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
fs.fsync(std::path::Path::new("/file.txt")).unwrap();
}
#[test]
fn fsync_nonexistent_returns_not_found() {
let fs = create_full_backend();
let result = fs.fsync(std::path::Path::new("/nonexistent.txt"));
assert!(matches!(result, Err(FsError::NotFound { .. })));
}
}
// ========================================================================
// FsStats Tests
// ========================================================================
mod fs_stats {
use super::*;
#[test]
fn statfs_returns_valid_stats() {
let fs = create_full_backend();
let stats = fs.statfs().unwrap();
// Basic sanity checks
assert!(stats.block_size > 0);
// available should not exceed total (if total is reported)
if stats.total_bytes > 0 {
assert!(stats.available_bytes <= stats.total_bytes);
}
}
}
}
}
FUSE Test Suite (FsFuse Traits)
For backends implementing FsFuse:
#![allow(unused)]
fn main() {
mod fs_fuse {
use super::*;
use anyfs_backend::FsInode;
fn create_fuse_backend() -> impl Fs + FsInode {
todo!("Return your FsFuse backend")
}
#[test]
fn path_to_inode_consistency() {
let fs = create_fuse_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
let inode1 = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();
let inode2 = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();
// Same path should always return same inode
assert_eq!(inode1, inode2);
}
#[test]
fn inode_to_path_roundtrip() {
let fs = create_fuse_backend();
fs.write(std::path::Path::new("/file.txt"), b"data").unwrap();
let inode = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();
let path = fs.inode_to_path(inode).unwrap();
assert_eq!(path.to_string_lossy(), "/file.txt");
}
#[test]
fn lookup_child() {
let fs = create_fuse_backend();
fs.create_dir(std::path::Path::new("/parent")).unwrap();
fs.write(std::path::Path::new("/parent/child.txt"), b"data").unwrap();
let parent_inode = fs.path_to_inode(std::path::Path::new("/parent")).unwrap();
let child_inode = fs.lookup(parent_inode, std::ffi::OsStr::new("child.txt")).unwrap();
let expected_inode = fs.path_to_inode(std::path::Path::new("/parent/child.txt")).unwrap();
assert_eq!(child_inode, expected_inode);
}
#[test]
fn metadata_by_inode() {
let fs = create_fuse_backend();
fs.write(std::path::Path::new("/file.txt"), b"content").unwrap();
let inode = fs.path_to_inode(std::path::Path::new("/file.txt")).unwrap();
let meta = fs.metadata_by_inode(inode).unwrap();
assert_eq!(meta.file_type, FileType::File);
assert_eq!(meta.size, 7);
}
#[test]
fn root_inode_is_one() {
let fs = create_fuse_backend();
let root_inode = fs.path_to_inode(std::path::Path::new("/")).unwrap();
// By FUSE convention, root inode is 1
assert_eq!(root_inode, 1);
}
#[test]
fn different_files_different_inodes() {
let fs = create_fuse_backend();
fs.write(std::path::Path::new("/file1.txt"), b"data1").unwrap();
fs.write(std::path::Path::new("/file2.txt"), b"data2").unwrap();
let inode1 = fs.path_to_inode(std::path::Path::new("/file1.txt")).unwrap();
let inode2 = fs.path_to_inode(std::path::Path::new("/file2.txt")).unwrap();
assert_ne!(inode1, inode2);
}
#[test]
fn hard_links_same_inode() {
let fs = create_fuse_backend();
fs.write(std::path::Path::new("/original.txt"), b"data").unwrap();
fs.hard_link(std::path::Path::new("/original.txt"), std::path::Path::new("/link.txt")).unwrap();
let inode1 = fs.path_to_inode(std::path::Path::new("/original.txt")).unwrap();
let inode2 = fs.path_to_inode(std::path::Path::new("/link.txt")).unwrap();
// Hard links must share the same inode
assert_eq!(inode1, inode2);
}
}
}
Middleware Test Suite
For middleware implementers, verify the middleware doesn’t break the underlying backend:
#![allow(unused)]
fn main() {
mod middleware_tests {
use super::*;
use anyfs::MemoryBackend;
/// Your middleware wrapping a known-good backend.
fn create_middleware() -> MyMiddleware<MemoryBackend> {
MyMiddleware::new(MemoryBackend::new())
}
// Run all standard Fs tests through the middleware
// This ensures the middleware doesn't break basic functionality
#[test]
fn passthrough_read_write() {
let fs = create_middleware();
fs.write(std::path::Path::new("/test.txt"), b"data").unwrap();
assert_eq!(fs.read(std::path::Path::new("/test.txt")).unwrap(), b"data");
}
#[test]
fn passthrough_directories() {
let fs = create_middleware();
fs.create_dir_all(std::path::Path::new("/a/b/c")).unwrap();
assert!(fs.exists(std::path::Path::new("/a/b/c")).unwrap());
}
// Add middleware-specific tests here
// e.g., for a Quota middleware:
#[test]
fn quota_blocks_oversized_write() {
let fs = QuotaMiddleware::new(MemoryBackend::new())
.with_max_file_size(100);
let result = fs.write(std::path::Path::new("/big.txt"), &vec![0u8; 200]);
assert!(matches!(result, Err(FsError::QuotaExceeded { .. })));
}
#[test]
fn quota_allows_within_limit() {
let fs = QuotaMiddleware::new(MemoryBackend::new())
.with_max_file_size(100);
fs.write(std::path::Path::new("/small.txt"), &vec![0u8; 50]).unwrap();
assert!(fs.exists(std::path::Path::new("/small.txt")).unwrap());
}
}
}
Running the Tests
Basic Usage
# Run all conformance tests
cargo test --test conformance
# Run specific test module
cargo test --test conformance fs_read
# Run with output
cargo test --test conformance -- --nocapture
# Run thread safety tests with more threads
RUST_TEST_THREADS=1 cargo test --test conformance thread_safety
CI Integration
# .github/workflows/test.yml
name: Conformance Tests
on: [push, pull_request]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
- name: Run conformance tests
run: cargo test --test conformance
- name: Run thread safety tests
run: cargo test --test conformance thread_safety -- --test-threads=1
Test Checklist
Before releasing your backend or middleware:
Core Tests (Required)
- All
fs_readtests pass - All
fs_writetests pass - All
fs_dirtests pass - All
edge_casestests pass - All
securitytests pass - All
thread_safetytests pass - All
no_panictests pass
Extended Tests (If Implementing FsFull)
- All
fs_linktests pass - All
fs_permissionstests pass - All
fs_synctests pass - All
fs_statstests pass
FUSE Tests (If Implementing FsFuse)
- All
fs_fusetests pass - Root inode is 1
- Hard links share inodes
Middleware Tests
- Basic passthrough works
- Middleware-specific behavior tested
- Error cases handled correctly
Summary
This conformance test suite provides:
- Complete coverage of all
Fstrait operations - Edge case testing for robustness
- Security tests learned from vulnerabilities in prior art (Apache Commons VFS, Afero, PyFilesystem2)
- Thread safety verification for concurrent access
- No-panic guarantees for invalid inputs
- Extended tests for
FsFullandFsFusetraits - Middleware testing patterns
Security Tests Cover:
- Path traversal attacks: URL-encoded
%2e%2e, backslash traversal, null byte injection - Symlink escape: Preventing sandbox escape via symlinks
- Symlink loops: Direct loops, indirect loops, deep chains
- Resource exhaustion: Limits on symlink depth
- Path canonicalization: Dot removal, double slash normalization
- Windows-specific (from
soft-canonicalize/strict-path):- NTFS Alternate Data Streams
- Windows 8.3 short names
- UNC path traversal
- Reserved device names
- Junction point escapes
- Linux-specific: Magic symlinks (
/proc/PID/root),/dev/fdescapes - Unicode: NFC/NFD normalization, RTL override, homoglyphs
- TOCTOU: Race conditions in check-then-use and symlink resolution
Copy the relevant test modules, implement create_backend(), and run the tests. If they all pass, your backend/middleware is AnyFS-compatible.
Middleware Implementation Guide
This document provides implementation sketches for all AnyFS middleware, verifying that each is implementable within our framework.
Verdict: All 9 middleware are implementable. Some have interesting challenges documented below.
Implementation Pattern
All middleware follow the same pattern:
#![allow(unused)]
fn main() {
pub struct MiddlewareName<B> {
inner: B,
state: MiddlewareState, // Interior mutability if needed
}
impl<B: Fs> FsRead for MiddlewareName<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// 1. Pre-check (validate, log, check limits)
// 2. Delegate to inner.read(path)
// 3. Post-process (update state, transform result)
}
}
// Implement FsWrite, FsDir similarly...
// Blanket impl for Fs is automatic
}
1. ReadOnly
Complexity: Trivial State: None Dependencies: None
Implementation
#![allow(unused)]
fn main() {
pub struct ReadOnly<B> {
inner: B,
}
impl<B> ReadOnly<B> {
pub fn new(inner: B) -> Self {
Self { inner }
}
}
impl<B: FsRead> FsRead for ReadOnly<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
self.inner.read(path) // Pass through
}
fn read_to_string(&self, path: &Path) -> Result<String, FsError> {
self.inner.read_to_string(path) // Pass through
}
fn read_range(&self, path: &Path, offset: u64, len: usize) -> Result<Vec<u8>, FsError> {
self.inner.read_range(path, offset, len) // Pass through
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
self.inner.exists(path) // Pass through
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
self.inner.metadata(path) // Pass through
}
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
self.inner.open_read(path) // Pass through
}
}
impl<B: FsWrite> FsWrite for ReadOnly<B> {
fn write(&self, path: &Path, _data: &[u8]) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "write" })
}
fn append(&self, path: &Path, _data: &[u8]) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "append" })
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "remove_file" })
}
fn rename(&self, from: &Path, _to: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: from.to_path_buf(), operation: "rename" })
}
fn copy(&self, from: &Path, _to: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: from.to_path_buf(), operation: "copy" })
}
fn truncate(&self, path: &Path, _size: u64) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "truncate" })
}
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "open_write" })
}
}
impl<B: FsDir> FsDir for ReadOnly<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
self.inner.read_dir(path) // Pass through (reading)
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "create_dir" })
}
fn create_dir_all(&self, path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "create_dir_all" })
}
fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "remove_dir" })
}
fn remove_dir_all(&self, path: &Path) -> Result<(), FsError> {
Err(FsError::ReadOnly { path: path.to_path_buf(), operation: "remove_dir_all" })
}
}
}
Verdict: ✅ Trivially Implementable
No challenges. Pure delegation for reads, error return for writes.
2. Restrictions
Complexity: Simple State: Configuration flags only Dependencies: None
Note: Symlink/hard-link capability is determined by trait bounds (
B: FsLink), not middleware. Restrictions only controls permission-related operations.
Implementation
#![allow(unused)]
fn main() {
pub struct Restrictions<B> {
inner: B,
deny_permissions: bool,
}
pub struct RestrictionsBuilder {
deny_permissions: bool,
}
impl RestrictionsBuilder {
pub fn deny_permissions(mut self) -> Self {
self.deny_permissions = true;
self
}
pub fn build<B>(self, inner: B) -> Restrictions<B> {
Restrictions {
inner,
deny_permissions: self.deny_permissions,
}
}
}
// FsRead, FsDir, FsLink: pure delegation (Restrictions doesn't block these)
impl<B: FsLink> FsLink for Restrictions<B> {
fn symlink(&self, target: &Path, link: &Path) -> Result<(), FsError> {
self.inner.symlink(target, link) // Pure delegation
}
fn hard_link(&self, original: &Path, link: &Path) -> Result<(), FsError> {
self.inner.hard_link(original, link) // Pure delegation
}
fn read_link(&self, path: &Path) -> Result<PathBuf, FsError> {
self.inner.read_link(path)
}
fn symlink_metadata(&self, path: &Path) -> Result<Metadata, FsError> {
self.inner.symlink_metadata(path)
}
}
impl<B: FsPermissions> FsPermissions for Restrictions<B> {
fn set_permissions(&self, path: &Path, perm: Permissions) -> Result<(), FsError> {
if self.deny_permissions {
return Err(FsError::FeatureNotEnabled {
path: path.to_path_buf(),
feature: "permissions",
operation: "set_permissions",
});
}
self.inner.set_permissions(path, perm)
}
}
}
Verdict: ✅ Trivially Implementable
Simple flag check on set_permissions(). Link operations delegate to inner backend.
3. Tracing
Complexity: Simple
State: Configuration only
Dependencies: tracing crate
Implementation
#![allow(unused)]
fn main() {
use tracing::{instrument, info, debug, Level};
pub struct Tracing<B> {
inner: B,
target: &'static str,
level: Level,
}
impl<B: FsRead> FsRead for Tracing<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
let span = tracing::span!(Level::DEBUG, "fs::read", ?path);
let _guard = span.enter();
let result = self.inner.read(path);
match &result {
Ok(data) => debug!(bytes = data.len(), "read succeeded"),
Err(e) => debug!(?e, "read failed"),
}
result
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
let path = path.as_ref();
let span = tracing::span!(Level::DEBUG, "fs::exists", ?path);
let _guard = span.enter();
let result = self.inner.exists(path);
debug!(?result, "exists check");
result
}
// ... similar for all other methods
}
// FsWrite and FsDir follow the same pattern
}
Verdict: ✅ Trivially Implementable
Pure instrumentation wrapper. No state mutation, no complex logic.
4. RateLimit
Complexity: Moderate
State: Counter + timestamp (requires interior mutability)
Dependencies: None (uses std::time)
Algorithm: Fixed-window counter (simpler than token bucket, sufficient for most use cases)
Implementation
#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicU64, AtomicU32, Ordering};
use std::time::{Duration, Instant};
use std::sync::RwLock;
pub struct RateLimit<B> {
inner: B,
max_ops: u32,
window: Duration,
state: RwLock<RateLimitState>,
}
struct RateLimitState {
window_start: Instant,
count: u32,
}
impl<B> RateLimit<B> {
fn check_rate_limit(&self, path: &Path) -> Result<(), FsError> {
let mut state = self.state.write().unwrap();
let now = Instant::now();
if now.duration_since(state.window_start) >= self.window {
// Window expired, reset
state.window_start = now;
state.count = 1;
return Ok(());
}
if state.count >= self.max_ops {
return Err(FsError::RateLimitExceeded {
path: path.to_path_buf(),
limit: self.max_ops,
window_secs: self.window.as_secs(),
});
}
state.count += 1;
Ok(())
}
}
impl<B: FsRead> FsRead for RateLimit<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
self.check_rate_limit(path)?;
self.inner.read(path)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
self.check_rate_limit(path)?;
self.inner.exists(path)
}
// ... all methods call check_rate_limit(path) first
}
}
Considerations
- Fixed window vs sliding window: Fixed window is simpler and sufficient for most use cases.
- Thread safety: Uses
RwLockfor state. Could optimize with atomics for lock-free path. - What counts as an operation? Each method call counts as 1 operation.
Verdict: ✅ Implementable
Straightforward with interior mutability.
5. DryRun
Complexity: Moderate State: Operation log Dependencies: None
Implementation
#![allow(unused)]
fn main() {
use std::sync::RwLock;
pub struct DryRun<B> {
inner: B,
operations: RwLock<Vec<String>>,
}
impl<B> DryRun<B> {
pub fn operations(&self) -> Vec<String> {
self.operations.read().unwrap().clone()
}
pub fn clear(&self) {
self.operations.write().unwrap().clear();
}
fn log(&self, op: String) {
self.operations.write().unwrap().push(op);
}
}
impl<B: FsRead> FsRead for DryRun<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
// Reads execute normally - we need real state to test against
self.inner.read(path)
}
// All read operations pass through unchanged
}
impl<B: FsWrite> FsWrite for DryRun<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
self.log(format!("write {} ({} bytes)", path.display(), data.len()));
Ok(()) // Don't actually write
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
self.log(format!("remove_file {}", path.display()));
Ok(()) // Don't actually remove
}
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
let path = path.as_ref();
self.log(format!("open_write {}", path.display()));
// Return a sink that discards all writes
Ok(Box::new(std::io::sink()))
}
// ... similar for all write operations
}
impl<B: FsDir> FsDir for DryRun<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
self.inner.read_dir(path) // Pass through
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
self.log(format!("create_dir {}", path.display()));
Ok(())
}
// ... similar for all directory mutations
}
}
Semantics Clarification
DryRun is NOT an isolation layer. It’s for answering “what would this code do?”
- Reads see the real backend state (unchanged from before DryRun was applied)
- Writes are logged but not executed
- After a dry write, reads won’t see the change (because it wasn’t written)
This is intentional. For isolation, use MemoryBackend::clone() for snapshots.
Verdict: ✅ Implementable
The semantics are clear once documented. Uses std::io::sink() for discarding streamed writes.
6. PathFilter
Complexity: Moderate
State: Compiled glob patterns
Dependencies: globset crate
Implementation
#![allow(unused)]
fn main() {
use globset::{Glob, GlobSet, GlobSetBuilder};
pub struct PathFilter<B> {
inner: B,
rules: Vec<PathRule>,
compiled: GlobSet, // For efficient matching
}
enum PathRule {
Allow(String),
Deny(String),
}
impl<B> PathFilter<B> {
fn check_access(&self, path: &Path) -> Result<(), FsError> {
let path_str = path.to_string_lossy();
for rule in &self.rules {
match rule {
PathRule::Allow(pattern) => {
if glob_matches(pattern, &path_str) {
return Ok(());
}
}
PathRule::Deny(pattern) => {
if glob_matches(pattern, &path_str) {
return Err(FsError::AccessDenied {
path: path.to_path_buf(),
reason: format!("path matches deny pattern: {}", pattern),
});
}
}
}
}
// Default: deny if no rules matched
Err(FsError::AccessDenied {
path: path.to_path_buf(),
reason: "no matching allow rule".to_string(),
})
}
}
impl<B: FsRead> FsRead for PathFilter<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
self.check_access(path)?;
self.inner.read(path)
}
// ... all methods check access first
}
impl<B: FsDir> FsDir for PathFilter<B> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
let path = path.as_ref();
self.check_access(path)?;
let inner_iter = self.inner.read_dir(path)?;
// Filter the iterator to exclude denied entries
Ok(ReadDirIter::new(FilteredDirIter {
inner: inner_iter,
rules: self.rules.clone(), // Copy rules for filtering
}))
}
}
// Custom iterator that filters denied entries
struct FilteredDirIter {
inner: ReadDirIter,
rules: Vec<PathRule>,
}
impl Iterator for FilteredDirIter {
type Item = Result<DirEntry, FsError>;
fn next(&mut self) -> Option<Self::Item> {
loop {
match self.inner.next()? {
Ok(entry) => {
if self.is_allowed(&entry.path) {
return Some(Ok(entry));
}
// Skip denied entries (don't reveal their existence)
}
Err(e) => return Some(Err(e)),
}
}
}
}
}
Considerations
- Rule evaluation order: First match wins, consistent with firewall rules.
- Default policy: Deny if no rules match (secure by default).
- Directory listing: Filters out denied entries so their existence isn’t revealed.
- Parent directory access: If you allow
/workspace/**, accessing/workspaceitself needs to be allowed.
Implementation Detail: ReadDirIter Filtering
Our ReadDirIter type needs to support wrapping. Options:
#![allow(unused)]
fn main() {
// Option 1: ReadDirIter is a trait object
pub struct ReadDirIter(Box<dyn Iterator<Item = Result<DirEntry, FsError>> + Send>);
// Option 2: ReadDirIter has a filter method
impl ReadDirIter {
pub fn filter<F>(self, predicate: F) -> ReadDirIter
where
F: Fn(&DirEntry) -> bool + Send + 'static
{ ... }
}
}
Recommendation: Option 1 (trait object) is more flexible and aligns with open_read/open_write returning Box<dyn ...>.
Verdict: ✅ Implementable
Requires ReadDirIter to be a trait object wrapper (already the case) so we can filter entries.
7. Cache
Complexity: Moderate
State: LRU cache with entries
Dependencies: lru crate (or custom implementation)
Implementation
#![allow(unused)]
fn main() {
use lru::LruCache;
use std::sync::RwLock;
use std::time::{Duration, Instant};
pub struct Cache<B> {
inner: B,
cache: RwLock<LruCache<PathBuf, CacheEntry>>,
max_entry_size: usize,
ttl: Duration,
}
struct CacheEntry {
data: Vec<u8>,
metadata: Metadata,
inserted_at: Instant,
}
impl<B: FsRead> FsRead for Cache<B> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
// Check cache
{
let cache = self.cache.read().unwrap();
if let Some(entry) = cache.peek(path) {
if entry.inserted_at.elapsed() < self.ttl {
return Ok(entry.data.clone());
}
}
}
// Cache miss - fetch from backend
let data = self.inner.read(path)?;
// Store in cache if not too large
if data.len() <= self.max_entry_size {
let metadata = self.inner.metadata(path)?;
let mut cache = self.cache.write().unwrap();
cache.put(path.to_path_buf(), CacheEntry {
data: data.clone(),
metadata,
inserted_at: Instant::now(),
});
}
Ok(data)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
let path = path.as_ref();
// Check cache for metadata
{
let cache = self.cache.read().unwrap();
if let Some(entry) = cache.peek(path) {
if entry.inserted_at.elapsed() < self.ttl {
return Ok(entry.metadata.clone());
}
}
}
// Fetch from backend
self.inner.metadata(path)
}
fn open_read(&self, path: &Path) -> Result<Box<dyn Read + Send>, FsError> {
// DO NOT CACHE - streams are for large files
self.inner.open_read(path)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
// Could cache this too, or derive from metadata cache
let path = path.as_ref();
{
let cache = self.cache.read().unwrap();
if let Some(entry) = cache.peek(path) {
if entry.inserted_at.elapsed() < self.ttl {
return Ok(true); // If in cache, it exists
}
}
}
self.inner.exists(path)
}
}
impl<B: FsWrite> FsWrite for Cache<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
let result = self.inner.write(path, data)?;
// Invalidate cache entry
let mut cache = self.cache.write().unwrap();
cache.pop(path);
Ok(result)
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
let result = self.inner.remove_file(path)?;
// Invalidate cache entry
let mut cache = self.cache.write().unwrap();
cache.pop(path);
Ok(result)
}
// ... all mutations invalidate cache
}
}
What Gets Cached
| Method | Cached? | Reason |
|---|---|---|
read() | Yes | Small files benefit from caching |
read_to_string() | Yes | Same as read |
read_range() | Maybe | Could cache full file, serve ranges from cache |
metadata() | Yes | Frequently accessed |
exists() | Derived | Can derive from metadata cache |
open_read() | No | Streams are for large files that shouldn’t be cached |
read_dir() | Maybe | Directory listings change frequently |
Verdict: ✅ Implementable
Standard LRU cache pattern. Key decision: don’t cache open_read() streams.
8. Quota
Complexity: High State: Usage counters (requires accurate tracking) Dependencies: None
The Challenge
Quota must track:
- Total bytes used
- Total file count
- Total directory count
- Per-directory entry count (optional)
- Maximum path depth (optional)
The tricky part: streaming writes via open_write(). We must track bytes as they’re written, not just when the operation completes.
Implementation
#![allow(unused)]
fn main() {
use std::sync::{Arc, RwLock};
use std::io::Write;
pub struct Quota<B> {
inner: B,
config: QuotaConfig,
usage: Arc<RwLock<QuotaUsage>>,
}
struct QuotaConfig {
max_total_size: Option<u64>,
max_file_size: Option<u64>,
max_node_count: Option<u64>,
max_dir_entries: Option<u64>, // Max entries per directory
max_path_depth: Option<usize>,
}
/// Current usage statistics.
#[derive(Debug, Clone, Default)]
pub struct Usage {
pub total_size: u64,
pub file_count: u64,
pub dir_count: u64,
}
/// Configured limits.
#[derive(Debug, Clone)]
pub struct Limits {
pub max_total_size: Option<u64>,
pub max_file_size: Option<u64>,
pub max_node_count: Option<u64>,
pub max_dir_entries: Option<u64>,
pub max_path_depth: Option<usize>,
}
/// Remaining capacity.
#[derive(Debug, Clone)]
pub struct Remaining {
pub bytes: Option<u64>,
pub nodes: Option<u64>,
pub can_write: bool,
}
struct QuotaUsage {
total_size: u64,
file_count: u64,
dir_count: u64,
}
impl Default for QuotaUsage {
fn default() -> Self {
Self { total_size: 0, file_count: 0, dir_count: 0 }
}
}
impl<B> Quota<B> {
/// Get current usage statistics.
pub fn usage(&self) -> Usage {
let u = self.usage.read().unwrap();
Usage {
total_size: u.total_size,
file_count: u.file_count,
dir_count: u.dir_count,
}
}
/// Get configured limits.
pub fn limits(&self) -> Limits {
Limits {
max_total_size: self.config.max_total_size,
max_file_size: self.config.max_file_size,
max_node_count: self.config.max_node_count,
max_dir_entries: self.config.max_dir_entries,
max_path_depth: self.config.max_path_depth,
}
}
/// Get remaining capacity.
pub fn remaining(&self) -> Remaining {
let u = self.usage.read().unwrap();
let bytes = self.config.max_total_size.map(|max| max.saturating_sub(u.total_size));
let nodes = self.config.max_node_count.map(|max| max.saturating_sub(u.file_count + u.dir_count));
Remaining {
bytes,
nodes,
can_write: bytes.map(|b| b > 0).unwrap_or(true),
}
}
}
impl<B: Fs> Quota<B> {
/// Create Quota middleware with explicit config.
/// Prefer `QuotaLayer::builder()` for the Layer pattern.
pub fn with_config(inner: B, config: QuotaConfig) -> Result<Self, FsError> {
// IMPORTANT: Scan backend to initialize usage counters
let usage = Self::scan_usage(&inner)?;
Ok(Self {
inner,
config,
usage: Arc::new(RwLock::new(usage)),
})
}
fn scan_usage(backend: &B) -> Result<QuotaUsage, FsError> {
let mut usage = QuotaUsage::default();
Self::scan_dir(backend, Path::new("/"), &mut usage)?;
Ok(usage)
}
fn scan_dir(backend: &B, path: &Path, usage: &mut QuotaUsage) -> Result<(), FsError> {
for entry in backend.read_dir(path)? {
let entry = entry?;
let meta = backend.metadata(&entry.path)?;
if meta.is_file() {
usage.file_count += 1;
usage.total_size += meta.size;
} else if meta.is_dir() {
usage.dir_count += 1;
Self::scan_dir(backend, &entry.path, usage)?;
}
}
Ok(())
}
fn check_size_limit(&self, path: &Path, additional_bytes: u64) -> Result<(), FsError> {
let usage = self.usage.read().unwrap();
if let Some(max) = self.config.max_total_size {
if usage.total_size + additional_bytes > max {
return Err(FsError::QuotaExceeded {
path: path.to_path_buf(),
limit: max,
requested: additional_bytes,
usage: usage.total_size,
});
}
}
Ok(())
}
fn check_node_limit(&self, path: &Path) -> Result<(), FsError> {
if let Some(max) = self.config.max_node_count {
let usage = self.usage.read().unwrap();
if usage.file_count + usage.dir_count >= max {
return Err(FsError::QuotaExceeded {
path: path.to_path_buf(),
limit: max,
requested: 1,
usage: usage.file_count + usage.dir_count,
});
}
}
Ok(())
}
fn check_dir_entries(&self, parent: &Path) -> Result<(), FsError>
where B: FsDir {
if let Some(max) = self.config.max_dir_entries {
// Count entries in parent directory
let count = self.inner.read_dir(parent)?
.filter(|e| e.is_ok())
.count() as u64;
if count >= max {
return Err(FsError::QuotaExceeded {
path: parent.to_path_buf(),
limit: max,
requested: 1,
usage: count,
});
}
}
Ok(())
}
fn check_path_depth(&self, path: &Path) -> Result<(), FsError> {
if let Some(max) = self.config.max_path_depth {
let depth = path.components().count();
if depth > max {
return Err(FsError::QuotaExceeded {
path: path.to_path_buf(),
limit: max as u64,
requested: depth as u64,
usage: depth as u64,
});
}
}
Ok(())
}
}
impl<B: FsWrite + FsRead + FsDir> FsWrite for Quota<B> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
let new_size = data.len() as u64;
// Check path depth limit
self.check_path_depth(path)?;
// Check per-file limit
if let Some(max) = self.config.max_file_size {
if new_size > max {
return Err(FsError::FileSizeExceeded {
path: path.to_path_buf(),
size: new_size,
limit: max,
});
}
}
// Get old size (if file exists)
let old_size = self.inner.metadata(path)
.map(|m| m.size)
.unwrap_or(0);
// If creating a new file, check node count and dir entries
let is_new_file = old_size == 0;
if is_new_file {
self.check_node_limit(path)?;
if let Some(parent) = path.parent() {
self.check_dir_entries(parent)?;
}
}
let size_delta = new_size as i64 - old_size as i64;
if size_delta > 0 {
self.check_size_limit(path, size_delta as u64)?;
}
// Perform write
self.inner.write(path, data)?;
// Update usage
let mut usage = self.usage.write().unwrap();
usage.total_size = (usage.total_size as i64 + size_delta) as u64;
if is_new_file {
usage.file_count += 1;
}
Ok(())
}
fn open_write(&self, path: &Path) -> Result<Box<dyn Write + Send>, FsError> {
let path = path.as_ref().to_path_buf();
// Get the underlying writer
let inner_writer = self.inner.open_write(&path)?;
// Wrap in a counting writer
Ok(Box::new(QuotaWriter {
inner: inner_writer,
path,
bytes_written: 0,
usage: Arc::clone(&self.usage),
max_file_size: self.config.max_file_size,
max_total_size: self.config.max_total_size,
}))
}
}
/// Wrapper that counts bytes and enforces quota on streaming writes
struct QuotaWriter {
inner: Box<dyn Write + Send>,
path: PathBuf,
bytes_written: u64,
usage: Arc<RwLock<QuotaUsage>>,
max_file_size: Option<u64>,
max_total_size: Option<u64>,
}
impl Write for QuotaWriter {
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
let additional = buf.len() as u64;
// Check per-file limit
if let Some(max) = self.max_file_size {
if self.bytes_written + additional > max {
return Err(std::io::Error::new(
std::io::ErrorKind::Other,
"file size limit exceeded"
));
}
}
// Check total size limit
if let Some(max) = self.max_total_size {
let usage = self.usage.read().unwrap();
if usage.total_size + additional > max {
return Err(std::io::Error::new(
std::io::ErrorKind::Other,
"quota exceeded"
));
}
}
// Write to inner
let written = self.inner.write(buf)?;
// Update counters
self.bytes_written += written as u64;
let mut usage = self.usage.write().unwrap();
usage.total_size += written as u64;
Ok(written)
}
fn flush(&mut self) -> std::io::Result<()> {
self.inner.flush()
}
}
impl Drop for QuotaWriter {
fn drop(&mut self) {
// If we need to track "committed" vs "in-progress" writes,
// this is where we'd finalize the accounting
}
}
impl<B: FsDir + FsRead> FsDir for Quota<B> {
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
// Check path depth
self.check_path_depth(path)?;
// Check node count
self.check_node_limit(path)?;
// Check parent directory entries
if let Some(parent) = path.parent() {
self.check_dir_entries(parent)?;
}
// Create directory
self.inner.create_dir(path)?;
// Update usage
let mut usage = self.usage.write().unwrap();
usage.dir_count += 1;
Ok(())
}
// create_dir_all, remove_dir, etc. delegate similarly
// ...
}
}
Challenges and Solutions
| Challenge | Solution |
|---|---|
| Initial usage unknown | Scan backend on construction |
| Streaming writes | QuotaWriter wrapper counts bytes |
| Concurrent writes | RwLock on usage counters |
| File replacement | Calculate delta (new_size - old_size) |
| New file detection | Check exists() before write |
| Accurate accounting | Update counters after successful operations |
| Node count limit | Check before creating files/directories |
| Dir entries limit | Count parent entries before creating child |
| Path depth limit | Count path components on create |
Edge Cases
- Partial write failure: If
inner.write()fails, don’t update counters. - Streaming write failure:
QuotaWriterupdates optimistically; on error, may need rollback. - Rename: Doesn’t change total size.
- Copy: Adds destination size.
- Append: Adds appended bytes only.
Verdict: ✅ Implementable
The most complex middleware, but well-understood patterns. The QuotaWriter wrapper is the key insight.
9. Overlay<B1, B2>
Complexity: High State: Two backends + whiteout tracking Dependencies: None
Overlay Semantics (Docker-style)
- Lower layer (base): Read-only source
- Upper layer: Writable overlay
- Whiteouts: Files named
.wh.<filename>mark deletions - Opaque directories:
.wh..wh..opqhides entire lower directory
Implementation
#![allow(unused)]
fn main() {
pub struct Overlay<Lower, Upper> {
lower: Lower,
upper: Upper,
}
impl<Lower, Upper> Overlay<Lower, Upper> {
const WHITEOUT_PREFIX: &'static str = ".wh.";
const OPAQUE_MARKER: &'static str = ".wh..wh..opq";
fn whiteout_path(path: &Path) -> PathBuf {
let parent = path.parent().unwrap_or(Path::new("/"));
let name = path.file_name().unwrap_or_default();
parent.join(format!("{}{}", Self::WHITEOUT_PREFIX, name.to_string_lossy()))
}
fn is_whiteout(name: &str) -> bool {
name.starts_with(Self::WHITEOUT_PREFIX)
}
fn original_name(whiteout_name: &str) -> &str {
&whiteout_name[Self::WHITEOUT_PREFIX.len()..]
}
}
impl<Lower: FsRead, Upper: FsRead> FsRead for Overlay<Lower, Upper> {
fn read(&self, path: &Path) -> Result<Vec<u8>, FsError> {
let path = path.as_ref();
// Check if whiteout exists in upper
let whiteout = Self::whiteout_path(path);
if self.upper.exists(&whiteout).unwrap_or(false) {
return Err(FsError::NotFound { path: path.to_path_buf() });
}
// Try upper first
match self.upper.read(path) {
Ok(data) => return Ok(data),
Err(FsError::NotFound { .. }) => {}
Err(e) => return Err(e),
}
// Fall back to lower
self.lower.read(path)
}
fn exists(&self, path: &Path) -> Result<bool, FsError> {
let path = path.as_ref();
// Check whiteout first
let whiteout = Self::whiteout_path(path);
if self.upper.exists(&whiteout).unwrap_or(false) {
return Ok(false); // Whited out = doesn't exist
}
// Check upper, then lower
if self.upper.exists(path).unwrap_or(false) {
return Ok(true);
}
self.lower.exists(path)
}
fn metadata(&self, path: &Path) -> Result<Metadata, FsError> {
let path = path.as_ref();
// Check whiteout
let whiteout = Self::whiteout_path(path);
if self.upper.exists(&whiteout).unwrap_or(false) {
return Err(FsError::NotFound { path: path.to_path_buf() });
}
// Upper first, then lower
match self.upper.metadata(path) {
Ok(meta) => return Ok(meta),
Err(FsError::NotFound { .. }) => {}
Err(e) => return Err(e),
}
self.lower.metadata(path)
}
}
impl<Lower: FsRead, Upper: Fs> FsWrite for Overlay<Lower, Upper> {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError> {
let path = path.as_ref();
// Remove whiteout if it exists
let whiteout = Self::whiteout_path(path);
let _ = self.upper.remove_file(&whiteout); // Ignore if doesn't exist
// Write to upper
self.upper.write(path, data)
}
fn remove_file(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
// Try to remove from upper
let _ = self.upper.remove_file(path);
// If file exists in lower, create whiteout
if self.lower.exists(path).unwrap_or(false) {
let whiteout = Self::whiteout_path(path);
self.upper.write(&whiteout, b"")?; // Create whiteout marker
}
Ok(())
}
fn rename(&self, from: &Path, to: &Path) -> Result<(), FsError> {
let from = from.as_ref();
let to = to.as_ref();
// Copy-on-write: read from overlay, write to upper, whiteout original
let data = self.read(from)?;
self.write(to, &data)?;
self.remove_file(from)?;
Ok(())
}
}
impl<Lower: FsRead + FsDir, Upper: Fs> FsDir for Overlay<Lower, Upper> {
fn read_dir(&self, path: &Path) -> Result<ReadDirIter, FsError> {
let path = path.as_ref();
// Check for opaque marker
let opaque_marker = path.join(Self::OPAQUE_MARKER);
let is_opaque = self.upper.exists(&opaque_marker).unwrap_or(false);
// Get entries from upper
let mut entries: HashMap<String, DirEntry> = HashMap::new();
let mut whiteouts: HashSet<String> = HashSet::new();
if let Ok(upper_iter) = self.upper.read_dir(path) {
for entry in upper_iter {
let entry = entry?;
let name = entry.name.clone();
if Self::is_whiteout(&name) {
whiteouts.insert(Self::original_name(&name).to_string());
} else if name != Self::OPAQUE_MARKER {
entries.insert(name, entry);
}
}
}
// Merge lower entries (unless opaque)
if !is_opaque {
if let Ok(lower_iter) = self.lower.read_dir(path) {
for entry in lower_iter {
let entry = entry?;
let name = entry.name.clone();
// Skip if already in upper or whited out
if !entries.contains_key(&name) && !whiteouts.contains(&name) {
entries.insert(name, entry);
}
}
}
}
// Convert to iterator
let entries_vec: Vec<_> = entries.into_values().map(Ok).collect();
Ok(ReadDirIter::from_vec(entries_vec))
}
fn create_dir(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
// Remove whiteout if exists
let whiteout = Self::whiteout_path(path);
let _ = self.upper.remove_file(&whiteout);
self.upper.create_dir(path)
}
fn remove_dir(&self, path: &Path) -> Result<(), FsError> {
let path = path.as_ref();
// Try to remove from upper
let _ = self.upper.remove_dir(path);
// If exists in lower, create whiteout
if self.lower.exists(path).unwrap_or(false) {
let whiteout = Self::whiteout_path(path);
self.upper.write(&whiteout, b"")?;
}
Ok(())
}
}
}
Key Concepts
| Concept | Description |
|---|---|
| Whiteout | .wh.<name> file in upper marks deletion of <name> from lower |
| Opaque | .wh..wh..opq file in a directory hides all lower entries |
| Copy-on-write | First write copies from lower to upper, then modifies |
| Merge | read_dir() combines both layers, respecting whiteouts |
Challenges
- Whiteout storage: Whiteouts are regular files - backend doesn’t need special support.
- Directory listing merge: Must be memory-buffered to remove duplicates and whiteouts.
- Rename: Implemented as copy + delete (standard CoW pattern).
- Symlinks in lower: Need to handle carefully - symlink targets might point to lower layer.
ReadDirIter Consideration
For Overlay, we need to buffer the merged directory listing. This means ReadDirIter must support construction from a Vec:
#![allow(unused)]
fn main() {
impl ReadDirIter {
pub fn from_vec(entries: Vec<Result<DirEntry, FsError>>) -> Self {
Self(Box::new(entries.into_iter()))
}
}
}
Verdict: ✅ Implementable
The most complex middleware, but uses well-established patterns from OverlayFS. Key insight: whiteouts are just marker files, no special backend support needed.
Summary
| Middleware | Complexity | Key Implementation Insight |
|---|---|---|
| ReadOnly | Trivial | Block all writes |
| Restrictions | Simple | Flag checks |
| Tracing | Simple | Wrap operations in spans |
| RateLimit | Moderate | Atomic counter + time window |
| DryRun | Moderate | Log writes, return Ok without executing |
| PathFilter | Moderate | Glob matching + filtered ReadDirIter |
| Cache | Moderate | LRU cache, invalidate on writes |
| Quota | High | Usage counters + QuotaWriter wrapper |
| Overlay | High | Whiteout markers + merged directory listing |
Required Framework Features
These middleware implementations assume:
ReadDirIteris a trait object wrapper - allows filtering and composition- All methods use
&self- interior mutability for state FsErrorhas all necessary variants - ReadOnly, RateLimitExceeded, QuotaExceeded, AccessDenied, FeatureNotEnabled
All of these are already part of our design. All middleware are implementable.
Appendix: Layer Trait Implementation
Each middleware provides a corresponding Layer type for composition:
#![allow(unused)]
fn main() {
// Example for Quota
pub struct QuotaLayer {
config: QuotaConfig,
}
impl QuotaLayer {
pub fn builder() -> QuotaLayerBuilder<Unconfigured> {
QuotaLayerBuilder::new()
}
}
impl<B: Fs> Layer<B> for QuotaLayer {
type Backend = Quota<B>;
fn layer(self, backend: B) -> Self::Backend {
Quota::with_config(backend, self.config).expect(\"quota initialization failed\")\n }\n}
// Usage:
let fs = MemoryBackend::new()
.layer(QuotaLayer::builder()
.max_total_size(100_000_000)
.build());
}
Lessons from Similar Projects
Analysis of issues from vfs and agentfs to inform AnyFS design.
This chapter documents problems encountered by similar projects and how AnyFS addresses them. These lessons are incorporated into our Implementation Plan and Backend Guide.
Summary
| Priority | Issue | AnyFS Response |
|---|---|---|
| 1 | Panics instead of errors | No-panic policy, always return Result |
| 2 | Thread safety problems | Concurrent stress tests required |
| 3 | Inconsistent path handling | Normalize in one place, test edge cases |
| 4 | Poor error ergonomics | FsError with context fields |
| 5 | Missing documentation | Performance & thread safety docs required |
| 6 | Platform issues | Cross-platform CI pipeline |
1. Thread Safety Issues
What Happened
Root cause: Insufficient synchronization in concurrent access patterns.
AnyFS Response
- Test concurrent operations explicitly - stress test with multiple threads
- Document thread safety guarantees per backend
Fs: Sendbound is intentionalMemoryBackendusesArc<RwLock<...>>for interior mutability
Required tests:
#![allow(unused)]
fn main() {
#[test]
fn test_concurrent_create_dir_all() {
let backend = Arc::new(RwLock::new(create_backend()));
let handles: Vec<_> = (0..10).map(|_| {
let backend = backend.clone();
std::thread::spawn(move || {
let mut backend = backend.write().unwrap();
let _ = backend.create_dir_all(std::path::Path::new("/a/b/c/d"));
})
}).collect();
for handle in handles {
handle.join().unwrap();
}
}
}
2. Panics Instead of Errors
What Happened
| Project | Issue | Problem |
|---|---|---|
| vfs | #8 | AltrootFS panics when file doesn’t exist |
| vfs | #23 | Unhandled edge cases cause panics |
| vfs | #68 | MemoryFS panics in WebAssembly |
Root cause: Using .unwrap() or .expect() on fallible operations.
AnyFS Response
No-panic policy: Never use .unwrap() or .expect() in library code.
#![allow(unused)]
fn main() {
// BAD - will panic
let entry = self.entries.get(&path).unwrap();
// GOOD - returns error
let entry = self.entries.get(&path)
.ok_or_else(|| FsError::NotFound { path: path.to_path_buf() })?;
}
Edge cases that must return errors (not panic):
- File doesn’t exist
- Directory doesn’t exist
- Path is empty string
- Invalid UTF-8 in path
- Parent directory missing
- Type mismatch (file vs directory)
- Concurrent access conflicts
3. Path Handling Inconsistencies
What Happened
| Project | Issue | Problem |
|---|---|---|
| vfs | #24 | Inconsistent path definition across backends |
| vfs | #42 | Path join doesn’t behave Unix-like |
| vfs | #22 | Non-UTF-8 path support questions |
Root cause: Each backend implemented path handling differently.
AnyFS Response
- Normalize paths in ONE place (FileStorage resolver for virtual backends;
SelfResolvingbackends delegate to the OS) - Consistent semantics: always absolute, always
/separator - Use
&Pathin core traits for object safety; provideimpl AsRef<Path>at the ergonomic layer (FileStorage/FsExt)
Required conformance tests:
| Input | Expected Output |
|---|---|
/foo/../bar | /bar |
/foo/./bar | /foo/bar |
//double//slash | /double/slash |
/ | / |
| `` (empty) | Error |
/foo/bar/ | /foo/bar |
4. Static Lifetime Requirements
What Happened
| Project | Issue | Problem |
|---|---|---|
| vfs | #66 | Why does filesystem require 'static? |
Root cause: Design decision that confused users and limited flexibility.
AnyFS Response
- Avoid
'staticbounds unless necessary - Our design:
Fs: Send(not'static) - Document why bounds exist when needed
5. Missing Symlink Support
What Happened
| Project | Issue | Problem |
|---|---|---|
| vfs | #81 | Symlink support missing entirely |
Root cause: Symlinks are complex and were deferred indefinitely.
AnyFS Response
- Symlinks supported via
FsLinktrait - backends that implementFsLinksupport symlinks - Compile-time capability - no
FsLinkimpl = no symlinks (won’t compile) - Bound resolution depth (default: 40 hops)
strict-pathprevents symlink escapes inVRootFsBackend
6. Error Type Ergonomics
What Happened
| Project | Issue | Problem |
|---|---|---|
| vfs | #33 | Error type hard to match programmatically |
Root cause: Error enum wasn’t designed for pattern matching.
AnyFS Response
FsError includes context and is easy to match:
#![allow(unused)]
fn main() {
#[derive(Debug, thiserror::Error)]
pub enum FsError {
#[error("not found: {path}")]
NotFound { path: PathBuf },
#[error("{operation}: already exists: {path}")]
AlreadyExists { path: PathBuf, operation: &'static str },
#[error("quota exceeded: limit {limit}, requested {requested}, usage {usage}")]
QuotaExceeded { limit: u64, requested: u64, usage: u64 },
#[error("feature not enabled: {feature} ({operation})")]
FeatureNotEnabled { feature: &'static str, operation: &'static str },
#[error("permission denied: {path} ({operation})")]
PermissionDenied { path: PathBuf, operation: &'static str },
// ...
}
}
7. Seek + Write Operations
What Happened
| Project | Issue | Problem |
|---|---|---|
| vfs | #35 | Missing file positioning features |
Root cause: Initial API was too simple.
AnyFS Response
- Streaming I/O:
open_read/open_writereturnBox<dyn Read/Write + Send> - Seek support varies by backend - document which support it
- Consider future:
open_read_seekvariant or capability query
8. Read-Only Filesystem Request
What Happened
| Project | Issue | Problem |
|---|---|---|
| vfs | #58 | Request for immutable filesystem |
Root cause: No built-in way to enforce read-only access.
AnyFS Response
Already solved: ReadOnly<B> middleware blocks all writes.
#![allow(unused)]
fn main() {
use anyfs::{ReadOnly, FileStorage};
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
let readonly_fs = FileStorage::new(
ReadOnly::new(SqliteBackend::open("archive.db")?)
);
// All write operations return FsError::ReadOnly
}
This validates our middleware approach.
9. Performance Issues
What Happened
Root cause: SQLite operations not optimized, FUSE overhead.
AnyFS Response
- Batch operations where possible in
SqliteBackend - Use transactions for multi-file operations
- Document performance characteristics per backend
- Keep mounting optional - core AnyFS stays a library; mount concerns are behind feature flags (
fuse,winfsp)
Documentation requirement:
#![allow(unused)]
fn main() {
/// # Performance Characteristics
///
/// | Operation | Complexity | Notes |
/// |-----------|------------|-------|
/// | `read` | O(1) | Single DB query |
/// | `write` | O(n) | n = data size |
/// | `remove_dir_all` | O(n) | n = descendants |
pub struct SqliteBackend { ... }
}
10. Signal Handling / Shutdown
What Happened
| Project | Issue | Problem |
|---|---|---|
| agentfs | #129 | Doesn’t shutdown on SIGTERM |
Root cause: FUSE mount cleanup issues.
AnyFS Response
- Core stays a library - daemon/mount shutdown concerns are behind feature flags
- Ensure
Dropimplementations clean up properly SqliteBackendflushes on drop
#![allow(unused)]
fn main() {
impl Drop for SqliteBackend {
fn drop(&mut self) {
if let Err(e) = self.sync() {
eprintln!("Warning: failed to sync on drop: {}", e);
}
}
}
}
11. Platform Compatibility
What Happened
Root cause: Platform-specific FUSE variants.
AnyFS Response
- We isolate this - core traits stay pure; FUSE lives behind feature flags (
fuse,winfsp) in theanyfscrate - Cross-platform by design - Memory and SQLite work everywhere
VRootFsBackendusesstrict-pathwhich handles Windows/Unix
CI requirement:
strategy:
matrix:
os: [ubuntu-latest, windows-latest, macos-latest]
12. Multiple Sessions / Concurrent Access
What Happened
| Project | Issue | Problem |
|---|---|---|
| agentfs | #126 | Can’t have multiple sessions on same filesystem |
Root cause: Locking/concurrency design.
AnyFS Response
SqliteBackenduses WAL mode for concurrent readers- Document concurrency model per backend
MemoryBackendusesArc<RwLock<...>>for sharing
Issues We Already Avoid
Our design decisions already prevent these problems:
| Problem in Others | AnyFS Solution |
|---|---|
| No middleware pattern | Tower-style composable middleware |
| No quota enforcement | Quota<B> middleware |
| No read-only mode | ReadOnly<B> middleware |
| Symlink complexity | FsLink trait (compile-time) |
| Path escape via symlinks | strict-path canonicalization |
| FUSE complexity | Isolated behind feature flags |
| SQLite-only | Multiple backends |
| Monolithic features | Composable middleware |
References
- rust-vfs Issues
- agentfs Issues
- Implementation Plan - incorporates these lessons
- Backend Guide - implementation requirements
Open Questions & Future Considerations
Status: Resolved (future considerations tracked) Last Updated: 2025-12-28
This document captures previously open questions and design considerations. Unless explicitly marked as future, the items below are resolved.
Note: Final decisions live in the Architecture Decision Records.
Symlink Security: Following vs Creating
Status: Resolved
Decision
- Symlink support is a backend capability (via
FsLink). FileStorageresolves paths via pluggablePathResolverfor non-SelfResolvingbackends.- The default
IterativeResolverfollows symlinks whenFsLinkis available. Custom resolvers can implement different behaviors. SelfResolvingbackends delegate to the OS.strict-pathprevents escapes.
Implications
- If you need symlink-free semantics, use a backend that does not implement
FsLink, or if using a backend that does implementFsLink, ensure no preexisting symlinks exist in the data (theFsLinktrait provides creation capability, but you can choose not to call those methods). Restrictionsmiddleware controls permission-related operations only, not symlink creation (which is a trait-level capability).
Why
- Virtual backends have no host filesystem to escape to; symlink resolution stays inside the virtual structure.
- OS-backed backends cannot reliably disable symlink following without TOCTOU risks or platform-specific hacks.
Virtual vs Real Backends: Path Resolution
Status: Resolved (see also ADR-033 for PathResolver)
Question: Should path resolution logic be different for virtual backends (memory, SQLite) vs filesystem-based backends (StdFsBackend, VRootFsBackend)?
Resolution: FileStorage handles path resolution via pluggable PathResolver trait for non-SelfResolving backends. SelfResolving backends delegate to the OS, so FileStorage does not pre-resolve paths for them.
| Backend Type | Path Resolution | Symlink Handling |
|---|---|---|
| MemoryBackend | PathResolver (default: IterativeResolver) | Resolver follows symlinks |
| SqliteBackend | PathResolver (default: IterativeResolver) | Resolver follows symlinks |
| VRootFsBackend | OS (implements SelfResolving) | OS follows symlinks (strict-path prevents escapes) |
Key design decisions:
- Backends that wrap a real filesystem implement the
SelfResolvingmarker trait to tellFileStorageto skip resolution:
#![allow(unused)]
fn main() {
impl SelfResolving for VRootFsBackend {}
}
- Path resolution is pluggable via
PathResolvertrait (ADR-033). Built-in resolvers include:IterativeResolver- default symlink-aware resolution (when backend implementsFsLink)NoOpResolver- forSelfResolvingbackendsCachingResolver- LRU cache wrapper around another resolver
Compression and Encryption
Question: Does the current design allow backends to compress/decompress or encrypt/decrypt files transparently?
Answer: Yes. The backend receives the data and stores it however it wants. A backend could:
- Compress data before writing to SQLite
- Encrypt blobs with a user-provided key
- Use a remote object store with encryption at rest
This is an implementation detail of the backend, not visible to the FileStorage API.
Hooks and Callbacks
Question: Should AnyFS support hooks or callbacks for file operations (e.g., audit logging, validation)?
Considerations:
- AgentFS (see comparison below) provides audit logging as a core feature
- Hooks add complexity but enable powerful use cases
- Could be implemented as a middleware pattern around FileStorage
Resolution: Implemented via Tracing middleware. Users can also wrap FileStorage or backends for custom hooks.
AgentFS Comparison
Note: There are two projects named “AgentFS”:
| Project | Description |
|---|---|
| tursodatabase/agentfs | Full AI agent runtime (Turso/libSQL) |
| cryptopatrick/agentfs | Related to AgentDB abstraction layer |
This section focuses on Turso’s AgentFS, which has a published spec.
What AgentFS Provides
AgentFS is an agent runtime, not just a filesystem. It provides three integrated subsystems:
- Virtual Filesystem - POSIX-like, inode-based, chunked storage in SQLite
- Key-Value Store - Agent state and context storage
- Tool Call Audit Trail - Records all tool invocations for debugging/compliance
AnyFS vs AgentFS: Different Abstractions
| Concern | AnyFS | AgentFS |
|---|---|---|
| Scope | Filesystem abstraction | Agent runtime |
| Filesystem | Full | Full |
| Key-Value store | Not our domain | Included |
| Tool auditing | Tracing middleware | Built-in |
| Backends | Memory, SQLite, VRootFs, custom | SQLite only (spec) |
| Middleware | Composable layers | Monolithic |
Relationship Options
AnyFS could be used BY AgentFS:
- AgentFS could implement its filesystem portion using
Fstrait - Our middleware (Quota, PathFilter, etc.) would work with their system
AgentFS-compatible backend for AnyFS:
- Someone could implement
Fsusing AgentFS’s SQLite schema - Would enable interop with AgentFS tooling
What we should NOT do:
- Add KV store to
Fs(different abstraction, scope creep) - Add tool call auditing to core trait (that’s what
Tracingmiddleware is for)
When to Use Which
| Use Case | Recommendation |
|---|---|
| Need just filesystem operations | AnyFS |
| Need composable middleware (quota, sandboxing) | AnyFS |
| Need full agent runtime (FS + KV + auditing) | AgentFS |
| Need multiple backend types (memory, real FS) | AnyFS |
| Need AgentFS-compatible SQLite format | AgentFS or custom AnyFS backend |
Takeaway
AnyFS and AgentFS solve different problems at different layers:
- AnyFS = filesystem abstraction with composable middleware
- AgentFS = complete agent runtime with integrated storage
They can complement each other rather than compete.
VFS Crate Comparison
The vfs crate provides virtual filesystem abstractions with:
- PhysicalFS: Host filesystem access
- MemoryFS: In-memory storage
- AltrootFS: Rooted filesystem (similar to our VRootFsBackend)
- OverlayFS: Layered filesystem
- EmbeddedFS: Compile resources into binary
Similarities with AnyFS:
- Trait-based abstraction over storage
- Memory and physical filesystem backends
Differences:
- VFS doesn’t have SQLite backend
- VFS doesn’t have policy/quota layer
- AnyFS focuses on isolation and limits
Why not use VFS? VFS is a good library, but AnyFS’s design goals differ:
- We want SQLite as a first-class backend
- We need quota/limit enforcement
- We want feature whitelisting (least privilege)
FUSE Mount Support
Status: Designed - Part of anyfs crate (feature flags: fuse, winfsp)
What is FUSE? FUSE (Filesystem in Userspace) allows implementing filesystems in userspace rather than kernel code. It enables:
- Mounting any backend as a real filesystem
- Using standard Unix tools (ls, cat, etc.) on AnyFS containers
- Integration with existing workflows
Resolution: Part of anyfs crate with feature flags:
- Linux: FUSE (native) -
fusefeature - macOS: macFUSE -
fusefeature - Windows: WinFsp -
winfspfeature
See Cross-Platform Mounting for full details.
Type-System Protection for Cross-Container Operations
Status: Resolved - User-defined wrapper types
Question: Should we use the type system to prevent accidentally mixing data between containers?
Resolution: Users who need type-safe domain separation can create wrapper types:
#![allow(unused)]
fn main() {
use anyfs_sqlite::SqliteBackend; // Ecosystem crate
// User-defined wrapper types provide compile-time safety
struct SandboxFs(FileStorage<MemoryBackend>);
struct UserDataFs(FileStorage<SqliteBackend>);
let sandbox = SandboxFs(FileStorage::new(MemoryBackend::new()));
let userdata = UserDataFs(FileStorage::new(SqliteBackend::open("data.db")?));
fn process_sandbox(fs: &SandboxFs) { /* only accepts SandboxFs */ }
process_sandbox(&sandbox); // OK
process_sandbox(&userdata); // Compile error - different type!
}
This approach avoids generic parameter complexity while still enabling compile-time safety when needed. See FileStorage for details.
Naming Considerations
Based on review feedback, the following naming concerns were raised:
| Current Name | Concern | Alternatives Considered |
|---|---|---|
anyfs-traits | “traits” is vague | anyfs-backend (adopted) |
anyfs-container | Could imply Docker | Merged into anyfs (adopted) |
anyfs | Sounds like Hebrew “ani efes” (I am zero) | anyfs retained for simplicity |
Decision: Renamed anyfs-traits to anyfs-backend. Merged anyfs-container into anyfs.
POSIX Behavior
Question: How POSIX-compatible should AnyFS be?
Answer: AnyFS is not a POSIX emulator. We use std::fs-like naming and semantics for familiarity, but we don’t aim for full POSIX compliance. Specific differences:
- Symlink-aware path resolution (FileStorage walks the virtual structure using
metadata()andread_link()) - No file descriptors or open file handles in the basic API
- Simplified permissions model
- No device files, FIFOs, or sockets
Async Support
Question: Should Fs traits be async?
Decision: Sync-first, async-ready (see ADR-010).
Rationale:
- Built-in backends are naturally synchronous (std::fs, memory)
- Ecosystem backends are also sync (e.g., rusqlite is sync)
- No runtime dependency (tokio/async-std) required
- Rust 1.75+ has native async traits, so adding later is low-cost
Async-ready design:
- Traits require
Send- compatible with async executors - Return types are
Result<T, FsError>- works with async - No hidden blocking state
- Methods are stateless per-call
Future path: When needed (e.g., S3/network backends), add parallel AsyncFs trait:
- Separate trait, not replacing
Fs - Blanket impl possible via
spawn_blocking - No breaking changes to existing sync API
Summary
| Topic | Decision |
|---|---|
| Symlink security | Backend-defined (FsLink); VRootFsBackend uses strict-path for containment |
| Path resolution | FileStorage (symlink-aware); VRootFs = OS via SelfResolving |
| Compression/encryption | Backend responsibility |
| Hooks/callbacks | Tracing middleware |
| FUSE mount | Part of anyfs crate (fuse, winfsp feature flags) |
| Type-system protection | User-defined wrapper types (e.g., struct SandboxFs(FileStorage<B>)) |
| POSIX compatibility | Not a goal |
truncate | Added to FsWrite |
sync / fsync | Added to FsSync |
| Async support | Sync-first, async-ready (ADR-010) |
| Layer trait | Tower-style composition (ADR-011) |
| Logging | Tracing with tracing ecosystem (ADR-012) |
| Extension methods | FsExt (ADR-013) |
| Zero-copy bytes | Optional bytes feature (ADR-014) |
| Error context | Contextual FsError (ADR-015) |
| BackendStack builder | Fluent API via .layer() |
| Path-based access control | PathFilter middleware (ADR-016) |
| Read-only mode | ReadOnly middleware (ADR-017) |
| Rate limiting | RateLimit middleware (ADR-018) |
| Dry-run testing | DryRun middleware (ADR-019) |
| Read caching | Cache middleware (ADR-020) |
| Union filesystem | Overlay middleware (ADR-021) |
Design Review: Rust Community Alignment
This document critically reviews AnyFS design decisions against Rust community expectations and best practices. The goal is to identify potential friction points before implementation.
Summary
| Category | Issues Found | Status |
|---|---|---|
| Critical (must fix) | 2 | ✅ Fixed |
| Should fix | 4 | 🟡 In progress |
| Document clearly | 3 | 🟢 Ongoing |
| Non-issues | 5 | ✅ Verified |
✅ Critical Issues (Fixed)
1. FsError Missing #[non_exhaustive] — FIXED
Problem: Our FsError enum doesn’t have #[non_exhaustive]. This is a semver hazard.
Status: ✅ Fixed in design-overview.md. FsError now has #[non_exhaustive], thiserror::Error derive, and From<std::io::Error> impl.
#![allow(unused)]
fn main() {
// Current (problematic)
pub enum FsError {
NotFound { path: PathBuf },
AlreadyExists { path: PathBuf, operation: &'static str },
// ...
}
// If we add a variant in 1.1:
pub enum FsError {
NotFound { path: PathBuf },
AlreadyExists { path: PathBuf, operation: &'static str },
TooManySymlinks { path: PathBuf }, // NEW - breaks exhaustive matches!
}
}
Impact: Users with exhaustive matches will get compile errors on minor version bumps.
Fix:
#![allow(unused)]
fn main() {
#[non_exhaustive]
#[derive(Debug, thiserror::Error)]
pub enum FsError {
#[error("not found: {path}")]
NotFound { path: PathBuf },
// ...
}
}
Also needed:
impl std::error::Error for FsErrorimpl From<std::io::Error> for FsError- Consider
#[non_exhaustive]on struct variants too
2. Documentation Shows &mut self Despite ADR-023 — FIXED
Problem: Several code examples still show &mut self or &mut impl Fs.
Status: ✅ Fixed. All examples in design-overview.md and files-container.md now use &self.
#![allow(unused)]
fn main() {
// In design-overview.md line 346:
fn with_symlinks(fs: &mut (impl Fs + FsLink)) { // WRONG
fs.write("/target.txt", b"content")?;
fs.symlink("/target.txt", "/link.txt")?;
}
// Should be:
fn with_symlinks(fs: &(impl Fs + FsLink)) { // Correct
fs.write("/target.txt", b"content")?;
fs.symlink("/target.txt", "/link.txt")?;
}
}
Impact: Contradicts ADR-023 (interior mutability). Confuses implementers.
Fix: Audit all examples and ensure &self everywhere.
🟡 Should Fix
3. Sync-Only Design May Limit Adoption
Problem: No async support. Many modern Rust projects are async-first.
Current stance (ADR-024): Sync now, async later via parallel traits.
Community expectation: Projects like tokio, async-std are dominant. Users may expect:
#![allow(unused)]
fn main() {
async fn read(&self, path: &Path) -> Result<Vec<u8>, FsError>;
}
Mitigation:
- Document clearly: “Sync-first by design, async planned”
- Ensure
Send + Syncbounds enablespawn_blockingwrapper - Consider shipping
anyfs-asyncadapter crate early
Recommendation: Acceptable. Async support is a future consideration.
4. Interior Mutability May Surprise Users
Problem: &self for write operations is unusual in Rust.
#![allow(unused)]
fn main() {
// Our design
pub trait FsWrite: Send + Sync {
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
// What users might expect (std::io::Write pattern)
pub trait FsWrite {
fn write(&mut self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
}
Why we chose this (ADR-023):
- Filesystems are shared resources
- Enables concurrent access
- Matches how
std::fs::write()works (takes path, not mutable handle)
Potential friction:
- Users may try to use
&mut selfpatterns - May conflict with borrowck mental models
Mitigation:
- Document prominently with rationale
- Show examples of concurrent usage
- Explain: “Like std::fs, not like std::io::Write”
Recommendation: Keep the design, but add prominent documentation.
5. Layer Trait Doesn’t Match Tower Exactly
Problem: Tower’s Layer trait has a different signature:
#![allow(unused)]
fn main() {
// Tower's Layer
pub trait Layer<S> {
type Service;
fn layer(&self, inner: S) -> Self::Service;
}
// Our Layer (appears to be)
pub trait Layer<B: Fs> {
type Backend: Fs;
fn layer(self, backend: B) -> Self::Backend;
}
}
Differences:
- Tower uses
&self, we useself(consumes the layer) - Tower calls it
Service, we call itBackend - Tower doesn’t require bounds on
S
Impact: Users familiar with Tower may be confused.
Options:
- Match Tower exactly - maximum familiarity
- Keep our design -
selfconsumption is arguably cleaner for our use case - Document differences - explain why we diverge
Recommendation: Document the differences. Our self consumption prevents accidental reuse of configured layers, which is appropriate for our use case.
6. No #[must_use] on Results
Problem: Functions returning Result should have #[must_use] to catch ignored errors.
#![allow(unused)]
fn main() {
// Current
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
// Better
#[must_use]
fn write(&self, path: &Path, data: &[u8]) -> Result<(), FsError>;
}
Impact: Users might accidentally ignore errors.
Fix: Add #[must_use] to all Result-returning methods, or use #[must_use] on the trait itself.
🟢 Document Clearly
7. Path Semantics Are Virtual, Not OS
Consideration: Our paths are virtual filesystem paths, not OS paths.
#![allow(unused)]
fn main() {
// On Windows, this works the same as on Unix:
fs.write("/documents/file.txt", data)?; // Forward slashes always
}
Potential confusion:
- Windows users might expect backslashes
- Path normalization rules may differ from OS
Mitigation: Document:
- “Paths are virtual, always use forward slashes”
- “Path resolution is platform-independent”
- Show examples on Windows
8. Fs as Marker Trait Pattern
Pattern:
#![allow(unused)]
fn main() {
pub trait Fs: FsRead + FsWrite + FsDir {}
impl<T: FsRead + FsWrite + FsDir> Fs for T {}
}
This is valid Rust but may surprise some users. They might expect:
#![allow(unused)]
fn main() {
pub trait Fs {
fn read(...);
fn write(...);
// etc
}
}
Why we do it:
- Granular traits for partial implementations
- Middleware only needs to implement what it wraps
Mitigation: Document the pattern clearly with examples.
9. Builder Pattern Requires Configuration
Pattern:
#![allow(unused)]
fn main() {
// This won't compile - no build() on unconfigured builder
let quota = QuotaLayer::builder().build(); // Error!
// Must configure at least one limit
let quota = QuotaLayer::builder()
.max_total_size(1_000_000)
.build(); // OK
}
This is intentional (ADR-022) but may surprise users expecting defaults.
Mitigation: Clear error messages and documentation.
✅ Non-Issues (We’re Doing It Right)
10. Object-Safe Path Parameters
✅ Core traits use &Path; ergonomics come from FileStorage/FsExt.
11. Send + Sync Requirements
✅ Standard for thread-safe abstractions. Enables use across async boundaries.
12. Feature-Gated Backends
✅ Standard Cargo pattern. Reduces compile time for unused backends.
13. Strategic Boxing (ADR-025)
✅ Matches Tower/Axum approach. Well-documented rationale.
14. Generic Middleware Composition
✅ Zero-cost abstractions. Idiomatic Rust.
Action Items
Before MVP
| Priority | Issue | Action |
|---|---|---|
| 🔴 Critical | FsError non_exhaustive | Add #[non_exhaustive] and thiserror derive |
| 🔴 Critical | &mut in examples | Audit all examples for &self consistency |
| 🟡 Should | #[must_use] | Add to all Result-returning methods |
| 🟢 Document | Interior mutability | Add prominent section explaining why |
| 🟢 Document | Path semantics | Add section on virtual paths |
Should Fix
| Priority | Issue | Action |
|---|---|---|
| 🟡 Should | Async support | Ship anyfs-async or document workaround |
| 🟡 Should | Layer trait docs | Document differences from Tower |
| 🟢 Document | Marker trait pattern | Explain Fs = FsRead + FsWrite + FsDir |
Comparison to Axum’s Success Factors
| Factor | Axum | AnyFS | Assessment |
|---|---|---|---|
| Tower integration | Native | Inspired by | 🟡 Different but similar |
| Async support | Yes | No (planned) | 🟡 Gap, but documented |
| Error handling | thiserror | Planned | 🔴 Must add |
| Documentation | Excellent | In progress | 🟡 Continue |
| Examples | Comprehensive | In progress | 🟡 Continue |
| Ecosystem fit | tokio native | std::fs native | ✅ Different target |
Conclusion
Overall assessment: The design is sound and follows Rust best practices. The main gaps are:
- Critical:
#[non_exhaustive]on FsError (semver hazard) - Critical: Inconsistent
&mutin examples (contradicts ADR-023) - Important: No async yet (but documented path forward)
- Minor: Documentation gaps (being addressed)
With these fixes, the design should be well-received by the Rust community.
License
This project is dual-licensed to allow for open collaboration on the design while ensuring the resulting code examples can be freely used in software implementations.
Documentation (Text and Media)
The text, diagrams, and other media in this design manual are licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially
Under the following terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original
Code Samples
Any code snippets, examples, or software implementation details contained within this manual are dual-licensed under your choice of:
This is the same licensing model used by the Rust ecosystem.
Summary
| Content Type | License |
|---|---|
| Documentation text | CC BY-SA 4.0 |
| Diagrams and media | CC BY-SA 4.0 |
| Code snippets | MIT OR Apache-2.0 |
| Example implementations | MIT OR Apache-2.0 |
Attribution
When attributing this work, please use:
AnyFS Design Manual by David Krasnitsky, licensed under CC BY-SA 4.0 (documentation) and MIT/Apache-2.0 (code samples). https://github.com/DK26/anyfs-design-manual