Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

The Journey to strict-path

Why Do I Need This Library?

The development of strict-path is a story of discovering security gaps in path handling and iteratively building a comprehensive solution. Here's the complete development journey that led to the creation of this crate.

Why Use This Crate (TL;DR)

Path security is not about comparing strings. It requires:

  • Full normalization/canonicalization that works even when targets don’t exist
  • Safe symlink/junction handling (with cycle detection and boundary enforcement)
  • Windows-specific defenses (8.3 short names, UNC/verbatim prefixes, ADS)
  • Unicode/encoding awareness (mixed separators, normalization differences)

strict-path solves this class of problems comprehensively, then encodes the guarantees in the type system. If a StrictPath<Marker> exists, it’s proven to be inside its boundary by construction.

The Development Process Story

The Simple Beginning

It started as an apparently simple idea for a crate that validates paths by making sure they are within an expected boundary, using canonicalization. The concept was straightforward: create a type that validates the correct path (PathValidator) and a generated byproduct that serves as proof of validation (JailedPath).

That wasn't too hard to do... except...

The First Major Obstacle

The std::canonicalize Problem: Rust's standard library canonicalize() could only accept and work with paths that already exist. This was a fundamental limitation that broke the entire concept.

Existing Crates Were Insufficient: Other Rust crates were only offering lexical path resolution, ignoring symlinks and unable to deliver the promise of canonicalized/realpath values without demanding that the target path must exist.

This was a big problem! How could I validate that a location for a future file is within a legal boundary if the file doesn't exist yet?

The Python Inspiration

A quick search revealed that Python had already faced this exact problem and solved it in Python 3.6 by adding the following feature: pathlib.Path.resolve(strict=False).

That's when I realized I'd need to create another crate! One that mimics that same logic—both to solve my problem and as an opportunity to give back to the Rust community.

Enter soft-canonicalize: This became the foundation crate that would enable proper path validation without requiring file existence.

Building soft-canonicalize

I asked an LLM agent to fetch Python's implementation unit tests, translate them to Rust, and run them over our soft-canonicalize implementation. This revealed gaps in my own implementation and led me to ask for the same algorithm that Python uses (later modified for optimizations and CVE resolutions).

Voilà! I had a working soft-canonicalize crate, so I could publish it and continue work on my jailed-path crate.

From here, the path guarantee became practical: validate first (without requiring existence), then operate safely.

The Marker Type Innovation

Continuing work on JailedPath, I realized that sometimes we might wish to have more than one validated path, but how could we identify them correctly? That's when I came up with the Marker type idea: simply create your very own Marker type, providing additional context for the compiler and allowing us to prevent mixing up paths!

Security Research and CVE Analysis

OK, now we have a really cool JailedPath crate! Let's further validate that we are safe by researching CVEs.

Oops! It looked like we had some gaps in our soft-canonicalize crate. That's where I took additional time investing in improving correctness, resilience, and performance. I created comprehensive Python benchmarks where I could validate soft-canonicalize performance vs Python's C language implementation. That took a while to perfect, but it was worth it because it could improve scalability in heavy usage cases.

The Virtual Path Discovery

Researching existing alternatives, I discovered a use case for virtual paths—paths that are clamped to a virtual root. This made me reconsider my own use case for creating this crate, revealing a lot of potential.

I started wondering if this should be our default behavior. Eventually, I came to this conclusion: All I needed was a secure, validated Path type. So I applied the KISS method (Keep It Simple, Stupid) and decided that the core JailedPath should represent simply a path that has been validated.

However, there were clear uses for VirtualPath. After long consideration about whether this should be in a different crate, I decided to keep it inside JailedPath because:

  • They share the same foundation
  • I didn't want to scatter logic across two crates
  • It's easier to maintain, use, and perform transitions between the two

The Great Renaming Journey

From PathValidator to Jail

My first gut feeling was that while our PathValidator type was quite self-explanatory, it felt like an extra tool we needed to carry around. I was aiming to simplify the developer experience. PathValidator seemed easy to understand but not fun, with no clear relation to JailedPath.

So I decided to rename PathValidator to Jail. It made sense: we set up a jail and then validate paths against it.

From Jail to PathBoundary

Eventually, Jail didn't feel completely right either, only because we were also supporting VirtualPath (created from a VirtualRoot). I realized that a newcomer (or someone returning to code after a long while) might get confused about what behavior to expect from a Jail and JailedPath type.

The API Surface Problem

As a result of LLM agents generating faulty code, I could see how the API was being misused. This motivated me to reduce the API surface to the minimum required and ensure all methods are explicit about the difference between JailedPath and VirtualPath. No vague method names (such as as_ref()). No Path type escapes—the LLM would simply defeat the purpose of my crate by calling its inner path and calling .join() on it.

The Problem with .join(): Calling Path::join() is no longer validated. The path could escape easily. And joining to a full path would completely override the path it's being joined to.

The "Three join() Problem"

This led me to the "Three join() problem"—each time I saw a generated .join() in test code, I had to take a moment following the chain of methods to figure out if a join() belongs to Path, JailedPath, or VirtualPath.

This is where I decided that methods must be explicit. Seeing them in generated code helps immediately notice and understand their behavior:

  • jailedpath_join() vs virtualpath_join() vs join()

Seeing join() in our code would mean unsafe behavior that we could notice immediately.

This explicitness is critical for LLM- and review-friendly code: .strict_join(..)/.virtual_join(..) are visibly safe; raw Path::join stands out as a red flag.

Finding the Right Balance

Fixing my demo projects, these methods seemed verbose. Since they were very common, I decided on shorter, easier names:

  • jailed_join(), virtual_join()

But we're back to behavior differences. Seeing jailed_join(), what does it mean? We'd need to refer to docs. While docs are important, wouldn't it be nicer if we could understand from the method name what's happening?

The Final Names

Eventually (and finally), I did another rename:

  • JailedPathStrictPath (clear that the path is restricted!)
  • JailPathBoundary (goes hand-in-hand with VirtualRoot)
  • strict_join() vs virtual_join() (perfect clarity!)

Path Ergonomics and Safety

Path ergonomics were crucial! I wanted to be as ergonomic as possible without breaking our established safety rules—especially not leaking out a Path type that could do a .join().

Eventually, I came up with .interop_path(). It contains the suffix _path to hint that this is what API users need to interop VirtualPath and StrictPath directly in places where AsRef<Path> is expected. But we do not expose a Path type! Instead, we expose a borrow of an OsStr.

This is perfect! OsStr:

  • Implements AsRef<Path> for integration with everything expecting AsRef<Path>
  • Is cross-platform and fits the underlying operating system
  • Doesn't lose any data
  • Is what Path wraps anyway—we're just stripping off all the dangerous methods
  • Is what Path wraps anyway—we're just stripping off all the dangerous methods

Escape hatches exist, but are explicit:

  • Borrow strict from virtual: vpath.as_unvirtual()
  • Ownership conversions: virtualize() / unvirtual() / unstrict() (use sparingly)

Feature Integration

I wanted to explore additional features by integrating with popular crates:

  • app-path: My own crate for easily referring to files near our executable, ensuring operations cannot escape our application directory
  • dirs: Cross-platform access to system directories
  • tempfile: Generate temporary directories with PathBoundary::try_new_temp()

API Simplification

I kept improving demo examples and API clarity. Eventually, I realized: StrictPath contains the boundary path within it, just as VirtualRoot contains its root path (which is a StrictPath).

I explored whether we could work with just 2 types: VirtualPath and StrictPath. While possible, it wouldn't be ideal—sometimes we want to be explicit about roots and boundaries as promises.

I decided to keep VirtualRoot and PathBoundary but make common usage more concise with StrictPath::with_boundary() and VirtualPath::with_root(). This made code much more concise while remaining highly readable.

Zero‑Trust vs Lexical Approaches

  • If you want a zero‑trust approach that covers (almost) everything that can go wrong, prefer canonicalized validation and joins. They resolve symlinks and normalize platform-specific forms before enforcement.
  • If you need maximum performance and you are absolutely certain symlinks cannot occur and paths are already canonical/normalized, a lexical solution from another crate may fit — but you accept the risk and narrower threat model.

The Road to Publication

This was a long journey, but it isn't over yet. It's time to make this crate public, ensuring all generated docs are correct and we don't have leftovers.

The version is now good enough to be the first stable foundation for a security crate! I hope this catches on (I didn't really expect it when I started), and at some point, I began thinking of it as a potential new standard for securing paths.

If this succeeds, I'd like to port it to other programming languages—JavaScript, Java, and Python first! In a way, I hope this will be what prepared statements are for SQL: a fundamental security practice that becomes standard across the ecosystem.

Lessons Learned

The journey taught me several important lessons:

  1. Security requires iteration: Each security review revealed new edge cases
  2. API design is crucial: Small naming decisions have huge impacts on usability
  3. Ergonomics vs Safety: You can have both, but it requires careful design
  4. Community feedback matters: LLM-generated code revealed real usage patterns
  5. Standards evolve: What seems like a simple idea often grows into something much more comprehensive

The result is strict-path—a crate that not only solves the original path validation problem but provides a comprehensive, ergonomic, and secure foundation for all path operations in Rust applications.