Archive Extraction with Safety
Extract ZIP files and other archives safely without zip-slip vulnerabilities. This example shows how PathBoundary
detects and rejects malicious archive entries.
The Problem
Archive extractors are vulnerable to zip-slip attacks where malicious archives contain entries like:
- ❌
../../../etc/passwd
- Escapes to system files - ❌
..\\..\\windows\\system32\\evil.exe
- Escapes on Windows - ❌ Symlinks pointing outside the extraction directory
The Solution: Choose Based on Your Use Case
Production Archive Extraction: Use PathBoundary to Detect Attacks
Use PathBoundary
for production archive extraction. This detects malicious paths and allows you to:
- Log the attack attempt
- Reject the entire archive as compromised
- Alert administrators
- Take appropriate security action
When extracting archives in production:
- Escape attempts indicate a malicious archive
- You want to detect and reject the archive, not silently hide the attack
- The archive should be quarantined or deleted
- Users/admins should be alerted to the attempted attack
PathBoundary
returns Err(PathEscapesBoundary)
so you can handle the security event appropriately.
Research/Sandbox: Use VirtualRoot to Safely Analyze
Use VirtualRoot
when analyzing suspicious archives in a controlled environment:
- Malware analysis and security research
- Safely studying attack techniques
- Observing malicious behavior while containing it
- Testing archive parsing without risk
In research scenarios, you want to see what the malicious archive tries to do, but safely contained within a virtual boundary.
Recommended Patterns
- ✅ Use
create_parent_dir_all()
before writes to avoid race conditions - ✅ Always join via
virtual_join()
orstrict_join()
- never concatenate paths manually - ✅ Treat absolute, UNC, drive-relative, or namespace-prefixed paths as untrusted
- ✅ On Windows, NTFS Alternate Data Streams (ADS) like
"file.txt:stream"
are handled safely
Anti-Patterns (Don't Do This)
- ❌ Building paths with
format!
/push
/join
onstd::path::Path
without validation - ❌ Stripping
"../"
by string replacement - ❌ Allowing absolute paths through to the OS
- ❌ Treating encoded/unicode tricks (URL-encoded, dot lookalikes) as pre-sanitized
Complete Example with PathBoundary
use strict_path::{PathBoundary, StrictPath}; use std::fs; use std::io::Write; struct SafeArchiveExtractor { extraction_dir: PathBoundary, } impl SafeArchiveExtractor { fn new(extract_to: &str) -> Result<Self, Box<dyn std::error::Error>> { let extraction_dir = PathBoundary::try_new_create(extract_to)?; Ok(Self { extraction_dir }) } fn extract_entry(&self, entry_path: &str, content: &[u8]) -> Result<StrictPath, Box<dyn std::error::Error>> { // This automatically prevents zip-slip attacks let safe_path = self.extraction_dir.strict_join(entry_path)?; // Create parent directories and write the file safe_path.create_parent_dir_all()?; safe_path.write(content)?; println!("📦 Extracted: {entry_path} -> {}", safe_path.strictpath_display()); Ok(safe_path) } fn extract_mock_zip(&self) -> Result<Vec<StrictPath>, Box<dyn std::error::Error>> { // Simulate extracting a ZIP file with various entries let entries = vec![ ("readme.txt", b"Welcome to our software!" as &[u8]), ("src/main.rs", b"fn main() { println!(\"Hello!\"); }"), ("docs/api.md", b"# API Documentation"), ("config/settings.json", b"{ \"debug\": true }"), // These malicious entries would be automatically blocked: // ("../../../etc/passwd", b"hacked"), // ❌ Blocked! // ("..\\windows\\system32\\evil.exe", b"malware"), // ❌ Blocked! // ("/absolute/path/hack.txt", b"bad"), // ❌ Blocked! ]; let mut extracted_files = Vec::new(); for (entry_path, content) in entries { match self.extract_entry(entry_path, content) { Ok(safe_path) => extracted_files.push(safe_path), Err(e) => println!("⚠️ Blocked malicious entry '{}': {}", entry_path, e), } } Ok(extracted_files) } } fn main() -> Result<(), Box<dyn std::error::Error>> { let extractor = SafeArchiveExtractor::new("extracted_files")?; println!("🗃️ Extracting archive safely..."); let extracted = extractor.extract_mock_zip()?; println!("\n✅ Successfully extracted {} files:", extracted.len()); for file in &extracted { println!(" 📄 {}", file.strictpath_display()); } // Verify we can read the extracted files for file in &extracted { if file.strictpath_extension().and_then(|s| s.to_str()) == Some("txt") { let content = file.read_to_string()?; println!("📖 {}: {}", file.strictpath_display(), content.trim()); } } Ok(()) }
Handling Malicious Archives
When a malicious path is detected, you should:
#![allow(unused)] fn main() { fn extract_entry_with_security( extraction_dir: &PathBoundary, entry_path: &str, content: &[u8], ) -> Result<StrictPath, Box<dyn std::error::Error>> { match extraction_dir.strict_join(entry_path) { Ok(safe_path) => { // Valid path - extract normally safe_path.create_parent_dir_all()?; safe_path.write(content)?; Ok(safe_path) } Err(e) => { // Malicious path detected! eprintln!("🚨 SECURITY ALERT: Malicious archive entry detected!"); eprintln!(" Entry path: {}", entry_path); eprintln!(" Error: {}", e); eprintln!(" Action: Rejecting entire archive as compromised"); // Return error to stop extraction Err(format!("Archive contains malicious path: {}", entry_path).into()) } } } // Usage let entries = vec![ ("readme.txt", b"Safe content" as &[u8]), ("../../../etc/passwd", b"Malicious"), // Skipped with log ("docs/api.md", b"More safe content"), ]; let count = extract_all_resilient("extracted", entries)?; println!("✅ Successfully extracted {} files", count); }
Key Security Features
1. Bounded Extraction Directory
#![allow(unused)] fn main() { let extraction_dir = PathBoundary::try_new_create(extract_to)?; }
All extracted files must stay within this directory.
2. Automatic Malicious Path Detection
#![allow(unused)] fn main() { let safe_path = self.extraction_dir.strict_join(entry_path)?; }
This line does all the heavy lifting:
- Normalizes
../
sequences - Blocks absolute paths
- Prevents symlink escapes
- Returns an error for malicious paths
3. Parent Directory Creation
#![allow(unused)] fn main() { safe_path.create_parent_dir_all()?; }
Automatically creates any necessary parent directories within the boundary.
4. Type-Safe Returns
#![allow(unused)] fn main() { fn extract_entry(&self, entry_path: &str, content: &[u8]) -> Result<StrictPath, ...> }
Returning StrictPath
ensures extracted paths are always validated.
Attack Scenarios Prevented
Malicious Entry | StrictPath Result | VirtualPath Result |
---|---|---|
../../../etc/passwd | ❌ Error: path escapes boundary | ✅ Clamped to vroot /etc/passwd |
..\\windows\\system32\\evil.exe | ❌ Error: path escapes boundary | ✅ Clamped to vroot /windows/system32/evil.exe |
/var/www/html/shell.php | ❌ Error: absolute path rejected | ✅ Clamped to vroot /var/www/html/shell.php |
legitimate/../../etc/passwd | ❌ Normalized and blocked | ✅ Normalized and clamped |
Symlink to /etc/passwd | ❌ Target validated, error if outside | ✅ Target clamped to vroot /etc/passwd |
Note: For archive extraction, consider using VirtualPath
instead of StrictPath
to gracefully clamp malicious entries rather than rejecting them. This provides defense-in-depth: even hostile archives with absolute paths or symlinks are safely contained within the extraction directory.
Real ZIP Integration
With the zip
crate:
use strict_path::{PathBoundary, StrictPath}; use zip::ZipArchive; use std::fs::File; use std::io::Read; struct RealArchiveExtractor { extraction_dir: PathBoundary, } impl RealArchiveExtractor { fn new(extract_to: &str) -> Result<Self, Box<dyn std::error::Error>> { let extraction_dir = PathBoundary::try_new_create(extract_to)?; Ok(Self { extraction_dir }) } fn extract_zip(&self, zip_path: &str) -> Result<Vec<StrictPath>, Box<dyn std::error::Error>> { let file = File::open(zip_path)?; let mut archive = ZipArchive::new(file)?; let mut extracted_files = Vec::new(); for i in 0..archive.len() { let mut file = archive.by_index(i)?; let entry_path = file.name(); // Validate the entry path - blocks zip-slip automatically let safe_path = match self.extraction_dir.strict_join(entry_path) { Ok(path) => path, Err(e) => { println!("⚠️ Skipping malicious entry '{}': {}", entry_path, e); continue; } }; if file.is_dir() { safe_path.create_dir_all()?; } else { safe_path.create_parent_dir_all()?; let mut content = Vec::new(); file.read_to_end(&mut content)?; safe_path.write(&content)?; extracted_files.push(safe_path); println!("📦 Extracted: {}", entry_path); } } Ok(extracted_files) } } fn main() -> Result<(), Box<dyn std::error::Error>> { let extractor = RealArchiveExtractor::new("extracted")?; // Extract a real ZIP file safely let files = extractor.extract_zip("archive.zip")?; println!("✅ Extracted {} files", files.len()); Ok(()) }
TAR Archives
With the tar
crate:
#![allow(unused)] fn main() { use strict_path::{PathBoundary, StrictPath}; use tar::Archive; use std::fs::File; fn extract_tar(tar_path: &str, extract_to: &str) -> Result<Vec<StrictPath>, Box<dyn std::error::Error>> { let boundary = PathBoundary::try_new_create(extract_to)?; let mut extracted = Vec::new(); let file = File::open(tar_path)?; let mut archive = Archive::new(file); for entry in archive.entries()? { let mut entry = entry?; let entry_path = entry.path()?; let entry_path_str = entry_path.to_string_lossy(); // Validate each entry path let safe_path = match boundary.strict_join(&*entry_path_str) { Ok(path) => path, Err(e) => { println!("⚠️ Skipping malicious entry '{}': {}", entry_path_str, e); continue; } }; // Extract using the validated path entry.unpack(safe_path.interop_path())?; extracted.push(safe_path); println!("📦 Extracted: {}", entry_path_str); } Ok(extracted) } }
Advanced: Extraction with Filters
Skip certain files or enforce naming patterns:
#![allow(unused)] fn main() { impl SafeArchiveExtractor { fn extract_with_filter<F>( &self, entries: Vec<(&str, &[u8])>, filter: F, ) -> Result<Vec<StrictPath>, Box<dyn std::error::Error>> where F: Fn(&str) -> bool, { let mut extracted = Vec::new(); for (entry_path, content) in entries { // Apply custom filter if !filter(entry_path) { println!("⏭️ Skipped by filter: {}", entry_path); continue; } // Validate and extract match self.extract_entry(entry_path, content) { Ok(path) => extracted.push(path), Err(e) => println!("⚠️ Failed to extract '{}': {}", entry_path, e), } } Ok(extracted) } } // Usage: let extracted = extractor.extract_with_filter(entries, |path| { // Only allow certain file types path.ends_with(".txt") || path.ends_with(".md") || path.ends_with(".rs") })?; }
Temporary Extraction
Extract to a temporary directory for processing:
#![allow(unused)] fn main() { use strict_path::PathBoundary; use tempfile::TempDir; fn extract_to_temp(archive_path: &str) -> Result<(TempDir, Vec<StrictPath>), Box<dyn std::error::Error>> { // Create temp directory let temp = TempDir::new()?; // Create boundary from temp path let boundary = PathBoundary::try_new(temp.path())?; // Extract archive let extracted = extract_archive_to_boundary(&boundary, archive_path)?; // Return both TempDir (to keep it alive) and extracted paths Ok((temp, extracted)) } // Temp directory is automatically cleaned up when TempDir is dropped }
Testing Advice
Test your extraction code with a malicious archive corpus:
Test Cases to Include
- Directory traversal:
"../"
,"..\\"
,"legitimate/../../etc/passwd"
- Absolute paths:
"/var/www/evil"
,"C:\\windows\\system32\\evil.exe"
- Windows-specific:
- UNC paths:
"\\\\?\\C:\\windows\\evil"
- Drive-relative:
"C:..\\foo"
- ADS streams:
"decoy.txt:..\\..\\evil.exe"
- Reserved names:
"CON"
,"PRN"
,"AUX"
- UNC paths:
- Unicode tricks: Dot lookalikes, NFC vs NFD forms
- Long paths: Paths exceeding system limits
Assertions
#![allow(unused)] fn main() { #[test] fn test_archive_extraction_safety() { let boundary = PathBoundary::try_new_create("test_extract").unwrap(); // Should succeed assert!(boundary.strict_join("safe/path.txt").is_ok()); // Should fail assert!(boundary.strict_join("../../../etc/passwd").is_err()); assert!(boundary.strict_join("/absolute/path").is_err()); // Cleanup std::fs::remove_dir_all("test_extract").ok(); } }
Behavior Notes
- Virtual joins clamp traversal lexically to the virtual root
- System-facing escapes (via symlinks/junctions) are rejected during resolution
- Unicode is not normalized - NFC and NFD forms are stored as-is, both safely contained
- Hard links and privileged mount tricks are outside path-level protections (see README limitations)
Using VirtualPath for Extra Safety
For even safer archive extraction, consider using VirtualPath
instead of StrictPath
. This clamps malicious entries instead of rejecting them:
#![allow(unused)] fn main() { use strict_path::{VirtualRoot, VirtualPath}; struct VirtualArchiveExtractor { extraction_vroot: VirtualRoot, } impl VirtualArchiveExtractor { fn new(extract_to: &str) -> Result<Self, Box<dyn std::error::Error>> { let extraction_vroot = VirtualRoot::try_new_create(extract_to)?; Ok(Self { extraction_vroot }) } fn extract_entry(&self, entry_path: &str, content: &[u8]) -> Result<VirtualPath, Box<dyn std::error::Error>> { // Malicious paths are CLAMPED instead of rejected // "../../../etc/passwd" becomes safe "vroot/etc/passwd" // Absolute symlink targets are also clamped to vroot let safe_path = self.extraction_vroot.virtual_join(entry_path)?; safe_path.create_parent_dir_all()?; safe_path.write(content)?; println!("📦 Extracted: {} -> {}", entry_path, safe_path.virtualpath_display()); Ok(safe_path) } } }
Why VirtualPath for archives?
- ✅ Hostile entries with
../../../
are clamped, not rejected - ✅ Absolute symlink targets (e.g.,
link -> /etc/passwd
) are clamped to vroot - ✅ Archive extraction continues even with malicious entries
- ✅ Defense-in-depth: every entry is safely contained
- ✅ Perfect for untrusted archives from the internet
When to use each:
StrictPath
: Fail-fast validation — reject malicious archives immediatelyVirtualPath
: Graceful containment — clamp every entry to stay safe, continue extraction
Best Practices
- Always validate - Never trust archive entry paths
- Log suspicious entries - Track and alert on blocked paths
- Limit extraction size - Check total extracted size to prevent zip bombs
- Filter file types - Only extract expected file types
- Use temporary storage - Extract to temp directory first, then move to final location
- Consider VirtualPath - Use for untrusted archives to clamp rather than reject malicious entries
Integration Tips
With Web Uploads
#![allow(unused)] fn main() { async fn handle_upload(file: UploadedFile) -> Result<Vec<String>, AppError> { // Save uploaded file let temp_zip = save_upload(file).await?; // Extract safely let extractor = SafeArchiveExtractor::new("uploads/extracted")?; let files = extractor.extract_zip(&temp_zip)?; // Return list of extracted files Ok(files.iter() .map(|p| p.strictpath_display().to_string()) .collect()) } }
With Background Jobs
#![allow(unused)] fn main() { async fn extract_job(job_id: String, archive_path: String) -> Result<(), JobError> { let extract_dir = format!("jobs/{}/extracted", job_id); let extractor = SafeArchiveExtractor::new(&extract_dir)?; let files = extractor.extract_zip(&archive_path)?; // Store results in database for file in files { db_store_file(&job_id, file.strictpath_display())?; } Ok(()) } }
Next Steps
- See CLI Tool for handling user-provided file paths
- See Web Upload Service for combining uploads with safe storage