✅ Testing
Testing & Accuracy
We obsess over accuracy. Here's the proof — 200,000 validated samples with full methodology.
97.38%
F1 Score
97.70%
Precision
97.06%
Recall
354K
Samples/sec
Large-Scale Validation Results
December 2024 • 200,000 programmatically generated samples across 5 categories (40,000 each)
79,324
True Positives
19,500
True Negatives
966
False Positives
210
False Negatives
Per-Category Performance
40,000 samples tested per category (200K total) with 80/20 valid/invalid split.
| Category | Samples | Precision | Recall | F1 Score |
|---|---|---|---|---|
| PII | 40,000 | 95.10% | 95.88% | 95.49% |
| Secrets | 40,000 | 100% | 90.46% | 94.99% |
| Contact | 40,000 | 98.77% | 99.99% | 99.37% |
| Financial | 40,000 | 97.54% | 100% | 98.76% |
| Healthcare | 40,000 | 97.32% | 99.00% | 98.15% |
| Overall | 200,000 | 97.70% | 97.06% | 97.38% |
Excellent Recall (99%+)
Contact (99.99%) and Financial (100%) categories achieved near-perfect recall — minimal false negatives.
High Precision (97%+)
Secrets category achieved 100% precision — when we flag something, we're right every time.
Industry Benchmark Comparison
How Redactorr compares to enterprise alternatives based on published benchmarks.
| Tool | F1 Score | Speed | Data Privacy |
|---|---|---|---|
| Redactorr | 97.38% | 354K/sec | 100% Local |
| Open Source Libraries | ~79% | Varies | Self-hosted |
| Cloud NLP Services | ~65% | API | Cloud |
| Enterprise Cloud DLP | ~35-50% | API | Cloud |
Test Methodology
Sample Generation
- •200,000 programmatically generated samples (40,000 per category)
- •Realistic data patterns matching real-world formats
- •80% valid sensitive data, 20% edge cases/invalid
- •Context simulation with surrounding text
Test Categories
- •PII: SSN, DOB, Driver's License, Passport, Names
- •Secrets: API Keys, Tokens, Passwords, AWS Keys, Private Keys
- •Contact: Email, Phone, IP Address, Physical Address
- •Financial: Credit Cards, Bank Accounts, IBAN, Crypto Wallets
- •Healthcare: MRN, Insurance ID, NPI, DEA Numbers, ICD Codes
Speed Measurement
565ms
Total test duration (200K)
200,000
Samples processed
354K/sec
Processing speed
What We Detect
Our detection engine includes 2,910 enterprise-grade patterns covering 50+ countries across these categories:
Personal Identifiers
- ✓Social Security Numbers
- ✓Driver License
- ✓Passport Numbers
- ✓National IDs
Contact Information
- ✓Email Addresses
- ✓Phone Numbers
- ✓Physical Addresses
- ✓IP Addresses
Financial Data
- ✓Credit Card Numbers
- ✓Bank Accounts
- ✓Routing Numbers
- ✓IBAN (40+ countries)
Credentials
- ✓API Keys
- ✓Passwords
- ✓Access Tokens
- ✓Secret Keys
Infrastructure
- ✓Database Connections
- ✓Server URLs
- ✓Internal Hostnames
- ✓AWS/GCP Keys
Healthcare
- ✓Medical Record Numbers
- ✓Patient IDs
- ✓Health Plan Numbers
- ✓NPI Numbers
See It In Action
BEFORE
Name: John Smith
SSN: 487-52-8914
Email: john.smith@acme.com
API Key: sk_live_4eC39HqLyjWDarjtT1zdp7dc
Card: 4532-8891-2234-5567
AFTER
Name: [NAME_a7f2]
SSN: [SSN_b3c1]
Email: [EMAIL_d4e8]
API Key: [API_KEY_f9a2]
Card: [CREDIT_CARD_c5b7]
Ready to See For Yourself?
Try Redactorr free and see the accuracy in action.
Start Redacting Free →