Modules Overview
Prompt Refiner is organized into 5 core modules plus measurement utilities.
The 5 Core Modules
1. Cleaner - Clean Dirty Data
The Cleaner module removes unwanted artifacts from your text.
Operations:
- StripHTML - Remove or convert HTML tags
- NormalizeWhitespace - Collapse excessive whitespace
- FixUnicode - Remove problematic Unicode characters
- JsonCleaner - Strip nulls/empties from JSON, minify
When to use:
- Processing web-scraped content
- Cleaning user-generated text
- Compressing JSON from RAG APIs
- Normalizing text from various sources
2. Compressor - Reduce Size
The Compressor module reduces token count while preserving meaning.
Operations:
- TruncateTokens - Smart text truncation with sentence boundaries
- Deduplicate - Remove similar or duplicate content
When to use:
- Fitting content within context windows
- Optimizing RAG retrieval results
- Reducing API costs
3. Scrubber - Security & Privacy
The Scrubber module protects sensitive information.
Operations:
- RedactPII - Automatically redact personally identifiable information
When to use:
- Before sending data to external APIs
- Compliance with privacy regulations
- Protecting user data in logs
4. Packer - Context Budget Management
The Packer module manages context budgets with intelligent priority-based item selection.
Operations:
- MessagesPacker - Pack items for chat completion APIs
- TextPacker - Pack items for text completion APIs
When to use:
- RAG applications with multiple documents
- Chatbots with conversation history
- Managing context windows with size limits
- Combining system prompts, user input, and documents
5. Strategy - Preset Strategies
The Strategy module provides benchmark-tested preset strategies for quick setup.
Strategies:
- MinimalStrategy - 4.3% reduction, 98.7% quality
- StandardStrategy - 4.8% reduction, 98.4% quality
- AggressiveStrategy - 15% reduction, 96.4% quality
When to use:
- Quick setup without manual configuration
- Benchmark-tested optimization presets
- Extending with additional custom operations
Measurement Utilities
Analyzer - Measure Impact
The Analyzer module measures optimization impact but does not transform prompts. Use it to track token savings and demonstrate ROI.
Operations:
- TokenTracker - Measure token savings and calculate ROI
- Token Counters - Built-in functions for counting tokens
When to use:
- Demonstrating cost savings to stakeholders
- A/B testing optimization strategies
- Monitoring optimization impact over time
- Calculating ROI for prompt optimization
Combining Modules
The real power comes from combining modules:
Pipeline Example
from prompt_refiner import (
TokenTracker, # Analyzer
StripHTML, # Cleaner
NormalizeWhitespace, # Cleaner
TruncateTokens, # Compressor
RedactPII, # Scrubber
character_based_counter, # Token counter
)
original_text = "Your text here..."
# Build pipeline
pipeline = (
StripHTML()
| NormalizeWhitespace()
| TruncateTokens(max_tokens=1000)
| RedactPII()
)
# Track optimization with TokenTracker
with TokenTracker(pipeline, character_based_counter) as tracker:
result = tracker.process(original_text)
# Show token savings
stats = tracker.stats
print(f"Saved {stats['saved_tokens']} tokens ({stats['saving_percent']})")
Packer Example
from prompt_refiner import (
MessagesPacker,
ROLE_SYSTEM,
ROLE_QUERY,
ROLE_CONTEXT,
StripHTML,
)
# Manage RAG context for chat APIs with automatic priorities
packer = MessagesPacker(
system="You are a helpful assistant.",
query="What is prompt-refiner?",
)
# Add retrieved documents with automatic cleaning
for doc in retrieved_docs:
packer.add(
doc.content,
role=ROLE_CONTEXT, # Auto-assigned PRIORITY_HIGH
refine_with=StripHTML(),
)
messages = packer.pack() # Returns List[Dict] directly
Module Relationships
graph LR
A[Raw Input] --> B[Cleaner]
B --> C[Compressor]
C --> D[Scrubber]
D --> E[Optimized Output]
E -.-> F[Analyzer<br/>Measurement Only]
G[Multiple Items] --> H[Packer]
H --> I[Packed Context]
Note: Analyzer (dotted line) measures but doesn't transform the output.
Best Practices
- Order matters: Clean before compressing, compress before redacting
- Use Packer for RAG: When managing multiple documents with priorities
- Test your pipeline: Different inputs may need different operations
- Measure impact: Use TokenTracker to track token savings and demonstrate ROI
- Start simple: Begin with one module and add more as needed