Tools Module

The Tools module optimizes AI agent function calling by compressing tool schemas and responses. Achieves 57% average token reduction with 100% lossless compression.

SchemaCompressor

Compress tool/function schemas (OpenAI, Anthropic format) while preserving 100% of protocol fields.

prompt_refiner.tools.SchemaCompressor

SchemaCompressor(
    drop_examples=True,
    drop_titles=True,
    drop_markdown_formatting=True,
)

Bases: Refiner

Compress tool schemas to save tokens while preserving functionality.

This operation compresses tool schema definitions (e.g., OpenAI function calling schemas) by removing documentation overhead while keeping all protocol-level fields intact.

What is modified: - description fields (truncated and cleaned) - title fields (removed if configured) - examples fields (removed if configured) - markdown formatting (removed if configured) - excessive whitespace

What is never modified: - name - type - properties - required - enum - Any other protocol-level fields

Parameters:

Name	Type	Description	Default
`drop_examples`	`bool`	Remove examples fields (default: True)	`True`
`drop_titles`	`bool`	Remove title fields (default: True)	`True`
`drop_markdown_formatting`	`bool`	Remove markdown formatting (default: True)	`True`

Example

from prompt_refiner import SchemaCompressor

tools = [{ ... "type": "function", ... "function": { ... "name": "search_flights", ... "description": "Search for available flights between two airports. " ... "This is a very long description with examples...", ... "parameters": { ... "type": "object", ... "properties": { ... "origin": { ... "type": "string", ... "description": "Origin airport IATA code, like LAX" ... } ... } ... } ... } ... }]

compressor = SchemaCompressor(drop_markdown_formatting=True) compressed = compressor.process(tools)

Markdown removed, tokens saved!

Use Cases

Function Calling: Reduce token usage in OpenAI/Anthropic function schemas
Agent Systems: Optimize tool definitions in agent prompts
Cost Reduction: Save 20-60% tokens on verbose tool schemas
Context Management: Fit more tools within token budget

Initialize schema compressor.

Parameters:

Name	Type	Description	Default
`drop_examples`	`bool`	Remove examples fields (default: True)	`True`
`drop_titles`	`bool`	Remove title fields (default: True)	`True`
`drop_markdown_formatting`	`bool`	Remove markdown formatting (default: True)	`True`

Source code in src/prompt_refiner/tools/schema_compressor.py

def __init__(
    self,
    drop_examples: bool = True,
    drop_titles: bool = True,
    drop_markdown_formatting: bool = True,
):
    """
    Initialize schema compressor.

    Args:
        drop_examples: Remove examples fields (default: True)
        drop_titles: Remove title fields (default: True)
        drop_markdown_formatting: Remove markdown formatting (default: True)
    """
    self.drop_examples = drop_examples
    self.drop_titles = drop_titles
    self.drop_markdown_formatting = drop_markdown_formatting

Functions

process

process(tool)

Process a single tool schema and return compressed JSON.

Parameters:

Name	Type	Description	Default
`tool`	`JSON`	Tool schema dictionary (e.g., OpenAI function calling schema)	required

Returns:

Type	Description
`JSON`	Compressed tool schema dictionary

Example

tool = { ... "type": "function", ... "function": { ... "name": "search", ... "description": "Search for items...", ... "parameters": {...} ... } ... } compressor = SchemaCompressor() compressed = compressor.process(tool)

Source code in src/prompt_refiner/tools/schema_compressor.py

def process(self, tool: JSON) -> JSON:
    """
    Process a single tool schema and return compressed JSON.

    Args:
        tool: Tool schema dictionary (e.g., OpenAI function calling schema)

    Returns:
        Compressed tool schema dictionary

    Example:
        >>> tool = {
        ...     "type": "function",
        ...     "function": {
        ...         "name": "search",
        ...         "description": "Search for items...",
        ...         "parameters": {...}
        ...     }
        ... }
        >>> compressor = SchemaCompressor()
        >>> compressed = compressor.process(tool)
    """
    # Compress the tool
    return self._compress_single_tool(tool)

Key Features

57% average reduction across 20 real-world API schemas
100% lossless - all protocol fields preserved (name, type, required, enum)
100% callable (20/20 validated) - all compressed schemas work correctly with OpenAI function calling
70%+ reduction on enterprise APIs (HubSpot, Salesforce, OpenAI)
Works with OpenAI and Anthropic function calling format

Examples

from prompt_refiner import SchemaCompressor

# Basic usage
tool_schema = {
    "type": "function",
    "function": {
        "name": "search_products",
        "description": "Search for products in the e-commerce catalog...",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query..."}
            },
            "required": ["query"]
        }
    }
}

compressor = SchemaCompressor()
compressed = compressor.process(tool_schema)
# Result: 30-70% smaller, functionally identical

# With Pydantic
from pydantic import BaseModel, Field
from openai.pydantic_function_tool import pydantic_function_tool
from prompt_refiner import SchemaCompressor

class SearchInput(BaseModel):
    query: str = Field(description="The search query...")
    category: str | None = Field(default=None, description="Filter by category...")

# Generate and compress schema
tool_schema = pydantic_function_tool(SearchInput, name="search")
compressed = SchemaCompressor().process(tool_schema)

# Use with OpenAI
response = client.chat.completions.create(
    model="gpt-4",
    messages=[...],
    tools=[compressed]  # Compressed but functionally identical
)

# Batch compression
tools = [search_schema, create_schema, update_schema, delete_schema]
compressor = SchemaCompressor()
compressed_tools = [compressor.process(tool) for tool in tools]

# Use all compressed tools
response = client.chat.completions.create(
    model="gpt-4",
    messages=[...],
    tools=compressed_tools
)

What Gets Compressed

✅ Optimized (Documentation): - description fields (main source of verbosity) - Redundant explanations and examples - Marketing language and filler words - Overly detailed parameter descriptions

❌ Never Modified (Protocol): - Function name - Parameter names - Parameter type (string, number, boolean, etc.) - required fields list - enum values - default values - JSON structure

100% Lossless

SchemaCompressor never modifies protocol fields. The compressed schema is functionally identical to the original - LLMs will call the function with the same arguments.

ResponseCompressor

Compress verbose API/tool responses before sending back to the LLM.

prompt_refiner.tools.ResponseCompressor

ResponseCompressor(
    drop_keys=None,
    drop_null_fields=True,
    drop_empty_fields=True,
    max_depth=8,
    add_truncation_marker=True,
    truncation_suffix="… (truncated)",
)

Bases: Refiner

Compress tool responses to reduce token usage before sending to LLM.

This operation compresses JSON-like tool responses by removing verbose content while preserving essential information. Perfect for agent systems that need to fit tool outputs within LLM context windows.

What is modified: - Long strings (truncated to 512 chars) - Long lists (truncated to 16 items) - Debug/trace/log fields (removed if in drop_keys) - Null values (removed if drop_null_fields=True) - Empty containers (removed if drop_empty_fields=True) - Deep nesting (truncated beyond max_depth)

What is preserved: - Overall structure (dict keys, list order) - Essential data fields - Numbers and booleans (never modified) - Type information

IMPORTANT: Use this ONLY for LLM-facing payloads. Do NOT use compressed output for business logic or APIs that expect complete data.

Parameters:

Name	Type	Description	Default
`drop_keys`	`Set[str] \| None`	Field names to remove (default: debug, trace, logs, etc.)	`None`
`drop_null_fields`	`bool`	Remove fields with None values (default: True)	`True`
`drop_empty_fields`	`bool`	Remove empty strings/lists/dicts (default: True)	`True`
`max_depth`	`int`	Maximum nesting depth before truncation (default: 8)	`8`
`add_truncation_marker`	`bool`	Add markers when truncating (default: True)	`True`
`truncation_suffix`	`str`	Suffix for truncated content (default: "… (truncated)")	`'… (truncated)'`

Example

from prompt_refiner import ResponseCompressor

Compress API response before sending to LLM

compressor = ResponseCompressor() response = { ... "results": ["item1", "item2"] * 100, # 200 items ... "debug": {"trace": "..."}, ... "data": "x" * 1000 ... } compressed = compressor.process(response)

Result: results limited to 16 items, debug removed, data truncated to 512 chars

Use Cases

Agent Systems: Compress verbose tool outputs before sending to LLM
API Integration: Reduce token usage from third-party API responses
Cost Optimization: Save 30-70% tokens on verbose tool responses
Context Management: Fit more tool results within token budget

Initialize ResponseCompressor with compression settings.

Source code in src/prompt_refiner/tools/response_compressor.py

def __init__(
    self,
    drop_keys: Set[str] | None = None,
    drop_null_fields: bool = True,
    drop_empty_fields: bool = True,
    max_depth: int = 8,
    add_truncation_marker: bool = True,
    truncation_suffix: str = "… (truncated)",
):
    """Initialize ResponseCompressor with compression settings."""
    # Hardcoded limits for simplicity
    self.max_string_chars = 512
    self.max_list_items = 16
    self.drop_keys = drop_keys or {
        "debug",
        "trace",
        "traces",
        "stack",
        "stacktrace",
        "logs",
        "logging",
    }
    self.drop_null_fields = drop_null_fields
    self.drop_empty_fields = drop_empty_fields
    self.max_depth = max_depth
    self.add_truncation_marker = add_truncation_marker
    self.truncation_suffix = truncation_suffix

Functions

process

process(response)

Compress tool response data.

Parameters:

Name	Type	Description	Default
`response`	`JSON`	Tool response as dict	required

Returns:

Type	Description
`JSON`	Compressed response as dict

Example

response = { ... "results": [{"data": "x" * 1000}] * 100, ... "debug": {"trace": "..."} ... } compressor = ResponseCompressor() compressed = compressor.process(response)

Result: debug removed, results truncated, data shortened

Source code in src/prompt_refiner/tools/response_compressor.py

def process(self, response: JSON) -> JSON:
    """
    Compress tool response data.

    Args:
        response: Tool response as dict

    Returns:
        Compressed response as dict

    Example:
        >>> response = {
        ...     "results": [{"data": "x" * 1000}] * 100,
        ...     "debug": {"trace": "..."}
        ... }
        >>> compressor = ResponseCompressor()
        >>> compressed = compressor.process(response)
        >>> # Result: debug removed, results truncated, data shortened
    """
    # Compress the response
    return self._compress_any(response, depth=0)

Key Features

25.8% average reduction on 20 real-world API responses (range: 14-53%)
Removes debug/trace/logs fields automatically
Truncates long strings (> 512 chars) and lists (> 16 items)
Preserves essential data structure
52.7% reduction on verbose responses like Stripe Payment API

Examples

from prompt_refiner import ResponseCompressor

# Basic usage
api_response = {
    "results": [
        {"id": 1, "name": "Product A", "price": 29.99},
        {"id": 2, "name": "Product B", "price": 39.99},
        # ... 100 more results
    ],
    "debug_info": {
        "query_time_ms": 45,
        "cache_hit": True,
        "server": "api-01"
    },
    "trace_id": "abc123...",
    "logs": ["Started query", "Fetched from DB", ...]
}

compressor = ResponseCompressor()
compact = compressor.process(api_response)
# Result: Essential data kept, debug/trace/logs removed, long lists truncated

# In agent workflow
from prompt_refiner import SchemaCompressor, ResponseCompressor
import openai
import json

# 1. Compress tool schema
tool_schema = {...}
compressed_schema = SchemaCompressor().process(tool_schema)

# 2. Call LLM with compressed schema
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Search for Python books"}],
    tools=[compressed_schema]
)

# 3. Execute tool
tool_call = response.choices[0].message.tool_calls[0]
function_args = json.loads(tool_call.function.arguments)
tool_response = search_books(**function_args)  # Verbose response

# 4. Compress response before sending to LLM
compact_response = ResponseCompressor().process(tool_response)

# 5. Continue conversation with compressed response
final_response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Search for Python books"},
        response.choices[0].message,
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(compact_response)  # Compressed
        }
    ]
)

What Gets Compressed

Removed: - Debug fields (debug_info, trace_id, _debug) - Log fields (logs, log, trace, _trace) - Excessive metadata

Truncated: - Long strings (> 512 chars) - Long lists (> 16 items) - Deep nesting (> 10 levels)

Preserved: - Essential data fields - JSON structure - Data types

Configuration

ResponseCompressor uses sensible hardcoded limits:

String limit: 512 characters
List limit: 16 items
Max depth: 10 levels
Drop nulls: True (automatic)
Drop empty containers: True (automatic)

No Configuration Needed

ResponseCompressor uses hardcoded sensible defaults that work well for most API responses. No configuration required.

Cost Savings

Typical savings for different agent sizes using GPT-4 ($0.03/1k input tokens):

Agent Size	Tools	Calls/Day	Monthly Savings	Annual Savings
Small	5	100	$44	$528
Medium	10	500	$541	$6,492
Large	20	1,000	$3,249	$38,988
Enterprise	50	5,000	$40,664	$487,968

Based on 56.9% average schema reduction

Best Practices

1. Compress Schemas Once, Reuse

# At application startup
compressor = SchemaCompressor()
COMPRESSED_TOOLS = [
    compressor.process(search_schema),
    compressor.process(create_schema),
    compressor.process(update_schema)
]

# In agent loop - reuse compressed schemas
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    tools=COMPRESSED_TOOLS  # Reuse
)

2. Always Compress Responses

# Always compress before sending to LLM
tool_response = api.search(query)
compact = ResponseCompressor().process(tool_response)
messages.append({"role": "tool", "content": json.dumps(compact)})

3. Monitor Token Savings

import tiktoken

encoder = tiktoken.encoding_for_model("gpt-4")

original_tokens = len(encoder.encode(json.dumps(original_schema)))
compressed_tokens = len(encoder.encode(json.dumps(compressed_schema)))

print(f"Saved {original_tokens - compressed_tokens} tokens")
print(f"Reduction: {(1 - compressed_tokens/original_tokens)*100:.1f}%")

Benchmark Results

See comprehensive benchmark results for detailed performance on 20 real-world API schemas.

Tools Module

SchemaCompressor

prompt_refiner.tools.SchemaCompressor

Markdown removed, tokens saved!

Functions

process

Key Features

Examples

What Gets Compressed

ResponseCompressor

prompt_refiner.tools.ResponseCompressor

Compress API response before sending to LLM

Result: results limited to 16 items, debug removed, data truncated to 512 chars

Functions

process

Result: debug removed, results truncated, data shortened

Key Features

Examples

What Gets Compressed

Configuration

Cost Savings

Best Practices

1. Compress Schemas Once, Reuse

2. Always Compress Responses

3. Monitor Token Savings

Benchmark Results

Learn More