Skip to content

Tools Module

The Tools module optimizes AI agent function calling by compressing tool schemas and responses. Achieves 57% average token reduction with 100% lossless compression.

SchemaCompressor

Compress tool/function schemas (OpenAI, Anthropic format) while preserving 100% of protocol fields.

prompt_refiner.tools.SchemaCompressor

SchemaCompressor(
    drop_examples=True,
    drop_titles=True,
    drop_markdown_formatting=True,
)

Bases: Refiner

Compress tool schemas to save tokens while preserving functionality.

This operation compresses tool schema definitions (e.g., OpenAI function calling schemas) by removing documentation overhead while keeping all protocol-level fields intact.

What is modified: - description fields (truncated and cleaned) - title fields (removed if configured) - examples fields (removed if configured) - markdown formatting (removed if configured) - excessive whitespace

What is never modified: - name - type - properties - required - enum - Any other protocol-level fields

Parameters:

Name Type Description Default
drop_examples bool

Remove examples fields (default: True)

True
drop_titles bool

Remove title fields (default: True)

True
drop_markdown_formatting bool

Remove markdown formatting (default: True)

True
Example

from prompt_refiner import SchemaCompressor

tools = [{ ... "type": "function", ... "function": { ... "name": "search_flights", ... "description": "Search for available flights between two airports. " ... "This is a very long description with examples...", ... "parameters": { ... "type": "object", ... "properties": { ... "origin": { ... "type": "string", ... "description": "Origin airport IATA code, like LAX" ... } ... } ... } ... } ... }]

compressor = SchemaCompressor(drop_markdown_formatting=True) compressed = compressor.process(tools)

Markdown removed, tokens saved!

Use Cases
  • Function Calling: Reduce token usage in OpenAI/Anthropic function schemas
  • Agent Systems: Optimize tool definitions in agent prompts
  • Cost Reduction: Save 20-60% tokens on verbose tool schemas
  • Context Management: Fit more tools within token budget

Initialize schema compressor.

Parameters:

Name Type Description Default
drop_examples bool

Remove examples fields (default: True)

True
drop_titles bool

Remove title fields (default: True)

True
drop_markdown_formatting bool

Remove markdown formatting (default: True)

True
Source code in src/prompt_refiner/tools/schema_compressor.py
def __init__(
    self,
    drop_examples: bool = True,
    drop_titles: bool = True,
    drop_markdown_formatting: bool = True,
):
    """
    Initialize schema compressor.

    Args:
        drop_examples: Remove examples fields (default: True)
        drop_titles: Remove title fields (default: True)
        drop_markdown_formatting: Remove markdown formatting (default: True)
    """
    self.drop_examples = drop_examples
    self.drop_titles = drop_titles
    self.drop_markdown_formatting = drop_markdown_formatting

Functions

process
process(tool)

Process a single tool schema and return compressed JSON.

Parameters:

Name Type Description Default
tool JSON

Tool schema dictionary (e.g., OpenAI function calling schema)

required

Returns:

Type Description
JSON

Compressed tool schema dictionary

Example

tool = { ... "type": "function", ... "function": { ... "name": "search", ... "description": "Search for items...", ... "parameters": {...} ... } ... } compressor = SchemaCompressor() compressed = compressor.process(tool)

Source code in src/prompt_refiner/tools/schema_compressor.py
def process(self, tool: JSON) -> JSON:
    """
    Process a single tool schema and return compressed JSON.

    Args:
        tool: Tool schema dictionary (e.g., OpenAI function calling schema)

    Returns:
        Compressed tool schema dictionary

    Example:
        >>> tool = {
        ...     "type": "function",
        ...     "function": {
        ...         "name": "search",
        ...         "description": "Search for items...",
        ...         "parameters": {...}
        ...     }
        ... }
        >>> compressor = SchemaCompressor()
        >>> compressed = compressor.process(tool)
    """
    # Compress the tool
    return self._compress_single_tool(tool)

Key Features

  • 57% average reduction across 20 real-world API schemas
  • 100% lossless - all protocol fields preserved (name, type, required, enum)
  • 100% callable (20/20 validated) - all compressed schemas work correctly with OpenAI function calling
  • 70%+ reduction on enterprise APIs (HubSpot, Salesforce, OpenAI)
  • Works with OpenAI and Anthropic function calling format

Examples

from prompt_refiner import SchemaCompressor

# Basic usage
tool_schema = {
    "type": "function",
    "function": {
        "name": "search_products",
        "description": "Search for products in the e-commerce catalog...",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query..."}
            },
            "required": ["query"]
        }
    }
}

compressor = SchemaCompressor()
compressed = compressor.process(tool_schema)
# Result: 30-70% smaller, functionally identical
# With Pydantic
from pydantic import BaseModel, Field
from openai.pydantic_function_tool import pydantic_function_tool
from prompt_refiner import SchemaCompressor

class SearchInput(BaseModel):
    query: str = Field(description="The search query...")
    category: str | None = Field(default=None, description="Filter by category...")

# Generate and compress schema
tool_schema = pydantic_function_tool(SearchInput, name="search")
compressed = SchemaCompressor().process(tool_schema)

# Use with OpenAI
response = client.chat.completions.create(
    model="gpt-4",
    messages=[...],
    tools=[compressed]  # Compressed but functionally identical
)
# Batch compression
tools = [search_schema, create_schema, update_schema, delete_schema]
compressor = SchemaCompressor()
compressed_tools = [compressor.process(tool) for tool in tools]

# Use all compressed tools
response = client.chat.completions.create(
    model="gpt-4",
    messages=[...],
    tools=compressed_tools
)

What Gets Compressed

✅ Optimized (Documentation): - description fields (main source of verbosity) - Redundant explanations and examples - Marketing language and filler words - Overly detailed parameter descriptions

❌ Never Modified (Protocol): - Function name - Parameter names - Parameter type (string, number, boolean, etc.) - required fields list - enum values - default values - JSON structure

100% Lossless

SchemaCompressor never modifies protocol fields. The compressed schema is functionally identical to the original - LLMs will call the function with the same arguments.


ResponseCompressor

Compress verbose API/tool responses before sending back to the LLM.

prompt_refiner.tools.ResponseCompressor

ResponseCompressor(
    drop_keys=None,
    drop_null_fields=True,
    drop_empty_fields=True,
    max_depth=8,
    add_truncation_marker=True,
    truncation_suffix="… (truncated)",
)

Bases: Refiner

Compress tool responses to reduce token usage before sending to LLM.

This operation compresses JSON-like tool responses by removing verbose content while preserving essential information. Perfect for agent systems that need to fit tool outputs within LLM context windows.

What is modified: - Long strings (truncated to 512 chars) - Long lists (truncated to 16 items) - Debug/trace/log fields (removed if in drop_keys) - Null values (removed if drop_null_fields=True) - Empty containers (removed if drop_empty_fields=True) - Deep nesting (truncated beyond max_depth)

What is preserved: - Overall structure (dict keys, list order) - Essential data fields - Numbers and booleans (never modified) - Type information

IMPORTANT: Use this ONLY for LLM-facing payloads. Do NOT use compressed output for business logic or APIs that expect complete data.

Parameters:

Name Type Description Default
drop_keys Set[str] | None

Field names to remove (default: debug, trace, logs, etc.)

None
drop_null_fields bool

Remove fields with None values (default: True)

True
drop_empty_fields bool

Remove empty strings/lists/dicts (default: True)

True
max_depth int

Maximum nesting depth before truncation (default: 8)

8
add_truncation_marker bool

Add markers when truncating (default: True)

True
truncation_suffix str

Suffix for truncated content (default: "… (truncated)")

'… (truncated)'
Example

from prompt_refiner import ResponseCompressor

Compress API response before sending to LLM

compressor = ResponseCompressor() response = { ... "results": ["item1", "item2"] * 100, # 200 items ... "debug": {"trace": "..."}, ... "data": "x" * 1000 ... } compressed = compressor.process(response)

Result: results limited to 16 items, debug removed, data truncated to 512 chars

Use Cases
  • Agent Systems: Compress verbose tool outputs before sending to LLM
  • API Integration: Reduce token usage from third-party API responses
  • Cost Optimization: Save 30-70% tokens on verbose tool responses
  • Context Management: Fit more tool results within token budget

Initialize ResponseCompressor with compression settings.

Source code in src/prompt_refiner/tools/response_compressor.py
def __init__(
    self,
    drop_keys: Set[str] | None = None,
    drop_null_fields: bool = True,
    drop_empty_fields: bool = True,
    max_depth: int = 8,
    add_truncation_marker: bool = True,
    truncation_suffix: str = "… (truncated)",
):
    """Initialize ResponseCompressor with compression settings."""
    # Hardcoded limits for simplicity
    self.max_string_chars = 512
    self.max_list_items = 16
    self.drop_keys = drop_keys or {
        "debug",
        "trace",
        "traces",
        "stack",
        "stacktrace",
        "logs",
        "logging",
    }
    self.drop_null_fields = drop_null_fields
    self.drop_empty_fields = drop_empty_fields
    self.max_depth = max_depth
    self.add_truncation_marker = add_truncation_marker
    self.truncation_suffix = truncation_suffix

Functions

process
process(response)

Compress tool response data.

Parameters:

Name Type Description Default
response JSON

Tool response as dict

required

Returns:

Type Description
JSON

Compressed response as dict

Example

response = { ... "results": [{"data": "x" * 1000}] * 100, ... "debug": {"trace": "..."} ... } compressor = ResponseCompressor() compressed = compressor.process(response)

Result: debug removed, results truncated, data shortened
Source code in src/prompt_refiner/tools/response_compressor.py
def process(self, response: JSON) -> JSON:
    """
    Compress tool response data.

    Args:
        response: Tool response as dict

    Returns:
        Compressed response as dict

    Example:
        >>> response = {
        ...     "results": [{"data": "x" * 1000}] * 100,
        ...     "debug": {"trace": "..."}
        ... }
        >>> compressor = ResponseCompressor()
        >>> compressed = compressor.process(response)
        >>> # Result: debug removed, results truncated, data shortened
    """
    # Compress the response
    return self._compress_any(response, depth=0)

Key Features

  • 25.8% average reduction on 20 real-world API responses (range: 14-53%)
  • Removes debug/trace/logs fields automatically
  • Truncates long strings (> 512 chars) and lists (> 16 items)
  • Preserves essential data structure
  • 52.7% reduction on verbose responses like Stripe Payment API

Examples

from prompt_refiner import ResponseCompressor

# Basic usage
api_response = {
    "results": [
        {"id": 1, "name": "Product A", "price": 29.99},
        {"id": 2, "name": "Product B", "price": 39.99},
        # ... 100 more results
    ],
    "debug_info": {
        "query_time_ms": 45,
        "cache_hit": True,
        "server": "api-01"
    },
    "trace_id": "abc123...",
    "logs": ["Started query", "Fetched from DB", ...]
}

compressor = ResponseCompressor()
compact = compressor.process(api_response)
# Result: Essential data kept, debug/trace/logs removed, long lists truncated
# In agent workflow
from prompt_refiner import SchemaCompressor, ResponseCompressor
import openai
import json

# 1. Compress tool schema
tool_schema = {...}
compressed_schema = SchemaCompressor().process(tool_schema)

# 2. Call LLM with compressed schema
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Search for Python books"}],
    tools=[compressed_schema]
)

# 3. Execute tool
tool_call = response.choices[0].message.tool_calls[0]
function_args = json.loads(tool_call.function.arguments)
tool_response = search_books(**function_args)  # Verbose response

# 4. Compress response before sending to LLM
compact_response = ResponseCompressor().process(tool_response)

# 5. Continue conversation with compressed response
final_response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "user", "content": "Search for Python books"},
        response.choices[0].message,
        {
            "role": "tool",
            "tool_call_id": tool_call.id,
            "content": json.dumps(compact_response)  # Compressed
        }
    ]
)

What Gets Compressed

Removed: - Debug fields (debug_info, trace_id, _debug) - Log fields (logs, log, trace, _trace) - Excessive metadata

Truncated: - Long strings (> 512 chars) - Long lists (> 16 items) - Deep nesting (> 10 levels)

Preserved: - Essential data fields - JSON structure - Data types

Configuration

ResponseCompressor uses sensible hardcoded limits:

  • String limit: 512 characters
  • List limit: 16 items
  • Max depth: 10 levels
  • Drop nulls: True (automatic)
  • Drop empty containers: True (automatic)

No Configuration Needed

ResponseCompressor uses hardcoded sensible defaults that work well for most API responses. No configuration required.


Cost Savings

Typical savings for different agent sizes using GPT-4 ($0.03/1k input tokens):

Agent Size Tools Calls/Day Monthly Savings Annual Savings
Small 5 100 $44 $528
Medium 10 500 $541 $6,492
Large 20 1,000 $3,249 $38,988
Enterprise 50 5,000 $40,664 $487,968

Based on 56.9% average schema reduction


Best Practices

1. Compress Schemas Once, Reuse

# At application startup
compressor = SchemaCompressor()
COMPRESSED_TOOLS = [
    compressor.process(search_schema),
    compressor.process(create_schema),
    compressor.process(update_schema)
]

# In agent loop - reuse compressed schemas
response = client.chat.completions.create(
    model="gpt-4",
    messages=messages,
    tools=COMPRESSED_TOOLS  # Reuse
)

2. Always Compress Responses

# Always compress before sending to LLM
tool_response = api.search(query)
compact = ResponseCompressor().process(tool_response)
messages.append({"role": "tool", "content": json.dumps(compact)})

3. Monitor Token Savings

import tiktoken

encoder = tiktoken.encoding_for_model("gpt-4")

original_tokens = len(encoder.encode(json.dumps(original_schema)))
compressed_tokens = len(encoder.encode(json.dumps(compressed_schema)))

print(f"Saved {original_tokens - compressed_tokens} tokens")
print(f"Reduction: {(1 - compressed_tokens/original_tokens)*100:.1f}%")

Benchmark Results

See comprehensive benchmark results for detailed performance on 20 real-world API schemas.

Learn More