Now in public betav0.11.0

Having trouble doing >5 tool calls in a row?

Consistently chain together tool calls by helping your AI agents memorize execution-proven code.

Book a demo Read the docs

Make vertical AI agents more consistent with code

Drop custom tools, go from 80% to 99% by only running code your agent generated that previously satisfied users.

code_results = raysurfer.search("Generate quarterly report CSV")

# LLM agent runs code_results & returns results to user

raysurfer.upload(code, logs, query)

How Code Scoring Works

Raysurfer surfaces the best code LLMs need, the moment they need it.

Inputs

User Queries

Generated Code

Execution Logs

Similar Previous User Queries

User Satisfaction Scoring

Upvote

Downvote

Managed Codegen vs Tool Calling

LLM agents repeat the same patterns constantly. Raysurfer retrieves proven code and runs it with new inputs—no regeneration needed.

Claude Agent SDK

180.4s

Init (0.3s)

LLM calls (3s each × 16)

Tool results (0.4s each × 10)

Compacting (2s each × 2)

Parallel subagents (30s each × 4)

30xfaster

Raysurfer

6.0s

Init (0.3s)

LLM call (3s)

Cache hit (1s)

Execute cached code (1.7s)

Claude Agent SDK180 seconds

INIT

{
  "user": "Update our Q3 revenue ($1.02B) in the quarterly report and sync it with the investor deck."
}

LLM CALL 1

I need to understand the task and find the relevant files.
Calling list_files("./reports/")...

TOOL RESULT

quarterly_report.xlsx, investor_deck.pptx, Q3_data.csv

LLM CALL 2

Found the files. Let me read the quarterly report first.
Calling read_file("quarterly_report.xlsx")...

TOOL RESULT

| Quarter | Revenue | Growth |
| Q3 2024 | $1.02B  | +11%   |

LLM CALL 3

Found Q3 revenue. Now I need to read the investor deck.
Calling read_file("investor_deck.pptx")...

TOOL RESULT

Slide 4: Q3 Financials - Revenue: $1.02B (+11% YoY)

LLM CALL 4

I also need to check the board presentation and analyst briefing.
Calling read_file("board_presentation.pptx")...

TOOL RESULT

Slide 12: Financial Summary - Q3 Revenue: $1.02B (+11% YoY)

LLM CALL 5

Found another file to update. Let me check the analyst briefing.
Calling read_file("analyst_briefing.docx")...

TOOL RESULT

Section 3: Q3 Performance - Revenue figure: $1.02B (+11% YoY)

LLM CALL 6

Need to also check the internal dashboard config.
Calling read_file("dashboard_config.json")...

TOOL RESULT

{ "q3_revenue": null, "last_updated": "2024-09-01" }

COMPACTING

Context limit approaching. Summarizing conversation...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Summary: User wants to update Q3 revenue ($1.02B) in
6 files: quarterly_report, investor_deck, board_presentation,
analyst_briefing, dashboard_config, and sync all references.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

LLM CALL 7

I'll spawn subagents to update the Excel and PowerPoint files.
Calling spawn_subagents(["report", "deck", "board"])...

PARALLEL SUBAGENTS

┌─ Subagent A ──────────────────────────┐
│ Updating quarterly_report.xlsx...     │
│ > LLM thinking...                     │
│ > Reading cell C4                     │
│ > LLM thinking...                     │
│ > Writing $1.02B to C4                │
│ > LLM thinking...                     │
│ > Updating formulas in D4:F4          │
│ > LLM thinking...                     │
│ > Recalculating dependent cells       │
│ > Formatting as currency              │
│ ✓ Done                                │
└───────────────────────────────────────┘
┌─ Subagent B ──────────────────────────┐
│ Updating investor_deck.pptx...        │
│ > LLM thinking...                     │
│ > Finding slide 4                     │
│ > LLM thinking...                     │
│ > Locating revenue placeholder        │
│ > LLM thinking...                     │
│ > Replacing with $1.02B               │
│ > Updating chart data                 │
│ > Regenerating chart visuals          │
│ ✓ Done                                │
└───────────────────────────────────────┘
┌─ Subagent C ──────────────────────────┐
│ Updating board_presentation.pptx...   │
│ > LLM thinking...                     │
│ > Finding slide 12                    │
│ > LLM thinking...                     │
│ > Locating financial summary          │
│ > LLM thinking...                     │
│ > Replacing revenue figure            │
│ > Updating executive summary          │
│ ✓ Done                                │
└───────────────────────────────────────┘

LLM CALL 8

Subagents completed. Now I need to update the document files.
Checking results from first batch...

TOOL RESULT

✓ quarterly_report.xlsx: C4 = $1.02B, formulas updated
✓ investor_deck.pptx: Slide 4 updated with chart
✓ board_presentation.pptx: Slide 12 updated

LLM CALL 9

First batch complete. Now spawning subagents for documents.
Calling spawn_subagents(["briefing", "config"])...

PARALLEL SUBAGENTS

┌─ Subagent D ──────────────────────────┐
│ Updating analyst_briefing.docx...     │
│ > LLM thinking...                     │
│ > Parsing document structure          │
│ > LLM thinking...                     │
│ > Finding Section 3                   │
│ > LLM thinking...                     │
│ > Locating revenue placeholder        │
│ > LLM thinking...                     │
│ > Replacing with $1.02B               │
│ > Updating table of contents          │
│ > Checking cross-references           │
│ ✓ Done                                │
└───────────────────────────────────────┘
┌─ Subagent E ──────────────────────────┐
│ Updating dashboard_config.json...     │
│ > LLM thinking...                     │
│ > Reading current config              │
│ > LLM thinking...                     │
│ > Validating JSON schema              │
│ > LLM thinking...                     │
│ > Updating q3_revenue field           │
│ > LLM thinking...                     │
│ > Updating last_updated timestamp     │
│ > Writing validated config            │
│ ✓ Done                                │
└───────────────────────────────────────┘

LLM CALL 10

Second batch complete. Now I need to verify all changes.
Calling verify_all_files()...

TOOL RESULT

✓ analyst_briefing.docx: Section 3 updated
✓ dashboard_config.json: q3_revenue = 1.02B

LLM CALL 11

All files updated. Now I need to sync the data warehouse.
Calling spawn_subagents(["warehouse", "cache"])...

PARALLEL SUBAGENTS

┌─ Subagent F ──────────────────────────┐
│ Syncing data warehouse...             │
│ > LLM thinking...                     │
│ > Connecting to warehouse             │
│ > LLM thinking...                     │
│ > Querying current Q3 records         │
│ > LLM thinking...                     │
│ > Preparing update statement          │
│ > LLM thinking...                     │
│ > Executing UPDATE query              │
│ > LLM thinking...                     │
│ > Verifying row count                 │
│ > Committing transaction              │
│ ✓ Done                                │
└───────────────────────────────────────┘
┌─ Subagent G ──────────────────────────┐
│ Invalidating cache entries...         │
│ > LLM thinking...                     │
│ > Connecting to Redis                 │
│ > LLM thinking...                     │
│ > Finding Q3 revenue keys             │
│ > LLM thinking...                     │
│ > Deleting stale cache entries        │
│ > LLM thinking...                     │
│ > Pre-warming cache with new data     │
│ > Verifying cache consistency         │
│ ✓ Done                                │
└───────────────────────────────────────┘

COMPACTING

Context limit approaching. Summarizing progress...
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Updated: quarterly_report, investor_deck, board_presentation,
analyst_briefing, dashboard_config, data warehouse, cache.
Remaining: notification system, audit log.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

LLM CALL 12

Need to notify stakeholders and update audit log.
Calling spawn_subagents(["notify", "audit"])...

PARALLEL SUBAGENTS

┌─ Subagent H ──────────────────────────┐
│ Sending stakeholder notifications...  │
│ > LLM thinking...                     │
│ > Fetching stakeholder list           │
│ > LLM thinking...                     │
│ > Composing email template            │
│ > LLM thinking...                     │
│ > Personalizing for CFO               │
│ > LLM thinking...                     │
│ > Personalizing for IR team           │
│ > LLM thinking...                     │
│ > Sending via SendGrid                │
│ > Verifying delivery status           │
│ ✓ Done                                │
└───────────────────────────────────────┘
┌─ Subagent I ──────────────────────────┐
│ Updating audit log...                 │
│ > LLM thinking...                     │
│ > Creating audit entry                │
│ > LLM thinking...                     │
│ > Recording file changes              │
│ > LLM thinking...                     │
│ > Adding user attribution             │
│ > LLM thinking...                     │
│ > Generating change hash              │
│ > Writing to immutable log            │
│ ✓ Done                                │
└───────────────────────────────────────┘

LLM CALL 13

All subagents completed. Let me run final verification.
Calling verify_all_systems()...

TOOL RESULT

✓ Warehouse: 1 row updated
✓ Cache: 3 keys invalidated, 3 pre-warmed
✓ Notifications: 5 emails sent
✓ Audit log: Entry #48291 created

LLM CALL 14

Running comprehensive verification across all systems.
Calling run_integration_tests()...

TOOL RESULT

Running 12 integration tests...
✓ File consistency check: PASSED
✓ Database integrity: PASSED
✓ Cache coherence: PASSED

LLM CALL 15

All tests passed. Generating comprehensive summary report.
Calling generate_summary()...

TOOL RESULT

Summary report generated: update_summary_2024Q3.pdf

LLM CALL 16

Summary generated. Finalizing and preparing response...

COMPLETE

Done! Updated Q3 revenue to $1.02B across 6 files,
synced warehouse, invalidated cache, notified 5 stakeholders,
and created audit log entry.

[

]done

Raysurfer Agent6 seconds

INIT

{
  "user": "Update our Q3 revenue ($1.02B) in the quarterly report and sync it with the investor deck."
}

LLM CALL

Understanding task: update Q3 revenue in report and deck.
Searching for cached solution...

CACHE HIT

> Match: update_financials_and_sync.py
> Similarity: 0.96 | Verdict: +52 | Runs: 8,431

EXECUTING

> Running cached code with params:
  revenue=$1.02B, files=[report, deck, board, briefing, config, warehouse]

COMPLETE

✓ Updated all 6 files and synced systems in 6s

[

]done

Claude Agent SDK: 180sRaysurfer: 6sSpeedup: 30x

Your code is your moat

For LLMs that generate code and execute it live-cache what works, skip what doesn't.

Code files generated by LLMs are cached. Retrieve and run proven code instead of regenerating.

Learn from History

Track which code executions succeeded or failed. Future agents retrieve successful code and avoid patterns that didn't work.

Low Variance Outputs

B2B vertical AI code is predictable. The same report generator, the same API client-perfect for caching and reuse.

Two calls. That's the whole API.

.search() to retrieve cached snippets. .upload() to cache new ones.

Search by task description

Call .search() with a natural language query. Hybrid search finds the most relevant cached code from prior agent runs.

Upload what works

Call .upload() with the task, file, and result. Raysurfer indexes it with semantic embeddings for future retrieval.

Vote on results

Code that works gets thumbs up, code that fails gets thumbs down. Verdict-aware scoring improves retrieval over time.

agent.py

from raysurfer import AsyncRaySurfer

rs = AsyncRaySurfer()

# Retrieve cached snippets
results = await rs.search("Update quarterly report")

# Cache new code after execution
await rs.upload(task, file, succeeded)

Also available as a drop-in replacement for Claude Agent SDK that handles caching automatically via RaysurferClient.

The Problem

The Token-Industrial Complex

Why are we all paying to regenerate the same tokens?

The Current Reality

Every time your agent runs, you wait for tokens to generate. The same patterns. The same outputs. Every. Single. Time.

You're paying for tokens. You're waiting for generation. For code that's already been generated somewhere else.

The Raysurfer Way

Raysurfer retrieves and runs proven code from previous executions. No waiting. No regenerating. Just execute.

Stop watching your agent think. Get instant results from code that already works.

Tool call to paste any output

~30x

Faster than regenerating

Generate once, paste forever

Perfect for long-running tasks: Dynamic code generation becomes trivial when your generated code is already context-managed. No more iteration loops. No more regeneration cycles. The code exists, it's been validated, it just needs to be executed.

“More tokens output causes a decrease in accuracy causes even more token output.”

Break the cycle. The median LLM agent activity for B2B SaaS is surprisingly low variance. Everything eventually viewed by a human is just HTML, PDF, or docs.

ReduceReuseCache code

Code Reputation Management

Track the user satisfaction rate of running different pieces of code

Code Snippet Search

Estimate from the code and logs what user queries this code would best solve

Simple API

.search() and .upload() — two calls, that's it

Frequently Asked Questions

Code files generated by LLMs during agent execution. When your agent writes a Python script, a data processor, or any executable code—that's what gets cached. Outputs (like generated PDFs or API responses) are not cached, only the code that produces them.

Code execution only. Raysurfer is designed for LLMs that generate code and run it live (like Claude Agent SDK). We don't cache tool call sequences—we cache the actual code files your agent writes and executes.

raysurfer.search() returns relevant code snippets along with metadata: how many times each has been run, and thumbs up/down scores from prior executions. Agents can assess relevance based on descriptive filenames and task similarity—just like how Claude ignores irrelevant files after an ls command.

The agent sees cached code as a starting point, not a mandate, and Claude Code running on Opus 4.6 typically ignores well-named files even when an `ls` returns a very large directory. In practice, extra files usually only cost a couple of tokens and don't affect the final output.

LLMs assess each cached code block and determine how it should be retrieved in the future—what task descriptions it matches, what keywords apply. This semantic indexing means future agents find relevant code even when their task is phrased differently.

Yes! The low-level API (RaySurfer client) works with any LLM or agent framework. Use .search() to retrieve code, and .upload() to cache new code after execution.

Start caching your agent's code today

Free tier includes 500 API requests. No credit card required.

Get started Try the playground