How it works

Route repeated AI work through one local path.

AI Optimizer runs locally and sits between your workflow and the selected provider API, helping reduce repeated spend with caching and clearer request visibility.

Annotated AI Optimizer app screenshot showing setup steps and live cache-hit stats

A single local setup path: activate the app, choose your provider, add your API key, start the proxy, and watch requests and cache hits update live.

Activate the app and choose your provider

Enter your license, then select OpenAI or Anthropic in the desktop app. AI Optimizer supports one active provider at a time in v2.2.0.

Save the API key for that provider

You can store both keys in the app, but the proxy uses the currently selected provider for local routing.

Start the local proxy

Once the proxy is running, AI Optimizer listens on http://localhost:3000/v1.

Point your workflow at localhost

Your tools, scripts, or automations send requests to AI Optimizer first instead of calling the upstream provider directly.

Cache repeated requests and track behavior

Repeated work can be served locally while the app shows request totals, cache hits, and hit rate so you can confirm the optimizer is doing useful work.

Typical config change

Most workflows only need one practical change.

For many OpenAI-compatible tools, the main setup step is routing traffic through AI Optimizer locally:

OPENAI_BASE_URL=http://localhost:3000/v1

That lets your workflow hit the local optimizer first instead of going directly to the provider API every time.

Supported providers

AI Optimizer 2.2.0 supports both OpenAI and Anthropic. You can store both API keys in the app and choose one active provider at a time through the desktop UI.

Provider-aware caching

Cache behavior is provider-aware, so OpenAI and Anthropic requests do not collide with each other even when the local proxy workflow stays the same.

How caching helps

Many developer tools, scripts, cron jobs, and agent workflows repeat the same or nearly identical requests. Without a local control layer, every repeat call can hit the provider API at full cost.

Adjustable cache TTL

AI Optimizer includes an adjustable cache TTL, which is especially useful for recurring jobs, cron-style workflows, and repeat-heavy automations where the same request pattern shows up on a predictable schedule.

Anthropic support in v2.2.0: Anthropic support in this release is focused on chat completions through the same local proxy flow. OpenAI support remains broader for workflows that rely on additional OpenAI-specific endpoints.

Start with the workflow you already have.

Install AI Optimizer, choose your provider, point traffic at localhost, and start reducing repeated API waste without rebuilding the way your team already works.

Start free trial