ai, engineering, switchailocal,

Evolution: From Dumb Pipes to Intelligent Gateways

Sebastian Schkudlara Sebastian Schkudlara Follow Jan 28, 2026 · 2 mins read
Evolution: From Dumb Pipes to Intelligent Gateways
Share this

Let’s be honest: the local AI landscape is plagued by “Dumb Pipes.”

You know the drill: you run a proxy, it takes a request, matches a regex, and forwards it to some model. It works. It’s fast. But it’s also deaf, dumb, and blind.

It doesn’t remember that you prefer Claude for Python but Llama 3 for creative writing. It doesn’t know that your local Ollama instance is currently struggling or that a specific provider is timing out. It certainly doesn’t care that you’re about to hit a hard API quota.

switchAILocal started as one of those pipes. But today, we’re changing the game. We’re killing the pipe and building a Gateway.

The Paradigm Shift: Stateless to Stateful

The biggest maturity milestone for any infrastructure is the move from Stateless (fresh start every time) to Stateful (learning and remembering).

We are introducing the Option C Hybrid Architecture. This isn’t just a fancy name—it’s a philosophical stance on how AI infrastructure should behave.

graph TD
    User[User Request] --> Proxy[Go Proxy Core]
    Proxy -->|Reflex Tier| Reflex[Regex Matcher]
    Proxy -->|Semantic Tier| Brain[Intelligence Service]
    Brain -->|Query| State[(State Box)]
    State -->|Context| Brain
    Brain -->|Routing Decision| Proxy
    Proxy -->|Forward| Model[AI Provider]

The 3 Core Pillars

  1. Plugin Independence (The Spine): The core proxy functionality must never break. If the brain crashes (or is simply turned off), the spine keeps you walking.
  2. Graceful Degradation (The Safety Net): Features are tiers, not dominoes.
    • Tier 3 (Cognitive): “Let me think about this…” (Smart but slow)
    • Tier 2 (Semantic): “I’ve seen this before!” (Fast and smart)
    • Tier 1 (Reflex): “Just do it.” (Instant) If Tier 3 fails, we drop to Tier 2. If that fails, Tier 1 takes over. You never get a 500 decision.
  3. Service Independence (The Opt-In): You don’t need the brain. By default, switchAILocal remains the lightweight, blazing-fast tool you love. But flip the intelligence: true switch, and the system wakes up.

What’s New?

We’ve rewritten the rulebook with a new Intelligence Service in Go:

  • Discovery Service: It doesn’t wait for config files. It proactively scans your ports (Ollama, LM Studio) and finds models.
  • Dynamic Matrix: Static configs are dead. The router builds a living routing matrix based on what’s actually alive.
  • Semantic Tier: We embedded a vector engine directly into the binary. It “understands” your prompt’s intent in <20ms.

The Agentic Difference

This is the difference between a tool handling your traffic and an agent managing your workflow.

Your proxy should know your intent (“This looks like a complex refactoring task”). It should check its memory (“The user usually selects Sonnet 3.5 for this”). It should verify health (“Sonnet is active, but my quota is low; let’s verify if DeepSeek won’t do the job effectively”).

And then? It should just work.

This is Part 1 of a series. Next up: A deep dive into the 20ms routing engine that makes this possible.

Bridging Architecture & Execution

Struggling to implement Agentic AI or Enterprise Microservices in your organization? I help CTOs and technical leaders transition from architectural bottlenecks to production-ready systems.

View My Architect Portfolio & Contact
Sebastian Schkudlara
Written by Sebastian Schkudlara Follow
Hi, I am Sebastian Schkudlara, the author of Jevvellabs. I hope you enjoy my blog!