architecture, engineering, deep dive,

Polymorphism in Practice: Architecture of a Universal Gateway

Sebastian Schkudlara Sebastian Schkudlara Follow Jan 25, 2026 · 2 mins read
Polymorphism in Practice: Architecture of a Universal Gateway
Share this

The challenge of building an “AI Gateway” isn’t connecting to APIs. It’s connecting to everything else. When we designed switchAILocal, we didn’t just want another proxy for OpenAI. We wanted a “Universal Adapter” that could handle the chaos of the real world: local binaries, proprietary protocols, and cloud endpoints.

The result is the Polymorphic Executor Pattern.

The Core Abstraction

To the outside world (your IDE), switchAILocal looks like a simple HTTP server. But inside, it’s a shape-shifting engine. It relies on a single Go interface:

type Executor interface {
    Execute(ctx context.Context, request *Request) (*Response, error)
}

This simple contract allows us to implement widely different behaviors that share the same API surface.

The Three Modes of Execution

This polymorphism allows switchAILocal to support three distinct “species” of AI providers simultaneously:

1. Network Executors (The Standard)

  • What they do: Make HTTP/SSE requests to a remote server.
  • Examples: ollama:, openai:, lmstudio:.
  • Use Case: Connecting to a standard local inference server or a public cloud API.

2. Process Executors (The Unique)

  • What they do: Spawn a local subprocess and manage stdin/stdout.
  • Examples: geminicli:, claudecli:, vibe:.
  • Use Case: This is our unique “No-Key” solution. By wrapping the CLI tool directly, we leverage the user’s existing authenticated session (stored in the system keychain) without ever handling secrets in the gateway configuration.

3. Bridge Executors (The Future)

  • What they do: Establish encrypted tunnels to remote agents.
  • Examples: opencode:.
  • Use Case: Connecting to complex agentic coding environments.

The Translation Pipeline: A Universal “Babel Fish”

Supporting these diverse backends creates a data problem.

  • Ollama speaks HTTP JSON.
  • Gemini CLI speaks command-line arguments.
  • Cursor speaks OpenAI JSON.

To solve this, we built a Translation Pipeline that acts as a universal interpreter.

  1. Ingest: Takes the standard OpenAI request from your client.
  2. Normalize: Converts it to an internal canonical schema.
  3. Adapt: Transforms it for the specific target.
    • For Ollama: Remaps the JSON to match the Ollama API.
    • For CLI: Converts the JSON prompt into shell arguments (e.g., gemini prompt "...").

Resilience by Design

Because this system is polymorphic, it allows for Interchangeable Failover. If geminicli: (a Process Executor) fails, the router can instantly retry the request on gemini-api: (a Network Executor). Use whatever works.

This architecture is what makes switchAILocal the most robust foundation for local AI development.

Bridging Architecture & Execution

Struggling to implement Agentic AI or Enterprise Microservices in your organization? I help CTOs and technical leaders transition from architectural bottlenecks to production-ready systems.

View My Architect Portfolio & Contact
Sebastian Schkudlara
Written by Sebastian Schkudlara Follow
Hi, I am Sebastian Schkudlara, the author of Jevvellabs. I hope you enjoy my blog!