The price-performance frontier on foundation models has been moving every quarter for two years. Anyone who picked "the model" in early 2024 has either re-architected since or is paying 4× for results another vendor can deliver today. Locking architecture to a single provider is not a one-time decision. It is a recurring tax.
The integration surface is the moat
What we have learnt: the durable layer is not the model. It is the surface around it: the prompt management, the eval harness, the safety filters, the audit log. Build that layer with a model-shaped hole in the middle. Swap providers behind the interface. Your product code never has to know.
A practical architecture sketch
ts// The provider interface, your product code only sees this. interface ChatProvider { complete(req: ChatRequest): Promise<ChatResponse>; embed(text: string): Promise<number[]>; } // Implementations are swappable behind config. class OpenAIProvider implements ChatProvider { /* ... */ } class AnthropicProvider implements ChatProvider { /* ... */ } class HostedLlamaProvider implements ChatProvider { /* ... */ } // Eval + guardrails wrap whichever provider you select. const ai = withEvals(withGuardrails(loadProvider(config)));
The trade-off (and why it is worth it)
Going model-agnostic costs you something. You cannot use the latest provider-specific feature on day one. You spend more upfront on the abstraction. In return: when a new model lands at half the price, you flip a config and pay half. When a vendor changes their pricing or their terms, you have a credible alternative. In a clinical setting, you have a real story to tell about not being locked into a single inference provider.
Zowork is a healthcare and behavioral health AI engineering team. For a decade we’ve shipped clinical platforms. Now we’re building the AI that runs underneath them.