All products

AI Infrastructure

When to run a model yourself, and when to use a cloud provider, it's a harder question than vendors make it sound. We map your supply chain risk, evaluate sovereign options, and build the architecture that gets results without locking you in.


Options

Three deployment models.

Cloud, hybrid, or self-hosted, each with fixed pricing and managed ongoing. The right choice depends on your classification level, data sensitivity, and usage volume.

Opinionated
Custom
Setup fee, fixed price
$8,500
+ $2,200/mo managed
GST exclusive · Confirmed before we start

The most opinionated option, Azure OpenAI, our reference architecture, running in your environment within days. We've chosen the stack. You get capability fast. We design the integration architecture, implement it, and manage it ongoing.

What's included
  • Azure OpenAI Service configuration and deployment
  • API gateway and rate-limiting layer
  • Content filtering and safety controls appropriate to your classification
  • Integration with your existing systems (REST/webhooks)
  • Ongoing management and model version upgrades
  • Usage monitoring and cost controls
Talk to us about cloud deployment
Pricing breakdown
  • Setup and implementation $8,500
  • Ongoing management $2,200/mo

Talk to us

Fixed price confirmed before we start. No change orders.

Project fee, fixed price
$18,000
+ $1,800/mo managed
GST exclusive · Confirmed before we start

A split environment where some decisions are yours to make, which workloads go where, what sensitivity threshold triggers local inference. More flexibility, more complexity. Common for agencies that need sovereignty for sensitive prompts but want cloud inference for lower-sensitivity tasks. We design the routing logic and manage both environments.

What's included
  • Workload classification and routing design
  • Cloud inference (Azure OpenAI) for eligible workloads
  • On-premise or Azure Gov inference for sensitive workloads
  • Supply chain risk assessment for each model used
  • Data residency documentation for IRAP/ISM purposes
  • Ongoing management of both environments
Talk to us about hybrid deployment
Pricing breakdown
  • Project delivery $18,000
  • Ongoing management $1,800/mo

Talk to us

Fixed price confirmed before we start. No change orders.

Project fee, fixed price
$26,000
one-time
GST exclusive · Confirmed before we start

Your infrastructure, your model, your constraints. Fully sovereign and fully custom, this is the most complex option to scope and deploy, and the one that requires the most from your environment. The model runs on your infrastructure, on-premise, air-gapped, or in a dedicated Azure Government region. No data leaves your environment.

What's included
  • Model selection and licensing assessment
  • Infrastructure specification (GPU requirements, networking)
  • Deployment: Ollama / vLLM / Azure AI on Government region
  • Integration layer (OpenAI-compatible API endpoint)
  • Quantisation and performance tuning
  • Runbook and operational documentation
  • Handover and knowledge transfer

Ongoing management available as an add-on ($1,200/mo)

Talk to us about self-hosted deployment
Pricing breakdown
  • Full project delivery $26,000
  • Ongoing management (optional) $1,200/mo

Talk to us

One-time fixed project price. Ongoing management is an optional add-on.

The questions we answer first

What shapes the right answer for you.

These four factors determine the architecture. We work through each one before recommending anything.

01
Classification level

What classification does your data operate at? PROTECTED, OFFICIAL, or SENSITIVE? This determines whether a cloud provider can legally hold your data and what sovereign options are viable.

02
Supply chain risk

Who trained the model, on what data, and under what jurisdiction? For government use, the model's provenance matters as much as where it runs.

03
Workload sensitivity

Not all prompts carry the same risk. A document summariser and a legal advice tool have very different requirements, even if they use the same model.

04
Cost at scale

Cloud inference is cheap at low volume and expensive at scale. Self-hosted is expensive upfront and cheap at scale. We model the crossover for your expected usage before recommending.


Architecture

How we build it

Three deep dives into the real problems behind AI infrastructure for government.

Self-hosted LLM on Azure for PROTECTED classification workloads Running a capable model inside the boundary when the data can't leave it. Hybrid AI: local inference with cloud API fallback Keeping sensitive inference local while reaching for the cloud when it's safe. Connecting agency data to an AI assistant without vendor lock-in Wiring data to a model through an abstraction you can swap out later.

The right answer depends on your environment.

Tell us your classification level and what you're trying to do. We'll tell you which model makes sense.