Headland: AI Infrastructure

Options

Three deployment models.

Cloud, hybrid, or self-hosted, each with fixed pricing and managed ongoing. The right choice depends on your classification level, data sensitivity, and usage volume.

Opinionated

Custom

Setup fee, fixed price

$8,500

+ $2,200/mo managed

GST exclusive · Confirmed before we start

The most opinionated option, Azure OpenAI, our reference architecture, running in your environment within days. We've chosen the stack. You get capability fast. We design the integration architecture, implement it, and manage it ongoing.

What's included

Azure OpenAI Service configuration and deployment
API gateway and rate-limiting layer
Content filtering and safety controls appropriate to your classification
Integration with your existing systems (REST/webhooks)
Ongoing management and model version upgrades
Usage monitoring and cost controls

Talk to us about cloud deployment

Pricing breakdown

Setup and implementation $8,500
Ongoing management $2,200/mo

Talk to us

Fixed price confirmed before we start. No change orders.

Project fee, fixed price

$18,000

+ $1,800/mo managed

GST exclusive · Confirmed before we start

A split environment where some decisions are yours to make, which workloads go where, what sensitivity threshold triggers local inference. More flexibility, more complexity. Common for agencies that need sovereignty for sensitive prompts but want cloud inference for lower-sensitivity tasks. We design the routing logic and manage both environments.

What's included

Workload classification and routing design
Cloud inference (Azure OpenAI) for eligible workloads
On-premise or Azure Gov inference for sensitive workloads
Supply chain risk assessment for each model used
Data residency documentation for IRAP/ISM purposes
Ongoing management of both environments

Talk to us about hybrid deployment

Pricing breakdown

Project delivery $18,000
Ongoing management $1,800/mo

Talk to us

Fixed price confirmed before we start. No change orders.

Project fee, fixed price

$26,000

one-time

GST exclusive · Confirmed before we start

Your infrastructure, your model, your constraints. Fully sovereign and fully custom, this is the most complex option to scope and deploy, and the one that requires the most from your environment. The model runs on your infrastructure, on-premise, air-gapped, or in a dedicated Azure Government region. No data leaves your environment.

What's included

Model selection and licensing assessment
Infrastructure specification (GPU requirements, networking)
Deployment: Ollama / vLLM / Azure AI on Government region
Integration layer (OpenAI-compatible API endpoint)
Quantisation and performance tuning
Runbook and operational documentation
Handover and knowledge transfer

Ongoing management available as an add-on ($1,200/mo)

Talk to us about self-hosted deployment

Pricing breakdown

Full project delivery $26,000
Ongoing management (optional) $1,200/mo

Talk to us

One-time fixed project price. Ongoing management is an optional add-on.

The questions we answer first

What shapes the right answer for you.

These four factors determine the architecture. We work through each one before recommending anything.

Classification level

What classification does your data operate at? PROTECTED, OFFICIAL, or SENSITIVE? This determines whether a cloud provider can legally hold your data and what sovereign options are viable.

Supply chain risk

Who trained the model, on what data, and under what jurisdiction? For government use, the model's provenance matters as much as where it runs.

Workload sensitivity

Not all prompts carry the same risk. A document summariser and a legal advice tool have very different requirements, even if they use the same model.

Cost at scale

Cloud inference is cheap at low volume and expensive at scale. Self-hosted is expensive upfront and cheap at scale. We model the crossover for your expected usage before recommending.

AI Infrastructure

Three deployment models.

What shapes the right answer for you.

How we build it

The right answer depends on your environment.