Keep your code where it belongs: Running GitHub Copilot CLI with your own models on Azure Local + GitHub Enterprise Server

Melissa Durbin

Published Apr 21, 2026

For a lot of the customers I work with — regulated industries, public sector, sovereign clouds, and teams operating under strict data residency requirements — "just send your prompts to a public AI endpoint" isn't an option. Source code, IP, and context can't leave the boundary. But that shouldn't mean giving up the developer productivity gains of an AI coding assistant.

That's exactly the gap that GitHub Copilot CLI's Bring Your Own Key (BYOK) support closes. Pair it with Azure Local and GitHub Enterprise Server (GHES), and you get a fully on-premises, residency-aligned AI coding workflow — your repos in GHES, your model running on Azure Local, and Copilot CLI stitching it all together on the developer's machine.

What BYOK gives you

Copilot CLI can be pointed at any of three provider types instead of GitHub-hosted models:

OpenAI – any OpenAI Chat Completions–compatible endpoint (Ollama, vLLM, Foundry Local, etc.)
Azure – Azure OpenAI Service (including deployments running in your tenant and region of choice)
Anthropic – Anthropic Claude models

The model you bring must support tool calling and streaming, and GitHub recommends a context window of at least 128k tokens for best results.

The on-prem / sovereign pattern

A typical setup for a data-residency-sensitive customer looks like this:

GitHub Enterprise Server hosts the repos, inside the customer's network.
Azure Local runs the inference workload — either Azure OpenAI in a region you control, or an OpenAI-compatible runtime (Foundry Local, vLLM, Ollama) hosting an approved model on-prem.
Copilot CLI on the developer workstation is pointed at that endpoint via environment variables.

Connecting to Azure OpenAI

export COPILOT_PROVIDER_BASE_URL=https://YOUR-RESOURCE-NAME.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT-NAME
export COPILOT_PROVIDER_TYPE=azure
export COPILOT_PROVIDER_API_KEY=YOUR-AZURE-API-KEY
export COPILOT_MODEL=YOUR-DEPLOYMENT-NAME
copilot

Connecting to a local OpenAI-compatible runtime (Foundry Local, vLLM, Ollama)

Recommended by LinkedIn

GitHub Copilot SDK Meets Agent Framework: Building…

Nelson Kumari 2 months ago

Enabling Agentic AI Agents with Proper Azure…

Jack Jin 10 months ago

AI Agents Have Write Access to My Code — Here’s How I…

Jay Stuart 1 month ago

export COPILOT_PROVIDER_BASE_URL=http://your-local-endpoint:PORT
export COPILOT_MODEL=YOUR-MODEL-NAME
copilot

No API key is required if your local runtime doesn't use authentication.

The piece people miss: offline mode

For air-gapped and sovereign environments:

export COPILOT_OFFLINE=true

With COPILOT_OFFLINE=true, Copilot CLI will not phone home to GitHub's servers. Combined with a provider endpoint that's also inside your boundary (Azure Local, on-prem Foundry Local, etc.), prompts and code context stay entirely within your environment.

One caveat straight from the docs: offline mode only guarantees full network isolation if your provider is also local or inside the same isolated environment. Point COPILOT_PROVIDER_BASE_URL at a remote endpoint and your context travels to that endpoint — regardless of the offline flag. Architect accordingly.

Why this matters

For customers who have been told "AI coding assistants and data residency can't coexist," the combination of GHES + Azure Local + Copilot CLI BYOK + offline mode is a concrete answer:

Code stays in GHES, inside your boundary.
Inference happens on Azure Local, in the region and jurisdiction you control.
The developer experience is the same Copilot CLI workflow — just pointed at your infrastructure.

If you're in financial services, healthcare, government, defense, or any team with a "nothing leaves our tenant" mandate, this is worth a closer look.

Full reference: Using your own LLM models in GitHub Copilot CLI

Views are my own and do not represent my employer

syed abdul Aman 1w

Hey! I launched a GitHub game: Weekly Builds Community. Share weekly progress via PRs, top contributors get featured Mondays. Week W17 is open—join 👉 github.com/P-r-e-m-i-u-m/weekly-builds-community

Keep your code where it belongs: Running GitHub Copilot CLI with your own models on Azure Local + GitHub Enterprise Server

Melissa Durbin

What BYOK gives you

The on-prem / sovereign pattern

Connecting to Azure OpenAI

Connecting to a local OpenAI-compatible runtime (Foundry Local, vLLM, Ollama)

Recommended by LinkedIn

The piece people miss: offline mode

Why this matters

More articles by Melissa Durbin

Others also viewed

Amazon Q Services for the AWS Certified GenAI Developer Pro Exam

How to create Microsoft Azure Custom Marketplace Extension?

The Gap Between Local Dev and Azure Is Three Commands Wide. Until It Isn't.

MCP servers: are we overcomplicating AI agents?

Streamlining Airflow on AWS EKS with Slack Integration: A Step-by-Step Guide

5 Highlights Thursday | 23th April 2026 (#109 Edition)

Complementing Google Anthos/Microsoft Arc/AWS GitOps with EMCO for distributed application orchestration across clouds

Serverless, It's a Thing

No Servers or Monoliths: Event-driven architecture with AWS Lambda

🚀 Our Terraform just auto-provisioned 15 new ECS instances via GitHub Actions AI prediction… GitOps Reality Check

Explore content categories

What BYOK gives you

The on-prem / sovereign pattern

Connecting to Azure OpenAI

Connecting to a local OpenAI-compatible runtime (Foundry Local, vLLM, Ollama)

Recommended by LinkedIn

The piece people miss: offline mode

Why this matters

More articles by Melissa Durbin

Engineering & Construction Industry is Transforming

Top Five Benefits of LinkedIn for Legal

Others also viewed

Amazon Q Services for the AWS Certified GenAI Developer Pro Exam

How to create Microsoft Azure Custom Marketplace Extension?

The Gap Between Local Dev and Azure Is Three Commands Wide. Until It Isn't.

MCP servers: are we overcomplicating AI agents?

Streamlining Airflow on AWS EKS with Slack Integration: A Step-by-Step Guide

5 Highlights Thursday | 23th April 2026 (#109 Edition)

Complementing Google Anthos/Microsoft Arc/AWS GitOps with EMCO for distributed application orchestration across clouds

Serverless, It's a Thing

No Servers or Monoliths: Event-driven architecture with AWS Lambda

🚀 Our Terraform just auto-provisioned 15 new ECS instances via GitHub Actions AI prediction… GitOps Reality Check

Similar topics

Impact of Github Copilot on Project Delivery

Building Custom AI Models for AWS Workflows

Explore content categories