Keep your code where it belongs: Running GitHub Copilot CLI with your own models on Azure Local + GitHub Enterprise Server
For a lot of the customers I work with — regulated industries, public sector, sovereign clouds, and teams operating under strict data residency requirements — "just send your prompts to a public AI endpoint" isn't an option. Source code, IP, and context can't leave the boundary. But that shouldn't mean giving up the developer productivity gains of an AI coding assistant.
That's exactly the gap that GitHub Copilot CLI's Bring Your Own Key (BYOK) support closes. Pair it with Azure Local and GitHub Enterprise Server (GHES), and you get a fully on-premises, residency-aligned AI coding workflow — your repos in GHES, your model running on Azure Local, and Copilot CLI stitching it all together on the developer's machine.
What BYOK gives you
Copilot CLI can be pointed at any of three provider types instead of GitHub-hosted models:
The model you bring must support tool calling and streaming, and GitHub recommends a context window of at least 128k tokens for best results.
The on-prem / sovereign pattern
A typical setup for a data-residency-sensitive customer looks like this:
Connecting to Azure OpenAI
export COPILOT_PROVIDER_BASE_URL=https://YOUR-RESOURCE-NAME.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT-NAME
export COPILOT_PROVIDER_TYPE=azure
export COPILOT_PROVIDER_API_KEY=YOUR-AZURE-API-KEY
export COPILOT_MODEL=YOUR-DEPLOYMENT-NAME
copilot
Connecting to a local OpenAI-compatible runtime (Foundry Local, vLLM, Ollama)
Recommended by LinkedIn
export COPILOT_PROVIDER_BASE_URL=http://your-local-endpoint:PORT
export COPILOT_MODEL=YOUR-MODEL-NAME
copilot
No API key is required if your local runtime doesn't use authentication.
The piece people miss: offline mode
For air-gapped and sovereign environments:
export COPILOT_OFFLINE=true
With COPILOT_OFFLINE=true, Copilot CLI will not phone home to GitHub's servers. Combined with a provider endpoint that's also inside your boundary (Azure Local, on-prem Foundry Local, etc.), prompts and code context stay entirely within your environment.
One caveat straight from the docs: offline mode only guarantees full network isolation if your provider is also local or inside the same isolated environment. Point COPILOT_PROVIDER_BASE_URL at a remote endpoint and your context travels to that endpoint — regardless of the offline flag. Architect accordingly.
Why this matters
For customers who have been told "AI coding assistants and data residency can't coexist," the combination of GHES + Azure Local + Copilot CLI BYOK + offline mode is a concrete answer:
If you're in financial services, healthcare, government, defense, or any team with a "nothing leaves our tenant" mandate, this is worth a closer look.
Full reference: Using your own LLM models in GitHub Copilot CLI
Views are my own and do not represent my employer
Hey! I launched a GitHub game: Weekly Builds Community. Share weekly progress via PRs, top contributors get featured Mondays. Week W17 is open—join 👉 github.com/P-r-e-m-i-u-m/weekly-builds-community