Main Branch

Fundamentals first, always

Articles

Getting BYOK Working with Ollama and Gemma 4

GitHub Copilot CLI now supports BYOK: bring your own model via Ollama, Azure AI Foundry, Anthropic, or any OpenAI-compatible endpoint. Here's what I learned getting it working with a local Gemma 4 instance.

Andrea Griffiths 4 min read 🌐 Read in Spanish 🌐 阅读中文版
GitHub Copilot Copilot CLI BYOK Ollama Gemma AI Developer Tools

Leer en español → · 阅读中文版 →

Getting BYOK Working with Ollama and Gemma 4

Tested on Copilot CLI, April 8, 2026.

I’ve been testing GitHub Copilot CLI’s BYOK (Bring Your Own Key) feature with a local Ollama instance running Gemma 4. The official docs cover the setup well. If you haven’t read them, start there: Using your own LLM models in GitHub Copilot CLI. This is what I ran into after that.

Before anything else: is Ollama actually running?

Do this before you touch any environment variables:

curl http://localhost:11434/api/tags

You’ll see your models listed if Ollama is up. If you get connection refused, it’s not running. Check your system tray or run ollama serve first.

Sounds obvious until you spend 20 minutes debugging env vars when the real problem was a stopped process.

Set your environment variables

export COPILOT_PROVIDER_BASE_URL=http://localhost:11434
export COPILOT_MODEL=gemma4:31b

Two things that tripped me up here. First: use the exact model name from ollama list. Not gemma4, not Gemma-4-31B. It has to match exactly, so gemma4:31b it is.

Second: don’t set COPILOT_PROVIDER_API_KEY for local Ollama. It doesn’t need auth, and an empty key can cause failures that’ll send you on a wild chase.

Default port is 11434, but if you’ve changed it, you can verify with:

# macOS/Linux
lsof -i :11434

# Windows
netstat -an | findstr 11434

A note on Anthropic models

The docs suggest COPILOT_PROVIDER_BASE_URL may be optional when using Anthropic as your provider. In my testing (Copilot CLI, April 8, 2026), the CLI required it anyway. If you hit validation errors pointing at Anthropic, set it explicitly:

export COPILOT_PROVIDER_BASE_URL=https://api.anthropic.com
export COPILOT_MODEL=claude-sonnet-4-20250514
export COPILOT_PROVIDER_API_KEY=your-key-here

Skip the base URL and you’ll hit a validation error that the error message doesn’t make obvious. The docs and the CLI disagree here. Set it anyway and move on.

Test it

copilot

If it connects, Gemma 4 starts responding. If not, run through the short checklist: Ollama is running, model name matches ollama list exactly, and COPILOT_PROVIDER_API_KEY is not set.

That last one specifically:

unset COPILOT_PROVIDER_API_KEY

Switching between local and GitHub-hosted models

This is the part that tripped me up most. At the time of writing (April 8, 2026), the /model picker inside Copilot CLI only showed GitHub-hosted models in my environment. Claude, GPT-4o, the usuals. Your local BYOK models didn’t show up there.

BYOK is session-level. You’re either using your env vars or you’re using GitHub-hosted models. To switch, you exit Copilot and restart with different vars:

# Local Ollama
export COPILOT_PROVIDER_BASE_URL=http://localhost:11434
export COPILOT_MODEL=gemma4:31b
copilot

# Back to GitHub-hosted (exit copilot first)
unset COPILOT_PROVIDER_BASE_URL
unset COPILOT_MODEL
copilot

I got tired of typing that out, so I made shell aliases:

alias copilot-local='COPILOT_PROVIDER_BASE_URL=http://localhost:11434 COPILOT_MODEL=gemma4:31b copilot'
alias copilot-github='copilot'

What this actually gets you

Your compute, your model, your control.

GitHub adds model support fast, but the hosted catalog reflects general-purpose models. BYOK is where specialized ones come in.

If your organization already runs models through Azure AI Foundry, this is the most natural path. Deploy a model in Foundry (a fine-tuned GPT, Mistral, Llama, or a domain-specific model your team trained) and point Copilot CLI at it:

export COPILOT_PROVIDER_BASE_URL=https://YOUR-RESOURCE-NAME.openai.azure.com/openai/deployments/YOUR-DEPLOYMENT-NAME
export COPILOT_PROVIDER_TYPE=azure
export COPILOT_PROVIDER_API_KEY=your-azure-api-key
export COPILOT_MODEL=YOUR-DEPLOYMENT-NAME

Medical, legal, finance teams running fine-tuned models on private infrastructure: this is how you bring that into your Copilot CLI workflow without anything leaving your Azure tenant.

The same pattern works for any OpenAI-compatible endpoint. OpenAI directly, Mistral, Cohere, Together AI, a self-hosted fine-tune. Set COPILOT_PROVIDER_BASE_URL and COPILOT_MODEL, same as above.

# Example: OpenAI directly
export COPILOT_PROVIDER_BASE_URL=https://api.openai.com/v1
export COPILOT_PROVIDER_API_KEY=sk-your-key
export COPILOT_MODEL=gpt-4o

# Example: Mistral via their API
export COPILOT_PROVIDER_BASE_URL=https://api.mistral.ai
export COPILOT_MODEL=mistral-medium-latest
export COPILOT_PROVIDER_API_KEY=your-mistral-key

# Example: a self-hosted fine-tune
export COPILOT_PROVIDER_BASE_URL=https://your-internal-endpoint
export COPILOT_MODEL=your-model-name
export COPILOT_PROVIDER_API_KEY=your-key-here

For compatible local or self-hosted setups, BYOK can enable Copilot CLI workflows without external model calls. If you’re working in an air-gapped environment or just offline, that’s useful in a way that cloud-only tooling can’t replicate.

Always verify provider-specific model IDs in current docs. Exact identifiers vary by provider and registry.

About the Author: Andrea Griffiths is a Senior Developer Advocate at GitHub, where she helps engineering teams adopt and scale developer technologies. She's passionate about making technical concepts accessible—to both humans and AI agents. Connect with her on LinkedIn, GitHub, or Twitter/X. · Read in Spanish · 阅读中文版