Table of contents
Zero-Cost AI: How to Access Free LLM Models via OpenRouter
Experimenting with LLMs usually means one of two things: either you pick a provider, register a credit card, and hope you don't accidentally burn through your budget — or you limit yourself to whatever local model fits on your laptop. Neither feels great when you're just trying to learn or prototype something.
OpenRouter offers a third path. It's a unified API gateway that sits in front of hundreds of models from dozens of providers — and a meaningful subset of those models are completely free to use. You get a real API endpoint, real models (including capable ones like Google's Gemma series), and zero cost, with just an account and an API key.
This article walks you through everything: creating an account, finding free models, making your first request, and integrating OpenRouter into TypeScript or Python projects.
Why Use OpenRouter for Free Models
You might wonder — why go through a gateway instead of calling a provider's free tier directly? A few reasons make OpenRouter worth it:
One key, many models. Instead of managing separate API keys for Google, Meta, Mistral, and others, you authenticate with a single OpenRouter key. Switching models means changing one string in your code — nothing else.
OpenAI-compatible API. OpenRouter mirrors the OpenAI Chat Completions API format. Any code you've already written against the OpenAI SDK works with OpenRouter by just swapping the base URL and API key. No new SDK to learn.
Transparent model catalog. Every model's context length, pricing, and provider are listed on a single page. Free models are clearly tagged, so there's no guesswork about what you'll be charged.
Rate limit visibility. OpenRouter surfaces rate limit info per model. You can see what you're getting before you commit to a model in your app.
Create an OpenRouter Account
Getting started takes about two minutes:
- Go to openrouter.ai and sign up — you can use Google, GitHub, or email.
- Once signed in, click the OpenRouter label in the top navigation.
- Select Get API Key from the dropdown menu.
- Click Create Key, give it a name, and copy the generated value — it starts with
sk-or-v1-....
Store this key as an environment variable in your shell or .env file:
export OPENROUTER_API_KEY="sk-or-v1-..."
No credit card is required to access free-tier models.
Find and Select Free Models
Head to openrouter.ai/models and use the Free filter in the top toolbar. This shows every model currently available at zero cost.
Free models on OpenRouter are identified by a :free suffix in their model ID. That suffix is load-bearing — it's part of the string you pass to the API, not just a UI label.
As of writing, Google's Gemma 3 family offers three strong free options at different sizes:
google/gemma-3-27b-it:free — 27B parameters, best quality, higher latency
google/gemma-3-12b-it:free — 12B parameters, good balance
google/gemma-3-4b-it:free — 4B parameters, fastest responses
The it suffix stands for instruction-tuned — these are fine-tuned variants designed for chat and instruction-following tasks, not raw text completion. They handle everyday prompts well and support a 131k token context window.
When choosing a model size, think about what you're optimizing for:
- Speed and low latency → use
4b - Balanced quality and responsiveness → use
12b - Best output quality, latency less critical → use
27b
You can also check a model's detail page on OpenRouter to see its context length, which providers serve it, and any throughput restrictions.
Make Your First Request
Before wiring OpenRouter into a full app, it's worth making a raw API call to confirm everything is working.
Here is the example directly from the OpenRouter quickstart documentation:
Python (using requests):
import requests
import json
response = requests.post(
url="https://openrouter.ai/api/v1/chat/completions",
headers={
"Authorization": "Bearer <OPENROUTER_API_KEY>",
"HTTP-Referer": "<YOUR_SITE_URL>", # Optional
"X-OpenRouter-Title": "<YOUR_SITE_NAME>", # Optional
},
data=json.dumps({
"model": "google/gemma-3-12b-it:free",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
]
})
)
print(response.json()["choices"][0]["message"]["content"])
TypeScript (using fetch):
fetch("https://openrouter.ai/api/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer <OPENROUTER_API_KEY>",
"HTTP-Referer": "<YOUR_SITE_URL>", // Optional. Site URL for rankings on openrouter.ai.
"X-OpenRouter-Title": "<YOUR_SITE_NAME>", // Optional. Site title for rankings on openrouter.ai.
"Content-Type": "application/json"
},
body: JSON.stringify({
"model": "google/gemma-3-27b-it:free",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is the meaning of life?"
},
{
"type": "image_url",
"image_url": {
"url": "https://live.staticflickr.com/3851/14825276609_098cac593d_b.jpg"
}
}
]
}
]
})
})
.then(res => res.json())
.then(data => console.log(data.choices[0].message.content));
The HTTP-Referer and X-OpenRouter-Title headers are optional — they're used to attribute your app on OpenRouter's public rankings page, but they don't affect functionality.
Integrate with Your Apps
Once the basics work, you'll typically want to parameterize the model selection and handle responses cleanly. Here are production-ready patterns for both languages.
Python — model switcher pattern:
from openai import OpenAI
import os
FREE_MODELS = {
"fast": "google/gemma-3-4b-it:free",
"balanced": "google/gemma-3-12b-it:free",
"quality": "google/gemma-3-27b-it:free",
}
client = OpenAI(
base_url="https://openrouter.ai/api/v1",
api_key=os.environ["OPENROUTER_API_KEY"],
)
def chat(prompt: str, tier: str = "balanced") -> str:
completion = client.chat.completions.create(
model=FREE_MODELS[tier],
messages=[{"role": "user", "content": prompt}],
)
return completion.choices[0].message.content
# Usage
reply = chat("Explain closures in JavaScript", tier="balanced")
print(reply)
The Python pattern uses the OpenAI SDK pointed at OpenRouter's base URL — no new dependencies, no new API surface to learn. You can extend it by reading the model tier from an environment variable, making it easy to switch models without touching application code.
Limits, Costs, and Reliability
Free models on OpenRouter are genuinely free — there's no hidden credit consumption for the :free variants. That said, there are a few things worth knowing before you build on top of them:
Rate limits. Free-tier models are subject to rate limits that are lower than paid usage. Limits vary per model and can change. OpenRouter's FAQ documents the current limits and how they're calculated. For most prototyping and personal projects, the limits are comfortable — for high-throughput production use, you'd want to move to a paid model or provider.
Availability. Because free models are served through shared infrastructure, you may occasionally hit higher latency during peak times. The 4b variant tends to be most reliably fast; the 27b is more sensitive to load.
No SLA. Free models don't come with uptime guarantees. If reliability matters for your use case, OpenRouter's paid routing features (including automatic provider fallback) are worth looking at.
Models come and go. The set of available free models changes over time. What's free today might not be free next month. Building your model selection behind a config variable (rather than hardcoding it deep in logic) makes it easy to swap out.
Summary
OpenRouter makes free LLM access practical for developers. You get a single API key, an OpenAI-compatible endpoint, and a rotating set of capable models — all without a credit card. The Gemma 3 family is a solid starting point: 4b for speed, 12b for everyday tasks, 27b when output quality matters most.
The integration pattern is minimal — just swap your OpenAI base URL and key, and you're running. From there, parameterizing the model selection behind a config gives you the flexibility to upgrade or swap models as the free tier evolves.
If you're starting a new project and want to prototype quickly without worrying about API costs, OpenRouter's free tier is worth making your default starting point.
