Showing 21 of 21 models
Claude Haiku 4.5
anthropic/claude-haiku-4.5
Anthropic's fastest and most affordable Claude model — ideal for low-latency, high-volume tasks
Claude Opus 4.6
anthropic/claude-opus-4.6
Anthropic's most intelligent model — state-of-the-art reasoning, extended thinking, 1M-token context, and top-tier coding capabilities
Claude Sonnet 4.6
anthropic/claude-sonnet-4.6
Anthropic's flagship production model — 1M context, best balance of intelligence and speed
DeepSeek R1 Distill Qwen 32B
deepseek/deepseek-r1-distill-qwen-32b
DeepSeek R1 distilled into Qwen 32B — strong reasoning at zero cost on Cloudflare edge
DeepSeek V3.2
deepseek/deepseek-v3.2
DeepSeek's flagship open-source MoE model — strong reasoning, coding, and math
Gemini 2.5 Flash
google/gemini-2.5-flash
Google's fast multimodal model with 1M context, vision, and audio I/O
Gemini 3 Flash (Preview)
google/gemini-3-flash-preview
Google's next-generation Gemini 3 Flash — 1M context, enhanced reasoning and multimodal
Gemini 3.1 Flash Lite (Preview)
google/gemini-3.1-flash-lite-preview
Ultra-affordable Gemini 3.1 Flash Lite — very cheap, 1M context, multimodal
Gemma 3 12B
google/gemma-3-12b-it
Google's Gemma 3 12B — compact and efficient, free on Cloudflare Workers AI edge
Gemma 4 26B
google/gemma-4-26b-a4b-it
Google's Gemma 4 26B — next-generation open model with improved reasoning and efficiency
Gemma 4 31B
google/gemma-4-31b-it
Google's Gemma 4 31B — larger open model with stronger reasoning, instruction following, and multilingual performance
Llama 3.1 405B
meta-llama/llama-3.1-405b-instruct
Flagship open-source model matching frontier performance at 128k context
Llama 3.1 70B
meta-llama/llama-3.1-70b-instruct
Best open-source model for most production use cases with 128k context
Llama 3.1 8B Instruct
meta-llama/llama-3.1-8b-instruct
Meta's Llama 3.1 8B — lightweight, fast, and free via Cloudflare Workers AI edge
Llama 3.3 70B Instruct
meta-llama/llama-3.3-70b-instruct
Meta's Llama 3.3 70B — high-quality open model, available free on Cloudflare Workers AI
Mistral Small 3.1 24B
mistralai/mistral-small-3.1-24b-instruct
Mistral Small 3.1 — efficient 24B model with vision, free on Cloudflare Workers AI
GPT-4o Mini
openai/gpt-4o-mini
Cost-efficient small model that outperforms GPT-3.5 Turbo on most benchmarks
GPT-4o
openai/gpt-4o
Omni model with native vision, audio, and text at GPT-4 intelligence level
Qwen 2.5 72B Instruct
qwen/qwen2.5-72b-instruct
Alibaba's Qwen 2.5 72B — strong multilingual reasoning, free on Cloudflare Workers AI
Qwen 3.5 9B
qwen/qwen3.5-9b
Alibaba's Qwen 3.5 9B — compact, efficient, and very affordable reasoning model
Step 3.5 Flash
stepfun/step-3.5-flash
StepFun Step 3.5 Flash — fast, affordable model with strong multilingual support