Which free AI coding tools work without a VPN in China?

SiliconFlow (20 million tokens on signup), Alibaba Bailian (70 million tokens across 70+ models), and Zhipu GLM (20 million tokens) offer domestic endpoints. Pair them with OpenCode or Codex CLI by pointing base_url to the provider API—no proxy required.

The 2026 Complete Guide to Getting Free AI Coding Tokens: Gemini CLI, Codex, OpenCode, Copilot and More

In June 2026, free AI coding capacity is larger than most teams assume: Gemini CLI OAuth delivers 1,000 requests per day, Codex CLI runs on ChatGPT free accounts for a limited window, and OpenCode routes through domestic APIs at zero subscription cost. This guide gives you a quota decision matrix, five setup steps, token-saving tactics, and a path to keep agents online on a remote Mac before the June 18, 2026 Gemini CLI cutoff.

1. Three pain points: quota illusion, network gates, intermittent uptime

Pain point one: quota illusion. Many tools advertise “free” while hiding hard ceilings. GitHub Copilot Free and Cursor Hobby both cap premium agent calls at 50 per month—roughly one or two serious refactoring sessions if you rely on multi-file edits. Gemini CLI’s OAuth tier looks generous at 1,000 requests per day, but that number is separate from AI Studio API keys and from Antigravity CLI’s post-migration 20-request free tier documented in our Gemini CLI policy guide. Treat RPM (requests per minute) and RPD (requests per day) as independent budgets; hitting 429 errors mid-sprint usually means you skipped the fine print, not that the service is broken.

Pain point two: network gates. Gemini CLI and Codex CLI assume stable outbound access to Google and OpenAI endpoints. Developers in regions without direct routes pay latency and reliability tax through proxies—cost that never appears on a pricing page. Domestic providers remove that friction: SiliconFlow, Alibaba Bailian, and Zhipu GLM expose OpenAI-compatible APIs on mainland infrastructure. OpenCode and Codex CLI accept custom base_url values, so “free” should mean free after network costs, not free plus a VPN subscription you forgot to amortize.

Pain point three: intermittent uptime. CLI agents, OpenClaw gateways, and Telegram bots fail quietly when the host sleeps. A closed laptop lid kills OAuth refresh loops, aborts long-running refactors, and leaves chat channels in “delivered but no reply” states. Free tokens cannot compensate for a host that is offline twelve hours a day. The fix is architectural: move agents to an always-on Apple Silicon node with launchd supervision, then sync workspaces over SFTP or rsync—the same pattern we document for OpenClaw on rented Mac mini M4.

One deadline overrides all three pain points: on June 18, 2026, Google ends bundled Gemini CLI access for personal subscribers (free, Pro, and Ultra). That is the last window to harvest OAuth-backed 1,000-request days before you must pivot to Antigravity CLI, a paid API key, or domestic alternatives. If you have not logged in yet, treat the next nine days as a migration sprint, not an open-ended trial.

2. 2026 free-tier decision matrix (reference table)

Use this matrix to pick a primary tool and a backup before you install anything. Numbers reflect public tiers as of June 2026; verify on each vendor’s billing page before production lock-in.

Tool / platform	Free quota	Direct access (no VPN)	Best reason to use it
Gemini CLI (OAuth)	1,000 req/day, 60 req/min	Requires proxy in CN	No card; use before 6/18 cutoff
Codex CLI	ChatGPT free account (limited window)	Yes with domestic API endpoint	OS-level sandbox; GPT-5.3-Codex class models
OpenCode	Client free; pay per API call	Yes when API is domestic	75+ providers; large community footprint
OpenClaw	Gateway free; model cost via API	Depends on provider	Multi-channel agents (Telegram, WeChat ClawBot)
GitHub Copilot Free	2,000 completions + 50 premium/mo	Yes	Students get full Pro via verification
Cursor Hobby	2,000 Tab + 50 slow premium/mo	Yes	Full VS Code fork with agent panel
SiliconFlow	20M tokens (signup grant)	Yes (CN)	DeepSeek, Qwen, GLM-5 routing
Alibaba Bailian	70M tokens (70+ models)	Yes (CN)	Qwen 3.5 family coverage
Zhipu GLM	20M tokens (signup grant)	Yes (CN)	Strong Chinese-English codegen

A practical split: developers with reliable Google access run Gemini CLI primary + SiliconFlow backup; proxy-free regions run OpenCode + Bailian + SiliconFlow dual keys. IDE-centric workflows stack Cursor Hobby + Copilot Free for inline completion while terminal agents handle repo-wide refactors. Automation teams add OpenClaw when messages must arrive on Telegram or WeChat—see our ClawBot install matrix for channel-specific gates.

3. Domestic API providers and endpoint wiring

SiliconFlow grants 20 million tokens at registration with no expiry on the promotional balance for many accounts; Alibaba Bailian advertises 70 million tokens across more than 70 models including the Qwen 3.5 line; Zhipu AI ships roughly 20 million tokens for new GLM API users. All three expose OpenAI-compatible HTTPS endpoints, which matters because Codex CLI and OpenCode already speak that protocol—you are swapping hostname and key, not rewriting agent logic.

For Codex CLI, point the client at SiliconFlow by editing ~/.codex/config.toml:

# ~/.codex/config.toml excerpt
openai_base_url = "https://api.siliconflow.cn/v1"
# Set OPENAI_API_KEY in your shell profile to the SiliconFlow key

Export the key, then validate:

export OPENAI_API_KEY="sk-..."
codex doctor

OpenCode users can run /connect inside the TUI or edit ~/.config/opencode/config.json to register Bailian or Zhipu providers. OpenClaw operators inject the same keys into openclaw.json provider blocks—details in the OpenClaw installation guide. Keep keys out of git: use environment variables on the host and sync only redacted config templates over SFTP.

Model choice affects burn rate. Flash-class models on Bailian or SiliconFlow cost fewer tokens per turn than flagship reasoning models; start with them for exploration commands and reserve large models for merge-blocking reviews. Domestic quotas look huge on paper, but a single /init-style repository scan can consume hundreds of thousands of tokens in one shot—avoid that pattern entirely (covered in Section 6).

4. Six CLI and IDE free tiers at a glance

Gemini CLI installs with npm i -g @google/gemini-cli, authenticates through browser OAuth, and enforces 1,000 daily requests at 60 RPM until June 18, 2026. It excels at terminal-native exploration with Google’s latest Gemini models and requires no billing account for the bundled tier. After the cutoff, Antigravity CLI’s free layer drops to 20 requests per day—a 98% reduction—so treat current OAuth access as a deprecating asset, not a permanent entitlement.

Codex CLI from OpenAI targets developers who want an OS-sandboxed coding agent with ChatGPT login. The free ChatGPT tier access is time-boxed in 2026 marketing; pairing the binary with domestic API keys removes dependence on OpenAI’s consumer login entirely while keeping sandbox semantics. Run codex in a git worktree you are willing to let the agent modify.

OpenCode is provider-agnostic terminal software: zero subscription, pay only what your API bills. With 75+ integrated providers and a large open-source community, it is the default answer when Gemini OAuth disappears and you refuse closed-source Antigravity terms. Configuration is JSON-first and plays well with multiple backup keys.

OpenClaw is not a model vendor—it is a gateway that routes inbound messages from Telegram, Slack, or WeChat ClawBot to whichever CLI provider you configure. Budget for API tokens separately, but the gateway software itself costs nothing. Teams already running OpenClaw should add domestic keys now so June 18 does not take down every channel at once.

GitHub Copilot Free delivers 2,000 inline completions and 50 premium chat requests monthly inside VS Code, JetBrains, and Neovim. Verified students and maintainers of popular open-source repos can upgrade to Copilot Pro at no charge—worth the fifteen-minute verification if you live in IDE completions. Copilot does not replace terminal agents for whole-repo refactors; pair it with CLI tools instead of choosing one.

Cursor Hobby mirrors Copilot’s ceiling—2,000 Tab completions and 50 slow premium requests per month—in a VS Code fork with a stronger agent sidebar. It is ideal for evaluating Cursor’s UX before paying for Pro. Heavy users should read our Cursor Agent Skills guide and offload batch work to Gemini or Codex on a remote host. Routing multiple models through OpenRouter is covered in the OpenRouter CLI ranking article if you outgrow single-vendor free tiers.

5. Five-step setup HowTo

Pick primary and backup from the matrix. If Google endpoints are reachable, declare Gemini CLI primary and register SiliconFlow as backup. If not, standardize on OpenCode with Bailian primary and SiliconFlow secondary. Write the choice in your team README so nobody installs a third CLI silently.
Obtain credentials before June 18. Complete Gemini OAuth in a browser session tied to the machine that will run nightly jobs—refresh tokens bind to environment. Register domestic APIs with phone or corporate verification; store keys in 1Password or macOS Keychain, not plaintext dotfiles in git.
Install CLIs and pin config. Global npm installs for Gemini and Codex; OpenCode via their installer script. Set model IDs explicitly—gemini-2.5-pro vs flash tiers changes cost by an order of magnitude. Run gemini --version, codex doctor, or OpenCode’s health check before declaring victory.
Move agents to an always-on Mac. Laptop hosts fail OAuth refresh after sleep; launchd on a Mac mini or rented remote Mac keeps gateways alive. Install OpenClaw or bare CLI agents there, restrict filesystem permissions per our production workspace guides, and expose SSH plus SFTP for operators.
Sync code and config over SFTP or rsync. Edit locally, execute remotely, and treat ~/.openclaw, ~/.codex, and project trees as sync targets with exclude files for node_modules and secrets. Atomic releases beat live-editing production config over SSH when something breaks at 2 a.m.

After step five, run a deliberate burn test: execute twenty representative agent turns (read file, edit, run tests) and multiply by your expected daily sessions. If projected RPD exceeds 80% of any free tier, switch backup keys automatically via OpenCode provider priority or OpenClaw fallback providers before users notice latency spikes.

6. Token-saving tactics and backup strategy

Free tiers reward discipline more than clever prompts. Scope files narrowly: pass explicit paths instead of repo roots. One file per request beats “fix the whole service” monoliths that balloon context windows. Start with flash models for discovery; promote to pro or reasoning models only when tests fail. On Bailian and SiliconFlow, flash SKUs can be ten times cheaper per million tokens than flagship entries—check the dashboard unit price before defaulting to the largest name.

Never run repository-wide init scans. Claude Code’s /init, Codex bulk indexing, and similar “learn the codebase” commands ingest every file once. On a 500-file TypeScript monorepo that single command can burn 200,000–400,000 tokens—potentially 1–2% of a 20M SiliconFlow grant in minutes. Maintain a hand-written AGENTS.md or Cursor skill file instead; our Skills article shows the lighter pattern.

Set quota alerts at 80%. Domestic consoles expose usage graphs; Gemini and Copilot show monthly caps in settings. Automate a webhook or email when daily burn crosses 800 of 1,000 Gemini requests so you switch to backup keys before hard 429s. Rotate providers by time of day: Gemini OAuth during US morning hours, domestic APIs during CN peak latency windows—smooths rate limits without paying for burst capacity.

Pre-June 18 sprint: schedule batch refactors, documentation passes, and test generation while Gemini OAuth still delivers 1,000 requests daily. Parallel-register Bailian and SiliconFlow even if you do not need them yet; signup grants do not require immediate consumption. Post-cutoff, Antigravity’s 20-request free tier is a tasting menu, not a workload plan—budget paid keys or domestic APIs for anything that runs unattended.

7. FAQ

Q: Is Gemini CLI completely free? Yes, today. OAuth with a normal Google account yields 1,000 requests per day and 60 per minute without a credit card. Google stops bundled Gemini CLI for personal plans on June 18, 2026. After that, use Antigravity CLI (closed source, 20 free requests/day), bring your own Gemini API key to the open-source CLI, or pivot to OpenCode with domestic providers.

Q: Which free tools work without a VPN in mainland China? SiliconFlow (20M tokens), Alibaba Bailian (70M tokens), and Zhipu GLM (20M tokens) all offer mainland endpoints. Wire them into OpenCode or Codex CLI via OpenAI-compatible URLs. GitHub Copilot, Cursor Hobby, and OpenClaw gateways also operate without Google access if model keys point domestically.

Q: Is Cursor Hobby enough for daily development? For evaluation, yes; for production agent loops, no. Fifty slow premium requests per month exhausts quickly when each refactor spans multiple files. Use Cursor for Tab completion, push heavy agent sessions to Gemini CLI or Codex on a remote Mac, and apply for student Pro if eligible. Copilot Free carries the same 50-request premium ceiling—stacking both IDEs doubles inline completion but not agent depth.

Q: How do OpenClaw and OpenCode differ? OpenCode is a terminal client you invoke manually or from scripts. OpenClaw is a persistent gateway that connects chat channels to agents. Many teams run OpenClaw on a server with OpenCode or Codex as the execution backend. Budget API tokens for both; neither includes model inference in the software price.

Q: What happens on June 19 if I did nothing? Gemini CLI OAuth calls start returning auth or quota errors for personal subscribers. OpenClaw Telegram bots and cron-driven scripts fail together unless you already configured fallback providers. The recovery cost is higher than spending one afternoon on steps 1–5 above—treat the deadline like a certificate expiry, not a surprise maintenance window.

8. Summary: free tokens are fuel; an always-on Mac is the engine

2026 delivers an unusually rich set of zero-subscription AI coding tools—Gemini CLI’s 1,000 daily requests, Codex and OpenCode with domestic APIs, Copilot and Cursor for IDE-native completion, and OpenClaw for channel automation. The value is real, but it depends on three aligned factors: quota headroom (know your RPM/RPD), network path (domestic APIs when Google or OpenAI routes are costly), and host uptime (agents that sleep when your laptop lid closes).

Local machines hit hidden limits fast. OAuth refresh breaks after sleep, Windows and WSL hosts suspend background daemons, and low-memory VPS instances OOM during parallel tool calls—symptoms look like “AI stopped responding” when the root cause is infrastructure. Moving CLI agents and OpenClaw gateways to an always-on Apple Silicon remote Mac with launchd supervision, then syncing workspaces over SFTP or rsync, converts promotional token grants into dependable daily output.

SFTPMAC remote Mac rental targets exactly this stack: 7×24 nodes tuned for OpenClaw, Codex, Gemini CLI, and Cursor remote workflows—native macOS toolchains, stable outbound networking for OAuth, and SFTP-first workspace sync aligned with the runbooks in our Gemini policy and OpenRouter CLI articles. If you are harvesting free tiers before June 18, host where the tokens actually get spent—not on a machine that powers down at dinner.