2026 Hermes Agent Skills Advanced Guide: SKILL.md, Bundles & GEPA Self-Evolution Decision Guide
In early 2026 Nous Research shipped Hermes Agent, and within two months the project crossed 160,000 GitHub stars. The headline is not a bigger model—it is "the agent that grows with you." Under that promise sits a standardized, evolvable, cross-session Skills system. This guide skips first-time install (see our Hermes Agent step-by-step install guide) and goes straight to the architecture decisions: SKILL.md format, Progressive Disclosure, Skill Bundles, conditional activation, Tap publishing, and how GEPA + DSPy turn skills into assets that improve with every run.
1. Three pain points: why the Hermes Skills system deserves deep study
Many developers treat Hermes as another chat wrapper and miss the module with the highest long-term ROI:
- Runaway token cost: stuffing SOPs into the system prompt burns tokens on every turn. Progressive Disclosure keeps skills at zero cost until activated.
- Knowledge that never compounds: one-off prompts vanish with the session. Skills are cross-session procedural memory—versionable, shareable, and evolvable.
- Workflows that need repeated slash commands: complex tasks require multiple
/skill-nameinvocations. Skill Bundles pack related skills into one slash command.
Advanced readers should answer five questions before shipping: How does Progressive Disclosure control tokens? How do conditional activations work? How do Bundles trigger full workflows? How does DSPy + GEPA auto-improve Skills? What community repos are worth installing today? The sections below address each one.
2. Core concepts: Skills are not Prompts, Skills are not Memory
Confusing the three leads to wrong architecture choices. A useful mnemonic: Prompt = sticky note (valid for this turn); Memory = notebook (permanent notes, always nearby); Skill = SOP manual (step-by-step procedure, opened when needed).
| Dimension | Plain Prompt | Memory | Skills |
|---|---|---|---|
| Persistence | Current conversation | Cross-session, permanent | Cross-session, permanent |
| Load timing | Always in context | Injected every session | On demand (key difference) |
| Token cost | Every turn | Small and stable | Zero until activated |
| Content type | Any intent description | User preferences / facts | Procedural steps (how to do something) |
| Maintainer | User manually | Agent automatically | User and Agent |
| Shareability | Awkward | Private | Publishable as community Tap |
3. SKILL.md format and Progressive Disclosure
3.1 Base structure (agentskills.io open standard)
All Hermes Skills follow the agentskills.io open standard, ensuring portability across Hermes, Claude Code, and Cursor.
---
name: my-skill # Required: lowercase + hyphens, max 64 chars
description: | # Required: max 1024 chars; start with "Use when..."
Use when the user needs to [...].
Handles [...] and [...].
version: 1.0.0
license: MIT
compatibility: Requires git, docker
allowed-tools: Bash(git:*) Read
metadata:
hermes:
tags: [devops, automation]
category: software-development
related_skills: [github-pr-workflow, test-driven-development]
requires_toolsets: [terminal]
fallback_for_toolsets: [web]
---
# My Skill Title
## Overview
## When to Use
## Procedure
## Common Pitfalls
## Verification Checklist
3.2 Skill directory layout (modular design)
~/.hermes/skills/
└── my-category/
└── my-skill/
├── SKILL.md # Main file (core steps; aim for ≤500 lines)
├── references/ # API references (loaded on demand)
├── templates/ # Reusable templates
└── scripts/ # Scripts the agent can execute directly
3.3 Progressive Disclosure: three load levels (token control core)
| Level | Content | Trigger | Token cost |
|---|---|---|---|
| Level 0 | name + description |
Every session start, all skills | ~3K tokens (all skills combined) |
| Level 1 | Full SKILL.md body | User /skill-name or LLM decides needed |
Depends on file length |
| Level 2 | references/ and scripts/ files |
LLM decides during execution | Per file, on demand |
Writing tip: description is the only Level 0 signal—the LLM uses it to decide whether to load the full skill. Clarify when to use rather than what it is.
4. Skill Bundles: one command triggers a complete workflow
Skill Bundles are a 2026 Hermes addition and remain underused. A Bundle is lightweight YAML that packs multiple related skills into one slash command. Running /bundle-name loads every listed skill at once.
File location: ~/.hermes/skill-bundles/<slug>.yaml
name: backend-dev
description: |
Full backend feature workflow — code review, TDD, and PR management.
skills:
- github-code-review
- test-driven-development
- github-pr-workflow
instruction: |
Always write failing tests first before implementation.
Never push directly to main.
Advanced scenarios:
- AI researcher workflow (
research-session): arxiv + deep-research + plan + excalidraw—query recent papers and sketch architecture each session. - MLOps deploy pipeline (
mlops-deploy): vllm + llama-cpp + github-pr-workflow + systematic-debugging—benchmark inference before and after deploy and log quantization settings.
Bundle priority rules:
- When a Bundle and a single Skill share a name, the Bundle wins.
- Missing skills in a Bundle are skipped without error; the loader reports what was absent.
- Bundles do not alter the system prompt, so they stay Prompt Cache friendly.
hermes bundles create backend-dev \
--skills github-code-review,test-driven-development,github-pr-workflow \
--instruction "Always write failing tests first"
5. Conditional activation: environment-aware skills
Skills can auto-show or hide based on which tools are available in the current session. Configure under metadata.hermes:
metadata:
hermes:
requires_toolsets: [web]
requires_tools: [web_search]
fallback_for_toolsets: [browser]
fallback_for_tools: [browser_navigate]
| Field | Behavior |
|---|---|
| requires_toolsets | Hide this skill when listed toolsets are missing |
| requires_tools | Hide this skill when listed tools are missing |
| fallback_for_toolsets | Hide when listed toolsets exist (acts as fallback) |
| fallback_for_tools | Hide when listed tools exist |
Classic pattern: free vs paid search. Set duckduckgo-search with fallback_for_tools: [web_search]: when FIRECRAWL_KEY or BRAVE_SEARCH_KEY is configured, paid web_search activates and the DuckDuckGo skill disappears to save tokens; when the API is unavailable, the fallback surfaces automatically.
Platform-aware example: telegram-notify can set requires_toolsets: [messaging] and platforms: [telegram, discord]. The hermes skills TUI lets you toggle skills independently for CLI, Telegram, and Discord.
6. Skills Hub and open-source ecosystem
6.1 Official install channels
hermes skills install official/research/arxiv
hermes skills install https://example.com/SKILL.md --name my-skill
hermes skills install github:openai/skills/k8s
hermes skills tap add github:my-org/my-skills
6.2 Repos worth bookmarking
| Repository | Description | Highlights |
|---|---|---|
| ChuckSRQ/awesome-hermes-skills | Curated production-grade skills | Deep Research, MLOps, Apple integrations; 23 skills with GitHub Copilot |
| amanning3390/hermeshub | Community skill registry | Security scanning, API, and marketplace features |
| kevinnft/ai-agent-skills | 191 skills, 28 categories | One-command install for Hermes, Claude Code, and Cursor |
| NousResearch/hermes-agent | Official main repo | Authoritative source with all built-in Skills and authoring specs |
The agentskills.io open standard means Skills work across Hermes, Claude Code, Cursor, and OpenCode. Validate compliance with skills-ref validate ./my-skill.
7. Publishing your own Skill Tap: team and community sharing
Create a GitHub repo as a Tap so your team—or the wider community—can subscribe to your skill set. This is one of the least-documented advanced techniques.
my-skills-tap/
├── skills.sh.json
├── mlops/vllm-deploy/SKILL.md
├── research/paper-summarizer/SKILL.md
└── README.md
skills.sh.json controls Hub category display. Team deployment:
hermes skills tap add github:your-org/your-skills-tap
hermes skills tap add github:your-org/private-skills --token $GH_TOKEN
hermes skills tap update
hermes skills tap list
Version management tip: put ~/.hermes/skills/ under Git. Across devices, git pull && hermes skills reset syncs and rebuilds built-in skills.
8. Self-evolving skills with GEPA + DSPy
This is Hermes's most distinctive capability versus peer tools. GEPA (Genetic-Pareto Prompt Evolution) is an ICLR 2026 Oral result, integrated in hermes-agent-self-evolution. The idea: do not fine-tune model weights—analyze execution traces, generate variants, and run multi-objective Pareto optimization on SKILL.md text. Each optimization run costs roughly $2–10 (API calls only; no GPU required).
8.1 GEPA five-stage evolution pipeline
- Stage 1 — trace collection: read full reasoning traces from SQLite (tool calls, branches, errors).
- Stage 2 — reflective failure analysis: LLM produces actionable side information—not just "it failed," but why.
- Stage 3 — targeted mutation: generate 10–20 SKILL.md variants aimed at failure causes.
- Stage 4 — multi-objective Pareto evaluation: optimize success rate, token efficiency, and speed together.
- Stage 5 — human review PR: best variant becomes a PR; ships after human approval.
8.2 Quick start
git clone https://github.com/NousResearch/hermes-agent-self-evolution
cd hermes-agent-self-evolution && pip install -r requirements.txt
export HERMES_AGENT_PATH=~/.hermes
python -m evolution.skills.evolve_skill \
--skill github-code-review --iterations 10 --eval-source synthetic
python -m evolution.skills.evolve_skill \
--skill github-code-review --iterations 10 --eval-source sessiondb
8.3 Four safety guardrails
- Full test suite:
pytest tests/ -qmust pass 100%. - Size limits: Skills ≤ 15KB; tool descriptions ≤ 500 characters.
- Prompt Cache compatibility: no mid-session edits that invalidate cache.
- Semantic preservation: variants must not drift from the skill's core purpose.
8.4 Five-phase evolution roadmap (official status)
| Phase | Target | Engine | Status |
|---|---|---|---|
| Phase 1 | Skill files (SKILL.md) | DSPy + GEPA | Shipped |
| Phase 2 | Tool descriptions | DSPy + GEPA | Planned |
| Phase 3 | System prompt fragments | DSPy + GEPA | Planned |
| Phase 4 | Tool implementation code | Darwinian Evolver | Planned |
| Phase 5 | Continuous improvement loop | Automated pipeline | Planned |
Cross-platform trace fusion: because Skills follow agentskills.io, you can feed Claude Code or Gemini CLI traces into GEPA:
python -m evolution.skills.evolve_skill \
--skill github-code-review --iterations 10 --eval-source mixed \
--trace-dirs ~/.claude/traces,~/.hermes/sessions
9. Plugin skills: extending Hermes boundaries
Plugins package skills under a namespace (plugin:skill): they do not appear in the default skills_list (less noise), activate only on explicit user invocation (opt-in), and skills inside a plugin can reference each other.
skill_view("superpowers:writing-plans")
# Loading also surfaces sibling skills in the same plugin
Declare in the plugin plugin.yaml:
name: my-hermes-plugin
skills:
- name: writing-plans
path: skills/writing-plans/SKILL.md
- name: editing
path: skills/editing/SKILL.md
10. Advanced authoring tips (engineer view)
10.1 description drives activation precision
Too vague: Helps with code. Better: Use when reviewing a pull request... Do NOT use for writing new code.
10.2 Pitfalls separate good skills from noise
High-quality Pitfalls name specific failure modes, root causes, and fix steps—fragile CSS selectors, GitHub API rate limits, large diff token overflow, and chunking strategies to handle them.
10.3 Scripting and skill_manage
Reference scripts/extract_schema.py in Procedure; on failure, load references/manual-extract.md. Agents can maintain skills via skill_manage(action='patch'| 'create'). Set skills.agent_writes_require_approval: true in config.yaml for a human approval gate.
10.4 Skill size control
| Skill size | Recommendation |
|---|---|
| < 500 lines | Keep everything in SKILL.md |
| 500–1000 lines | Move deep reference to references/ |
| > 1000 lines | Split strongly; consider two skills |
| > 15KB | Exceeds GEPA limit; must split |
11. Case study: building Skills for your tech blog workflow
Build a blog-workflow Bundle packing seo-keyword-research, outline-generator, code-example-validator, bilingual-checker, and publish-to-platform:
name: blog-workflow
description: Full tech blog writing workflow.
skills:
- seo-keyword-research
- outline-generator
- code-example-validator
- bilingual-checker
- publish-to-platform
instruction: |
Always research SEO keywords before writing.
Ensure all code examples are tested and runnable.
Generate both Chinese and English title options.
A custom seo-keyword-research skill should query long-tail terms at session start ("how to X," "X vs Y," platform-specific agent terminology), output 3–5 primary keywords plus 10–15 long-tail entries, and note terminology differences across locales.
12. FAQ
Q: What is the difference between Skills and MCP?
Skills are procedural knowledge documents; MCP is a tool interface. MCP provides database access; a Skill teaches the agent how to run a migration correctly. They complement each other.
Q: Why does the agent still use an old Skill after I edited it?
Run /reset for a new session, or reinstall with --now (which breaks Prompt Cache).
Q: Are GEPA-evolved skills safe?
Four guardrails plus human PR review; semantic drift checks prevent purpose drift. Still review every diff before merge.
Q: How do I reuse Hermes Skills in Claude Code?
Copy to ~/.claude/skills/ or use the kevinnft/ai-agent-skills multi-platform installer.
Q: Does Chinese content in Skills hurt token efficiency?
Chinese runs roughly 1–1.5 tokens per character, similar to English density. Keep description in English (or bilingual) for sharper LLM matching.
13. Resources and remote Mac 7x24 decision
13.1 Official and community resources
- Hermes Agent official docs · Chinese docs
- Skills system reference · agentskills.io open standard
- GEPA self-evolution tool · GEPA algorithm · DSPy framework
- Related on SFTPMAC: Cursor Agent Skills complete guide · OpenRouter CLI and Hermes selection guide
13.2 Deployment scenario decision matrix
| Scenario | Local Mac / laptop | Remote Mac 7x24 (SFTPMAC) |
|---|---|---|
| GEPA evolution + sessiondb | Lid close breaks traces; incomplete samples | Continuous session traces; richer evolution data |
| Telegram/Discord Gateway | Sleep and Wi-Fi drops take the bot offline | launchd supervision keeps Gateway online |
| Team Tap sync | Each developer's ~/.hermes drifts | Unified node + SFTP/rsync for skills directory |
| Skill Bundles long sessions | Memory and tokens compete with other apps | Apple Silicon unified memory; stable multi-skill runs |
13.3 Summary: from local experiments to production agent nodes
This guide covered the full Hermes Skills stack: concept comparison, SKILL.md and Progressive Disclosure, Skill Bundles, conditional activation, Hub ecosystem, Tap publishing, GEPA self-evolution, Plugin extensions, authoring tips, and a blog workflow case study. Mastering it upgrades your agent from a disposable prompt into a versionable, shareable, self-improving procedural asset.
Local Hermes has clear limits: laptop lid close breaks connectivity, GEPA sessiondb traces stay incomplete, Telegram Gateway drops when the machine sleeps, and team Taps drift across developer machines. Teams that need 7x24 uptime for evolution traces, Gateway hosting, or unified ~/.hermes/skills/ sync should run Hermes on an Apple Silicon remote Mac—native launchd supervision, macOS-aligned toolchains, and SFTP/rsync for secure skills directory sync.
SFTPMAC remote Mac rental targets Hermes Agent Skills workflows: 7x24 Gateway, continuous GEPA trace collection, and team Tap directory sync—a better production entry than a home Mac pulling double duty. Start with your first SKILL.md today and let the agent compound from every session.