What is the difference between Hermes Skills and MCP?

Skills are procedural knowledge documents that teach an agent how to complete a task class; MCP is a tool interface that exposes additional callable tools. They complement each other: MCP provides database access, a Skill teaches the agent how to run a migration correctly.

Why does my Skill change not take effect?

The current session cached the old version. Start a new session with /reset, or reinstall with --now to force refresh (which breaks Prompt Cache and increases token spend).

Are GEPA-evolved skills safe to deploy?

They must pass the full test suite, 15KB size limit, Prompt Cache compatibility, and semantic preservation checks. The best variant ships as a PR awaiting human review before production.

Can Hermes Skills be reused in Claude Code or Cursor?

Yes. Skills follow the agentskills.io open standard. Copy SKILL.md to ~/.claude/skills/ or use kevinnft/ai-agent-skills for one-command multi-platform install.

2026 Hermes Agent Skills Advanced Guide: SKILL.md, Bundles & GEPA Self-Evolution Decision Guide

In early 2026 Nous Research shipped Hermes Agent, and within two months the project crossed 160,000 GitHub stars. The headline is not a bigger model—it is "the agent that grows with you." Under that promise sits a standardized, evolvable, cross-session Skills system. This guide skips first-time install (see our Hermes Agent step-by-step install guide) and goes straight to the architecture decisions: SKILL.md format, Progressive Disclosure, Skill Bundles, conditional activation, Tap publishing, and how GEPA + DSPy turn skills into assets that improve with every run.

1. Three pain points: why the Hermes Skills system deserves deep study

Many developers treat Hermes as another chat wrapper and miss the module with the highest long-term ROI:

Runaway token cost: stuffing SOPs into the system prompt burns tokens on every turn. Progressive Disclosure keeps skills at zero cost until activated.
Knowledge that never compounds: one-off prompts vanish with the session. Skills are cross-session procedural memory—versionable, shareable, and evolvable.
Workflows that need repeated slash commands: complex tasks require multiple /skill-name invocations. Skill Bundles pack related skills into one slash command.

Advanced readers should answer five questions before shipping: How does Progressive Disclosure control tokens? How do conditional activations work? How do Bundles trigger full workflows? How does DSPy + GEPA auto-improve Skills? What community repos are worth installing today? The sections below address each one.

2. Core concepts: Skills are not Prompts, Skills are not Memory

Confusing the three leads to wrong architecture choices. A useful mnemonic: Prompt = sticky note (valid for this turn); Memory = notebook (permanent notes, always nearby); Skill = SOP manual (step-by-step procedure, opened when needed).

Dimension	Plain Prompt	Memory	Skills
Persistence	Current conversation	Cross-session, permanent	Cross-session, permanent
Load timing	Always in context	Injected every session	On demand (key difference)
Token cost	Every turn	Small and stable	Zero until activated
Content type	Any intent description	User preferences / facts	Procedural steps (how to do something)
Maintainer	User manually	Agent automatically	User and Agent
Shareability	Awkward	Private	Publishable as community Tap

3. SKILL.md format and Progressive Disclosure

3.1 Base structure (agentskills.io open standard)

All Hermes Skills follow the agentskills.io open standard, ensuring portability across Hermes, Claude Code, and Cursor.

---
name: my-skill                    # Required: lowercase + hyphens, max 64 chars
description: |                    # Required: max 1024 chars; start with "Use when..."
  Use when the user needs to [...].
  Handles [...] and [...].
version: 1.0.0
license: MIT
compatibility: Requires git, docker
allowed-tools: Bash(git:*) Read
metadata:
  hermes:
    tags: [devops, automation]
    category: software-development
    related_skills: [github-pr-workflow, test-driven-development]
    requires_toolsets: [terminal]
    fallback_for_toolsets: [web]
---

# My Skill Title

## Overview
## When to Use
## Procedure
## Common Pitfalls
## Verification Checklist

3.2 Skill directory layout (modular design)

~/.hermes/skills/
└── my-category/
    └── my-skill/
        ├── SKILL.md              # Main file (core steps; aim for ≤500 lines)
        ├── references/           # API references (loaded on demand)
        ├── templates/            # Reusable templates
        └── scripts/              # Scripts the agent can execute directly

3.3 Progressive Disclosure: three load levels (token control core)

Level	Content	Trigger	Token cost
Level 0	`name` + `description`	Every session start, all skills	~3K tokens (all skills combined)
Level 1	Full SKILL.md body	User `/skill-name` or LLM decides needed	Depends on file length
Level 2	`references/` and `scripts/` files	LLM decides during execution	Per file, on demand

Writing tip: description is the only Level 0 signal—the LLM uses it to decide whether to load the full skill. Clarify when to use rather than what it is.

4. Skill Bundles: one command triggers a complete workflow

Skill Bundles are a 2026 Hermes addition and remain underused. A Bundle is lightweight YAML that packs multiple related skills into one slash command. Running /bundle-name loads every listed skill at once.

File location: ~/.hermes/skill-bundles/<slug>.yaml

name: backend-dev
description: |
  Full backend feature workflow — code review, TDD, and PR management.
skills:
  - github-code-review
  - test-driven-development
  - github-pr-workflow
instruction: |
  Always write failing tests first before implementation.
  Never push directly to main.

Advanced scenarios:

AI researcher workflow (research-session): arxiv + deep-research + plan + excalidraw—query recent papers and sketch architecture each session.
MLOps deploy pipeline (mlops-deploy): vllm + llama-cpp + github-pr-workflow + systematic-debugging—benchmark inference before and after deploy and log quantization settings.

Bundle priority rules:

When a Bundle and a single Skill share a name, the Bundle wins.
Missing skills in a Bundle are skipped without error; the loader reports what was absent.
Bundles do not alter the system prompt, so they stay Prompt Cache friendly.

hermes bundles create backend-dev \
  --skills github-code-review,test-driven-development,github-pr-workflow \
  --instruction "Always write failing tests first"

5. Conditional activation: environment-aware skills

Skills can auto-show or hide based on which tools are available in the current session. Configure under metadata.hermes:

metadata:
  hermes:
    requires_toolsets: [web]
    requires_tools: [web_search]
    fallback_for_toolsets: [browser]
    fallback_for_tools: [browser_navigate]

Field	Behavior
requires_toolsets	Hide this skill when listed toolsets are missing
requires_tools	Hide this skill when listed tools are missing
fallback_for_toolsets	Hide when listed toolsets exist (acts as fallback)
fallback_for_tools	Hide when listed tools exist

Classic pattern: free vs paid search. Set duckduckgo-search with fallback_for_tools: [web_search]: when FIRECRAWL_KEY or BRAVE_SEARCH_KEY is configured, paid web_search activates and the DuckDuckGo skill disappears to save tokens; when the API is unavailable, the fallback surfaces automatically.

Platform-aware example: telegram-notify can set requires_toolsets: [messaging] and platforms: [telegram, discord]. The hermes skills TUI lets you toggle skills independently for CLI, Telegram, and Discord.

6. Skills Hub and open-source ecosystem

6.1 Official install channels

hermes skills install official/research/arxiv
hermes skills install https://example.com/SKILL.md --name my-skill
hermes skills install github:openai/skills/k8s
hermes skills tap add github:my-org/my-skills

6.2 Repos worth bookmarking

Repository	Description	Highlights
ChuckSRQ/awesome-hermes-skills	Curated production-grade skills	Deep Research, MLOps, Apple integrations; 23 skills with GitHub Copilot
amanning3390/hermeshub	Community skill registry	Security scanning, API, and marketplace features
kevinnft/ai-agent-skills	191 skills, 28 categories	One-command install for Hermes, Claude Code, and Cursor
NousResearch/hermes-agent	Official main repo	Authoritative source with all built-in Skills and authoring specs

The agentskills.io open standard means Skills work across Hermes, Claude Code, Cursor, and OpenCode. Validate compliance with skills-ref validate ./my-skill.

7. Publishing your own Skill Tap: team and community sharing

Create a GitHub repo as a Tap so your team—or the wider community—can subscribe to your skill set. This is one of the least-documented advanced techniques.

my-skills-tap/
├── skills.sh.json
├── mlops/vllm-deploy/SKILL.md
├── research/paper-summarizer/SKILL.md
└── README.md

skills.sh.json controls Hub category display. Team deployment:

hermes skills tap add github:your-org/your-skills-tap
hermes skills tap add github:your-org/private-skills --token $GH_TOKEN
hermes skills tap update
hermes skills tap list

Version management tip: put ~/.hermes/skills/ under Git. Across devices, git pull && hermes skills reset syncs and rebuilds built-in skills.

8. Self-evolving skills with GEPA + DSPy

This is Hermes's most distinctive capability versus peer tools. GEPA (Genetic-Pareto Prompt Evolution) is an ICLR 2026 Oral result, integrated in hermes-agent-self-evolution. The idea: do not fine-tune model weights—analyze execution traces, generate variants, and run multi-objective Pareto optimization on SKILL.md text. Each optimization run costs roughly $2–10 (API calls only; no GPU required).

8.1 GEPA five-stage evolution pipeline

Stage 1 — trace collection: read full reasoning traces from SQLite (tool calls, branches, errors).
Stage 2 — reflective failure analysis: LLM produces actionable side information—not just "it failed," but why.
Stage 3 — targeted mutation: generate 10–20 SKILL.md variants aimed at failure causes.
Stage 4 — multi-objective Pareto evaluation: optimize success rate, token efficiency, and speed together.
Stage 5 — human review PR: best variant becomes a PR; ships after human approval.

8.2 Quick start

git clone https://github.com/NousResearch/hermes-agent-self-evolution
cd hermes-agent-self-evolution && pip install -r requirements.txt
export HERMES_AGENT_PATH=~/.hermes

python -m evolution.skills.evolve_skill \
    --skill github-code-review --iterations 10 --eval-source synthetic

python -m evolution.skills.evolve_skill \
    --skill github-code-review --iterations 10 --eval-source sessiondb

8.3 Four safety guardrails

Full test suite: pytest tests/ -q must pass 100%.
Size limits: Skills ≤ 15KB; tool descriptions ≤ 500 characters.
Prompt Cache compatibility: no mid-session edits that invalidate cache.
Semantic preservation: variants must not drift from the skill's core purpose.

8.4 Five-phase evolution roadmap (official status)

Phase	Target	Engine	Status
Phase 1	Skill files (SKILL.md)	DSPy + GEPA	Shipped
Phase 2	Tool descriptions	DSPy + GEPA	Planned
Phase 3	System prompt fragments	DSPy + GEPA	Planned
Phase 4	Tool implementation code	Darwinian Evolver	Planned
Phase 5	Continuous improvement loop	Automated pipeline	Planned

Cross-platform trace fusion: because Skills follow agentskills.io, you can feed Claude Code or Gemini CLI traces into GEPA:

python -m evolution.skills.evolve_skill \
    --skill github-code-review --iterations 10 --eval-source mixed \
    --trace-dirs ~/.claude/traces,~/.hermes/sessions

9. Plugin skills: extending Hermes boundaries

Plugins package skills under a namespace (plugin:skill): they do not appear in the default skills_list (less noise), activate only on explicit user invocation (opt-in), and skills inside a plugin can reference each other.

skill_view("superpowers:writing-plans")
# Loading also surfaces sibling skills in the same plugin

Declare in the plugin plugin.yaml:

name: my-hermes-plugin
skills:
  - name: writing-plans
    path: skills/writing-plans/SKILL.md
  - name: editing
    path: skills/editing/SKILL.md

10. Advanced authoring tips (engineer view)

10.1 description drives activation precision

Too vague: Helps with code. Better: Use when reviewing a pull request... Do NOT use for writing new code.

10.2 Pitfalls separate good skills from noise

High-quality Pitfalls name specific failure modes, root causes, and fix steps—fragile CSS selectors, GitHub API rate limits, large diff token overflow, and chunking strategies to handle them.

10.3 Scripting and skill_manage

Reference scripts/extract_schema.py in Procedure; on failure, load references/manual-extract.md. Agents can maintain skills via skill_manage(action='patch'| 'create'). Set skills.agent_writes_require_approval: true in config.yaml for a human approval gate.

10.4 Skill size control

Skill size	Recommendation
< 500 lines	Keep everything in SKILL.md
500–1000 lines	Move deep reference to references/
> 1000 lines	Split strongly; consider two skills
> 15KB	Exceeds GEPA limit; must split

11. Case study: building Skills for your tech blog workflow

Build a blog-workflow Bundle packing seo-keyword-research, outline-generator, code-example-validator, bilingual-checker, and publish-to-platform:

name: blog-workflow
description: Full tech blog writing workflow.
skills:
  - seo-keyword-research
  - outline-generator
  - code-example-validator
  - bilingual-checker
  - publish-to-platform
instruction: |
  Always research SEO keywords before writing.
  Ensure all code examples are tested and runnable.
  Generate both Chinese and English title options.

A custom seo-keyword-research skill should query long-tail terms at session start ("how to X," "X vs Y," platform-specific agent terminology), output 3–5 primary keywords plus 10–15 long-tail entries, and note terminology differences across locales.

12. FAQ

Q: What is the difference between Skills and MCP?
Skills are procedural knowledge documents; MCP is a tool interface. MCP provides database access; a Skill teaches the agent how to run a migration correctly. They complement each other.

Q: Why does the agent still use an old Skill after I edited it?
Run /reset for a new session, or reinstall with --now (which breaks Prompt Cache).

Q: Are GEPA-evolved skills safe?
Four guardrails plus human PR review; semantic drift checks prevent purpose drift. Still review every diff before merge.

Q: How do I reuse Hermes Skills in Claude Code?
Copy to ~/.claude/skills/ or use the kevinnft/ai-agent-skills multi-platform installer.

Q: Does Chinese content in Skills hurt token efficiency?
Chinese runs roughly 1–1.5 tokens per character, similar to English density. Keep description in English (or bilingual) for sharper LLM matching.

13. Resources and remote Mac 7x24 decision

13.1 Official and community resources

Hermes Agent official docs · Chinese docs
Skills system reference · agentskills.io open standard
GEPA self-evolution tool · GEPA algorithm · DSPy framework
Related on SFTPMAC: Cursor Agent Skills complete guide · OpenRouter CLI and Hermes selection guide

13.2 Deployment scenario decision matrix

Scenario	Local Mac / laptop	Remote Mac 7x24 (SFTPMAC)
GEPA evolution + sessiondb	Lid close breaks traces; incomplete samples	Continuous session traces; richer evolution data
Telegram/Discord Gateway	Sleep and Wi-Fi drops take the bot offline	launchd supervision keeps Gateway online
Team Tap sync	Each developer's ~/.hermes drifts	Unified node + SFTP/rsync for skills directory
Skill Bundles long sessions	Memory and tokens compete with other apps	Apple Silicon unified memory; stable multi-skill runs

13.3 Summary: from local experiments to production agent nodes

This guide covered the full Hermes Skills stack: concept comparison, SKILL.md and Progressive Disclosure, Skill Bundles, conditional activation, Hub ecosystem, Tap publishing, GEPA self-evolution, Plugin extensions, authoring tips, and a blog workflow case study. Mastering it upgrades your agent from a disposable prompt into a versionable, shareable, self-improving procedural asset.

Local Hermes has clear limits: laptop lid close breaks connectivity, GEPA sessiondb traces stay incomplete, Telegram Gateway drops when the machine sleeps, and team Taps drift across developer machines. Teams that need 7x24 uptime for evolution traces, Gateway hosting, or unified ~/.hermes/skills/ sync should run Hermes on an Apple Silicon remote Mac—native launchd supervision, macOS-aligned toolchains, and SFTP/rsync for secure skills directory sync.

SFTPMAC remote Mac rental targets Hermes Agent Skills workflows: 7x24 Gateway, continuous GEPA trace collection, and team Tap directory sync—a better production entry than a home Mac pulling double duty. Start with your first SKILL.md today and let the agent compound from every session.