opensourceprojects.dev

A broadsheet for software that doesn't ask for your email

SkillClaw evolves agent skills collectively from real conversations

SkillClaw evolves agent skills collectively from real conversations

GitHub RepoImpressions2k

Project Description

View on GitHub

SkillClaw: Evolving Agent Skills From Real Conversations

Imagine you've built an AI agent that handles customer support. It starts out competent, but after a few hundred real conversations, you realize it's missing a bunch of useful skills. You'd have to manually analyze transcripts, write new prompts, and retrain. That's slow and brittle.

SkillClaw takes a different approach. It watches actual conversations between users and your agent, figures out where the agent is lacking, and autonomously evolves new skills to fill those gaps. It's like having a junior developer who learns from live production data and ships patches on their own.

What It Does

SkillClaw is a framework that continuously improves a conversational agent's capabilities by analyzing real dialogue logs. It operates in a loop:

  1. Collects conversations between users and the agent.
  2. Identifies skill gaps — moments where the agent fails to provide a useful response or misses an opportunity.
  3. Generates new skills in the form of modular, reusable functions (think tool calls or specialized prompts).
  4. Validates them against held-out conversation examples to prevent regressions.
  5. Integrates them back into the agent's runtime.

The result is an agent that gets smarter over time without human intervention for every single missing edge case.

Why It's Cool

A few things stand out about this approach:

  • Data-driven, not prompt-engineered. Instead of guessing what skills your agent needs, SkillClaw lets real user interactions drive the evolution. It only adds skills that solve actual observed failures.
  • No manual curation. You don't need to sift through hundreds of logs to find patterns. The system does that automatically.
  • Safe self-improvement. New skills are tested against a validation set before deployment. If a proposed skill makes things worse, it gets rejected.
  • Modular and composable. Each skill is a standalone unit. You can inspect, edit, or reuse them across different agents. This means you can audit what your agent learned and why.

Practical use cases: chatbots that handle niche product questions without hand-writing every rule, coding assistants that pick up new library-specific patterns from developer chats, or customer support agents that automatically learn how to handle new product categories.

How to Try It

SkillClaw is available on GitHub. Here's the quick start:

git clone https://github.com/AMAP-ML/SkillClaw.git
cd SkillClaw
pip install -r requirements.txt

You'll need a running conversational agent (the repo supports several LLM backends) and a way to feed it real conversation logs. The README includes a full example using a simple FAQ bot. Start with the provided demo_conversations.json to see how SkillClaw detects gaps and generates new skills.

from skillclaw import SkillClawEngine

engine = SkillClawEngine(agent=your_agent, skill_store="skills.json")
engine.learn_from_logs("conversations.json")

After running, check the skills.json file to see what your agent learned. You can also manually approve or reject skills before they go live.

Final Thoughts

SkillClaw solves a real pain point. We all want agents that improve over time, but manual skill engineering doesn't scale. This project automates the dull part — finding patterns in failed conversations and turning them into reusable capabilities.

Is it perfect? Probably not yet. The quality of generated skills depends on the underlying LLM, and you'll want to keep an eye on validation thresholds. But as a foundation for self-improving agents, it's a promising direction. If you're building conversational AI and tired of manually patching your agent after every edge case, give it a spin. It might save you a lot of log reading.


Follow us at @githubprojects for more developer tools and open source highlights.

Back to Projects
Project ID: 551d278f-84c3-4253-a3fe-5612effd8824Last updated: June 6, 2026 at 10:09 AM