# Review a skill against best practices

Skills provide procedural knowledge and specific workflows that agents load when relevant. This guide explains how to run **skill reviews** **against best practices** and on the next page we'll talk about how to use the automated **optimize** option to address issues before deploying the skill to your team.

## TL;DR

Compare your skill against best practices. For example:

* Examine the description in the skill, determine if the wording is effective, and how that affects activation.

## Why review skills?

Skills encode team knowledge and workflows. Skill reviews help you:

* Assess if skills conform to the skills standard
* Validate skill content quality (how likely skill is to help) before deploying to your team
* Validate skill description quality (how likely skill is to activate) before deploying to your team

## Viewing skill reviews

In the [Tessl Registry](https://tessl.io/registry), skill reviews show multiple scores.

**Example**: [React development skill review](https://tessl.io/skills/github/softaworks/agent-toolkit/react-dev/review?showAll=)

**Review Score**: Overall quality assessment (0-100%)

* Weighted average of the three sub components below

**Validation Checks**: Validates that skill follows the criteria at <https://agentskills.io/specification>

* Checks covering line count, frontmatter, schema, license and metadata
* Pass/warning/fail deterministic grading

**Implementation Score**: LLM-as-a-judge review of the SKILL.md body, graded on:

* Conciseness
* Actionability
* Workflow clarity
* Progressive disclosure

**Activation Score**: LLM-as-a-judge review of the description, assessing how likely agents are to use the skill, graded on:

* Specificity
* Completeness
* Trigger Term Quality
* Distinctiveness Conflict Risk

Each skill review includes detailed validation results showing what passed, what needs improvement, and specific recommendations.

**What scores mean:**

* **90%+ Review Score**: Skill conforms well to best practices
* **70-89% Review Score**: Good skill, may have minor improvements needed
* **Below 70%**: Likely needs work before deployment

Use these scores to choose quality skills and identify what to fix in your own skills before publishing.

## Automatic review on publish

When you publish a skill to the registry using `tessl skill publish`, skill reviews run automatically:

```bash
# Publish skill with automatic skill review
tessl skill publish ./<Path to skill>
```

**What happens automatically:**

* Skill is linted for format and structure
* Skill review is performed
* Review scores are calculated and displayed in the registry

## Reviewing skills locally

Before publishing skills, validate them locally:

```bash
# Validate skill format and structure
tessl skill lint ./<tile.json folder>

# Get detailed quality review
tessl skill review ./<path to SKILL.md folder>
```

Fix any issues locally, then publish. The registry will show updated review scores. For fixing issues with skill reviews, use the optimise flag.

Here's a sample output of `tessl skill review` in the CLI:

```shell
$tessl skill review skills/debug-api-endpoints/SKILL.md 

Validation Checks

  ✔ skill_md_line_count - SKILL.md line count is 152 (<= 500)
  ✔ frontmatter_valid - YAML frontmatter is valid
  ✔ name_field - 'name' field is valid: 'debug-api-endpoints'
  ✔ description_field - 'description' field is valid (59 chars)
  ✔ description_voice - 'description' uses third person voice
  ⚠ description_trigger_hint - Description may be missing an explicit 'when to use' trigger hint (e.g., 'Use when...')
  ✔ compatibility_field - 'compatibility' field not present (optional)
  ✔ allowed_tools_field - 'allowed-tools' field not present (optional)
  ⚠ metadata_version - 'metadata' field is not a dictionary
  ✔ metadata_field - 'metadata' field not present (optional)
  ⚠ license_field - 'license' field is missing
  ✔ frontmatter_unknown_keys - No unknown frontmatter keys found
  ✔ body_present - SKILL.md body is present
  ✔ body_examples - Examples detected (code fence or 'Example' wording)
  ✔ body_output_format - Output/return/format terms detected
  ✔ body_steps - Step-by-step structure detected (ordered list)

Overall: PASSED (0 errors, 3 warnings)

Judge Evaluation

  Description: 22%
    specificity: 1/4 - The description uses vague language ('test or debug API endpoints systematically') without listing any concrete actions like 'send requests', 'validate responses', 'check status codes', or 'inspect headers'.
    trigger_term_quality: 2/4 - Contains some relevant keywords ('test', 'debug', 'API endpoints') that users might say, but misses common variations like 'REST', 'HTTP requests', 'curl', 'postman', 'API calls', or specific verbs like 'hit an endpoint'.
    completeness: 1/4 - The description only addresses 'when' (framed as a trigger) but completely lacks the 'what' - it never explains what capabilities or actions the skill provides. This inverts the typical problem but still fails completeness.
    distinctiveness_conflict_risk: 2/4 - The API testing domain is somewhat specific, but 'test or debug' is broad enough to potentially conflict with general debugging skills, testing frameworks, or other API-related skills without clear differentiation.

    Assessment: This description is structured as a 'when' clause only, completely omitting what the skill actually does. While it identifies a reasonable use case (API testing/debugging), it lacks concrete actions, comprehensive trigger terms, and sufficient detail to distinguish it from other testing or API-related skills.

    Suggestions:
      - Add specific capabilities the skill provides, e.g., 'Send HTTP requests, inspect response headers and bodies, validate status codes, test authentication flows'
      - Expand trigger terms to include natural variations: 'REST API', 'HTTP requests', 'curl', 'API calls', 'endpoints', 'request/response'
      - Restructure to lead with 'what' then 'when': 'Tests API endpoints by sending HTTP requests and validating responses. Use when debugging REST APIs, testing endpoints, or inspecting HTTP request/response cycles.'

  Content: 77%
    conciseness: 2/4 - The content is reasonably efficient but includes some unnecessary explanation that Claude would already know (e.g., explaining what 401/403/404 mean, basic concepts like 'XSS attempts blocked'). The structure is good but could be tightened.
    actionability: 3/4 - Provides concrete, executable curl commands with expected outputs for each testing phase. The examples are copy-paste ready and cover specific scenarios with clear expected results.
    workflow_clarity: 3/4 - Clear 4-step sequential workflow with explicit ordering (Infrastructure → Security → Validation → Functionality). The troubleshooting section provides a decision tree for debugging, and the 'Test in order' best practice reinforces the validation checkpoint approach.
    progressive_disclosure: 2/4 - Content is well-organized with clear sections and headers, but it's a monolithic document that could benefit from splitting detailed test cases into separate reference files. For a skill of this length (~150 lines), some content like the detailed troubleshooting tips could be externalized.

    Assessment: This is a solid, actionable skill with a clear systematic workflow and executable examples. The main weakness is verbosity - it explains concepts Claude already knows (HTTP status codes, basic security concepts) and could be more concise. The structure is excellent but the document length suggests some content could be split into reference files.

    Suggestions:
      - Remove explanatory text for concepts Claude knows (e.g., 'Request without Authorization header' after 'No token returns 401' is redundant)
      - Consider extracting the detailed test cases for each step into a separate TESTS.md reference file, keeping SKILL.md as a concise overview
      - Consolidate the 'What to check' code blocks - some curl examples are repetitive and could be reduced to one representative example per step

Average Score: 50%
```

## Next steps

* [Optimize a skill using best practices](/evaluate/optimize-a-skill-using-best-practices.md) - Automate fixing issues in your skill
* [Evaluate skill quality using scenarios](/evaluate/evaluate-skill-quality-using-scenarios.md) - Evaluate skill effect on agent performance
* [Evaluating documentation](/evaluate/evaluating-documentation.md) - Measure documentation effectiveness
* [Distributing via registry](/distribute/distributing-via-registry.md) for publishing skills.
* [Creating skills](/create/creating-skills.md) - Write better skills
* [Publishing skills](/distribute/distributing-via-registry.md) - Share reviewed skills

## Related resources

* [Agent Skills Specification](https://agentskills.io/specification)
* [Glossary: Evaluations](/reference/glossary.md#evaluations-evals)
* [Glossary: Skills](/reference/glossary.md#skill)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.tessl.io/evaluate/evaluating-skills.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
