---
name: seogent-seo
description: >
  Run SEO audits using the SEOgent CLI. Use when the user asks to
  check SEO, audit a website, find SEO issues, fix meta tags,
  optimize images, check structured data, or improve search rankings.
  Handles scanning, polling for results, and applying fixes.
---

# SEOgent SEO Scanner Skill

## Overview
This skill uses the SEOgent CLI to perform comprehensive SEO audits
on websites and guide the agent through fixing discovered issues.

## Reference Files
This skill directory contains additional resources — consult these
before your first scan:
- **`scan.sh`** — Ready-to-run shell script that handles the full
  scan → poll → results workflow automatically. Use as a reference
  for correct flag usage, status checking, and polling patterns.
- **`examples/sample-output.json`** — Realistic example of the full
  JSON response structure. Reference this to understand the shape of
  results before parsing real output.
- **`examples/fix-workflow.md`** — Step-by-step walkthrough of fixing
  site-wide vs page-specific issues with concrete CLI examples.
- **`examples/cross-domain-workflow.md`** — Walkthrough of scanning a
  live production site and applying fixes in a local development project.

## Prerequisites
- SEOgent CLI installed: `npm install -g seogent`
- API key set via `seogent auth <token>` or `export SEOGENT_API_KEY=sk_...`

## Scanning Workflow (3 steps)

Every scan follows this pattern. Run each step as a **separate command**.

### Step 1: Start the scan
```bash
seogent scan <domain> --quiet --dev
```
This returns JSON immediately with a `scan_id`. Extract it:
```json
{ "scan_id": "xxx", "status": "pending", ... }
```

Variations:
```bash
# Scan specific URLs
seogent scan <url1> --urls <url2> --urls <url3> --quiet --dev

# Include performance metrics (Core Web Vitals)
seogent scan <domain> --performance --quiet --dev

# Check for dead links and broken images
seogent scan <domain> --link-check --quiet --dev

# Limit discovered pages
seogent scan <domain> --max-pages 50 --quiet --dev
```

### Step 2: Poll for completion
The most efficient pattern is to combine a wait with the status check:
```bash
sleep 15 && seogent status <scan_id> --quiet --dev
```
This waits 15 seconds then polls in a single command.

Check the `status` field in the response:
- `"pending"`, `"crawling"`, or `"analyzing"` → run the combined sleep+status command again
- `"completed"` → move to step 3
- `"failed"` → report error to user

Do NOT use a bash while-loop — run each poll as its own command so
you can read the output.

### Step 3: Fetch results
```bash
seogent results <scan_id> --quiet --dev
```

### Filtering results
```bash
# Only pages with issues
seogent results <scan_id> --issues-only --quiet --dev

# Only high-severity and above
seogent results <scan_id> --min-severity high --quiet --dev

# Paginate large result set
seogent results <scan_id> --per-page 50 --cursor <next_cursor> --quiet --dev
```

### Handling large results
Results for sites with 30+ pages are typically 200KB–600KB+ and will
exceed context limits if read directly.

When results are large:
1. The CLI output will be auto-saved to a persisted file. Note the file path.
2. **Delegate analysis to a general-purpose subagent** using the Task tool
   with `subagent_type: "general-purpose"`. Give it the file path and a
   detailed prompt asking it to:
   - Summarize the `average_score` and `summary` breakdown (excellent/good/needs_work/poor)
   - Review `site_checks` for site-wide config issues
   - Identify **site-wide issues** — the same warning appearing on 80%+ of pages
     indicates a template or layout bug (fix once, improves everywhere)
   - Identify **page-specific issues** — warnings on only 1-2 pages
   - Return a **prioritized fix list** grouped by blast radius (site-wide first)
3. Use the subagent's analysis to present findings and apply fixes.

## Reading Results

All output is JSON on stdout. The full structure is:

```
{
  average_score,                    // Overall SEO score (0-100)
  summary {
    excellent,                      // Count of pages scoring 90+
    good,                           // Count of pages scoring 70-89
    needs_work,                     // Count of pages scoring 50-69
    poor                            // Count of pages scoring 0-49
  },
  site_checks {
    checks[{
      key,                          // e.g. "robots_txt", "sitemap"
      name,                         // Human-readable name
      status,                       // "passed" | "failed" | "warning"
      message,                      // Description of the finding
      category                      // e.g. "crawlability", "indexability"
    }],
    duplicate_titles {
      count, found,
      duplicates[]                  // Groups of pages sharing the same title
    },
    duplicate_descriptions {
      count, found,
      duplicates[]                  // Groups of pages sharing the same description
    }
  },
  top_issues[{
    issue,                          // Human-readable issue description
    count                           // Number of pages affected
  }],
  results {
    data[{
      url,                          // Page URL
      score,                        // Page score (0-100)
      grade,                        // Letter grade (A, B, C, D, F)
      failed_checks[],              // Strings — things that failed
      warnings[],                   // Strings — things that need attention
      all_checks[{
        name, key, status,          // Same fields as site_checks
        message, category, weight   // Plus weight for scoring impact
      }]
    }],
    next_cursor,                    // For pagination
    prev_cursor,
    per_page
  }
}
```

## Analysis Strategy

Before jumping to fixes, follow this process to interpret results:

1. **Check `site_checks` first** — these are site-wide config issues like
   missing robots.txt, missing sitemap, duplicate titles/descriptions.
   These are the highest-leverage fixes.

2. **Identify baseline pages** — find pages you expect to be "clean" (e.g.,
   the homepage or a well-maintained product page). If those pages still
   have warnings, those issues are likely unintentional template-level bugs.

3. **Categorize by scope**:
   - **Site-wide template issues**: The same warning on most pages means it's
     in the base layout or a shared partial. Fix once → improves every page.
   - **Page-specific issues**: Warning on only 1-2 pages means it's in that
     page's content or template. Fix individually.

4. **Check for intentional issues** — if the project documents deliberate SEO
   gaps (e.g., a test/demo site with known issues), cross-reference before
   fixing. Don't "fix" things that are intentionally configured that way.

5. **Prioritize by impact** — fix site-wide issues first (highest blast
   radius), then page-specific issues sorted by page importance/traffic.

## Applying Fixes

The SEOgent CLI identifies issues and provides fix recommendations.
The agent's job is to read these recommendations and apply fixes
directly in the codebase:

1. Run a scan and review the JSON results
2. For each issue, read the `fix` field for guidance
3. Locate the relevant source files in the project
4. Apply the recommended changes (meta tags, alt text, schema markup, etc.)
5. Re-scan only the affected URLs to verify the fixes

## Common High-Impact Fixes

These fixes consistently produce the largest score improvements, listed
in rough priority order:

1. **Canonical tags** — add self-referencing `<link rel="canonical">` tags
   to the base template. Generate the URL from the request URI so each page
   gets its own canonical automatically.

2. **Open Graph tags** — add `og:title`, `og:description`, `og:image` to
   the base template `<head>`. Use the page's existing title/description
   meta values so they stay in sync.

3. **JSON-LD structured data** — add schema.org markup:
   - `Product` schema for product pages
   - `Article` schema for blog posts
   - `Organization` schema for the company/about page
   - Match the schema type to the page content.

4. **Heading hierarchy** — ensure headings follow H1 → H2 → H3 with no
   skipped levels. Common mistake: using `<h4>` or `<h5>` in footers or
   navigation. Use `<p>`, `<span>`, or styled `<div>` elements instead.

5. **Image lazy loading** — add `loading="lazy"` to below-fold images.
   Do NOT add it to hero images or anything above the fold — lazy loading
   above-fold content hurts perceived performance.

6. **robots.txt + sitemap.xml** — ensure both exist and are accessible.

7. **Title length** — titles under 30 characters are flagged. Pad short
   titles with the brand name or descriptors (e.g., "Products" →
   "Products | Acme — Quality Widgets").

## Iterative Fix & Verify Workflow

After applying fixes, re-scan only the specific pages that had issues
to verify the fixes were applied correctly. This is faster and more
efficient than re-scanning the entire site.

### Step-by-step

1. Run the initial full-site scan (3-step async flow above).

2. Extract the URLs that had issues from the results JSON.

3. Apply fixes to the relevant source files in the codebase.

4. Re-scan only the affected URLs:
```bash
seogent scan https://example.com/page-with-issues \
  --urls https://example.com/another-page \
  --urls https://example.com/third-page \
  --quiet --dev
```

5. Poll status, then fetch results (steps 2-3 of scanning workflow).

6. If issues remain, repeat steps 3-5 until clean.

## Common Workflows

### Full audit with report
1. Start scan: `seogent scan <domain> --quiet --dev`
2. Poll: `sleep 15 && seogent status <scan_id> --quiet --dev` (repeat until completed)
3. Fetch: `seogent results <scan_id> --quiet --dev`
4. Analyze using the Analysis Strategy above
5. Present results using the format below
6. Offer to fix issues in the codebase

### Fix-and-verify cycle
1. Scan the site (async 3-step flow)
2. Apply fixes directly in the project source files
3. Re-scan only the affected URLs to verify
4. Repeat until all critical issues are resolved

### Performance-focused audit
Include Core Web Vitals in the scan:
```bash
seogent scan <domain> --performance --quiet --dev
```

### Dead link audit
Check for broken links and missing images across the site:
```bash
seogent scan <domain> --link-check --quiet --dev
```

## Presenting Results to User

When presenting scan results:

1. **Summary table** — show overall score, grade breakdown
   (excellent/good/needs_work/poor counts), and before/after comparison
   if this is a re-scan.

2. **Site-level issues first** — report findings from `site_checks`
   (robots.txt, sitemap, duplicate titles/descriptions). These affect
   the whole site and are highest priority.

3. **Group page issues by scope**:
   - **Site-wide template fixes** (high leverage) — issues appearing on
     most pages. Note how many pages each fix will improve.
   - **Individual page fixes** — issues specific to 1-2 pages.

4. **Note blast radius** — for each fix, state how many pages it will
   improve (e.g., "Adding canonical tags to the base template fixes
   this on all 32 pages").

5. **Always present the plan and ask before making changes** — don't
   start editing files without user approval.

6. **After fixes, re-scan and compare** — show a before/after score
   comparison to demonstrate improvement.

## Cross-Domain Workflow: Scan Live, Fix Local

Use this workflow when you scan a **production site** (e.g., `https://example.com`)
but need to apply fixes in a **local development project** (e.g., `https://example.test`).

See `examples/cross-domain-workflow.md` for a complete walkthrough.

### When to use this workflow
- The user asks to scan a live/production URL but the working directory contains
  the local source code for that site
- The scan domain differs from the local development domain
- The user explicitly says "scan X, fix locally" or similar

### Step 1: Establish domain mapping

Ask the user (or infer from context) two things:
1. **Scan target** — the live URL to scan (e.g., `https://example.com`)
2. **Local equivalent** — the local dev URL (e.g., `https://example.test`)

When processing scan results, replace the scan domain with the local domain
in all URLs. Use the URL **path** (not domain) to locate source files:
```
https://example.com/about  →  path: /about  →  find source file for /about
```

### Step 2: Detect project type

Check the working directory for signature files to determine the project type.
This tells you how URL paths map to source files.

| Project Type | Detection Signals | URL → File Mapping |
|---|---|---|
| **Laravel/Blade** | `artisan`, `routes/web.php`, no Inertia middleware | Check `routes/web.php` → controller → Blade view in `resources/views/` |
| **Laravel/Inertia** | `artisan`, `HandleInertiaRequests` middleware | Check `routes/web.php` → controller → `resources/js/Pages/{path}.tsx` |
| **Astro** | `astro.config.mjs` | `src/pages/{path}.astro` or `src/pages/{path}/index.astro` |
| **Next.js** | `next.config.js` or `next.config.mjs` | `app/{path}/page.tsx` or `pages/{path}.tsx` |
| **Static HTML** | `index.html` in root, no framework config | `{path}/index.html` or `{path}.html` |
| **Craft CMS** | `craft` binary, `templates/` dir | `templates/{path}.twig` or `templates/{path}/index.twig` |
| **WordPress** | `wp-config.php` | Theme template hierarchy in `wp-content/themes/{active-theme}/` |
| **Hugo** | `hugo.toml` or `config.toml` with Hugo config | `content/{path}.md` + `layouts/` for templates |
| **Eleventy** | `.eleventy.js` or `eleventy.config.js` | `src/{path}.njk` (or `.md`, `.html`) |

Detection steps:
1. Use Glob to check for signature files in the working directory root
2. If multiple signals match (e.g., `artisan` + `HandleInertiaRequests`), use the
   most specific match (Laravel/Inertia over plain Laravel)
3. If no signals match, ask the user what project type this is

### Step 3: Resolve source files

For each issue in the scan results, find the corresponding source file:

**Site-wide issues** (canonical tags, OG tags, base layout concerns):
1. Find the **base layout/template** file — this is the highest-value target:
   - Laravel/Blade: `resources/views/layouts/app.blade.php` or similar
   - Laravel/Inertia: the root layout component, often in `resources/js/Layouts/`
   - Astro: `src/layouts/Layout.astro` or `src/layouts/Base.astro`
   - Craft CMS: `templates/_layout.twig` or `templates/_base.twig`
   - Static HTML: the common `<head>` section (may need fixing in every file)
2. One fix to the base layout improves every page

**Page-specific issues** (missing meta description, heading hierarchy on one page):
1. Extract the URL path from the result (e.g., `/about`)
2. Use the project-type mapping above to find the source file
3. If path-based mapping fails, search for content strings from the scan
   (like the page `<title>` text) using Grep to locate the right file
4. If still ambiguous, ask the user which file corresponds to the URL

### Step 4: Apply fixes

Apply fixes to the resolved source files following the same approach as the
standard fix workflow. The only difference is you're working with local files
that serve a different domain than what was scanned.

### Step 5: Verify locally

After applying fixes, re-scan using the **local domain** to verify:
```bash
seogent scan https://example.test/about \
  --urls https://example.test/contact \
  --quiet --dev
```

This verifies fixes without deploying to production. Use the `--dev` flag
which is already standard for all scans.

### Fallback: when mapping is ambiguous

If you cannot confidently map a URL to a source file:
1. Search for unique content strings from that page using Grep
2. Check the routing configuration for the project type
3. If still unclear, **ask the user** — don't guess and modify the wrong file

## Gotchas

Things learned the hard way:

- **Results can include 404 pages** — the scanner discovers pages via
  internal links. If broken links exist, those 404 pages appear in
  results with low scores. These inflate "needs work" counts but aren't
  fixable SEO pages — fix the broken links instead.

- **Large results exceed context** — sites with 30+ pages produce
  200KB–600KB of JSON. Always delegate large result analysis to a
  subagent (see "Handling large results" above).
