SEOgent is an automated SEO auditing platform. You give it a URL, it crawls the site, analyzes every page against 30+ checks, and returns structured results that both humans and AI agents can act on.
This doc covers what happens under the hood — from initial crawl to final report — so you understand exactly what SEOgent is checking and how to use the results.
Every scan moves through four stages:
Pending → Crawling → Analyzing → Completed
If credits run out mid-scan, you'll still get partial results for every page analyzed up to that point.
SEOgent supports two crawl modes that determine how it builds the list of URLs to scan.
Discover mode automatically finds pages on your site. The process:
robots.txt for a Sitemap: directiveThis is the recommended mode for most sites. Set a max_pages limit to control scope — useful for large sites where you want to audit a representative sample rather than every page.
Manual mode skips discovery entirely. You provide an explicit list of URLs and SEOgent crawls only those pages. This is useful for:
Before crawling begins, SEOgent runs checks against your site's configuration. These aren't tied to any single page — they apply to the entire domain.
SEOgent fetches and parses your robots.txt to check:
| Check | What It Looks For |
|---|---|
| robots.txt exists | Is the file accessible at /robots.txt? |
| No blanket block | Does User-agent: * / Disallow: / block all crawlers? |
| Search engines allowed | Are Googlebot and Bingbot explicitly blocked? |
| Sitemap directive | Does robots.txt reference a sitemap? |
| Reasonable crawl delay | Is the crawl-delay set above 10 seconds? |
| AI bot access | Are GPTBot, ClaudeBot, ChatGPT-User, Google-Extended, or PerplexityBot blocked? |
The AI bot access check is particularly relevant for GEO (Generative Engine Optimization). If your robots.txt blocks AI crawlers, your content won't appear in AI-generated answers regardless of how well-optimized it is.
SEOgent verifies that your sitemap is accessible. It checks the URL from robots.txt first, then probes /sitemap.xml and /sitemap_index.xml as fallbacks.
Once all pages are analyzed, SEOgent scans for:
<title> tagThese are grouped by the duplicated value, showing which URLs share it. Duplicate titles and descriptions dilute your search presence and confuse both users and search engines about which page to rank.
Every crawled page is analyzed against checks organized into categories. Each check has a weight (how much it affects the overall score) and a severity (error vs. warning).
| Check | What It Looks For | Weight |
|---|---|---|
| Title tag exists | Page has a <title> tag |
10 |
| Title length | Between 30–60 characters | 5 |
| Meta description exists | Page has <meta name="description"> |
10 |
| Meta description length | Between 70–160 characters | 5 |
| Canonical URL | <link rel="canonical"> is present |
5 |
| Viewport meta tag | Mobile-responsive viewport is set | 5 |
| Language attribute | <html lang="..."> is set |
3 |
| Open Graph tags | og:title, og:description, og:image present | 3 |
| Check | What It Looks For | Weight |
|---|---|---|
| H1 heading | Page has exactly one H1 | 10 |
| Single H1 | No more than one H1 on the page | 5 |
| Heading hierarchy | No skipped levels (e.g., H1 → H3 without H2) | 4 |
| Content length | At least 300 words of body text | 8 |
| Internal links | Page links to other pages on the same domain | 5 |
| Image alt attributes | All images have descriptive alt text | 8 |
| Image lazy loading | Images use loading="lazy" (checked when 2+ images) |
3 |
| Orphan anchors | <a> tags without href attributes |
3 |
| Check | What It Looks For | Weight |
|---|---|---|
| HTTPS | Page is served over a secure connection | 5 |
| HTML5 DOCTYPE | <!DOCTYPE html> is present |
2 |
| Character encoding | <meta charset> or Content-Type header |
2 |
| No noindex | Page isn't accidentally blocked from indexing | 8 |
| No redirect chains | Fewer than 2 redirects to reach the page | 7 |
| HTTP status code | Page returns 2xx, not 4xx or 5xx | 10 |
| SEO-friendly URL | Short path, hyphens not underscores, few query params | 3 |
| Hreflang tags | International language targeting tags are present | 2 |
| Check | What It Looks For | Weight |
|---|---|---|
| JSON-LD exists | At least one <script type="application/ld+json"> block |
5 |
| Valid JSON | All JSON-LD blocks contain parseable JSON | 5 |
| Has @type | Every block defines a schema type | 3 |
| Rich-result eligible | Uses types Google supports (Article, Product, FAQPage, etc.) | 2 |
These checks evaluate how well your content is structured for AI-generated answers and featured snippets. They only run on pages with 300+ words of content.
| Check | What It Looks For | Weight |
|---|---|---|
| Answer blocks | FAQ accordions, question headings, or definition lists | 5 |
| FAQ/HowTo schema | FAQPage, HowTo, or Question/Answer schema types in JSON-LD | 5 |
| Data tables | Structured comparison tables with headers and multiple rows | 3 |
Answer blocks matter because search engines and AI models extract direct answers from well-structured content. A page with a clear <h2>What is X?</h2> followed by a concise paragraph is far more likely to be quoted in an AI answer or featured snippet than a wall of unstructured text.
When link checking is enabled, SEOgent verifies every link and image on each page:
| Check | What It Looks For | Weight |
|---|---|---|
| No dead links | Internal and external links return valid responses | 8 |
| No broken images | Images return 2xx status codes (not 4xx/5xx) | 7 |
The crawler makes HEAD requests to external URLs and cross-references internal links against all crawled pages. Results distinguish between internal dead links (pages on your site that 404) and external dead links (third-party URLs that no longer work).
Each check carries a weight between 2 and 10. The page score is calculated as:
score = (sum of passed check weights / total check weights) × 100
A page that passes all checks scores 100. High-weight checks like "has title" (10) and "has H1" (10) have more impact than low-weight checks like "has hreflang" (2).
Checks that fail as warnings (like title length being slightly off) still reduce the score, but they're presented separately from hard errors (like a missing title entirely) so you can prioritize fixes.
When performance scanning is enabled, SEOgent measures Core Web Vitals and Lighthouse metrics for your pages.
| Metric | What It Is |
|---|---|
| LCP (Largest Contentful Paint) | How long until the main content is visible |
| FCP (First Contentful Paint) | Time to first rendered content |
| CLS (Cumulative Layout Shift) | Visual stability — how much the page shifts during load |
| INP (Interaction to Next Paint) | Responsiveness to user input |
| TTFB (Time to First Byte) | Server response time |
| TBT (Total Blocking Time) | How long the main thread is blocked |
| SI (Speed Index) | How quickly visible content fills the viewport |
Running Lighthouse on every page of a large site would be wasteful. Most pages built from the same template perform similarly — /products/blue-widget and /products/red-widget likely share the same layout, CSS, and JavaScript.
SEOgent uses pattern matching to group URLs by their template structure, then tests one representative page per group. Here's how it works:
| URL | Detected Pattern | Group |
|---|---|---|
/ |
/ |
Homepage |
/products/blue-widget |
/products/{slug} |
Product page |
/products/red-widget |
/products/{slug} |
Product page (same group) |
/blog/2025/01/my-post |
/blog/{year}/{month}/{slug} |
Blog post |
/blog/2025/03/other-post |
/blog/{year}/{month}/{slug} |
Blog post (same group) |
/users/42 |
/users/{id} |
User profile |
/users/187 |
/users/{id} |
User profile (same group) |
The pattern matcher recognizes:
my-great-post428-4-4-4-12 hex format/category/esports and /category/gaming both exist, it infers {slug} even without hyphensThe result: a site with 500 pages might only need 8–12 performance tests. The homepage is always included, and groups are sorted alphabetically for consistency.
SEOgent's structured output is designed to be actionable for both human developers and AI coding agents.
The SEOgent CLI and REST API return machine-readable JSON that agents can parse and act on directly. A typical agent workflow:
seogent scan https://example.com or POST /api/scans<title> tag to the layoutEvery check result includes a key, category, weight, and status — so agents can programmatically decide what to fix first and how to verify the fix worked.
The web dashboard groups results into clear categories:
SEOgent's checks map to three overlapping optimization strategies:
SEO (Search Engine Optimization) — The core checks. Titles, descriptions, headings, internal links, canonical URLs, structured data, and technical health. These directly affect how search engines crawl, index, and rank your pages.
AEO (Answer Engine Optimization) — The answer block and schema checks. FAQ accordions, question headings, definition lists, data tables, and FAQ/HowTo schema give AI models and featured-snippet algorithms structured content to extract answers from. A page that clearly answers "What is X?" with a concise paragraph under an H2 is far more likely to be cited.
GEO (Generative Engine Optimization) — The AI bot access check and structured data checks. If your robots.txt blocks ClaudeBot or GPTBot, your content is invisible to AI training and retrieval systems. JSON-LD structured data helps AI models understand what your page is about, not just what text it contains.
For automated workflows, you can provide a webhook URL when starting a scan. SEOgent sends a POST request when the scan completes, including the scan ID, domain, average score, and a summary of results. This lets you trigger downstream actions — Slack notifications, CI pipeline steps, or agent-based fix workflows — without polling.
Scans are priced per page. Standard HTML analysis is the base cost, with optional add-ons for link checking and performance scanning. Performance scans use intelligent page grouping, so a 500-page site might only incur performance costs for 8–12 representative pages. See Pricing for current rates.