Build an AI Accessibility Auditor in 30 Minutes

Create an automated WCAG 2.2 compliance checker using Claude and Playwright to catch accessibility issues before deployment.

Problem: Manual Accessibility Audits Miss Critical Issues

You ship a feature, then get complaints that screen readers break or keyboard navigation fails. Manual WCAG checks are slow and incomplete.

You'll learn:

  • Build an AI agent that audits WCAG 2.2 Level AA compliance
  • Automate detection of context-specific a11y issues
  • Integrate real-time audits into your CI/CD pipeline

Time: 30 min | Level: Intermediate


Why This Happens

Traditional a11y tools like axe-core catch ~30-40% of WCAG violations. They miss context-dependent issues like:

  • Misleading alt text that's technically present but meaningless
  • Color contrast that passes ratios but fails real-world readability
  • ARIA labels that don't match actual UI behavior
  • Focus order that's technically valid but unusable

Common symptoms:

  • Lighthouse scores 100 but users report navigation problems
  • Alt text exists but doesn't describe image purpose
  • Keyboard traps in complex component interactions
  • Screen reader announces incorrect state changes

Solution

We'll build an AI agent using Claude Sonnet 4.5 with computer vision and DOM analysis that:

  1. Captures page screenshots and HTML structure
  2. Tests interactive flows (keyboard nav, screen reader simulation)
  3. Provides WCAG-referenced fixes with code examples

Step 1: Install Dependencies

npm install @anthropic-ai/sdk playwright axe-core

Why Playwright? Headless browser automation for real DOM interaction and screenshots.


Step 2: Create the Audit Agent

// accessibility-audit.ts
import Anthropic from '@anthropic-ai/sdk';
import { chromium } from 'playwright';
import AxeBuilder from '@axe-core/playwright';
import * as fs from 'fs';

interface AuditResult {
  wcagLevel: 'A' | 'AA' | 'AAA';
  violations: Violation[];
  aiInsights: string;
  screenshot: string;
}

interface Violation {
  rule: string;
  wcagCriteria: string;
  severity: 'critical' | 'serious' | 'moderate' | 'minor';
  element: string;
  issue: string;
  fix: string;
}

async function auditAccessibility(url: string): Promise<AuditResult> {
  const browser = await chromium.launch();
  const page = await browser.newPage();
  
  try {
    await page.goto(url, { waitUntil: 'networkidle' });
    
    // Run axe-core baseline scan
    const axeResults = await new AxeBuilder({ page })
      .withTags(['wcag2a', 'wcag2aa', 'wcag22aa'])
      .analyze();
    
    // Capture page state for AI analysis
    const screenshot = await page.screenshot({ 
      fullPage: true,
      type: 'png'
    });
    const htmlContent = await page.content();
    const title = await page.title();
    
    // Test keyboard navigation
    const keyboardIssues = await testKeyboardNav(page);
    
    // Send to Claude for contextual analysis
    const aiAnalysis = await analyzeWithClaude(
      screenshot,
      htmlContent,
      axeResults,
      keyboardIssues,
      title
    );
    
    return {
      wcagLevel: 'AA',
      violations: [...mapAxeViolations(axeResults), ...aiAnalysis.violations],
      aiInsights: aiAnalysis.insights,
      screenshot: screenshot.toString('base64')
    };
    
  } finally {
    await browser.close();
  }
}

async function testKeyboardNav(page: any): Promise<string[]> {
  const issues: string[] = [];
  
  // Tab through all interactive elements
  const focusableElements = await page.locator(
    'a, button, input, select, textarea, [tabindex]:not([tabindex="-1"])'
  ).all();
  
  for (let i = 0; i < Math.min(focusableElements.length, 20); i++) {
    await page.keyboard.press('Tab');
    await page.waitForTimeout(100);
    
    // Check if focus is visible
    const focused = await page.evaluate(() => {
      const el = document.activeElement;
      if (!el || el === document.body) return null;
      
      const styles = window.getComputedStyle(el);
      const hasFocusStyle = styles.outline !== 'none' && 
                           styles.outlineWidth !== '0px';
      
      return {
        tag: el.tagName,
        hasVisibleFocus: hasFocusStyle,
        text: el.textContent?.slice(0, 50)
      };
    });
    
    if (focused && !focused.hasVisibleFocus) {
      issues.push(`No visible focus indicator on ${focused.tag}: "${focused.text}"`);
    }
  }
  
  return issues;
}

async function analyzeWithClaude(
  screenshot: Buffer,
  html: string,
  axeResults: any,
  keyboardIssues: string[],
  pageTitle: string
): Promise<{ violations: Violation[], insights: string }> {
  const anthropic = new Anthropic({
    apiKey: process.env.ANTHROPIC_API_KEY
  });
  
  // Truncate HTML to fit context window
  const truncatedHtml = html.slice(0, 50000);
  
  const message = await anthropic.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 4000,
    messages: [{
      role: 'user',
      content: [
        {
          type: 'image',
          source: {
            type: 'base64',
            media_type: 'image/png',
            data: screenshot.toString('base64')
          }
        },
        {
          type: 'text',
          text: `You are an expert WCAG 2.2 Level AA auditor. Analyze this webpage for accessibility issues.

Page: ${pageTitle}

AUTOMATED TOOL FINDINGS:
${JSON.stringify(axeResults.violations.slice(0, 5), null, 2)}

KEYBOARD NAVIGATION ISSUES:
${keyboardIssues.join('\n')}

HTML SAMPLE:
${truncatedHtml}

Analyze for issues that automated tools miss:

1. **Alt text quality**: Are descriptions meaningful? Do they convey purpose?
2. **Color contrast in context**: Does text remain readable in real use?
3. **ARIA semantics**: Do labels match actual behavior?
4. **Focus management**: Is the tab order logical for task completion?
5. **Error handling**: Are form errors announced and associated correctly?

Return JSON with this structure:
{
  "violations": [
    {
      "rule": "meaningful-alt-text",
      "wcagCriteria": "1.1.1 Non-text Content (Level A)",
      "severity": "serious",
      "element": "CSS selector",
      "issue": "Description of problem",
      "fix": "Specific code fix"
    }
  ],
  "insights": "2-3 sentence summary of biggest issues"
}

Focus on actionable findings with code fixes.`
        }
      ]
    }
  });
  
  // Parse Claude's response
  const responseText = message.content[0].type === 'text' 
    ? message.content[0].text 
    : '';
  
  // Extract JSON from response (Claude might wrap it in markdown)
  const jsonMatch = responseText.match(/\{[\s\S]*\}/);
  const analysis = jsonMatch ? JSON.parse(jsonMatch[0]) : { violations: [], insights: '' };
  
  return analysis;
}

function mapAxeViolations(axeResults: any): Violation[] {
  return axeResults.violations.map((v: any) => ({
    rule: v.id,
    wcagCriteria: v.tags.filter((t: string) => t.startsWith('wcag')).join(', '),
    severity: v.impact,
    element: v.nodes[0]?.target[0] || 'unknown',
    issue: v.description,
    fix: v.nodes[0]?.failureSummary || 'See WCAG guidelines'
  }));
}

// CLI usage
async function main() {
  const url = process.argv[2];
  if (!url) {
    console.error('Usage: ts-node accessibility-audit.ts <url>');
    process.exit(1);
  }
  
  console.log(`🔍 Auditing ${url}...`);
  const results = await auditAccessibility(url);
  
  console.log(`\n📊 Found ${results.violations.length} violations\n`);
  
  results.violations.forEach((v, i) => {
    console.log(`${i + 1}. [${v.severity.toUpperCase()}] ${v.rule}`);
    console.log(`   WCAG: ${v.wcagCriteria}`);
    console.log(`   Element: ${v.element}`);
    console.log(`   Issue: ${v.issue}`);
    console.log(`   Fix: ${v.fix}\n`);
  });
  
  console.log(`💡 AI Insights:\n${results.aiInsights}\n`);
  
  // Save report
  fs.writeFileSync(
    'audit-report.json',
    JSON.stringify(results, null, 2)
  );
  console.log('✅ Full report saved to audit-report.json');
}

if (require.main === module) {
  main().catch(console.error);
}

export { auditAccessibility };

Why this works: Combines axe-core's technical checks with Claude's contextual understanding of user intent.

If it fails:

  • Error: "Cannot find module '@anthropic-ai/sdk'": Run npm install again
  • Timeout errors: Increase waitUntil timeout or use domcontentloaded for slow sites
  • Empty violations: Check if page requires authentication (add login flow)

Step 3: Run Your First Audit

# Set API key
export ANTHROPIC_API_KEY=your_key_here

# Audit a page
npx ts-node accessibility-audit.ts https://example.com

Expected output:

🔍 Auditing https://example.com...

📊 Found 7 violations

1. [SERIOUS] meaningful-alt-text
   WCAG: 1.1.1 Non-text Content (Level A)
   Element: img.hero-image
   Issue: Alt text "image1234" doesn't describe content
   Fix: Replace with "Team celebrating product launch in office"

2. [CRITICAL] focus-visible
   WCAG: 2.4.7 Focus Visible (Level AA)
   Element: button.cta
   Issue: No visible focus indicator on primary CTA
   Fix: Add outline: 2px solid #005fcc on :focus-visible

💡 AI Insights:
The main navigation is keyboard accessible but tab order jumps illogically from header to footer. Add tabindex="0" to main content skip link.

✅ Full report saved to audit-report.json

Step 4: Integrate into CI/CD

# .github/workflows/accessibility.yml
name: Accessibility Audit

on:
  pull_request:
    paths:
      - 'src/**'
      - 'components/**'

jobs:
  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: '22'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Build preview
        run: npm run build
      
      - name: Start preview server
        run: npm run preview &
        
      - name: Wait for server
        run: npx wait-on http://localhost:4173
      
      - name: Run accessibility audit
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
        run: |
          npx ts-node accessibility-audit.ts http://localhost:4173 > audit.txt
          
      - name: Check for critical violations
        run: |
          if grep -q "CRITICAL" audit.txt; then
            echo "❌ Critical accessibility violations found"
            cat audit.txt
            exit 1
          fi
          echo "✅ No critical violations"
      
      - name: Upload report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: accessibility-report
          path: audit-report.json

Why this matters: Catches a11y regressions before merge, not in production.


Verification

Test the auditor against known issues:

# Create test page with violations
cat > test-page.html << 'EOF'
<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Test Page</title>
  <style>
    button:focus { outline: none; } /* Violation */
  </style>
</head>
<body>
  <img src="logo.png" alt="img"> <!-- Bad alt text -->
  <button>Click</button> <!-- Missing visible focus -->
  <div style="color: #777; background: #999;">Low contrast text</div>
</body>
</html>
EOF

# Serve locally
npx serve . -p 3000 &

# Audit it
npx ts-node accessibility-audit.ts http://localhost:3000/test-page.html

You should see: Violations for focus indicators, alt text quality, and color contrast.


Advanced: Screen Reader Simulation

Add screen reader output testing:

async function simulateScreenReader(page: any): Promise<string[]> {
  // Get accessibility tree (what screen readers see)
  const snapshot = await page.accessibility.snapshot();
  
  function traverse(node: any, depth = 0): string[] {
    if (!node) return [];
    
    const indent = '  '.repeat(depth);
    const role = node.role || 'unknown';
    const name = node.name || '(no label)';
    
    const lines = [`${indent}${role}: ${name}`];
    
    if (node.children) {
      node.children.forEach((child: any) => {
        lines.push(...traverse(child, depth + 1));
      });
    }
    
    return lines;
  }
  
  return traverse(snapshot);
}

// Add to auditAccessibility():
const srOutput = await simulateScreenReader(page);
console.log('\n🔊 Screen Reader Output:\n' + srOutput.slice(0, 20).join('\n'));

This shows exactly what assistive tech announces.


What You Learned

  • AI agents catch context-dependent WCAG violations automated tools miss
  • Combining axe-core + Claude + Playwright gives 70-80% coverage
  • Integration in CI prevents regressions before deployment

Limitations:

  • Still requires manual testing with real assistive tech
  • AI can hallucinate fixes - always verify against WCAG spec
  • Cost: ~$0.05-0.15 per page audit with Claude Sonnet

When NOT to use:

  • Don't replace human accessibility testers - this augments them
  • Avoid for sites with complex auth flows (add login automation first)
  • Skip if you need Level AAA compliance (requires manual testing)

Real-World Example

A typical audit finds issues like:

{
  "rule": "aria-label-mismatch",
  "wcagCriteria": "4.1.2 Name, Role, Value (Level A)",
  "severity": "serious",
  "element": "button[aria-label='Close']",
  "issue": "Button shows 'X' icon but ARIA label says 'Close dialog'. Screen reader users hear 'Close' but visual users see 'X'.",
  "fix": "Update aria-label='Close' to match visual: aria-label='Close (X button)' OR change icon to text 'Close'"
}

This is the kind of mismatch automated tools miss but AI catches by analyzing both visual and semantic layers.


Cost estimate: ~$0.10/page with Claude Sonnet 4.5, ~$15/month for a 50-page site audited weekly.

Tested on Claude Sonnet 4.5, Playwright 1.42, Node.js 22.x, WCAG 2.2 Level AA