AI Citation Tracking
Daily monitoring of brand visibility across ChatGPT, Perplexity, Gemini, and Google AI Overviews — with email alerts when citation status changes.
AI Citation Tracking
The Citation Tracker answers a question that's gone from "nice to have" to existential for any modern brand: "Do we appear in AI answers when someone asks the questions our customers ask?"
It runs the queries you care about against ChatGPT, Perplexity, Gemini, and Google AI Overviews on a daily cron, captures whether your brand appears (and where), and emails you the moment that status changes.
What it does
- You add tracked queries — e.g.
best AI marketing agent,top tools for SaaS SEO. Optionally specify the brand terms / domains the classifier should scan for. Empty list = inferred fromcustomer.brandName+customer.websitehost. - A daily cron fires each query against each engine in parallel.
- Results are persisted as immutable rows:
appeared(true/false/null),position(when listed),citationType(primary/mentioned/cited-source), fullrawResponse(truncated to 16 KB for replay-debugging). - Status changes trigger an email alert — flipping into / out of presence, or moving in/out of the top-3 — when
alertOnChangeis on.
Configuration
| Property | Value |
|---|---|
| Phase | 7 |
| Schema | TrackedQuery, CitationResult |
| Migration | prisma/migrations/20260428000400_citation_tracking/ |
| Engine clients | apps/api/agent/lib/citation-tracking/engines/* |
| Runner | apps/api/agent/lib/citation-tracking/runner.ts |
| Classifier | apps/api/agent/lib/citation-tracking/classify.ts |
| Alert sink | apps/api/agent/lib/citation-tracking/alert-sink.ts |
| UI | /intelligence/citations |
| Cron | daily-citation-tracking — 0 8 * * * (daily 8 AM UTC) |
| Workflow | apps/api/agent/workflows/citation-tracking.ts (registered as a global Inngest function — single run, not per-customer fan-out) |
Engines and credentials
Each engine has a graceful "no credential" fallback: if the relevant key isn't set we record appeared = NULL with a friendly errorMessage rather than failing the whole run.
| Engine | Credential | Implementation |
|---|---|---|
| ChatGPT | OPENAI_API_KEY | Official openai SDK, model gpt-4o-mini, max 600 tokens |
| Perplexity | PERPLEXITY_API_KEY | REST → https://api.perplexity.ai/chat/completions, model sonar (online) |
| Gemini | GOOGLE_GENERATIVE_AI_API_KEY | REST → Generative Language API, model gemini-1.5-flash-latest |
| Google AI Overviews | SERPER_API_KEY or SERPAPI_KEY | Composes AI-overview text + answer-box + top-5 organic into one classified blob |
Classifier heuristics
Pure-function — apps/api/agent/lib/citation-tracking/classify.ts. Designed so the runner can mock engines and test classification in isolation.
- Appeared — case-insensitive substring match on any
brandTerm(whole-word OR hostname). - Position — line-by-line list parser that recognises
1.,1),(1),-,*,•markers. Honours declared numbering. - Citation type:
primary— at position 1, OR within first 240 chars near a strong-recommendation phrase (e.g. "top pick", "#1", "best overall").cited-source— appears in markdown link[text](url)or under aSources:/References:block.mentioned— anything else withappeared = true.
API endpoints
All under /api/citation-tracking/:customerId/.... Auth: API key, tenant-scoped via getTenantId(c).
| Method | Path | RBAC | Purpose |
|---|---|---|---|
POST | /queries | editor | Create a tracked query ({ query, engines?, brandTerms?, alertOnChange? }) |
GET | /queries | viewer | List the customer's enabled queries with the latest snapshot per engine |
PATCH | /queries/:queryId | editor | Toggle alertOnChange / enabled |
DELETE | /queries/:queryId | editor | Soft-delete (sets enabled = false; results retained for trend charts) |
GET | /queries/:queryId/results | viewer | Last 30 daily results × 4 engines, newest first, for the trend sparkline |
POST | /queries/:queryId/run-now | editor | Manual trigger — calls the same runCitationCheck the cron does |
Tenant scoping: every handler resolves customerId via getTenantId(c) so a leaked API key never leaks across tenants. Soft-delete keeps history queryable; hard-delete cascades through Prisma.
Change-detection rules
Implemented in isMeaningfulChange(prev, next):
- ✅ Flip
appeared = true ↔ false - ✅ Move into / out of top-3 by
position - ✅
citationTypechanges (e.g.mentioned→primary) - ❌ A run with
appeared = null(engine skipped) on either side never counts — prevents false alerts when a key is briefly missing.
Testing
- Pure-function tests —
__tests__/classify.test.ts(18 tests, all passing). Cover regex escape, list parsing, citation-type detection, and change-detection edge cases. - Engine clients — adapter pattern lets each engine be unit-tested with a stubbed network adapter. The
probeChatGPT(input, { adapter: { complete: stub }})shape is the contract. - E2E —
runCitationCheck(...)is wired so a real PrismaClient + mock probers can exercise the persistence + alert path end-to-end.
Why this exists
AI citation visibility is the new SERP rank. SEO told you when you fell off page 1; nothing told you when ChatGPT stopped recommending you. This closes that gap.
SEO + GEO Scanner
One-click dual-score audit for any URL — SEO + Generative Engine Optimization, with per-engine breakdowns, a prioritized checklist, and "Fix with AI" routing.
Provenance & RSS
Public-facing dogfooding artifacts — ProvenanceFooter on every blog post, /built-by-heycmo dashboard, per-agent transparency pages, and an RSS feed of agent-published content.