Behind the scenes: making 296 tool pages agree with themselves
UDT ships a lot of pages — currently 328 tool pages plus 11 category pages and 365 blog posts. When you maintain a corpus that size, the failure mode you fear most is the one nobody is looking at: silent drift between two representations of the same content that the rendered page can't compare against itself. Last week we found a quiet drift problem across 64 of the legacy tool pages, fixed it, and then added a verify-script invariant so it can't recur. This is the engineering note.
The two FAQs on every page
Every UDT tool page carries two parallel FAQ representations. The visible one is a list of <details> accordions a human reads on the page. The invisible one is a JSON-LD FAQPage schema block in the head a search engine reads for FAQ rich results. The two are supposed to carry the same set of questions, and they used to — the v28 build pipeline emits both from a single source. But pages predating v28 were hand-authored at different times, sometimes with one half edited and the other left behind. The result was 168 legacy tool pages with no audit history.
The audit
A 60-line Python script walked all 168 pages, pulled the schema question set and the visible-accordion question set, normalized both (strip HTML, fold curly quotes, lowercase, drop trailing punctuation), and compared. The result was sobering:
100 pages matched cleanly. 50 had a visible accordion shorter than the schema (someone added new schema questions for SEO without adding the matching visible accordion entries). 12 were in real drift — both sides had questions the other lacked. 5 had a richer visible accordion than schema (in 4 cases intentionally: the long-form legal generators ship 12-16 visible questions but only the canonical 8 in schema). 1 was a single missing schema entry. The remaining tools were clean.
The 12 drift cases were the ones that gave us pause. Drift means two authors at two times each thought their version was correct, neither knew about the other, and the page silently shipped two contradictory answers — one for humans, one for crawlers. Nobody noticed because the page renders fine.
The remediation
The 50 visible-short cases were mechanical: a script reads the schema, finds the questions missing from the accordion, renders matching <details> blocks, appends them. 79 questions across 50 pages, all idempotent (every script gets a marker comment so reruns are no-ops). The 12 drift cases used a union strategy — keep both sets in both representations, deduplicated by normalized stem. Schema-only questions were authored by an SEO-aware mind; visible-only questions were authored by a UX-aware mind; neither was wrong. Union preserves both kinds of value.
The post-remediation re-audit found something we hadn't been counting: some legacy pages have two FAQ schema blocks (a head block and a body-end block) that share questions, which inflated the original counts. After deduplication, 27 of the 64 remediated tools landed below the 8-question target the post-v28 standard sets. Authoring 1-3 new questions per tool brought all 27 up. Total new Q&A pairs written this session: 44.
The invariant
Fixing the drift once is satisfying but worthless if it can drift again. The whole point of the audit was to motivate an invariant: a verify-script check that runs on every commit and fails the build if schema and visible diverge. So we wrote one.
I-TOOL-PAGE-011, as it appears in the project's invariants doc, says: on every tool page that uses the <details> accordion pattern, the union of all FAQPage mainEntity names across every parseable JSON-LD block (deduplicated by normalized stem) must equal the set of <summary> texts inside those <details> blocks (also deduplicated). One small subtlety: identification is summary-only, not summary-plus-answer-div, because one legacy page (the disclaimer generator) uses <p> for answer markup instead of <div>, and parity is about the question set, not the markup of the answer.
The check runs in about 1.5 seconds across all 296 in-scope tool pages and the build is green. The 32 v28+ audio and video tools that use a different non-accordion FAQ rendering are out of scope by design — they're a separate problem for a separate session.
What this is and isn't
This isn't a glamorous post. There's no shiny new feature, no new tool, no new model. It's a thousand-line patch that makes two parts of every page agree with each other, and a 60-line verify rule that keeps them honest. But this is exactly the kind of work the corpus needs in its middle age, and exactly the kind of work it's easy to skip because nobody is asking for it. If you maintain a large static site with multiple representations of the same content — schema vs visible, sitemap vs filesystem, breadcrumbs vs URL structure — pick the pair most prone to drift, audit it once, and add the invariant. Future-you doing the next audit will thank present-you.