Built by people who think a comp ladder deserves the same engineering as a trading desk.
If how it works explains the discipline, this is the engine room. Two AI models in conversation, a stealth-driven browser pulling the comps you'd see if you opened Fanatics Collect yourself, a time-series database purpose-built for tick data, and a few classical statistics doing the steady work in the background. No mysticism. No black box.
One pipeline. Six stages.
A tick is born when a sale closes on a marketplace, an article gets posted on a forum, or a player drops 50. From birth to the price you see — six stages, each auditable. We don't hide the steps; you can click any forecast and walk back through them.
The agent runs hourly. WebSockets push updates to your screen the moment new ticks land — no refresh, no polling, no "data as of 12 hours ago" disclaimers.
One agent loop · auditable per stage · hourly cron
A web camera. A three-signal edge detector.
Bulk scan runs entirely in Safari over getUserMedia. Every ~180ms we sample the framing rect into a 120×<H> canvas, derive a luma map, and compute three signals:
- inner detail— stddev of luma inside the card-shaped guide rect (a real card has print + image variance; an empty desk doesn't)
- inner-vs-outer contrast— inner stddev minus outer stddev so a busy desk doesn't fake fill
- edge density— fraction of pixels along the rect's perimeter with a strong luma gradient (catches white-bordered cards on dark surfaces even when the card face is featureless)
All three must clear for ~1.2s of consecutive samples before we fire. Stack-bulk mode arms a 1.2s settling delay after each capture so the camera doesn't re-fire mid-flip on the residual scene. Concurrency on submit is capped at 3 in flight.
components/scan/camera-capture.tsx · app/scan/bulk/page.tsx
- 01idle
empty
luma stddev < 18 — no card in frame
- 02idle
moving
card detected · |Δ| > 9 — operator still moving
- 03armed
steady
all 3 signals pass · confidence bar fills
- 04armed
captured
snap → upload → camera remounts (1.2s arm delay)
Four canonical card APIs. In parallel.
app/services/card_enrichment/ wires four authoritative sources behind a single enrich(EnrichmentInput) entry point. Each adapter normalizes to a shared CanonicalCard shape (category, set, year, manufacturer, subject, card_number, parallel, image_url, reference_url, confidence, source).
- PSA cert — Bearer token,
/publicapi/cert/GetByCertNumber(confidence 1.0) - Scryfall — no auth,
/cards/search?q=…&order=edhrec(0.95) - pokemontcg.io — optional X-Api-Key,
/v2/cards(0.92) - ComicVine — api_key query param,
/search?resources=issuewith a /volumes fallback (0.9)
All four fire under asyncio.gather. Adapters whose API key is missing silent-skip. Failures from any one adapter don't take the others down. When the top-confidence candidate clears 0.9 and trigram has no hit, the scan endpoint materializes the catalog row on the spot — audit-logged with card.auto_create_from_enrichment and the source provenance.
app/services/card_enrichment/{psa,scryfall,pokemontcg,comicvine}.py
PSA
cert lookup
Scryfall
MTG canonical
pokemontcg
Pokémon TCG
ComicVine
Marvel issues
→ CanonicalCard[] sorted by confidence × source priority
Two models in conversation. One reads. The other audits.
A single AI model reading its own work has a known failure mode: it agrees with itself. So we don't let the same model be both classifier and judge. Cardpulse uses Claude Haiku 4.5 — the faster, cheaper model — to read each forum post, news article, and performance event, and produce a sentiment score with a confidence weight.
Whenever Haiku's output is loud — high magnitude or low confidence — Claude Sonnet 4.6 (the slower, more careful model) takes a second pass. The auditor evaluates: does the article actually justify this magnitude? Is the kind right? Is this a confident-wrong answer? Three verdicts: pass, regenerate, reject. Rejected high-magnitude signals get logged for human review rather than silently dropped. Nothing big and bullish ever moves a price without paper trail.
The agent also runs a one-click scout on demand: search Reddit and ESPN for a player, score every fresh post, regenerate the forecast — about ten seconds end-to-end.
Evaluator-optimizer pattern · different models = no self-affirmation · scout latency ~10s
Sonnet sees the rationale. Can downgrade. Never upgrade.
After the rules pipeline produces a candidate Buy/Hold/Sell, the reviewer agent (app/agents/conviction_reviewer.py) gets the same evidence and one permission: downgrade Buy/Sell to Hold when the evidence is thin. It can never upgrade. Cost-gated to fire only on directional candidates with sentiment consulted — the Hold path skips it entirely.
The system prompt lists durability bands (single missed practice → keep; ACL/retirement → keep; trade rumor with no named team → downgrade) and asymmetric defaults (Keep when in doubt, since the rules already gated). When it overrides, the reviewer's rationale replaces the formulaic one verbatim.
Cached `ephemeral` system prompt · returns `keep | downgrade` JSON
- 01
rules
Bayesian-shrunk avg + sqrt-time scaling + signal-mass gate
- 02
candidate
buy / hold / sell — what the rules say
- 03
reviewer
Sonnet · keep | downgrade · 28-word rationale
- 04
persisted
predictions row · rationale = reviewer's text when overridden
asymmetric: caution-only · default to keep
Each source carries its own batting average. We publish the receipts.
Reddit isn't ESPN isn't a Beckett pop-report update. Each signal source has its own track record — how often did its calls correctly predict the next-7-day price direction? We update each source's posterior every night, weight signals by it inside the forecast, and publish the running numbers right here. The same posture as the calibration page: claimed accuracy and realized accuracy on the same screen.
| Source | Direction match | Calls evaluated | Last updated |
|---|---|---|---|
| Loading current track records… | |||
Direction-match rate is the Beta-Binomial posterior mean α / (α + β). Each source starts at α=2, β=2 (a 50/50 prior), then updates one for one as 7-day-elapsed signals get joined to realized price moves. Warming sources have fewer than 10 calls evaluated; the running rate stabilizes as the sample grows.
The comp ladders you'd see if you opened the tab yourself.
Fanatics Collect and Goldin — the two heavyweight auction-house archives — render their results inside JavaScript single-page apps. There's no public API, and the server response is a shell with no prices in it. So Cardpulse drives a real browser: Patchright (a stealth fork of Playwright) running headless Chromium that types into the search box, waits for the React app to hydrate, and reads the rendered DOM the same way your eye would.
One browser context handles the entire refresh — about 700 variants per pass, ~2 seconds each. The session reuses pages across calls, respects per-source rate limits (one in-flight search at a time, max 20 calls per minute), and identifies as a real Chrome on macOS so we look like the humans the sites actually serve.
Patchright stealth · shared browser context · ~25 min per refresh
sales-history.fanaticscollect.com
type · enter · eval
goldin.co/buy/?show_only=sold_items
URL · wait · eval
Browse API · oauth2
JSON · paginate
pricecharting.com/api
JSON · score-match
A database that knows about time.
Every tick lands in a TimescaleDB hypertable — Postgres with a time-aware extension. Rows are partitioned automatically by 7-day chunks, indexed by (variant_id, ts DESC) so the catalog's "last 30 days" median runs as a tight index scan instead of a full table read.
A continuous aggregate rolls every variant up into a daily sold-and-ask bucket on a schedule, so the price chart on every detail panel renders from pre-computed data — no per-request aggregation latency. New ticks force a refresh of the affected chunk so the chart never lies about staleness.
Postgres 17 · TimescaleDB · 7-day chunks · price_ticks_daily continuous aggregate
CREATE TABLE price_ticks (
ts TIMESTAMPTZ NOT NULL,
variant_id UUID NOT NULL,
source TEXT NOT NULL,
source_listing_id TEXT,
source_url TEXT NOT NULL,
price_cents BIGINT NOT NULL,
shipping_cents BIGINT,
currency CHAR(3) DEFAULT 'USD',
kind TEXT CHECK (kind IN
('sold','active','bid','ask','auction_close')),
raw_payload JSONB,
ingested_at TIMESTAMPTZ DEFAULT now()
);
SELECT create_hypertable('price_ticks', 'ts',
chunk_time_interval => INTERVAL '7 days');
CREATE INDEX ON price_ticks (variant_id, ts DESC);
CREATE INDEX ON price_ticks (source, ts DESC);Two old ideas, applied with discipline.
The textbook answer to outliers is the Tukey 1.5×IQR fence — clean for normal data, brittle on the small samples + heavy tails you see in card markets. Cardpulse uses the Iglewicz–Hoaglin modified z-score: a robust statistic forensic accountants use for messy real-world data. Each tick gets a score; anything past |3.5| is flagged and lifted out of the median. Never silently dropped — the outlier still appears in the comp ladder so you can decide.
The confidence state machine sits on top: every variant is classified as HOT / WARM / COLD / DARK on every refresh, based on tick count, recency, and source disagreement (coefficient of variation across the comp sources). An EVENT-DRIVEN modifier amplifies the state when the news cycle is loud enough to override the ladder.
Iglewicz–Hoaglin |M|>3.5 · 4-state FSM · re-classified per refresh
for each tick price x:
M = 0.6745 × (x − median) / MAD
where MAD = median absolute deviation from the median.
decision:
if |M| > 3.5 → flag (don't drop)
else → include in median
HOT
n≥10
<14d
WARM
n≥3
<45d
COLD
<3
<45d
DARK
n≈0
any
Pricing 1/1s and limited variants without faking direct comps.
A 1/1 sells once a lifetime. The standard 30-day median collapses, and most pricing tools either silently refuse or return a number with no methodology. Cardpulse runs an explicit borrowing graph with five named edges, each with its own statistical rationale.
Edges A–E pull comps from progressively further distance: cross-year siblings of the same insert (compact 4-year tricube kernel), same-year cross-player comps adjusted by an empirical-Bayes-shrunk player-tier multiplier, the player's broader 1/1 ceiling capped at 20% weight share, the pop class anchor multiplied by a fitted scarcity log-ratio, and a cross-product sanity floor capped at 15%.
Per-comp weight = tricube(distance) × tricube(recency) × (1 − suspicion)². Per-edge headline = weighted median of log-prices. Per-edge variance = 400-sample bootstrap. Cross-edge pool = inverse-variance with hard share caps and excess redistribution. The pooled posterior shrinks toward a product-line prior via Normal-Normal conjugacy, then the CI inflates by an explicit borrowing distance (1 − max admitted edge share, λ = 0.6) so a borrowed estimate is structurally wider than a direct-comp one.
Pathology checks refuse single-comp dominance (Logoman/Shield outlier z-test against leave-one-out cluster stats), high mean suspicion, and stale edges where the dominant 80% of weight is pre-540d. Cross-grade comps survive only when a learned grade-bridge multiplier admits at the product line — refused otherwise. Borrowed estimates cap at WARM confidence and demote to COLD/DARK as borrowing distance widens.
A separate calibration cohort tracks bucketed PIT histograms (A-dominant / B-dominant / C-dominant / mixed). Empirical 80% coverage on A+B-dominant ≥ 75% AND log-MAE < 0.35 is the gate that earns borrowed estimates promotion into the predictor's anchor blend.
services/borrowed_strength.py · 5 edges · tricube kernel · bootstrap CI · EB shrinkage
Seven layers, each with a receipt.
Most pricing tools stop at a single number. Cardpulse keeps going. Every variant gets a 30-day and 90-day forecast as a true 80% prediction band — not a fudged ±8%. The buy / hold / sell conviction is a function of the band's overlap with current price, plus the agent's net sentiment.
The number itself is a sum of layered drivers. A regression baseline against the recent comp ladder. A peer-borrow from comparable variants when the card itself is thin. A sentiment drift from the agent's signal aggregate. A grade-tier multiplier (top tiers amplify sentiment harder than mids). A rookie premium when the card is an RC (rookies trade on narrative arc). A rarity premium for short-print, low-pop, and 1-of-1 variants. A calendar bump when a postseason game, set release, or theatrical premiere sits inside the horizon.
Each layer carries its own contribution percentage and confidence score. Click into any forecast and walk back through them.
Layered predictor v2 · 7 drivers · 30d + 90d horizons · auditable per layer
Same listing. Same price. One row.
Adapters re-emit the same eBay / PriceCharting listing every hourly refresh, often at the same price. Without a guard, price_ticks bloats by ~24× per live listing per day — same data, no new information. The persister checks for an identical-price tick in a 90-minute window before insert and skips when matched. Real price changes still create a new row.
The same hygiene runs at every agent boundary: news_articles dedup on URL or content_hash; sentiment_signals via partial-UNIQUE on dedup_key; forecast_eval has a NOT EXISTS guard with a 23h window; the predictor cron only re-runs on variants whose ticks moved in the last 90 minutes — so same-price reposts no longer trigger re-prediction either.
app/jobs/refresh.py · _DEDUP_LOOKUP_SQL · 90-minute window
The same comp never lands twice.
Scrape pipelines drift. The same article shows up under a tracking-parameter URL one day and a clean URL the next. Cross-posts of a Reddit thread surface under different paths. Deduping on URL alone leaves the door open for the same content to enter the database three times.
Cardpulse stamps every article and every signal with a sha256 hash over a canonical content fingerprint — URL with tracking params stripped, headline, body, author, published-at. Two identical articles produce identical hashes. A unique index on the hash means even a hostile re-replay just hits the constraint and walks away clean.
Migration 0014 · sha256 over canonical-URL + content · partial UNIQUE indexes
url_a: reddit.com/r/x/?utm_src=feed
url_b: www.reddit.com/r/x
canonical: reddit.com/r/x (both)
sha256: be8283db8d40…0f9c (both)
on insert:
url_a → stored
url_b → UNIQUE constraint · skipped
When a comp lands, your screen updates.
The hourly cron triggers the refresh. Every adapter pass yields ticks that get persisted, and every persisted tick fires a broadcast onto a Redis pub/sub channel. The web app keeps a WebSocket open to that channel. The tick lands in your browser within hundreds of milliseconds of the database write — no polling, no manual refresh.
The same signal triggers the alert engine: any active rule whose variant just got a new tick gets re-evaluated. If the rule trips, an event lands in your inbox with a 24-hour debounce so a flapping price doesn't spam you.
Redis pub/sub · WebSocket fan-out · 24h alert debounce
end-to-end latency: ~250ms
Boring tech, applied carefully. No buzzword soup.
Cardpulse runs on technology that's well-understood, broadly deployed, and unlikely to disappear in a year. Every choice in the stack is one we'd defend in a code review.
Backend
Python 3.12
FastAPI 0.136 · Pydantic 2.13 · SQLAlchemy 2.0 (async) · LangGraph
Database
Postgres 17
TimescaleDB · pgvector · pgcrypto · trigram search
AI
Anthropic Claude
Haiku 4.5 (classifier) · Sonnet 4.6 (critic) · Claude Agent SDK
Scraping
Patchright
stealth Playwright fork · headless Chromium · per-source rate policies
Frontend
Next.js 16
React 19 · React Query · Tailwind v4 · shadcn/new-york · Turbopack
Infra
Local-first dev
Docker Compose · Cloudflare tunnel · Arq cron worker · Redis pub/sub
Want the same story without the engineering?
How it works walks through the same machinery from a collector's seat — comp ladders, outlier discipline, sentiment as a forward lever. Same content, different lens.