About this research
Geometry of Poker is a computational research visualization. It maps poker states to feature vectors, embeds them in three dimensions, and renders an explorable point cloud. This page summarizes methodology without overstating conclusions.
Research question
When hero-centric poker states are mapped to a fixed feature vector built from exact combinatorics and engineered summaries, does unsupervised dimensionality reduction reveal coherent low-dimensional structure that can be explored interactively?
We do not presuppose a geometric shape. Structure must emerge from data.
Poker state
Each state: exactly two hero hole cards and zero, three, four, or five community cards (preflop through river).
Validation rejects duplicate cards, invalid lengths, and hero-on-board collisions.
Postflop state space is far too large to enumerate — we use seeded random sampling per street.
Feature vector (66 dimensions, compact mode)
Exact combinatorial outputs from the C++ poker-calculations core: equity vs a uniform random villain, hand category, runout quantiles, vulnerability (pNuts, pDominated).
Derived deterministic features: board texture, draw counts via single-card enumeration.
Summary statistics: card-removal gradient aggregates, category-transition entropy (flop only).
Street-aware masking: unavailable groups are zero with explicit availability flags — never NaN.
Embedding pipeline
Per street: StandardScaler → PCA (95% variance target) → UMAP (3D) → HDBSCAN clustering.
UMAP parameters and random seed are recorded in viewer-manifest.json.
Evaluation metrics (trustworthiness, kNN overlap) are in analysis-report.md per street.
Manual hand projection
User-selected cards are validated, featurized, and projected into the learned geometry.
Primary path: scale → PCA → UMAP transform when available; fallback: kNN interpolation in PCA space.
The viewer shows projection method, neighbor IDs, and distances — treat distant neighbors as low confidence.
What we claim vs. what we do not
We claim
- Deterministic feature extraction from a documented schema
- Reproducible sampling and embedding given recorded seeds
- Quantitative embedding diagnostics alongside visualization
- Interactive exploration via a single GPU point cloud
We do not claim
- Clusters prove optimal poker strategy
- UMAP distances are perfect strategic distances
- Uniform-villain equity equals game-theoretic EV
- Demo/synthetic embeddings represent real strategic structure
Four kinds of evidence
Always distinguish these when interpreting the visualization:
- Exact combinatorial outputs
- equityVsRandom, categoryIndex, runout quantiles
- Engineered features
- boardConnectivityScore, removalGradientMean
- Dimensionality-reduction artifacts
- UMAP xyz, HDBSCAN cluster id
- Interpretive observations
- human cluster descriptions — not theorems
Reproducibility
- Dataset seed: manifest.json per street
- Feature schema version: retained-features.json
- UMAP seed and hyperparameters: viewer-manifest.json
- Report UMAP random_state in any published figure — embedding is seed-sensitive
Current data status
Artifacts may be generated from demo synthetic features for pipeline validation. Demo embeddings do not represent strategic poker structure. Real datasets require native poker-calculations feature extraction via pnpm generate:all.
Full documentation (repository)
- docs/research-methodology.md
- docs/performance-analysis.md
- docs/manifold-findings.md
- docs/limitations.md
- docs/cppcon-talk-outline.md
- docs/quant-firm-project-summary.md