About this research

Geometry of Poker is a computational research visualization. It maps poker states to feature vectors, embeds them in three dimensions, and renders an explorable point cloud. This page summarizes methodology without overstating conclusions.

Research question

When hero-centric poker states are mapped to a fixed feature vector built from exact combinatorics and engineered summaries, does unsupervised dimensionality reduction reveal coherent low-dimensional structure that can be explored interactively?

We do not presuppose a geometric shape. Structure must emerge from data.

Poker state

Each state: exactly two hero hole cards and zero, three, four, or five community cards (preflop through river).

Validation rejects duplicate cards, invalid lengths, and hero-on-board collisions.

Postflop state space is far too large to enumerate — we use seeded random sampling per street.

Feature vector (66 dimensions, compact mode)

Exact combinatorial outputs from the C++ poker-calculations core: equity vs a uniform random villain, hand category, runout quantiles, vulnerability (pNuts, pDominated).

Derived deterministic features: board texture, draw counts via single-card enumeration.

Summary statistics: card-removal gradient aggregates, category-transition entropy (flop only).

Street-aware masking: unavailable groups are zero with explicit availability flags — never NaN.

Embedding pipeline

Per street: StandardScaler → PCA (95% variance target) → UMAP (3D) → HDBSCAN clustering.

UMAP parameters and random seed are recorded in viewer-manifest.json.

Evaluation metrics (trustworthiness, kNN overlap) are in analysis-report.md per street.

Manual hand projection

User-selected cards are validated, featurized, and projected into the learned geometry.

Primary path: scale → PCA → UMAP transform when available; fallback: kNN interpolation in PCA space.

The viewer shows projection method, neighbor IDs, and distances — treat distant neighbors as low confidence.

What we claim vs. what we do not

We claim

Deterministic feature extraction from a documented schema
Reproducible sampling and embedding given recorded seeds
Quantitative embedding diagnostics alongside visualization
Interactive exploration via a single GPU point cloud

We do not claim

Clusters prove optimal poker strategy
UMAP distances are perfect strategic distances
Uniform-villain equity equals game-theoretic EV
Demo/synthetic embeddings represent real strategic structure

Four kinds of evidence

Always distinguish these when interpreting the visualization:

Exact combinatorial outputs: equityVsRandom, categoryIndex, runout quantiles
Engineered features: boardConnectivityScore, removalGradientMean
Dimensionality-reduction artifacts: UMAP xyz, HDBSCAN cluster id
Interpretive observations: human cluster descriptions — not theorems

Reproducibility

Dataset seed: manifest.json per street
Feature schema version: retained-features.json
UMAP seed and hyperparameters: viewer-manifest.json
Report UMAP random_state in any published figure — embedding is seed-sensitive

Current data status

Artifacts may be generated from demo synthetic features for pipeline validation. Demo embeddings do not represent strategic poker structure. Real datasets require native poker-calculations feature extraction via pnpm generate:all.

Full documentation (repository)

docs/research-methodology.md
docs/performance-analysis.md
docs/manifold-findings.md
docs/limitations.md
docs/cppcon-talk-outline.md
docs/quant-firm-project-summary.md