← Back to viewer

About this research

Geometry of Poker is a computational research visualization. It maps poker states to feature vectors, embeds them in three dimensions, and renders an explorable point cloud. This page summarizes methodology without overstating conclusions.

Research question

When hero-centric poker states are mapped to a fixed feature vector built from exact combinatorics and engineered summaries, does unsupervised dimensionality reduction reveal coherent low-dimensional structure that can be explored interactively?

We do not presuppose a geometric shape. Structure must emerge from data.

Poker state

Each state: exactly two hero hole cards and zero, three, four, or five community cards (preflop through river).

Validation rejects duplicate cards, invalid lengths, and hero-on-board collisions.

Postflop state space is far too large to enumerate — we use seeded random sampling per street.

Feature vector (66 dimensions, compact mode)

Exact combinatorial outputs from the C++ poker-calculations core: equity vs a uniform random villain, hand category, runout quantiles, vulnerability (pNuts, pDominated).

Derived deterministic features: board texture, draw counts via single-card enumeration.

Summary statistics: card-removal gradient aggregates, category-transition entropy (flop only).

Street-aware masking: unavailable groups are zero with explicit availability flags — never NaN.

Embedding pipeline

Per street: StandardScaler → PCA (95% variance target) → UMAP (3D) → HDBSCAN clustering.

UMAP parameters and random seed are recorded in viewer-manifest.json.

Evaluation metrics (trustworthiness, kNN overlap) are in analysis-report.md per street.

Manual hand projection

User-selected cards are validated, featurized, and projected into the learned geometry.

Primary path: scale → PCA → UMAP transform when available; fallback: kNN interpolation in PCA space.

The viewer shows projection method, neighbor IDs, and distances — treat distant neighbors as low confidence.

What we claim vs. what we do not

We claim

  • Deterministic feature extraction from a documented schema
  • Reproducible sampling and embedding given recorded seeds
  • Quantitative embedding diagnostics alongside visualization
  • Interactive exploration via a single GPU point cloud

We do not claim

  • Clusters prove optimal poker strategy
  • UMAP distances are perfect strategic distances
  • Uniform-villain equity equals game-theoretic EV
  • Demo/synthetic embeddings represent real strategic structure

Four kinds of evidence

Always distinguish these when interpreting the visualization:

Exact combinatorial outputs
equityVsRandom, categoryIndex, runout quantiles
Engineered features
boardConnectivityScore, removalGradientMean
Dimensionality-reduction artifacts
UMAP xyz, HDBSCAN cluster id
Interpretive observations
human cluster descriptions — not theorems

Reproducibility

  • Dataset seed: manifest.json per street
  • Feature schema version: retained-features.json
  • UMAP seed and hyperparameters: viewer-manifest.json
  • Report UMAP random_state in any published figure — embedding is seed-sensitive

Current data status

Artifacts may be generated from demo synthetic features for pipeline validation. Demo embeddings do not represent strategic poker structure. Real datasets require native poker-calculations feature extraction via pnpm generate:all.

Full documentation (repository)

  • docs/research-methodology.md
  • docs/performance-analysis.md
  • docs/manifold-findings.md
  • docs/limitations.md
  • docs/cppcon-talk-outline.md
  • docs/quant-firm-project-summary.md