Link Detection
How KTX's relationship detection performs on real-world schemas.
KTX infers foreign key relationships between tables even when the database declares no primary keys or foreign key constraints. This is critical for analytics warehouses, where constraints are rarely enforced. This page documents the methodology, scoring pipeline, and a reproducible benchmark you can run yourself.
What this measures
Most analytics warehouses — Snowflake, BigQuery, Redshift — don't enforce referential integrity constraints. Tables like fct_product_events reference dim_accounts by convention (account_id → id), but nothing in the schema says so.
KTX's relationship detection discovers these links automatically. The benchmark measures how accurately it recovers known foreign key relationships from a schema with all declared constraints removed — the hardest operating mode.
Metrics tracked:
- Accepted — relationships scored above the accept threshold (default 0.85) and written to the project manifest
- Review — relationships scored between the review threshold (0.55) and accept threshold, flagged for human review
- Rejected — relationships scored below the review threshold
- Skipped — relationships not evaluated (e.g., filtered by candidate limits)
Methodology
Detection pipeline
Relationship detection runs as a multi-stage pipeline during ktx dev scan:
-
Candidate generation — scans the schema for potential FK relationships using multiple heuristics: exact column name matches, normalized table name matching, name inflection (singular/plural), column suffix patterns (
_id,_key,_code,_uuid), self-references (parent_id,manager_id), and optionally embedding similarity and LLM proposals. -
Column profiling — samples up to 10,000 rows per column (configurable via
profile_sample_rows) to collect statistics: row counts, null rates, distinct value counts, uniqueness ratios, sample values, and text length ranges. -
Validation — tests each candidate relationship against actual data by measuring target uniqueness, source coverage, violation ratio, and value overlap between child and parent columns.
-
Scoring — combines 7 weighted signals into a confidence score:
| Signal | Weight | What it captures |
|---|---|---|
| Name similarity | 0.24 | How closely column/table names match FK conventions |
| Value overlap | 0.22 | What percentage of FK values exist in the PK column |
| Profile uniqueness | 0.22 | How unique the target column values are |
| Type compatibility | 0.10 | Whether data types are compatible (hard gate — score is 0 if incompatible) |
| Embedding similarity | 0.10 | Semantic similarity between column names |
| Profile null rate | 0.08 | Presence of non-null values |
| Structural prior | 0.04 | Baseline structural hints from schema conventions |
Each signal is normalized to [0, 1], multiplied by its weight, and summed. The final confidence is 0.56 + (weighted_sum × 0.65), clamped to [0, 1].
- Graph resolution — resolves conflicts when multiple candidates target the same column, detects primary keys (by name pattern and validation), and classifies each relationship into
accepted,review, orrejectedbased on thresholds.
Threshold configuration
scan:
relationships:
accept_threshold: 0.85
review_threshold: 0.55Relationships scoring above accept_threshold are automatically accepted into the project manifest. Those between review_threshold and accept_threshold are flagged for analyst review. Below review_threshold, they're rejected.
Test fixture
The benchmark uses the Orbit-style product warehouse — a synthetic schema modeled after a real SaaS analytics warehouse with all declared constraints removed. The fixture is a SQLite database with 6 tables:
| Table | Role | Estimated rows |
|---|---|---|
dim_accounts | Dimension | 3 |
dim_users | Dimension | 4 |
dim_workspaces | Dimension | 4 |
fct_product_events | Fact | 5 |
fct_invoices | Fact | 3 |
support_tickets | Fact | 4 |
Ground truth: 6 primary keys (one id column per table) and 9 foreign key relationships, all many_to_one:
| Source column | Target |
|---|---|
dim_users.account_id | dim_accounts.id |
dim_workspaces.account_id | dim_accounts.id |
dim_workspaces.user_id | dim_users.id |
fct_product_events.account_id | dim_accounts.id |
fct_product_events.user_id | dim_users.id |
fct_product_events.workspace_id | dim_workspaces.id |
fct_invoices.account_id | dim_accounts.id |
support_tickets.account_id | dim_accounts.id |
support_tickets.user_id | dim_users.id |
The fixture runs in multiple modes to isolate the contribution of each pipeline stage: with LLM disabled, profiling disabled, validation disabled, and embeddings disabled.
Results
Results for the default configuration will be added after the benchmark run is finalized.
Reproducing the benchmark
Prerequisites
- Node.js 22+
- pnpm
- The KTX repository cloned and dependencies installed (
pnpm install)
Running
From the repository root:
pnpm run relationships:verify-orbitThis runs ktx dev scan against the bundled SQLite fixture with enrichment disabled, then generates a verification report at:
examples/orbit-relationship-verification/reports/orbit-verification.mdThe report includes the full relationship summary, enrichment details, artifact paths, and any warnings.
Custom project
To run verification against your own database (e.g., a local Orbit project):
KTX_ORBIT_PROJECT_DIR=/path/to/your-project pnpm run relationships:verify-orbitConfiguration
The benchmark project configuration lives at examples/orbit-relationship-verification/ktx.yaml:
scan:
enrichment:
backend: none
relationships:
enabled: true
llm_proposals: false
accept_threshold: 0.85
review_threshold: 0.55
profile_sample_rows: 10000
validation_concurrency: 4Adjust accept_threshold and review_threshold to see how threshold changes affect the accepted/review/rejected distribution. Lower thresholds accept more relationships (higher recall, lower precision); higher thresholds are more conservative.
Broader benchmark suite
Beyond the Orbit fixture, KTX includes a full benchmark corpus at packages/context/test/fixtures/relationship-benchmarks/ with fixtures across multiple tiers:
- Unit — minimal schemas testing individual heuristics
- Row-bearing — small schemas with data for validation testing
- Product — full warehouse schemas like the Orbit fixture
Fixtures from public datasets (Chinook, Sakila, AdventureWorks, Northwind) supplement the synthetic fixtures. The benchmark runner measures precision, recall, and F1 for both primary key and foreign key detection across all fixtures and modes.