KTXDocs
Integrations

Primary Sources

Connect KTX to PostgreSQL, Snowflake, BigQuery, ClickHouse, MySQL, SQL Server, or SQLite.

KTX connects to your data warehouse or database to scan schemas, discover relationships, and execute semantic layer queries. Each connection is defined in ktx.yaml under the connections key.

All connectors share these conventions:

  • Sensitive values support env:VAR_NAME (read from environment) and file:/path/to/secret (read from file) references
  • Connections are read-only — KTX never writes to your database
  • Schema scanning discovers tables, columns, types, and constraints automatically

PostgreSQL

The most full-featured connector. Supports schema introspection, foreign key detection, column statistics, and historic SQL via pg_stat_statements.

Connection config

ktx.yaml
connections:
  my-postgres:
    driver: postgres
    url: postgresql://user:password@host:5432/database
    schema: public
    readonly: true

Or with individual fields:

ktx.yaml
connections:
  my-postgres:
    driver: postgres
    host: localhost
    port: 5432
    database: analytics
    username: ktx_reader
    password: env:PG_PASSWORD
    schemas:
      - public
      - analytics
    ssl: true
    readonly: true

Authentication

MethodConfig
Passwordpassword: env:PG_PASSWORD or password: file:/path/to/secret
Connection URLurl: env:DATABASE_URL
SSLssl: true, optionally rejectUnauthorized: false for self-signed certs

Features

FeatureSupportedNotes
Tables & viewsYesVia pg_catalog
Primary keysYesVia information_schema.table_constraints
Foreign keysYesFull constraint detection
Row count estimatesYesVia pg_class.reltuples
Column statisticsYesRequires pg_read_all_stats role
Historic SQLYesVia pg_stat_statements extension
Table samplingYesTABLESAMPLE SYSTEM

Historic SQL

PostgreSQL Historic SQL mines real query patterns from pg_stat_statements. This is the most mature local Historic SQL path and helps KTX understand how your team actually queries the data.

Requirements:

  • pg_stat_statements extension enabled
  • pg_read_all_stats role granted to the KTX user

Config options:

historicSql:
  minCalls: 5           # Minimum call count to include a query template
  maxTemplatesPerRun: 5000

Dialect notes

  • SQL generation uses LIMIT/OFFSET pagination
  • Named parameters converted to positional ($1, $2, ...)
  • Supports COUNT(*) FILTER (WHERE ...) for null analysis
  • Full support for PostgreSQL types: uuid, jsonb, timestamptz, numeric, text[], etc.

Snowflake

Connects via the Snowflake SDK. Supports multi-schema scanning, RSA key authentication, and Historic SQL configuration for Snowflake query history.

Connection config

ktx.yaml
connections:
  my-snowflake:
    driver: snowflake
    account: xy12345
    warehouse: ANALYTICS_WH
    database: PROD
    schema_name: PUBLIC
    username: KTX_SERVICE
    password: env:SNOWFLAKE_PASSWORD
    role: ANALYST
    readonly: true

For multiple schemas:

    schema_names:
      - PUBLIC
      - ANALYTICS
      - STAGING

Authentication

MethodConfig
Passwordpassword: env:SNOWFLAKE_PASSWORD
RSA key pairauthMethod: rsa, privateKey: file:~/.ssh/snowflake_key.pem, optional passphrase

Features

FeatureSupportedNotes
Tables & viewsYesVia INFORMATION_SCHEMA.TABLES
Primary keysYesVia table constraints
Foreign keysNoNot available in Snowflake
Row count estimatesYesFrom INFORMATION_SCHEMA.TABLES.ROW_COUNT
Column statisticsNo
Historic SQLConfigurableQuery-history settings can be stored; local CLI Historic SQL ingest currently uses the Postgres path
Table samplingYes

Historic SQL

Snowflake Historic SQL settings describe how query history should be sampled when that runtime path is available.

historicSql:
  windowDays: 90
  redactionPatterns: []
  serviceAccountUserPatterns: []

Dialect notes

  • All identifiers are uppercase by default (case-insensitive matching)
  • Connection context set per query (USE ROLE, USE WAREHOUSE, USE DATABASE, USE SCHEMA)
  • Parameter binding uses positional ? placeholders
  • Date values normalized to ISO 8601 strings

BigQuery

Authenticates via GCP service account credentials. Supports multi-dataset scanning and Historic SQL configuration for INFORMATION_SCHEMA.JOBS_BY_PROJECT.

Connection config

ktx.yaml
connections:
  my-bigquery:
    driver: bigquery
    credentials_json: file:~/.config/gcloud/bq-service-account.json
    dataset_id: analytics
    location: US
    readonly: true

For multiple datasets:

    dataset_ids:
      - analytics
      - marketing
      - finance

Authentication

MethodConfig
Service account JSONcredentials_json: file:/path/to/key.json
Environment variablecredentials_json: env:GCP_CREDENTIALS_JSON

The project ID is extracted automatically from the service account JSON file.

Features

FeatureSupportedNotes
Tables & viewsYesIncluding materialized views and external tables
Primary keysNo
Foreign keysNoNot available in BigQuery
Row count estimatesYesFrom table metadata
Column statisticsNo
Historic SQLConfigurableQuery-history settings can be stored; local CLI Historic SQL ingest currently uses the Postgres path
Table samplingYes

Historic SQL

BigQuery Historic SQL settings describe how INFORMATION_SCHEMA.JOBS_BY_PROJECT should be sampled when that runtime path is available.

historicSql:
  windowDays: 90
  redactionPatterns: []
  serviceAccountUserPatterns: []

Dialect notes

  • Parameter binding uses named @param syntax
  • Arrays flattened to comma-separated strings in results
  • Location specified at query execution time
  • Supports maxBytesBilled and jobTimeoutMs limits

ClickHouse

Connects over HTTP (port 8123) or HTTPS (port 8443). Supports the ClickHouse native type system including Nullable, LowCardinality, and Array wrappers.

Connection config

ktx.yaml
connections:
  my-clickhouse:
    driver: clickhouse
    url: http://localhost:8123/analytics
    readonly: true

Or with individual fields:

ktx.yaml
connections:
  my-clickhouse:
    driver: clickhouse
    host: clickhouse.internal
    port: 8123
    database: analytics
    username: default
    password: env:CH_PASSWORD
    ssl: false
    readonly: true

Authentication

MethodConfig
Basic authusername + password (HTTP basic auth)
No authDefault user default with no password
HTTPSSet ssl: true (uses port 8443 by default)

Features

FeatureSupportedNotes
Tables & viewsYesVia system.tables, engine-based detection
Primary keysYesVia system.columns
Foreign keysNoNot a ClickHouse concept
Row count estimatesYesVia system.parts aggregation
Column statisticsNo
Historic SQLNo
Table samplingYes

Dialect notes

  • Parameter binding uses {param:Type} syntax (e.g., {database:String})
  • Detects views vs. tables by engine name (View, MaterializedView)
  • Handles Nullable(T) and LowCardinality(Nullable(T)) type wrappers
  • Dictionary tables are excluded from scanning
  • Results returned in JSONCompact or JSONEachRow format

MySQL

Standard MySQL/MariaDB connector with full foreign key support and schema introspection.

Connection config

ktx.yaml
connections:
  my-mysql:
    driver: mysql
    url: mysql://user:password@host:3306/database
    readonly: true

Or with individual fields:

ktx.yaml
connections:
  my-mysql:
    driver: mysql
    host: mysql.internal
    port: 3306
    database: analytics
    username: ktx_reader
    password: env:MYSQL_PASSWORD
    ssl: true
    readonly: true

Authentication

MethodConfig
Passwordpassword: env:MYSQL_PASSWORD or password: file:/path/to/secret
SSLssl: true or ssl: { rejectUnauthorized: false }
URL parameters?ssl=true or ?sslmode=required in connection URL

Features

FeatureSupportedNotes
Tables & viewsYesVia INFORMATION_SCHEMA.TABLES
Primary keysYesVia KEY_COLUMN_USAGE
Foreign keysYesVia REFERENTIAL_CONSTRAINTS
Row count estimatesYesFrom TABLE_ROWS (InnoDB estimate)
Column statisticsNo
Historic SQLNo
Table samplingYesUses RAND() filter

Dialect notes

  • Parameter binding uses positional ? placeholders
  • Uses LIMIT X OFFSET Y for pagination
  • Single database per connection (no multi-schema)
  • Supports 20+ MySQL types including enum, json, datetime, decimal
  • Table comments extracted with InnoDB metadata prefix stripping

SQL Server

Connects to Microsoft SQL Server and Azure SQL. Supports multi-schema scanning with dbo as the default schema.

Connection config

ktx.yaml
connections:
  my-sqlserver:
    driver: sqlserver
    url: mssql://user:password@host:1433/database?trustServerCertificate=true
    readonly: true

Or with individual fields:

ktx.yaml
connections:
  my-sqlserver:
    driver: sqlserver
    host: sql.internal
    port: 1433
    database: Analytics
    username: ktx_reader
    password: env:MSSQL_PASSWORD
    schema: dbo
    trustServerCertificate: true
    readonly: true

For multiple schemas:

    schemas:
      - dbo
      - analytics
      - staging

Authentication

MethodConfig
SQL Server authusername + password
Encrypted connectionAlways enabled, trustServerCertificate: true for self-signed

Features

FeatureSupportedNotes
Tables & viewsYesVia INFORMATION_SCHEMA.TABLES
Primary keysYesVia TABLE_CONSTRAINTS and KEY_COLUMN_USAGE
Foreign keysYesVia REFERENTIAL_CONSTRAINTS
Row count estimatesYesVia sys.dm_db_partition_stats
Column statisticsNo
Historic SQLNo
Table samplingYes
Nested analysisNo

Dialect notes

  • Parameter binding uses @paramName syntax
  • Row limiting uses SELECT TOP N * FROM (query) AS ktx_query_result
  • Encryption is always required; certificate validation is optional
  • Multi-schema support with per-schema isolation

SQLite

File-based connector using better-sqlite3. Ideal for local development, embedded analytics, or testing.

Connection config

ktx.yaml
connections:
  my-sqlite:
    driver: sqlite
    path: ./data/warehouse.sqlite
    readonly: true

Path supports multiple formats:

# Relative path (resolved against project directory)
path: ./warehouse.sqlite

# Absolute path
path: /var/data/analytics.db

# Home directory expansion
path: ~/data/warehouse.sqlite

# Environment variable
path: env:SQLITE_DB_PATH

# URL format
url: sqlite:///path/to/db.sqlite

Authentication

No authentication required — SQLite is file-based. The file must be readable by the process running KTX.

Features

FeatureSupportedNotes
Tables & viewsYesVia sqlite_master
Primary keysYesVia PRAGMA table_info()
Foreign keysYesVia PRAGMA foreign_key_list() (requires PRAGMA foreign_keys = ON)
Row count estimatesYesExact count via SELECT COUNT(*)
Column statisticsNo
Historic SQLNo
Table samplingYes
Nested analysisNo

Dialect notes

  • Synchronous query execution (no connection pooling)
  • Parameter binding uses :paramName syntax
  • Uses LIMIT X OFFSET Y for pagination
  • SQLite type affinity system: TEXT, NUMERIC, INTEGER, REAL, BLOB
  • Foreign key enforcement requires explicit PRAGMA foreign_keys = ON
  • In-memory databases supported with path: ":memory:" (for testing)