Data model overview
Data Vault 2.0 schema — Hubs, Links, Satellites used by portfolIQ.
Data model
portfolIQ uses Data Vault 2.0 — a modeling methodology designed for auditability, historization, and parallel loading. All tables live in the dv schema of PostgreSQL.
Architecture
Three entity types form the backbone:
- Hubs — unique business keys (e.g.
coingecko_id). Immutable identifiers. One row per entity. - Links — relationships between Hubs (e.g. asset ↔ protocol). Many-to-many capable.
- Satellites — attributes and metrics, all historized (append-only or SCD-II). Never modified — only new rows inserted.
Entity diagram
hub_asset ──────────────────────────────────────────────────────
│ asset_hk (SHA-256 of coingecko_id) │
│ │
├── sat_asset_metadata (name, ticker, tier, supply) │
├── sat_asset_market (price, mcap, change_24h) │
├── sat_asset_vwap_consensus (VWAP daily, 5 exchanges) │
├── sat_asset_fundamentals (P/S, P/R, fees, revenue) │
├── sat_asset_ai_analysis (classification, summary, sentiment)
├── sat_asset_onchain_core (tx_count, fees, active_addr)
└── sat_asset_onchain_metrics (Realized Cap, MVRV, SOPR, NUPL)
[BTC only — Growth tier]
link_asset_protocol
├── asset_hk → hub_asset
└── protocol_hk → hub_protocol
hub_protocol
└── sat_protocol_fundamentals (DeFiLlama fees, revenue 30d)
Hub: hub_asset
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | Primary key — SHA-256 of coingecko_id. Stable identifier. |
coingecko_id | text | Business key — CoinGecko slug (e.g. bitcoin, ethereum). |
load_ts | timestamptz | Row insertion timestamp. Immutable. |
record_source | text | System that created this hub record. |
Satellite: sat_asset_metadata
Descriptive metadata about the asset. SCD-II pattern: load_end_ts IS NULL = current record.
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | FK → hub_asset. |
ticker | text | Trading symbol (e.g. BTC, ETH). |
name | text | Human-readable name. |
coingecko_rank | integer | CoinGecko market cap rank at load time. |
tier_crypto | integer | portfolIQ tier: 1 (top 10), 2 (11-200), 3 (201-1000). |
is_stablecoin | boolean | Whether the asset is classified as a stablecoin. |
chain | text | Primary blockchain (e.g. ethereum, bitcoin). |
load_ts | timestamptz | Row insertion timestamp. |
load_end_ts | timestamptz | NULL = current. Non-null = superseded. |
Satellite: sat_asset_market
Daily price and market cap snapshots. Grain: (asset_hk, snapshot_date).
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | FK → hub_asset. |
snapshot_date | date | UTC date of the snapshot. |
price_usd | numeric(28,10) | Price in USD at close. |
market_cap_usd | numeric(28,2) | Market capitalization in USD. |
change_24h_pct | numeric(10,4) | 24h price change percentage. |
volume_usd | numeric(28,2) | 24h trading volume. |
Satellite: sat_asset_vwap_consensus
Daily VWAP computed across 5 exchanges (Binance, Kraken, Coinbase, Bybit, OKX). Legally redistributable — methodology disclosed.
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | FK → hub_asset. |
candle_date | date | UTC date. |
vwap_usd | numeric(28,10) | Volume-Weighted Average Price. |
volume_usd_total | numeric(28,2) | Total volume across all included exchanges. |
exchange_count | integer | Number of exchanges that contributed (min 1, max 5). |
methodology_version | text | vwap-v1.1 — see methodology docs. |
Satellite: sat_asset_fundamentals
Protocol valuation ratios (P/S, P/R) sourced from DeFiLlama (MIT licence). Grain: (asset_hk, snapshot_date).
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | FK → hub_asset. |
snapshot_date | date | UTC date. |
fees_30d_usd | numeric(28,2) | Protocol fees in last 30 days. |
revenue_30d_usd | numeric(28,2) | Protocol revenue in last 30 days (subset of fees). |
ps_ratio | numeric(18,4) | P/S ratio. See ps_ratio_label for denominator used. |
ps_ratio_label | text | 'P/S (revenue_30d)' or 'P/S approx (fees_30d)'. |
pr_ratio | numeric(18,4) | P/R ratio (NULL when revenue unavailable). |
methodology_version | text | defillama-v1. |
Satellite: sat_asset_ai_analysis
AI-generated analysis blocks. Historized by prompt version — old analyses preserved when prompts change.
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | FK → hub_asset. |
analysis_type | text | token_classification | fundamental_summary | news_summary. |
generated_at | timestamptz | When the analysis was generated. |
model | text | Model used (e.g. claude-haiku-4-5, claude-sonnet-4-6). |
prompt_version | text | Prompt version tag (e.g. classify-v2). |
content | jsonb | Analysis payload. Structure varies by analysis_type. |
cost_usd_micros | integer | Inference cost in micro-USD. |
Satellite: sat_asset_onchain_core
Daily on-chain activity metrics for BTC and ETH. Sources: mempool.space + Blockstream (BTC), public Ethereum RPC (ETH).
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | FK → hub_asset. |
metric_date | date | UTC date. |
active_addresses | bigint | Approximate active address count. |
tx_count | bigint | Confirmed transactions in the 24h window. |
fees_total_usd | numeric(20,8) | Total on-chain fees paid in USD. |
avg_fee_usd | numeric(20,8) | Average fee per transaction. |
source_rpc | text | Data source (e.g. mempool.space + blockstream.info). |
eth_tx_count_24h | bigint | ETH-specific: confirmed transactions (24h). |
eth_avg_gas_price_gwei | numeric(20,8) | ETH-specific: average gas price in Gwei. |
eth_gas_utilization_pct | numeric(8,6) | ETH-specific: gas utilization ratio (0.0–1.0). |
eth_source_rpc | text | ETH-specific: RPC endpoint used. |
Satellite: sat_asset_onchain_metrics
Advanced BTC on-chain metrics (Realized Cap, MVRV, SOPR, NUPL, HODL waves). Computed from bigquery-public-data.crypto_bitcoin (MIT licence). Growth tier only.
| Column | Type | Description |
|---|---|---|
asset_hk | varchar(256) | FK → hub_asset. |
metric_date | date | UTC date. |
realized_cap_usd | numeric(28,2) | Realized Cap: sum of each UTXO at its last-moved price. |
mvrv_ratio | numeric(18,8) | Market Cap / Realized Cap. |
nupl | numeric(18,8) | Net Unrealized Profit/Loss. |
sopr | numeric(18,8) | Spent Output Profit Ratio. |
hodl_waves | jsonb | Supply distribution by UTXO age — 10 buckets. |
methodology_version | text | onchain-realized-metrics-v1.0. |
classification | text | Always 'proprietary' — never redistributed raw. |
Data freshness
| Model | Update frequency | Source |
|---|---|---|
| sat_asset_metadata | Daily | CoinGecko Demo API |
| sat_asset_market | Daily | CoinGecko Demo API |
| sat_asset_vwap_consensus | Daily | 5 exchanges (Binance, Kraken, Coinbase, Bybit, OKX) |
| sat_asset_fundamentals | Daily | DeFiLlama /overview/fees |
| sat_asset_ai_analysis | On change | Claude Haiku / Sonnet |
| sat_asset_onchain_core | Daily | mempool.space, public ETH RPC |
| sat_asset_onchain_metrics | Daily | BigQuery public crypto_bitcoin |
Notes
- All timestamps are UTC.
- All monetary amounts are in USD.
asset_hkisSHA-256(lower(coingecko_id))encoded as hex.- Models tagged
meta.public_doc: truein dbt are documented here. Models withmeta.public_doc: false(e.g.sat_asset_onchain_metrics) are proprietary and not exposed in the public dbt package. - Not financial advice. Factual data only. Methodology disclosed.