SOP: finding tissue-by-age expression data
Use this SOP when populating tissue expression information for a protein page, or when checking whether a mouse expression-with-age finding has a counterpart in human tissue data. The primary resources are GTEx (human, age-stratified), Human Protein Atlas (human, cell/tissue-type level), and the Aging Atlas (multi-species, aging-specific multi-omic).
When to use these resources
- Filling the
gtex-aging-correlation:frontmatter field on atype: proteinpage. - Writing a “Tissue expression” or “Age-dependent expression” section.
- Cross-checking mouse single-cell results against human bulk/single-cell data.
- Answering “Does this gene go up or down with age in tissue X in humans?”
- Assessing whether a ubiquitously expressed gene has tissue-specific aging effects.
Resource 1: GTEx Portal (Human bulk RNA-seq, age-stratified)
What it is: Genotype-Tissue Expression project — post-mortem donor RNA-seq from 54 tissues, n ~1000 donors, with age bins 20–29, 30–39, 40–49, 50–59, 60–69, 70–79. GTEx v10 is the current dataset.
Primary API endpoint: https://gtexportal.org/api/v2/
Step 1 — Resolve gene symbol to versioned ENSG ID
GTEx requires the ENSG ID with version suffix (e.g., ENSG00000133116.8 for KL/Klotho). Look this up first:
curl "https://gtexportal.org/api/v2/reference/gene?geneId=KL&gencodeVersion=v39&genomeBuild=GRCh38%2Fhg38&pageSize=1&format=json"Record the gencodeId field (e.g., ENSG00000133116.8).
Step 2 — Get median expression per tissue (GTEx v10)
curl "https://gtexportal.org/api/v2/expression/medianGeneExpression?datasetId=gtex_v10&gencodeId=ENSG00000133116.8&format=json"Returns one record per GTEx tissue with median (TPM) and tissueSiteDetailId. Useful for establishing tissue specificity and highlighting high-expression tissues.
Confirmed working: KL (Klotho) returns 54 tissue records; kidney cortex tops the list at ~18.8 TPM, consistent with known biology.
Step 3 — Age-stratified expression
✅ Updated 2026-05-21 — the GTEx v2 API does expose age-bracket-stratified data via an undocumented (or poorly-documented) parameter on the /expression/geneExpression endpoint. Prior SOP statement (“The public API does not expose per-sample age-labelled data directly”) was incorrect or out of date.
Method (verified working 2026-05-21):
# Per-sample expression returned as one bracketed array per age decile
curl "https://gtexportal.org/api/v2/expression/geneExpression\
?datasetId=gtex_v10\
&gencodeId=ENSG00000198911.12\
&tissueSiteDetailId=Liver\
&attributeSubset=ageBracket\
&format=json"Returns one record per age bracket (20-29, 30-39, 40-49, 50-59, 60-69, 70-79) with data: [array of per-sample TPM values] and subsetGroup field naming the bracket.
Caveats specific to this method:
- Returns per-sample TPM values (post-normalization) but does NOT return donor IDs, sex, or other covariates — only the bracketed array. Useful for computing Spearman ρ across bracket midpoints; not useful for multivariate adjustment.
- Sample sizes are unequal across brackets. As of 2026-05-21, the typical liver sample distribution is roughly: 20-29: n≈8, 30-39: n≈18, 40-49: n≈38, 50-59: n≈99, 60-69: n≈93, 70-79: n≈6. The 20-29 and 70-79 bins are small and underpowered.
- Bulk RNA-seq is subject to cell-composition confounding — see § “Interpreting age-expression correlations” below. When bulk shows weak/null and single-cell shows clear directional change, trust the snRNA-seq result.
Computing Spearman ρ across bracket midpoints (Python):
import urllib.request, json
midpoints = {'20-29':24.5,'30-39':34.5,'40-49':44.5,'50-59':54.5,'60-69':64.5,'70-79':74.5}
url = 'https://gtexportal.org/api/v2/expression/geneExpression?datasetId=gtex_v10&gencodeId=ENSG00000198911.12&tissueSiteDetailId=Liver&attributeSubset=ageBracket&format=json'
with urllib.request.urlopen(url) as r:
d = json.load(r)
samples = [(midpoints[rec['subsetGroup']], v) for rec in d['data'] for v in rec['data']]
# Spearman ρ across (age_midpoint, TPM) tuples; n=sum of bracket sizesAlternative routes (still valid backup paths):
- GTEX v10 age-stratified summary files — available from the GTEx portal downloads page under “Gene expression” → “Age-stratified median TPM.” Download and filter locally.
- GTEx multi-tissue aging analyses — peer-reviewed papers (e.g., GTEx Consortium 2020, Science; Oliva et al. 2020, Science) report genome-wide age-eQTL and age-correlated expression with Spearman ρ per tissue. These are the best citable source when
gtex-aging-correlation:needs published-paper provenance rather than direct-query provenance. - recount3 / GTEx in R: The
recount3Bioconductor package provides programmatic access to per-sample expression matrices with donor-age covariates, which can be correlated with donor age (and adjusted for sex, RIN, ischemic time, etc.).
library(recount3)
# GTEx v10 data for a specific tissue
proj <- create_rse_manual(
project = "SKIN_SUN_EXPOSED_LOWER_LEG",
project_home = "data_sources/gtex",
organism = "human",
annotation = "gencode_v29",
type = "gene"
)Step 4 — Recording gtex-aging-correlation:
This R22 frontmatter field on type: protein pages encodes directional age-correlation across GTEx tissues:
gtex-aging-correlation: "pan-tissue: ρ=−0.15 (negative, kidney cortex strongest); data from Oliva-2020"Encoding convention (R22 decision): Record as a free-text string (not a structured object) because correlation direction and magnitude vary widely by tissue and the wiki should summarize the pattern rather than store a full 54-tissue table. Use one of these canonical forms:
"pan-tissue: ρ=X (direction)"— if correlation is consistent pan-tissue"tissue-specific: ρ=X in <tissue> only; flat elsewhere"— if heterogeneous"not-significant pan-tissue"— if no tissue reaches p<0.05 age-correlation after multiple testingnull+#gap/gtex-not-queried— if not yet populated
Cite the supporting paper or dataset version in the body (not in frontmatter).
Resource 2: Human Protein Atlas
What it is: HPA provides protein-level (immunohistochemistry, immunofluorescence) and RNA-level (bulk RNA-seq, single-cell RNA-seq) expression across tissues and cell types. Covers ~20,000 human proteins.
Primary API endpoint: https://www.proteinatlas.org/api/
# Tissue expression summary for a gene (returns JSON array)
curl "https://www.proteinatlas.org/api/search_download.php?search=KL&format=json&columns=g,gs,t,scl&compress=no"Key columns to request (columns= parameter):
g— gene symbolgs— gene synonymst— tissue expression level (HPA RNA)scl— subcellular locationup— UniProt accessionrnatsm— RNA tissue specificity
HPA tissue specificity categories:
tissue enriched— ≥5× higher in one tissuegroup enriched— elevated in a small tissue grouptissue enhanced— moderately elevatedlow tissue specificity— ubiquitousnot detected
What to extract for wiki pages:
- Tissue specificity category (place in “Tissue expression” section)
- Top 2–3 tissues with highest expression
- Whether HPA IHC shows age-related change in antibody staining (check the “Pathology Atlas” section for age correlation where available)
HPA Proteomic Atlas: https://www.proteinatlas.org/humanproteome/proteome+age — lists proteins with significant age-associated abundance changes in plasma. Check this for secreted or circulating proteins.
Resource 3: Aging Atlas (CNCB)
What it is: The Aging Atlas database (https://ngdc.cncb.ac.cn/aging/) from the China National Center for Bioinformation. Hosts curated multi-omic aging datasets: transcriptomics, epigenomics, and single-cell data across multiple species and tissues with age metadata.
Note on API access: The Aging Atlas does not currently expose a programmatic REST API at the /api/ path tested as of 2026-05-05. Data is available via the web portal and downloadable bulk files. Use the portal interactively or download tissue-specific aging transcriptomics matrices.
What to extract:
- Age-correlated DEGs (differentially expressed genes) in specific tissues
- Epigenomic aging marks per tissue
- Cross-species conservation of age-related expression changes
Citing Aging Atlas data: Record as: Aging Atlas v2.0 (ngdc.cncb.ac.cn/aging), accessed YYYY-MM-DD plus the associated paper DOI: 10.1093/nar/gkaa894 (Zhang et al. 2020, Nucleic Acids Research).
Interpreting age-expression correlations
Bulk vs single-cell confounds: Bulk tissue RNA-seq correlations with age can reflect cell-composition shifts (e.g., increased myeloid infiltration in aged tissue) rather than per-cell expression changes. Always note which data type underlies the claim, and tag #gap/cell-composition-confound if only bulk data is available.
Age bins in GTEx: GTEx v10 donors span ages 20–79 in 10-year bins. The oldest bin (70–79) is underrepresented. Extrapolation beyond 80 years is speculative.
GTEx donor selection bias: GTEx donors are largely healthy decedents (trauma, sudden illness) — “healthy survivor” bias. Very old, frail individuals are underrepresented. Claims about “expression in the very old” should note this.
Effect size vs significance: GTEx aging analyses often have modest effect sizes (|ρ| 0.1–0.3) even at significant p-values due to large n. Note the effect size alongside p-values.
Evidence table for tissue-expression claims
For cross-species expression claims, add the standard extrapolation table:
| Dimension | Status |
|---|---|
| Expression pattern conserved in humans? | yes / partial / no / unknown |
| Age-correlation direction conserved? | yes / partial / no / unknown |
| Single-cell resolution available? | yes / no |
Workflow for populating gtex-aging-correlation:
- Look up versioned ENSG ID via GTEx reference endpoint.
- Run median tissue expression query to establish tissue specificity.
- Search PubMed for
"<gene>" AND ("GTEx" OR "age-correlated expression")to find published age-eQTL or age-correlation analyses that provide citable ρ values. - If no paper: note as
null+#gap/gtex-not-queried. - Fill frontmatter field using the free-text convention above.
- Write a “Tissue expression” section in the page body with 1–3 sentences + the citations.
See also
- finding-protein-data — UniProt, canonical identity
- finding-singlecell-aging — single-cell age resolution
- finding-aging-specific — Aging Atlas bulk-data context
- finding-population-evidence — GTEx age-eQTL as instrument for MR