Gene2AI Genomics Analysis API v5.1 — AI-powered, population-aware genetic analysis with 122K+ SNP markers, 14 PGx genes, LD proxy fallback, HLA allele typing, CYP450 metabolizer phenotyping (incl. CYP1A2), NAT2 acetylator typing, and nutrition-intervention-oriented output.
v5.1 extends the analysis engine with nutrition-intervention-oriented output, enabling downstream AI agents and supplement formulation systems to consume structured ingredient guidance directly from genomic findings.
nutritionNew fields: snpDetails, ingredientGuidance, relatedLabIndicators, pathway, risk, risk_level
health_risksNew field: nutritionRelevance with relevant ingredients and monitoring indicators
cyp450New gene: CYP1A2 (caffeine metabolism). New field: supplementRelevance on all 4 CYP genes
metaNew field: snpCoverage — per-category SNP coverage breakdown
need field in nutrition findings is preserved. New risk and risk_level fields mirror the same value. Existing integrations require no changes.The Gene2AI Genomics Analysis API processes raw genetic data files (23andMe V3/V4/V5, AncestryDNA V1/V2, WeGene) and returns AI-enriched health insights in structured JSON format. The API supports format auto-detection. The analysis pipeline includes:
All output is in English. All confidence values are numeric (0.0–1.0).
https://api.gene2.aiAll REST endpoints are accessible at this base URL. Use /api/upload-url and /api/query-result for production.
V3 (Illumina OmniExpress) — ~960K SNPs
V4 (Illumina HumanOmniExpress) — ~570K SNPs
V5 / V5.1 (Illumina GSA) — ~640K SNPs
Format: Tab-separated, lines starting with # are comments. Columns: rsid, chromosome, position, genotype.
V1 (Illumina OmniExpress) — ~700K SNPs
V2 (Illumina GSA) — ~670K SNPs
Format: Tab-separated, lines starting with # are comments. Columns: rsid, chromosome, position, allele1, allele2.
Standard chip — ~1.35M SNPs (GRCh37, plus strand)
Format: Tab-separated, lines starting with # are comments. Columns: rsid, chromosome, position, genotype. Same layout as 23andMe but includes WeGene internal IDs (ws/wi/w0/w1/w2 prefixes) and indel genotypes (DD/II/DI).
KB coverage: high direct rsID overlap with 122K+ markers. Indel genotypes are skipped as they are not applicable to SNP-based analysis.
All API calls require HMAC-MD5 authentication via HTTP Headers. Three headers are required:
idVerify ID — "UP02" for upload, "GE01" for querytokenHMAC-MD5 token (16 hex chars)timeUTC time string in format "YYYY-MM-DD HH:MM"# Token = md5(verify_id + secret + time)[4:20]
# Time format: "YYYY-MM-DD HH:MM" (UTC)
import hashlib
from datetime import datetime, timezone
verify_id = "UP02"
secret = "your_secret_here"
time_str = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M")
raw = f"{verify_id}{secret}{time_str}"
full_hash = hashlib.md5(raw.encode()).hexdigest()
token = full_hash[4:20] # 16 hex charsCreates an analysis job and returns an upload URL for the genetic data file.
id: UP02
token: a1b2c3d4e5f6g7h8
time: 2026-03-02 14:30
Content-Type: application/json{
"key": "gene2ai_user12345_1709389800",
"type": "txt", // "txt" | "zip"
"format": "23andme" // "23andme" | "ancestry" | "wegene" | "auto"
}{
"code": 0,
"msg": "success",
"data": {
"key": "gene2ai_user12345_1709389800",
"url": "https://api.gene2.ai/api/genomics/upload/gene2ai_user12345_..."
}
}code: 0Job created successfullycode: 1Invalid token — HMAC validation failedcode: 2Missing required fields (key, type, or format)code: 3Invalid format or type valuecurl -X POST 'https://api.gene2.ai/api/upload-url' \
-H 'Content-Type: application/json' \
-H 'id: UP02' \
-H 'token: YOUR_TOKEN' \
-H 'time: 2026-03-02 14:30' \
-d '{
"key": "gene2ai_user12345_1709389800",
"type": "txt",
"format": "23andme"
}'The analysis result contains 9 categories of findings. Fields marked with v5.1 are new in this version. All new fields are optional and only present when relevant data exists.
nutrition[]expanded in v5.1| Field | Type | Description |
|---|---|---|
| nutrient | string | Nutrient name, e.g. "Folate (MTHFR)" |
| need | string | "decreased" | "normal" | "increased" — preserved for backward compat |
| risk v5.1 | string | Same value as need. Alias for clearer semantics in risk-oriented contexts |
| risk_level v5.1 | string | Same value as need. Alias for structured consumption |
| confidence | number | 0.0–1.0 |
| gene | string | Gene symbol, e.g. "MTHFR" |
| snps | string[] | rsID list |
| snpDetails v5.1 | object[] | Per-SNP detail: rsid, gene, genotype, effect description |
| ingredientGuidance v5.1 | object | primaryIngredients[], supportingIngredients[], avoidIngredients[], doseModifier, rationale |
| relatedLabIndicators v5.1 | string[] | Lab tests to monitor, e.g. ["Serum Folate", "Homocysteine"] |
| pathway v5.1 | string | Metabolic pathway enum (see below) |
| populationNote? | string | Population-specific context |
folate_methylation | vitamin_d_metabolism | vitamin_b12_metabolism | omega3_fatty_acid | iron_metabolism | antioxidant_defense | vitamin_a_metabolism | vitamin_c_metabolism | choline_metabolism | detoxificationstandard | increased | high | reduced | avoidhealth_risks[]expanded in v5.1| Field | Type | Description |
|---|---|---|
| condition | string | Condition name |
| risk | string | "low" | "average" | "slightly_elevated" | "elevated" | "high" |
| confidence | number | 0.0–1.0 |
| snps | string[] | rsID list |
| description | string | LLM-enriched description |
| nutritionRelevance v5.1 | object | relevantIngredients[] and monitorIndicators[] for this condition |
| populationNote? | string | Population-specific context |
cyp450.genes[]expanded in v5.1| Field | Type | Description |
|---|---|---|
| gene | string | "CYP2C19" | "CYP2D6" | "CYP2C9" | "CYP1A2" (new) |
| diplotype | string | Star allele diplotype, e.g. "*1/*2", "*1F/*1F" |
| activityScore | number | Combined activity score |
| phenotype | string | Metabolizer phenotype |
| drugRecommendations | object[] | Drug-specific recommendations |
| supplementRelevance v5.1 | object | affectedSupplements[], guidanceNotes, monitorIndicators[] |
| confidence | number | 0.0–1.0 |
| confidenceTier | string | "high" | "moderate" | "low" | "insufficient" |
| coverage | object | totalDefiningSNPs, foundInData, missing, missingRsids[] |
drug_responseFields: drug, sensitivity, gene, recommendation, snps, populationNote?
Values: sensitivity: normal | increased | reduced
traitsFields: trait, value, confidence, gene, snps
Values: confidence: 0.0–1.0
ancestryFields: regions[].region, regions[].percentage
Values: percentage: 0–100
apoeFields: genotype, alleles, alzheimerRisk, cardiovascularNote, confidence, description, snps
Values: alzheimerRisk: reduced | average | slightly_elevated | elevated | high
hlaFields: alleles[].allele, carrier, confidence, drugAssociations[]
Values: 9 HLA alleles via tag SNP inference
nat2Fields: diplotype, phenotype, acetylatorStatus, drugRecommendations[]
Values: phenotype: Rapid | Intermediate | Slow Acetylator
coverageFields: total_snps_in_kb, found_in_data, found_via_proxy, missing, coverage_pct
Values: coverage_pct: 0.0–100.0
snpCoverage[] ✨Fields: category, totalSnps, directHits, proxyHits, normalDefaults, misses, coveragePct
Values: Per-category breakdown (v5.1)
v5.1 is fully backward compatible with v5.0 integrations. All existing fields retain their names, types, and value ranges. The following design decisions ensure zero-disruption upgrades:
| Concern | Resolution |
|---|---|
| nutrition.need vs risk | need is preserved. risk and risk_level are additive aliases with the same value. Consumers can use either. |
| New optional fields | All v5.1 fields (snpDetails, ingredientGuidance, nutritionRelevance, supplementRelevance, snpCoverage) are optional. Absent when no matching knowledge data exists. |
| CYP1A2 addition | cyp450.genes[] now contains 4 entries instead of 3. Consumers iterating over the array will automatically include CYP1A2. |
| meta.version | Changed from "5.0" to "5.1". meta.kbVersion remains "v5.0-gwas-cpic". |
Different DTC chip versions (23andMe V3/V4/V5, AncestryDNA V1/V2) cover different sets of SNPs. When a target marker is not present in the user's data, the system automatically looks up linkage disequilibrium (LD) proxy variants — nearby SNPs that are strongly correlated (R² ≥ 0.7) with the target.
The LD proxy database contains 1,222 pre-computed proxy entries for 124 target SNPs, sourced from LDlink using the 1000 Genomes Phase 3 reference panel (GRCh37). Proxies are population-specific, with R² values recorded for EUR, EAS, AFR, SAS, and AMR super-populations.
When a proxy is used, the confidence score is automatically adjusted by multiplying with the R² value. The coverage object in the result reports how many markers used direct hits vs. proxy fallback vs. missing entirely.
import hashlib, time, requests
from datetime import datetime, timezone
BASE_URL = "https://api.gene2.ai"
def make_token(verify_id, secret):
t = datetime.now(timezone.utc).strftime("%Y-%m-%d %H:%M")
raw = f"{verify_id}{secret}{t}"
h = hashlib.md5(raw.encode()).hexdigest()
return h[4:20], t
# Step 1: Request upload URL
token, time_str = make_token("UP02", "your_upload_secret")
resp = requests.post(f"{BASE_URL}/api/upload-url",
headers={
"Content-Type": "application/json",
"id": "UP02",
"token": token,
"time": time_str
},
json={
"key": "gene2ai_user001_" + str(int(time.time())),
"type": "txt",
"format": "23andme"
}
)
data = resp.json()
assert data["code"] == 0
job_key = data["data"]["key"]
upload_url = data["data"]["url"]
# Step 2: Upload the file
with open("genome_data.txt", "rb") as f:
requests.put(upload_url, data=f.read(),
headers={"Content-Type": "text/plain"})
# Step 3: Poll for results
while True:
token, time_str = make_token("GE01", "your_query_secret")
resp = requests.post(f"{BASE_URL}/api/query-result",
headers={
"Content-Type": "application/json",
"id": "GE01",
"token": token,
"time": time_str
},
json={"key": job_key}
)
result = resp.json()
status = result["data"]["status"]
if status in ("succeeded", "failed"):
if status == "succeeded":
r = result["data"]["result"]
# v5.1: Access new nutrition fields
for n in r["nutrition"]:
if n.get("ingredientGuidance"):
print(f"{n['nutrient']}: {n['ingredientGuidance']['doseModifier']}")
print(f" Primary: {n['ingredientGuidance']['primaryIngredients']}")
print(f" Lab: {n.get('relatedLabIndicators', [])}")
# v5.1: Access CYP1A2 supplement relevance
for g in r["cyp450"]["genes"]:
if g.get("supplementRelevance"):
print(f"{g['gene']} ({g['phenotype']}): {g['supplementRelevance']['guidanceNotes']}")
break
time.sleep(5)import crypto from 'crypto';
import fs from 'fs';
const BASE_URL = 'https://api.gene2.ai';
function makeToken(verifyId, secret) {
const now = new Date();
const time = now.toISOString().slice(0, 16).replace('T', ' ');
const raw = verifyId + secret + time;
const hash = crypto.createHash('md5').update(raw).digest('hex');
return { token: hash.slice(4, 20), time };
}
async function analyzeGenome(filePath, format = '23andme') {
// Step 1: Request upload URL
const { token: upToken, time: upTime } = makeToken('UP02', 'your_upload_secret');
const key = 'gene2ai_' + Date.now();
const uploadResp = await fetch(BASE_URL + '/api/upload-url', {
method: 'POST',
headers: { 'Content-Type': 'application/json', id: 'UP02', token: upToken, time: upTime },
body: JSON.stringify({ key, type: 'txt', format })
});
const { data } = await uploadResp.json();
// Step 2: Upload file
const fileData = fs.readFileSync(filePath);
await fetch(data.url, {
method: 'PUT',
headers: { 'Content-Type': 'text/plain' },
body: fileData
});
// Step 3: Poll for results
while (true) {
const { token: qToken, time: qTime } = makeToken('GE01', 'your_query_secret');
const resp = await fetch(BASE_URL + '/api/query-result', {
method: 'POST',
headers: { 'Content-Type': 'application/json', id: 'GE01', token: qToken, time: qTime },
body: JSON.stringify({ key })
});
const result = await resp.json();
if (['succeeded', 'failed'].includes(result.data.status)) {
return result.data;
}
await new Promise(r => setTimeout(r, 5000));
}
}
// Usage
analyzeGenome('./genome_data.txt', '23andme').then(console.log);Nutrition-intervention-oriented output: ingredientGuidance, snpDetails, pathway, relatedLabIndicators for 18 P0 genes. nutritionRelevance for 13 health_risk categories. CYP1A2 star allele typing (4 alleles, 4 drugs). supplementRelevance for all 4 CYP genes. Per-category snpCoverage metadata.
GWAS Catalog + CPIC expansion to 122K+ SNP markers, 24K+ genes. 14 PGx genes (10 new CPIC genes). LD proxy fallback (1,222 proxies). HLA allele typing (9 alleles). NAT2 acetylator typing (18 alleles, CPIC 2025). Population-aware analysis (5 super-populations).
Initial release with CYP2C19/CYP2D6/CYP2C9 typing, 5 analysis categories, basic SNP knowledge base.