Biological Age Marker (BAM) Criteria
We created a Biological Age Marker Criteria (BAM criteria) to validate biological age markers for Blueprint’s whole-body age rejuvenation protocols.
BAM criteria checklist
A biological age marker should meet all BAM criteria.
- Classification system: Marker should have a named categorical physical body, organ system, organ, sub-organ, tissue, cell, or sub-cellular process as classification system metadata.
- Markers should be combined to better predict organ biological ages and diverse clinical outcomes: Groups of markers can be grouped and weighted together to calculate sub organ anatomical region or tissue biological age and organ biological age, e.g., ‘lens age’; groups of organs can be weighted together to calculate organ system biological age, groups of organs or organ systems can be weighted together to calculate whole-body biological age, groups of the above can be grouped to predict a large, stackable amount of clinical outcome risk.
One should appreciate that integration of 100s or 1000s of markers, e.g., on average of 10+ per organ, may be needed to predict organ, organ system and whole-body biological age, or >95% accurate multi-decade or lifetime all-cause mortality risk, or all age-related disease clinical outcomes, with any theoretical certainty. In addition, other non-age-related but clinical outcome predicting (NARB-COP) biomarkers, extrinsic aging, and accelerating aging biomarkers may need to be included in specific individuals.
Biological age markers can also represent less anatomically distinct entities such as multi-organ system age markers, ‘e.g., exercise performance age,’ all-cause mortality predictors, e.g., ‘PBMC WBC DNAm GrimAge,’ sub-cellular structure, e.g., ‘mitochondrial age,’ SENS or Lope Otin or similar hallmark of aging damage type, e.g., ‘extracellular matrix extracellular crosslinking age.’ In terms of stacking independent risk factors within an organ or organ systems or all-cause mortality related clinical outcome, for example, PWV + intima-media thickness + 2D intima-media + 3D intima-media + Augmentation Index + central BP + 24 hours BP + longitudinal strain + maximum diameters + quantitative calcification + plaque characteristics at multiple different major arteries, as relevant, may better predict arterial age and its sub organ region standard deviation ages than anyone or subset of these markers alone; evidence for such multi-marker organ-related clinical outcomes prediction should be accumulated and appraised for each organ.
- Should have a biological age resolution of 5 years – ideally 2 – years AFTER adjusting for patient, assay, and operator variation coefficients:
Should have at least a biological age resolution of 5 years after adjusting for intra-day and intra-month variation, ideally two years, considering individual intra-day and intra-monthly variation. This means the result of the test should be able to distinguish people. For example, AST is not a biological age marker because data suggests it only raises from 24 to 28 from age 20 to 80. As such, the delta is 4 IU/L for 60 years biological age change; 60/4 = resolution of 15 years, taking into account +/=2 IU intra-day or intra-month or intra-assay variation (which may be generous), the resolution then becomes 60/(4-(2+2) = infinity years resolution: a marker that cannot meet criteria.
- Should make biogerontological sense: Should fit with the current understanding of aging biology while also appreciating the unknowns of biogerontology.
- Efforts should be made to eliminate or reduce the effect of neutral false positives:
Markers that change with age reliably but have a neutral effect on clinical outcome when rejuvenated, e.g., potentially some forms of beta-amyloid (removing it has no benefit or harm, in some populations).
- Efforts should be made to eliminate or reduce the effect of false-negative positives:
Markers that change with age reliably but have a negative effect on clinical outcomes when rejuvenated, e.g., potentially long term rejuvenation of IGF-1 levels (Young 20s IGF-1 is linked in multi-confounder observational studies and preclinical lifespan studies and human lifespan studies (Laron syndrome), e.g., liver sinusoidal endothelial senescent cells (certain types of analysis may cause hepatic failure).
- Efforts should be made to highlight the chance of true positives:
Markers that change reliably with age and have a positive effect on clinical outcomes when rejuvenated; such markers should be highlighted with extensive evidence summary enhancing the marker’s reputation as a marker whose rejuvenation improves clinical outcomes reliably.
- External validity in measurement methods:
Should be measured by externally valid (real-world reproducible) methods, including the same or validatable surrogate equipment as used to derive the reference range, with sufficient sensitivity and specificity and technical procedures (e.g., Withings Body Cardio scales are validated against Sphygmocore for Carotid Femoral Pulse Wave Velocity with around 85% accuracy, whereas fingertip PWV measures are not validated as such and should not be used). E.g., Sphygmocore and other applanation tonometry devices, ultrasound-based PWV measurements, or Withings Body Cardio each have their own ‘age graph’ with different reference ranges for each device. However, the shapes of the graphs are typically similar, just translated. Each biological age marker should have its data plotted against chronological age on the X-axis, and the device and methodology used to measure it recorded as metadata to this graph.
- Coefficient of variation control at n=1:
Multiple readings to increase accuracy across patient, assay, and operator variation coefficients should be done to reduce the coefficient of variation to an acceptable level, taking into account biological age resolution, finances, and test risks. In addition, means of multiple readings of the same marker should be used to reduce noise, for example, using triplicate measurements of Diagnostics AGE volar forearm autofluorescence with no confounding factors, across multiple days, rather than single measures; or using 7 or 30 day mean average of whoop 5 min deep sleep RMSSD HRV rather than one day.
- Chrono-age independency:
Should not require chronological age to be calculated.
- Marker dependent X-axis limits:
Should only be used to calculate biological age for the biological ages it is relevant to (e.g., Augmentation index only changes for age 18-50 and then the median gradient = 0, e.g., pediatric limits)
- Good healthy reference populations:
Reference graphs should be based on a healthy population with good inclusion-exclusion criteria, ideally with a high longevity level 1 score (score/30) (150 mins+ weekly moderate or vigorous MET exercise, high AHEI-2010 diet scores, calorie-restricted with optimal nutrition, BMI 18.5-22.5, no significant conditions, never smoker, alcohol maximum two units per day).
- Non-linear or linear curve options:
All graphs should be fitted with the best fitting curves such as linear, exponential, logarithmic, polynomial 2/3/4/5, rather than linear resorted to without consideration and discussion of non-linear solutions.
- Demographic specific reference ranges & endeavourment to achieve these:
Gender, ethnicity, height, weight, body surface area, and other confounders should be taken into account to customize biological age equations to ever more specific demographics; demographics should be highlighted in the equation name.
- Should represent intrinsic, not extrinsic, or accelerated aging:
Should not be conflated with accelerated aging markers; for example, quantitative shear wave elastography or magnetic resonance elastography of the liver are not known to increase intrinsic aging but are still crucial for clinical outcomes prediction at n=1. Hence NAFLD, NASH, and liver fibrosis are not age-related diseases but rather show accelerating aging and hence can be may be able to form part of an accelerated aging or clinical outcomes risk equation or to be scored older than one chronological age for that organ/system/tissue, but not younger and hence cannot prove intrinsic rejuvenation. On the other hand, liver parenchymal volume, diffusion-weighted quantitative MRI markers, and senescent cell or fenestration density on biopsy change with age and are more likely to be considered for review as intrinsic aging markers.
We have begun measuring biological age markers in all 78 organs, where possible – with prioritization to the highest area under the curve containing all-cause mortality predicting organs when necessary- as we believe it is less credible to attempt to prove a cure for aging without such data. Furthermore, 78 organ measurement enables a zero-assumption model as it is unclear what % of all-cause mortality risk at any given chronological age, and more so at n=1, is contributed by each organ. Nail cancer or nail fungal infections may be clinically significant in some individuals. Maybe grey hair due to catalase enzyme failure has systemic consequences? Hair and nails are the two most extreme examples of the least clinically important organs theoretically for the average person.
There is a lack of consensus on what makes a good organ biological age criteria. Biological ages are rarely used in clinical practice, other than a limited number of the lung (FEV1/FVC/PEFR/DLCO) or arterial markers (PWV). Some non-clinical services such as consumer apps may give a ‘skin age’ (Visia 7 TruSkin age), or ‘hearing age’; however, these markers are not of unclear validity. Likewise, exercise testing services and the American College of Sports Medicine may use ‘exercise marker performance age’ typically only for VO2 max, despite the ACSM guidelines having age-related reference ranges for many other fitness tests.
Previous publications were taken into account, such as Butler et al [Butler R.N., 2004], recommendations for aging biomarkers. Comments:
- change with age à there is a need to clarify the amount that markers need to change with age to have clinical and statistical significance taking into account intra/inter-assay, subject, and operator variation and biological age resolution.
- predicting death is better than calendar age as death is the least powered marker in healthy populations, other clinical outcomes, high-level surrogates, and low-level surrogate markers are now additional options (note the paper was in 2004)
- to determine the early stages of a specific pathology, in particular – an age-related disease à age-related diseases should not be conflated with ICD-11, which misses many unclassified age-related diseases; early stages should not be presumed necessary as later stage detection is better than nothing
- be minimally invasive – does not require major surgery or painful procedure. a some biopsy-based methods may be necessary, given with anesthetic, similar to ICD-11 disease diagnosis requirements. It is unclear what minimally invasive is defined, as this is highly subjective.
And Moskalev [Moskalev A., 2019] recommendations:
- Have a high sensitivity to early signs of aging of the body; a this should not be a requirement, as it implies low sensitivity is as clinically useless as no sensitivity, which makes no sense.
- Be predictable over the foreseeable time frame a it is unclear what this means, predictable for the individual in response to intervention or something else?
- Have low analytical variability – be reliable and reproducible.
Jylhava et al. 2017
“Most of the biological age predictors we have discussed have little or no interaction with each other. Thus, effects are independent of each other and may describe different parts of the aging process. A combination of markers would increase the predictive power’ à I agree with this. However, the authors focus solely on one type of PBMC telomere length, one type of WBC DNAm test, and some simple composite biomarker scores, which are insufficient to capture 100% of ACM.”
As you can see, PBMC median telomere length does not predict all-cause mortality, and other tests may predict semi-stackable ACM risk.
- Similarly, Li and colleagues found FI + Grimage stacked predictivity (2020)
- “Integration of multiple biomarkers can be even more powerful. The Dunedin Study 91 has focused on middle-aged people and used different measurements (telomere lengths, epigenetic clocks, and clinical biomarker composites) and compared their performance in predicting health status, as measured by physical functionality, cognitive decline, and subjective signs of aging. The three types of measurements in this study do not correlate with each other, suggesting that there is no single index of biological age.” à more evidence of semi-stacking
The American Federation for Aging Research (AFAR) has proposed the following criteria for a biomarker of aging:
- It must predict the rate of aging a This is a tautology; the biomarker you measure to ascertain the rate of aging must predict the rate of aging and does not make sense.
- It must monitor a fundamental process that underlies the aging process, not the effects of disease à It is impossible to know the difference in most cases, so this is not a valuable criterion.
- It must be tested repeatedly without harming the person à This is not specific enough; financial harm can occur from too many MRIs, venous collapse can arise from too many blood tests, repeatedly is non-specific.
- It must be something that works in humans and laboratory animals –it is unclear what ‘works’ means.
Similar to AGREE-II, AMSTAR-II, STROBE, and other checklists, this checklist can be used similarly by journals, funders, investors, researchers, and clinicians.