Skip to main content

Age estimation using DNA methylation technique in forensics: a systematic review

Abstract

Background

In addition to the DNA sequence, epigenetic markers have become substantial forensic tools during the last decade. Estimating the age of an individual from human biological remains may provide information for a forensic investigation. Age estimation in molecular strategies can be obtained by telomere length, mRNa mutation, or by sjTRECs but the accuracy is not sufficient in forensic practice because of high margin error.

Main body

One solution to this problem is to use DNA methylation methods. DNA methylation markers for tissue identification at age-associated CpG sites have been suggested as the most informative biomarkers for estimating the age of an unknown donor. This review aims to give an overview of DNA methylation profiling for estimating the age in cases of forensic relevance and the important aspects in determining the mean absolute deviation (MAD) or mean absolute error (MAE) of the estimated age. Online database searching was performed through PubMed, Scopus, and Google Scholar with keywords selected for forensic age estimation. Thirty-two studies were included in the review, with variable DNA samples but blood commonly as a source. Pyrosequencing and EpiTYPER were methods mostly used in DNA analysis. The MAD in the estimates from DNA methylation was about 3 to 5 years, which was better than other methods such as those based on telomere length or signal-joint T-cell receptor excision circles. The ELOVL2 gene was a commonly used DNA methylation marker in age estimation.

Conclusion

DNA methylation is a favorable candidate for estimating the age at the time of death in forensic profiling, with an uncertainty mean absolute deviation of about 3 to 5 years in the predicted age. The sample type, platform techniques used, and methods to construct age predictive models were important in determining the accuracy in mean absolute deviation or mean absolute error. The DNA methylation outcome suggests good potential to support conventional STR profiling in forensic cases.

Background

The study of epigenetics refers to the heritable changes in gene function that cannot be explained by DNA sequence changes (Deans and Maggert 2015; Felsenfeld 2014). The term “epigenome” refers to the overall epigenetic status of a cell, parallel to the term “genome”. The epigenome is the set of chemical modifications to the DNA that alter gene expression. Epigenetic changes control how and when the genes are turned on or off which regulate the protein production in certain cells. Epigenetic modification types include DNA methylation, histone modification, and chromatic structuring. DNA methylation is a common type of epigenetic modification. The chromatin proteins associated with DNA may be activated or silenced and therefore, only express necessary genes for an activity such as certain protein production (Bird 2007; Vidaki et al. 2013). DNA methylation plays an important role in embryonic development, reprogramming, transcription, imprinting, chromosomal stability, and X-chromosome inactivation. The epigenetic pattern is preserved during cell division in the same way as the DNA sequence is inherited from one generation to the next. However, during an individual’s lifetime, they can change over time (Kanherkar et al. 2014). Epigenetics can be affected by environmental exposure, such as diet and smoking (Lee and Pausova 2013).

In mammalian cells, chemical modification of DNA methylation primarily affects cytosines, followed by guanines in a 5′-3′ direction in the DNA double helix, resulting in the addition of a methyl group (-CH3) to their 5′ carbon (C5). These 5′-3′ CG methylation sites in DNA are called “CpG” dinucleotides, which are mostly methylated in the human genome (Ehrlich et al. 1982). Unmethylated CpGs called “CpG islands” are predominantly encountered in groups of 300–3000 bp with high CG density (>55% CG content), mostly located at the promoter of housekeeping genes (Antequerra and Bird 1993; Espada and Esteller 2010). In the last decades, studies have shown that certain CpG sites are often either hypermethylated or hypomethylated when age increases (Zhang et al. 2011). Hypermethylation (excessive methylation) or hypomethylation (loss of appropriate methylation) can promote carcinogenesis within a living individual (Auerkari 2006).

In a crime investigation scene where highly limited biological remains are found, such as blood, semen, tissue, or saliva, accurate age estimation can be important for the police to narrow down the identity of a victim or criminal. The traditional materials required for age estimation, such as large pieces of skeletal remains, are not always available in crime scenes (Feng et al. 2018). In order to estimate human age, several molecular-based strategies have been proposed, such as telomere repeat length that decreases with increasing age (Weidner et al. 2014), mRNA mutations that accumulate with increasing age, T-cell DNA rearrangements (sjTREC) (Zubakov et al. 2016), age-dependent deletions of mitochondrial DNA, or protein alterations such as the racemization of aspartic acid and advanced glycation end-products (Wochna et al. 2018). Nonetheless, only DNA methylation has provided an acceptable accuracy that is clinically useful (Freire-Aradas et al. 2017).

A study of DNA methylation has provided a forensic method for epigenetic female sex typing. The method is based on the methylation pattern at a repetitive DXZ4 locus that is highly methylated on the active X chromosome but hypomethylated on the inactive X-chromosome. The PCR protocol to detect the latter is very sensitive and only requires 50 pg of DNA for female sex typing (Naito et al. 1993). DNA methylation marks a methyl group at the 5′ position of cytosine residues remaining in the extracted DNA, so this epigenetic marker is compatible with the standard procedures of forensics (Bird 2002; Sijen 2015). The analysis of DNA methylation patterns in forensics may give hints on pathological conditions (Klutstein et al. 2016) or circumstances that lead to death (Virani et al. 2016) and indicate the age of the DNA donor (Feng et al. 2018). This review aims to address the DNA methylation-based age estimation and the important aspect of its uncertainty in forensic applications.

Main text

Methods

The online literature search in the Scopus, Google Scholar, and Pubmed/Medline databases was applied to define keywords of “age estimation” OR “age determination” AND “DNA methylation” AND “forensic”. The guidelines of the Preferred Reporting Items for Systematic reviews and Meta-analyses (PRISMA) were used for the systematic review (Moher et al. 2009).

The inclusion and exclusion criteria were determined as follows. The inclusion criteria were studies describing the DNA methylation analysis for age estimation combined with or without other methods, with no restriction of sample size or age ranges, but restricted to reporting in the English language, publication within 2014–2019, and topics related to forensic studies. The exclusion criteria were satisfied by review studies, age estimation without molecular analysis, and abstracts without full paper available.

Study selection

Reading the full articles for possible inclusion in the review followed the initial screening of the titles and abstracts. The full articles that matched the inclusion criteria and none of the exclusion criteria were set as eligible.

Results

Literature search and screening

The analysis initially included 495 studies, and after the removal of duplicate articles, 462 were left for screening. After excluding 340 articles by the relevance of title and abstract, 122 full text articles were left. After further careful screening for more detailed contents, 91 full articles were excluded, leaving 31 eligible full articles. The procedure of the data selection is presented in Fig.1.

Fig. 1
figure 1

Flow chart of screening analysis

Data extraction

The studies comparing the different methods of forensic age estimation were extracted as follows: name of the first author, year of publication, methods, source of samples, number of samples, age (in years), age prediction (in years) as MAD and RSME/SEE (Table 1). In the DNA methylation method’s studies, the following data were extracted: name of the first author, year of publication, population, source of samples, age range (in years), sample size, CpG coverage, gene(s), technique/input DNA for bisulfite conversion, statistical model, age prediction by MAD or MAE, in years (Table 2).

Table 1 Selected methods of forensic age estimation
Table 2 DNA methylation methods for age estimation in forensic studies

Table 1 shows previous studies about DNA methylation-based age estimation together with other methods based on telomere length, mRNA methylation, and signal-joint T-cell receptor excision circles (sjTRECs). The uncertainty levels (as MAD) in age prediction are compiled in Fig.2.

Fig. 2
figure 2

The age prediction (MAD) with different methods (years)

In the included studies (Table 2), the population and the sample types used were varied, as seen in Figs. 3 and 4, respectively. The study age range was 0–104 years old, and the range of the number of samples was 16–725, as shown in Figs.5 and 6, respectively. The number of CpG coverage in this study was from 1 CpG to 32 CpGs. Variable candidate genes date was used for age prediction. The ELOVL2 gene was most frequently used in studies with different body fluids and teeth (Fig. 7). The techniques used in the study are compiled in Fig. 8.

Fig. 3
figure 3

The sample population (%)

Fig. 4
figure 4

The sample types (%)

Fig. 5
figure 5

The age range (years)

Fig. 6
figure 6

The samples origin (%)

Fig. 7
figure 7

The genes mostly used in the studies (%)

Fig. 8
figure 8

The techniques used in the studies (%)

The platforms in age prediction used the multivariate linear regression model (MLRM), SNaPshot, methylation sensitive-high resolution melting (MS-HRM), EpiTYPER, next-generation sequencing (NGS), massively parallel sequencing (MPS), support vector regression model (SVRM), multivariate quadratic regression model (MQDRM), multivariate quantile regression model (MQTRM), random forest regression (RFR), generalized regression neural networks (GRNN), neural network (NN), artificial neural network (ANN), R-models, or combinations, as shown in Fig. 9.

Fig. 9
figure 9

The platform of age prediction model used in the study (%)

The uncertainty in the predicted age ranged in MAD from ±1.2 (Giuliani et al. 2016) to 7.87 years (Huang et al. 2015), in MAE from ±0.94 (Freire-Aradas et al. 2018) to 7.45 years (Vidaki et al. 2017), and in RSME from ±4.03 (Hong et al. 2019) to 11.1 years (Aliferi et al. 2018). Levels of mixed sample MAD and MAE are presented in Fig. 10.

Fig. 10
figure 10

The age prediction in mixed samples in MAD (years), *in MAE (years)

Discussion

Age estimation is important to investigate in forensic cases on persons of unknown age, in fraud cases, and other legal affairs of victim identification. Several DNA-based methods can be used to estimate human age, such as those based on telomere length, mRNA, DNA rearrangement or sjTREC, and aspartic amino acid (Asp) racemization, which decrease along with increasing age (Zubakov et al. 2016).

Telomeres are located at the terminal regions of chromosomes and protect chromosome ends. Shortening of the telomeres will lead to cell senescence, characterized by the incapacity of the cell to replicate. The measurement of the telomere length for the estimation of human age was first published using the Southern blot technique (Butler et al. 1998). The current methods in measuring the telomere length for age prediction have been presented in two studies (Weidner et al. 2014; Zubakov et al. 2016), with MAD of more than 10 years for the predicted age prediction was more than 10 years, while it was 5 years for the method of DNA methylation. The telomere length-based approach is hence not sufficient in forensic practice because of the high margin of error.

By identifying the mRNA markers via microarray screening and validating with TaqMan qRT-PCR profiling, the results can provide an age prediction model. The correlation between gene expression and age has been used to find the strongest of nine mRNA candidate markers. The MAD for mRNA methylation-based age prediction was about 9 years, i.e., more than that the 5 years for DNA methylation-based prediction (Zubakov et al. 2016).

The sjTREC levels in the blood decrease with increasing age. The sjTRECs are episomal DNA molecules, by-products of T-cell somatic rearrangements in the T-cell receptor loci in order to recognize a wide range of foreign antigens. These molecules do not replicate and are progressively lost during subsequent cell divisions (Yamanoi et al. 2018). The MAD for sjTREC-based age prediction is 9–10 years, again more than the 4–5 years for DNA methylation-based age prediction (Cho et al. 2017).

The sjTREC-based methods are only applicable with a limited range of tissues under specific conditions (fresh blood samples and tissues of fresh cadavers) and do not meet the requirements of robustness under variable environmental factors and accurate estimation models (Zubakov et al. 2016).

Other methods of molecular age determination include Asp racemization (Hartomo et al. n.d.). The racemization is a first-order kinetics reaction where the amino acid changes from the levo (l) to the dextro (d) form. The aspartic amino acid (Asp) is a protein compound in many human tissues, including the teeth. Asp is most prone to racemization, which is optically active change because of an asymmetric carbon atom arrangement. Asp has the highest racemization reaction rate of all amino acids (Ogino and Onino 1988). In cartilage, bone, and teeth, the turnover accumulation of the d form proceeds at a low temperature-dependent rate linearly with age. The ratio of d/l may be used to estimate chronological age. In dentin, the MAD of the estimated chronological age was approximately 3 years (Ohtani and Yamamoto 2010).

The DNA methylation-based methods developed rapidly since the first relevant studies on DNA methylation and age estimation were published (Naito et al. 1993). The studies comparing DNA methylation-based age estimation with other methods showed that, e.g., sjTREC-based methods alone give MAD of about 10 years, while DNA methylation gave MAD of about 4 years. Combining SjTRECs and DNA methylation exhibited even higher predictive accuracy with MAD of about 3.3 years (Cho et al. 2017), while a combination set of five DNA methylation markers and one mRNA marker gave MAD of 4.6 years (Zubakov et al. 2016).

In line with the increasing age, DNA hypomethylation increases in the distribution of the genome (affecting intronic, exonic, promoters, and intergenic regions) or, in other words, the global level of methylated genomic DNA decreases as a person is aging (Wilson et al. 1987). However, DNA methylation is also susceptible to reproducibility variation in the assays according to the type of tissue used in the analysis, because some of the 5mC methylation marks in DNA are specific (Rana 2018). To scrutinize further on DNA methylation in different types of tissue, Horvath (2013) developed a multi-tissue age predictor, which allowed estimating the DNA methylation age in most tissues. The age predictor used 8000 samples from 82 Illumina DNA methylation array datasets, covering 51 healthy tissues and cell types. The multi-tissue age predictor is freely available (Horvath 2013).

Different sample sources can be modified in CpG coverage, such as buccal swabs as DNA methylation source of age prediction. The buccal epithelial cells with leukocytes by two additional CpGs provided age prediction with a multivariate model, showing that two cell type-specific CpGs actually improve epigenetic age prediction (Eipel et al. 2016). Different oral tissue sources showed different MADs: MAD was 1.2 years for cementum, 2.3 years for dental pulp, 7 years for dentin (Giuliani et al. 2016), 6 years for saliva, and 7.7 years for cigarette butts (Hamano et al. 2017).

Predicting younger age was more accurate and the accuracy decreased with increasing age. Five years prediction achieved 86.7% in the 2–19 years of age category and decreased to 50% in the 60–75 years of age category (Zbiec-Piekarska et al. 2015a). Validation of the age-prediction model for young age ranges showed MAE ± 1.25 years in the 2–18 years of donor age range while it was MAE ± 3.07 years in the adult populations (Freire-Aradas et al. 2018). The CpG site methylation markers with reduced methylation with age were CCDC102B, ASP, C1orf132, and chr16:85395429, while ELOVL2, FHL2, and PDE4C progressed with increasing DNA methylation with increasing age (Park et al. 2016). On the other hand, young age tends to be overestimated, while older age tends to be underestimated more often (Naue et al. 2017). An experimental study showed that ELOVL2, FHL2, PENK, and KLF14 did not display an age-related change in gene expression in peripheral blood mononucleated cells (Steegenga et al. 2014).

ELOVL2 locus provides a very good blood source of information of human chronological age and did not change significantly after 4 weeks of storage at room temperature, although along with increasing time, the positive result determined by PCR was gradually decreased (Zbiec-Piekarska et al. 2015b). The ELOVL2 gene was mostly used especially in blood and bloodstain samples (75%) as seen in Fig. 7. ELOVL2 also appeared to be an excellent age predictor across multiple ethnic groups such as Polish (Zbiec-Piekarska et al. 2015a,b), Koreans (Cho et al. 2017), and Singaporeans (Thong et al. 2017). ELOVL2 was not affected by the disease, so it appears suitable for forensic age prediction (Spolnicka et al. 2018). ELOVL2 is also a stable gene and has a strong positive correlation between methylation and age across other samples such as teeth (Giuliani et al. 2016; Bekaert et al. 2015b), buccal swabs (Bekaert et al. 2015a; Giuliani et al. 2016; Jung et al. 2019), saliva (Jung et al. 2019), and even cigarette butts (Hamano et al. 2017). The PDE4C gene was used in 33.3% of studies using blood samples, teeth, and buccal swabs. Eipel et al. demonstrated that methylation of PDE4C (cg17861230) has a higher correlation to chronological age with saliva and buccal swabs than blood. While semen samples were detected mostly by NOX4 (cg06979108) then TTC7B (cg06304190) and cg12837463 with no gene associated (Lee et al. 2015; Li et al. 2017; Lee et al. 2018; Richards et al. 2019).

For the target sites or CpG coverage and the age prediction accuracy, three target sites have been suggested as a preferable number for practical reasons (Weidner et al. 2014; Park et al. 2016), while one study suggested two target sites (Hamano et al. 2017). The age differential in methylation might be similar or significantly disparate between different tissues, depending on the specific CpG site. Therefore, in designing an age-prediction model, the method should be investigated thoroughly for multi-tissue forensic applicability (Aliferi et al. 2018).

Epigenetic studies are best in comparing monozygotic twins because they share the same genetic basis. They both display the same methylation and gradually show more differences in the methylation patterns (Kader and Ghai 2015). There is a specific forensic marker in discriminating monozygotic twins by the differences of LINE-1 in interspersed repeat sequences (Xu et al. 2015b). Buccal samples from 31 CpG sites from three loci in identical twins have demonstrated that at least one CpG site with DNA methylation was significantly different in all twin pairs (p < 0.05) and the highest number of significantly different CpG sites was six (Richards et al. 2019). The sampling of reference subjects from monozygotic twin pairs is often favored for investigating environmental influences on age prediction models since monozygotic twins usually have a similar growth environment (Xu et al. 2015a; Vidaki et al. 2017).

The collected samples in the studies mostly use blood from a donor or cadaver, but one study used samples from both healthy subjects and cadavers collected within 10 days and found no significant changes between living and dead body samples in the methylation status (Hamano et al. 2016). DNA methylation is also stable in bloodstains obtained from peripheral blood in both FTA cards and gauze exposed at room temperature for about 3 months (Peng et al. 2019).

Chronological age prediction from a forensic setting usually gives no information regarding possible disease status; therefore, age prediction is also performed in deceased subjects (Spolnicka et al. 2018; Vidaki et al. 2017; Horvath 2013). The biological age is relevant for the onset and progression rate of many diseases. Chronological age and biological age differences are important in forensic studies. Biological aging is influenced by cellular and molecular aging including changes in dysregulated nutrition, cell senescence, stem cell exhaustion, and disease-related factors (Bell et al. 2019). In one study, blood-related diseases showed high MAEs of the predicted age, with the highest MAE for anemia at 14.38 years, while schizophrenia showed the lowest age-prediction error of 5.03 years (Vidaki et al. 2017). In another study, a group with early-onset Alzheimer’s disease was predicted to be 1.7 years older than the chronological age of patients. The genes C1orf132 and ELOVL2 were stable in the three groups of early-onset Alzheimer’s disease, late-onset Alzheimer’s disease, and Grave’s disease. Therefore, they can be used as predictors of chronological age in forensic investigations (Spolnicka et al. 2018). ELOVL2 or ELOVL fatty acid elongase 2, also known as SSC2, is located in human chromosome 6 (6p24.2) (Jakobsson et al. 2006). In forensics, ELOVL2 is a promising candidate marker for age estimation because of its strong correlation with age prediction and a wide range of changes in methylation in aging (Zbiec-Piekarska et al. 2015b).

The pyrosequencing method was used in 13 out of 32 studies and is considered as a gold standard to detect DNA methylation. Pyrosequencing gives a detailed profile and accurate pattern of DNA methylation within 100 bases from the pyrosequencing binding sites. The ratio of nucleotides T and C determine the methylation degree at each CpG site in a sequence. Bisulfite conversion methods change unmethylated cytosines to uracil, while methylated cytosines remain cytosines. This is a quantitative technique, which can detect low methylation of up to 5%, and it can be used for multiplex assays (Kurdyukov and Bullock 2016).

The NGS is capable to detect DNA methylation differences in bisulfite-converted DNA fragments with overall performance <0.05 standard deviation. Other advantages include high sensitivity, multiplexing capabilities, and the potential for merging with other DNA marker analysis (Vidaki et al. 2017; Horvath 2013).

The disadvantage of pyrosequencing and NGS is that they are time-consuming and expensive (Mawlood et al. 2016); therefore, new methods were developed, such as MS-HRM, which can indicate methylation status more effectively in terms of labor, time, and cost (Hamano et al. 2016; Hamano et al. 2017). MS-HRM is a method to measure methylation profiles where the PCR amplification of bisulfite-treated DNA is followed by melting analysis. MS-HRM only requires qPCR, less time, and a gDNA amount of 20 ng/gene, whereas pyrosequencing needs 150 ng of gDNA. However, MS-HRM cannot measure the individual methylation rates and the issue of PCR bias such as intrinsic differences in the amplification efficiency of templates or by the self-annealing templates in the late stages of amplification (Hamano et al. 2017).

Other methylation detection methods include EpiTYPER (Feng et al. 2018; Zubakov et al. 2016; Freire-Aradas et al. 2016; Freire-Aradas et al. 2018; Peng et al. 2019), massive parallel sequencing or MPS (Naue et al. 2018), and single-base extension such as the SNaPshot technique. EpiTYPER is a sequencing method based on mass spectrometry-based bisulfite analysis. This technique indicates regional-specific DNA methylation, is fast and accurate but carries high cost in forensic service (Suchiman et al. 2015). A single EpiTYPER run yields 126 triplicate measurements that with the required controls are provided from a 384-well PCR plate (Suchiman et al. 2015). Therefore, EpiTYPER is useful for measuring relatively large numbers of samples. The MPS is a high throughput approach to DNA sequencing. Millions of short reads are sequenced per instrument run (Richards et al. 2018). The main advantage of MPS is its multiplexing capability, which allows simultaneous detection of multiple CpG sites from different genomic locations in a single reaction. MPS also has high sensitivity with single-base resolution, successfully applied to forensic analysis (Aliferi et al. 2018). The disadvantages include the high recommended DNA input (200–500 ng) due to the extensive DNA fragmentation and loss during the bisulfite conversion process (Richards et al. 2018).

The small amount of DNA commonly found in forensic cases increases margins of error of DNA methylation levels (Naue et al. 2018). The degraded and forensic relevant materials mostly contain inhibitors that can prevent DNA amplification of those samples and STR typing often fails to produce full DNA profiles. Therefore, shorter markers such as single-nucleotide polymorphisms (SNPs) and mini-STRs can be used with the SNaPshot approach (Zar et al. 2018). The SNP genotyping allows the identification of highly degraded biological samples. In the multiplex methylation SNaPshot method, the needed amount of bisulfite-converted DNA is only about 4 ng; therefore, it can be used in a routine forensic laboratory analysis (Hong et al. 2017). The average value of gDNA input before bisulfite conversion is 50 ng as the optimum input (Aliferi et al. 2018), but regarding the samples, the reliable identification of blood and saliva was possibly down to 10 and 0.1 ng for semen (Silva et al. 2016).

Identifying age-associated DNA methylation sites require prediction models. MLRM was used in most studies of this review. Weidner et al. proposed an age-prediction model with only three CpG sites with MLRM and pyrosequencing (Weidner et al. 2014). Constructed models for blood data by applying MLRM with pyrosequencing achieved MAD of about 3–4 years (Zubakov et al. 2016; Zbiec-Piekarska et al. 2015a; Park et al. 2016). The combination of MLRM based on SNaPshot data also provided predicted age from semen (Lee et al. 2015; Lee et al. 2018) or saliva (Hong et al. 2017; Hong et al. 2019; Jung et al. 2019) with MAD of 3–5 years. A disadvantage of the multivariate linear model was oversimplicity to explain the relationship between DNA methylation and age. The relationship between DNA methylation and age showed much faster (3- to 4-fold) change during childhood than as adults, so the changes were more accurately modeled with a logarithmic age function (Alisch et al. 2012). Therefore, some studies proposed the MQDRM, which performed well in both living individuals and deceased samples (Bekaert et al. 2015b).

Another statistical model is MQTRM that the prediction is not hindered by the prediction error which increases with age, which establishes by age-specific prediction intervals each time the new data contribute to the model (Freire-Aradas et al. 2016).

Hong et al. suggested that different platforms give different MADs between chronological and predicted ages. The predicted age obtained by applying MPS and SNaPshot data from the same individuals differed greatly, so they used platform-independent age predictive models using a neural network (NN) and MLRM. NN was tuned to have five and two neurons on layer 1 and layer 2 concurrently with the MLRM method tuned as well. The results demonstrated different MADs: 3.19 years for NN and 3.69 years for MLRM analysis (Hong et al. 2019).

The ANN model was believed to improve the prediction accuracy because it has the ability to recognize complex patterns in chronological age traits and seems to be a good alternative compared with the traditional parametric methods such as multiple linear regression models (Vidaki et al. 2017). ANN could eliminate the problem of nonlinear patterns but had a slightly lower prediction accuracy than NN (Spolnicka et al. 2018).

Aliferi et al. used GRNN and ANN modeling for age prediction, and the R project was employed to test 14 regression methods. After using the same sample, both GRNN networks and R model subsets were trained and blind-tested. GRNN has a disadvantage in using a small training dataset (n < 1000) for its susceptibility to overfitting and loss of generalizability (Vidaki et al. 2017; Aliferi et al. 2018).

Xu et al. compared age-prediction models in selected 11 CpG loci, including MLRM, multivariate nonlinear regression, back-propagation NN, and SVRM. They found that SVRM was the best model with the least MAD and superior to MLRM (Xu et al. 2015a). Other studies have used RFR, which allowed the selection and incorporation of linear and nonlinear markers (Naue et al. 2017). The established models from several studies provide an online calculator that is freely accessible to calculate predicted age (Feng et al. 2018; Weidner et al. 2014; Horvath 2013).

The methods and their advantages, limitations, and observed performance in DNA methylation-based age prediction in the studies of this review were hence quite variable. In general, the best-performing methods of DNA methylation-based age prediction showed MAD of around 3 to 5 years. In the forensic field, DNA methylation should therefore provide fair information about the remains of an unknown individual and his/her age. As before, it remains likely that future development in the assessment methods and techniques will reduce the associated limitations such as time and cost of analysis, and possibly allow for improved accuracy in the predicted age.

DNA methylation is not only age-specific but also influenced by diet, lifestyle, smoking, ancestry, and other factors that cannot be excluded in the studies. Lifestyle and genetic factors are associated with the level of variation in DNA methylation despite their stability as epigenetic markers (Xia et al. 2014). Therefore, further study is suggested on DNA methylation markers for age estimation in e.g. different ethnic groups.

Conclusions

DNA methylation is a favorable candidate in estimating the age at the time of death in forensic profiling. DNA methylation changes rapidly up to adulthood and the uncertainty (e.g., as mean absolute deviation or MAD) of the age estimates is under favorable circumstances about 3 to 5 years. The important aspects that influence the MAD include the available tissue or body fluid used for samples, analysis methods and platforms used according to the type of samples, and ways to construct the age predictive models. Developments in the methods of DNA methylation profiling and these studies are important in supporting conventional STR profiling to solve forensic cases in the future.

Availability of data and materials

Not applicable.

Abbreviations

ANN:

Artificial neural network

DNA:

Deoxyribonucleic acid

gDNA:

Genomic deoxyribonucleic acid

GRNN:

Generalized regression neural networks

MAD:

Mean absolute deviation

MAE:

Median absolute error

MLRM:

Multivariate linear regression method

MPS:

Massively parallel sequencing

MQDRM:

Multivariate quadratic regression model

MQTRM:

Multivariate quantile regression model

MS-HRM:

Methylation sensitive-high resolution melting

N:

Number of samples

NA:

Not available

NGS:

Next generation sequencing

PCR:

Polymerase chain reaction

PRISMA:

Preferred reporting items for systematic reviews and meta-analyses

RFR:

Random forest regression

RMSE:

Root mean square error

RT-PCR:

Real-time polymerase chain reaction

SEE:

Standard error of the estimate

sjTRECs:

Signal-joint T-cell receptor excision circles

SVRM:

Support vector regression model

References

Download references

Acknowledgements

The authors would like to thank Universitas Indonesia for the library facility support and Enago (www.enago.com) for the English language review

Author information

Authors and Affiliations

Authors

Contributions

CM conducted literature research, analysis, interpretation of data, and drafted the work. EIA contribute to the design of the study, supervise, revised it critically for important intellectual content, and approved the version to be published. All authors have read and approved the final manuscript.

Corresponding author

Correspondence to Elza Ibrahim Auerkari.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that there are no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Maulani, C., Auerkari, E.I. Age estimation using DNA methylation technique in forensics: a systematic review. Egypt J Forensic Sci 10, 38 (2020). https://doi.org/10.1186/s41935-020-00214-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s41935-020-00214-2

Keywords