Sex estimation using the human vertebra: a systematic review

The vertebral column has been used in forensic studies for its weight-bearing function and relative density. Sex estimation is one of the essential elements in an anthropological examination, as it may narrow down the possibility of a match by half. Hence, it is crucial to derive the population-specific reference data in each vertebra for sex estimation. This systematic review explored the most sexually dimorphic vertebra by using the conventional anthropometric analysis. An electronic comprehensive search was conducted using databases such as Scopus, Web of Science (WOS) and EBSCO Medline for relevant studies between 2008 and 2020. The main inclusion criteria were studies in English, and studies on sex estimation by morphometric analysis of vertebra by CT scan or dry bone. Only studies related to human adult age and vertebra were analysed. Literature search identified 84 potentially relevant articles, in which 19 articles had fulfilled the inclusion criteria. This review included studies on the cervical, thoracic and lumbar vertebrae in different populations. The vertebral spine has demonstrated significant sexual dimorphism with variable prediction accuracies, whereby the body of a vertebra was found to be sexually dimorphic. It was shown that high accuracy of sex classification was provided by the second cervical, twelfth thoracic and first lumbar vertebrae, especially when they were used in combination.


Background
Identification of unknown skeletal remains has been a challenge for forensic anthropologists, especially in mass disaster and advanced decomposition in human remains. Sex estimation is one of the most important elements in anthropological examination. It has impact on facilitating the identification of other skeletal parameters such as estimation of age at death, race, stature and relevance to studies of compounding biological factors such as pathological conditions, environmental and dietary habits (Cattaneo 2007;Marlow and Kozieradzka-Ogunmakin 2016).
It has been reported that the pelvic bone is the most accurate bone for estimation of sex (Bruzek and Murail 2006;Marlow and Kozieradzka-Ogunmakin 2016;Hora and Sládek 2018). By having sex estimation, a forensic scientist may be able to narrow down the possibility of a match by half (Robinson and Bidmos 2009). Accurate estimation of sex (100% accuracy) can be achieved based on the pelvis and skull combined (El Dine and El Shafei 2015). However, both pelvis and skull are not always preserved due to severe fragmentation and decomposition. Hence, exploration of sex differences in other bones is equally important to develop for future studies (Ramadan et al. 2017).
A vertebra comprises a combination of outer dense cortical bone and inner cancellous bone, that contribute to its weight-bearing function (Tan et al. 2004;Gülek et al. 2007;Yu et al. 2008;El Dine and El Shafei 2015).
Vertebral column has been used in forensic investigation due to its comparative thickness and capacity in resisting mechanical forces. In several cases, vertebral column was considered to be a well-preserved bone element (Hora and Sládek 2018;Padovan et al. 2019), as it usually survive in relatively harsh conditions such as bushfire, mass disaster and other natural disruption (Blau and Briggs 2011).
The metric analysis is easy to manage by using the statistical methods (Pretorius et al. 2006). It involves collections of linear distances, angles and distance ratios from either dry bone or radiological images (Slice 2007;Omar et al. 2019). The morphological analysis is rather subjective as it explores two-dimensional (2D) or threedimensional (3D) shapes for sex estimation in multiple bones (Teodoru-Raghina et al. 2017;Choong et al. 2020). In this review, studies on vertebrae are mainly focused on sexual dimorphism in different populations using the conventional metric analysis. This review summarised the morphometric parameters for sex estimation in different types of vertebrae by both radiographic method and dry bone measurements.

Methods
The systematic review search protocol PRISMA was adopted in this research study (Fig. 1). It has been registered in the Prospective Register of Systematic Reviews (PROSPERO) as the international database of registered systematic reviews (No: CRD42021252590).

Literature review
The focus of this review was to identify relevant research studies on sexual dimorphism in vertebral bones by morphometric analysis. Scopus, Web of Science (WOS) and EBSCO Medline were used to search for articles from the health science journals published between 2008 and 2020. The search strategy included a combination of three sets of words with truncation of an advanced search engine: (1) gender OR sex* (2) estimati* OR determin* OR assess* OR identificat* OR characterist* OR dimorphi* OR stud* (3) vertebr* OR spin* OR cervi* OR thora* OR lumba*.

Selection criteria
The results were selected from articles that were published in English including the abstracts. The studies selected were studies that performed morphometric analysis of human vertebrae for sexual estimation by computed tomography (CT) scan or dry bones. Articles such as reviews, news, editorials, letters or case reports were excluded from the review.

Data extraction and managing references
The articles were screened in three stages before they were included in the review. Firstly, title screening was done, and articles that were not suitable by the inclusion criteria were excluded. Next, the abstracts of the remaining articles were reviewed, and those that did not meet the inclusion criteria were excluded. Finally, the remaining articles were screened to exclude papers that were not within the scope of the literature. Duplicates were removed, and the remaining papers were selected again by at least two reviewers.
Before the data extraction phase, the reviewers will approve the full papers, that matched the inclusion criteria, and any discrepancies in opinions were discussed by the reviewers. Data extraction was performed independently for data validation by using the data collection form. The following data were recorded from these studies, i.e. title of the study along with the authors and year, types of vertebrae performed in the study, subjects or samples, study methods, results and remarks or conclusion of the study.

Inclusion and exclusion criteria
The inclusion criteria were primary studies, which included studies related to human adult age; studies related to vertebrae; studies by morphometric analysis such as calipers, CT scan or MRI; and studies using discriminant function analysis or regression analysis.

Results
The literature search identified 84 related articles. The reviewers evaluated the inclusion and exclusion criteria based on their titles and abstracts in all of the articles. Twenty papers were removed as they were nonrelational types neither to single vertebral bone analysis, sexual dimorphism nor morphometric measurements, and these studies were not conducted within the field of forensic anthropology. The remaining 19 articles had matched with the inclusion and exclusion criteria, and hence were included in the review. The selection process of the systematic review was presented in a flow chart (Fig. 1).

Study characteristics
The descriptions of the articles were presented (Table  1). Briefly, all of the articles were published within the year 2008 until 2020, which comprised studies of vertebrae by 3D images or dry bones for estimation of sex. Based on bones types, seven studies were focused on cervical vertebrae (Marlow and Pastor 2011;Bethard and Seet 2013;Gama et al. 2015;Torimitsu et al. 2016;Kaeswaren and Hackman 2019;Padovan et al. 2019;Rozendaal et al. 2020), two studies on thoracic vertebrae (Yu et al. 2008;Hou et al. 2012) and seven studies on lumbar vertebrae (Zheng et al. 2012;Ostrofsky and Churchill 2015;Ramadan et al. 2017;Oura et al. 2018;Decker et al. 2019; Azofra-Monge and Alemán Aguilera 2020; Suwanlikhid et al. 2020). Two studies were done on thoracic and lumbar vertebrae (El Dine and El Shafei 2015; Garoufi et al. 2020), and one study on cervical and thoracic vertebrae (Amores et al. 2014).
In the review, nine studies used radiologic images to analyse the bones (Yu et al. 2008;Zheng et al. 2012;Hou et al. 2012;El Dine and El Shafei 2015;Torimitsu et al. 2016;Ramadan et al. 2017;Oura et al. 2018;Decker et al. 2019;Garoufi et al. 2020), whilst ten studies were conducted on dry bones (Marlow and Pastor 2011;Bethard and Seet 2013;Amores et al. 2014;Gama et al. 2015;Ostrofsky and Churchill 2015;Kaeswaren and Hackman 2019;Padovan et al. 2019; Azofra-Monge and Alemán Aguilera 2020; Suwanlikhid et al. 2020;Rozendaal et al. 2020). All of the studies used various experimental designs on several populations by conventional morphometric method, and were analysed by either discriminant function or regression analysis. • Acquired 13 measurements from the skeletal collections by sliding calipers with an approximation of 0.5 mm. • Performed a t-test (two-tailed) to analyse the differences in measurements between males and females. • Using logistic regression model to construct estimation models.
• The most dimorphic dimensions were the LMA (11.18%) and DSMC (10.6%). • The most predictive variables were LMA, DSMC, CMA and LMFS (right side). • The resulting model identified sex in 89.7% of cases in the training set, whilst sex was accurately identified in 86.7% of cases in the testing set.
• The second cervical vertebra was useful for sex estimation with accuracies that ranged from 86.7 to 89.7%. • Nine measurements were collected from cadavers by PMCT scanning and subsequent forensic autopsy was done.
• Measurements of DMFS and LMA on the C2 vertebra achieved expected crossvalidated accuracies of 83.5% • CT scan of C2 vertebra showed good estimation of sex with high accuracy rate. • Samples were taken from the skeletal cemetery collections • Three measurements were taken from each of the seven cervical vertebrae: Maximum cervical vertebrae body height (CHT), cervical anterior-posterior diameter (CAP) and cervical transverse diameter (CTR). • Discriminant functions were generated for each cervical vertebra, using all three measurements, to establish whether sex could be estimated from a single vertebra. • Multivariate discriminant functions were produced using all seven cervical vertebrae to investigate whether a combination of vertebrae may be used for sex estimation. • The functions that achieved predicted accuracies of 80% or greater, were cross-validated on independent samples of 32 individuals from the skeletal collections.
• Results indicated that CAP measurement did not demonstrate sexual dimorphism, whilst CHT and CTR demonstrated significant difference between males and females (p < 0.002) (except for CTR of C1 and C2). • Using combinations of all three measurements for sex estimation from a single vertebra, the accuracies ranged from 66.9 to 74% for males and 70.2-79.5% for females. • This study produced seven discriminant function equations using 20 measurements from all seven cervical vertebras, which achieved an overall accuracy rate of greater than 80%. The cross-validation test showed that among these functions, only four had achieved accuracies equal or greater than their predictive accuracies. The results indicated that C2 and C5 vertebrae were the most sexually dimorphic bones.
• The discriminant function equations achieved accuracy rates of 84.5% for cervical vertebrae (used in combination) in the European population.  • Using 24 linear measurements and four ratios from the images of multi-slice computed tomography (MSCT) • t test was used to establish the difference between sexes. • Unstandardized coefficient.
• About 14 out of 24 linear measurements showed significant sex differences using T12 vertebra (predictive accuracy ranged from 49% to 85.5%), with three variables, i.e. lower endplate depth (EPDl), • The T12 vertebra demonstrated a better sex estimation than L1 in the Egyptians. The accuracy increased when T12 and L1 were used in combination as sex predictors.  • All the measurements were greater in men than in women. • The multivariate regression analyses which included mean width, depth, and height of L4 vertebra, and yielded good sex accuracies in all age groups (86.4%, 87.7% and 82.8% at the ages of 20, 30 and 46, respectively). • The classification accuracy for sex were consistently higher for females than males.
• The width, depth and height of the L4 vertebral body were found to be useful for sex estimation. The C2 vertebra, also known as the axis, was commonly employed for sex estimation, as it has well-described morphological characteristics, and are mostly well-preserved even in adverse conditions (Gama et al. 2015). Besides being sexually dimorphic, C1 and C7 vertebrae were also found to be useful for sex estimation (Amores et al. 2014;Padovan et al. 2019).

Cervical vertebrae for sex determination
An older publication by the Wescott's study (2000) was excluded from the review as it has undergone reevaluation by many researchers in many sex estimation studies (Wescott 2000;Marlow and Pastor 2011;Bethard and Seet 2013). Continuous analysis and re-evaluation is vital to increase accuracy and precision of the existing methods. Marlow and Pastor (2011) had adopted and measured the Wescott's eight projected measurements and the width of vertebral foramen (WVF) on C2 vertebra in adults from the Spitalfields' anatomical collections held at the National History Museum, London. The discriminant function analysis showed a high percentage of 83.3% for the classification of male and female, which was higher than that achieved by Wescott (2000). Previous conclusions by Wescott (2000) and Marlow and Pastor (2011) were re-evaluated on the modern American skeletal collections in Tennessee, in which C2 vertebra showed an accuracy rate of 86.7% (Bethard and Seet 2013). Gama et al. (2015) had created a logistic regression model on C2 vertebra in the Portuguese population, and developed simple predictive model based on logistic regression models with accuracies, that ranged from 86.7 to 89.7%. In this study, about 13 dimensions of C2 vertebra were measured by adopting the Wescott's method (2000, which included the sagittal maximum body diameter (DSMC), maximum width of the axis (LMA), maximum width of the right superior facet (LMFSD) and maximum length of the axis (CMA). Torimitsu et al. (2016) had conducted CT measurements on C2 vertebra to provide good classification of sex. In this study, two variables, i.e. maximum distance between superior articular facets (DMFS) and LMA, were demonstrated to have accuracy rates of 83.5% and 83.1%, respectively. Other dimensions of C2 vertebra such as the maximum length of the axis (XSL), maximum width of vertebral foramen (WVF), odontoid process sagittal diameter (DSD) and maximum distance between superior facets (SFB) were analysed for biological sex estimation (Marlow and Pastor 2011;Gama et al. 2015;Torimitsu et al. 2016).
Methods for sex estimation is important as many of the bones retrieved have often undergone severe fragmentation upon recovery. In the event of bone fragmentation due to advanced decomposition, the results from these studies have to be taken with caution. This may happen in conditions, whereby the vertebra and its parts such as the spinous process, transverse process and superior articular facets may be damaged, hence rendered immeasurable for research study or casework investigation. In corroboration with such circumstances, a study was conducted on wet disarticulated cervical vertebrae (C1-C7) from the white Scottish human cadavers (Kaeswaren and Hackman 2019), whilst another study was done on all of the cervical vertebrae from the cemetery collections of European ancestry (Rozendaal et al. 2020). Both studies revealed that both C2 and C5 vertebrae were sexually dimorphic with good sex predictors in two measured variables, i.e. the vertebral body height (CHT) and transverse diameter of vertebral foramen (CTR).
It is worthwhile to note that high classification rates of at least 80% were considered useful in sex estimation methods (Torimitsu et al. 2016). In many studies, measured variables of C2 vertebra including the length (Marlow and Pastor 2011;Bethard and Seet 2013;Gama et al. 2015), width (Gama et al. 2015;Torimitsu et al. 2016), sagittal diameter of the vertebral body (Gama et al. 2015;Torimitsu et al. 2016), and the distance between superior articular facets were demonstrated to have high discriminant power with accuracies exceeding 80% (Marlow and Pastor 2011;Torimitsu et al. 2016). In other studies, the height of the vertebral body and transverse diameter of the foramen provided a high degree of sexual dimorphism (Kaeswaren and Hackman 2019;Rozendaal et al. 2020). Amores et al. (2014) have demonstrated good discriminant power with 80% accuracy on the measured variables of C7 vertebra from the Southern Spain laboratory skeletal collections, which comprised the length of the vertebral foramen (LVF), width and length of the inferior vertebral body (LIVB and WIVB).

Thoracic vertebrae for sex estimation
Under the subject heading of thoracic vertebrae, five articles were identified (Yu et al. 2008;Hou et al. 2012;Amores et al. 2014;El Dine and El Shafei 2015;Garoufi et al. 2020). Besides being a transitional vertebra, it has distinct morphological characteristics, which can be easily identified in disarticulated skeleton. Three articles had focused on a combination study of the twelfth thoracic (T12), C7, first lumbar (L1) and first thoracic (T1) vertebrae (Amores et al. 2014;El Dine and El Shafei 2015;Garoufi et al. 2020). In different populations, T12 vertebra was shown to be sexually dimorphic with an accuracy rate exceeding 83%. The T1 vertebra had the highest degree of sexual dimorphism (88.8% accuracy), followed by T12 vertebra (84.2%) (Garoufi et al. 2020).
CT scan analysis of the coronal diameter of superior endplate of the vertebral body (sBDc), ratio of the anterior to middle height of the vertebral body (BHm/BHp) and length of left mammillary process and pedicle (lM and PL) of T12 vertebra in the Korean population showed a high degree of sexual dimorphism with 90% accuracy rate (Yu et al. 2008).
Additionally, analysis of variables from CT images of T12 vertebra in the Chinese population showed a high degree of sexual dimorphism with 94.6% accuracy (Hou et al. 2012). These variables included superior maximum sagittal diameter of vertebral body endplate (sBDsm), inferior length of the whole vertebrae (iVL), the distance between superior articular processes (sAD) and one ratio (the ratio of anterior to posterior height of the vertebral body (BHa/BHp). Amores et al. (2014) have demonstrated that the length of the inferior surface of the vertebral body (LIVB) of T12 vertebra showed the highest dimorphism values with 80% accuracy.
El Dine and El Shafei (2015) performed a study by adopting the method by Yu et al. (2008), in which 24 linear measurements and 4 ratios by multi-slice computed tomography (MSCT) were utilised on T12 vertebra in the Egyptian population. From the analysis, 14 measured variables have exhibited significant sex differences, and produced more than 80% predictive accuracy in three variables, i.e. the lower endplate depth (EPDl), upper endplate width (EPWu) and superior vertebral length (VLs). By regression analysis, this study had generated 93.1% accuracy for sex estimation, which was comparable to that by Yu et al. (2008) (90%). Yu et al. (2008) conducted a study on T12 vertebra for sexual classification. In the study, the vertebral body endplate and sagittal length of the vertebra of T12 vertebra were found to be sexually dimorphic with accuracies exceeding 80%. The measured variables comprised sBDc, superior maximum coronal diameter of endplate of the vertebral body (sBDcm), coronal diameter of endplate on inferior plane (iBDc) or maximum coronal diameter of endplate on inferior plane (iBDcm). Additionally, coronal diameter of the superior endplate vertebral body (sBDc), ratio of the anterior to middle height of the body (BHm/BHp) and length of the mammillary process and pedicle of T12 vertebra (IM and PL) were also found to be sexually dimorphic with 90% accuracy rate (Yu et al. 2008).
In a study of T12 vertebra in the Chinese population by Hou et al. (2012), the vertebral body measurements (sBDs, sBDsm, iBDc, etc.) demonstrated accuracies exceeding 80%, with the sagittal length of the vertebra (iVL) showing 90% accuracy. By discriminant function analysis (DFA), a discriminant equation was produced with 94.2% accuracy, which was based on variables such as superior sagittal diameter of vertebral body endplate (sBDsm), ratio of anterior to posterior height of the vertebral body (BHa/BHp) and non-vertebral body measurements, i.e. the distance between superior articular process (sAD) and iVL. Besides, the variables such as vertebral sagittal lengths (iVL and sVL), which measured distances from the anterior edge of the vertebral body to posterior edge of vertebral spinous process at the inferior and superior planes were found to be highly accurate, with iVL yielding 90% highest accuracy rate (Hou et al. 2012).
El Dine and El Shafei (2015) conducted a study on T12 and first lumbar (L1) vertebrae in the Egyptians. Results revealed that sVL and iVL (vertebral sagittal length) showed 93.1% accuracy rate by multiple regression analysis. Also, the lower endplate depth (EPDl), upper endplate width (EPWu) and superior sagittal length vertebral (sVL) demonstrated an accuracy exceeding 80%. Garoufi et al. (2020) performed a study on three vertebrae, i.e. T1, T12, L1 vertebrae in the Greek population, and a high degree of sexual dimorphism was reported in T1 vertebra (90% cross-validated accuracy) based on the maximum vertebral length (mVL) and maximum distance (mTD). Accuracies ranging from 75 to 83% accuracy was demonstrated by T12 and L1 vertebrae (Garoufi et al. 2020). Amores et al. (2014) had studied C7 and T12 vertebrae in the Mediterranean population, and demonstrated a high degree of sexual dimorphism with 80% accuracy rate by discriminant function analysis. The equations were based on the length of the inferior surface of the vertebral body and the width and length of the vertebral foramen of C7 vertebra and the length of the inferior surface of the vertebral body (LiBV) of T12 vertebra.
It can be concluded that the vertebral body and sagittal length of the thoracic vertebra played a crucial role for estimation of sex. The accuracy to predict sex correctly may be achieved when the vertebrae are complete and intact, particularly the sagittal length of the vertebra, that is formed by the body of a vertebra with spinous process.

Lumbar vertebrae for sex estimation
The first lumbar vertebra (L1) was shown to be sexually dimorphic by discriminant function analysis (Zheng et al. 2012 Analysis of 29 linear measurements and 5 ratios from 34 traits on L1 vertebra in the Chinese population revealed accuracies ranging from 57.1 to 86.6% (Zheng et al. 2012). About five linear measurements associated with the vertebral body (EPWu 86.6%, EPDm 86.2%, EPWl 85.2%, EPDl 84.3% and EPDu 83.3%) gave an accuracy rate exceeding 80%, with EPWu showing the highest accuracy. The discriminant function produced was based on the upper endplate width (EPWu), middle endplate depth (EPDm) and left pedicle height (PHl) with an accuracy of 88.6%.
El Dine and El Shafei (2015) had studied on L1 and T12 vertebrae, and demonstrated varying degree of sexual dimorphism with accuracies ranging from 47 to 79% based on seven linear measurements and one ratio on L1 vertebra. By comparison, Zheng et al. (2012), who focused on L1 vertebra alone, had achieved higher accuracy rate (57.1-86.6%) compared to that by El Dine and El Shafei (2015) (79%) based on the upper endplate depth (EPDu). Also, Zheng et al. (2012) (2015). Ramadan et al. (2017) analysed 15 linear measurements on L1 vertebra in the Egyptians by CT scan, and by adopting the methods by Zheng et al. (2012) andEl Dine andEl Shafei (2015), results showed that EPWu demonstrated an accuracy of 84.6%, and nearly all of the measurements were significantly greater in males than females by discriminant function analysis. Ostrofsky and Churchill (2015) performed physical osteological examination (POE) on all of the lumbar (L1-L5) vertebrae in the South Africans, and revealed that sex was correctly predicted in L1 until L4, with accuracies exceeding 80%. Measured variables such as the vertebral body superior dorso-ventral diameter (BSDVD) and body superior transverse diameter (BSTD) on L1 vertebra gave over 80% accuracy rate (87.1%) by discriminant function analysis (Ostrofsky and Churchill 2015), which was comparable to other studies based on measured variables, i.e. EPDm and EPWu/l (Zheng et al. 2012;Ramadan et al. 2017).
Studies on the lumbar vertebrae have also been shown to be useful for sex estimation (Decker et al. 2019; Azofra-Monge and Alemán Aguilera 2020; Suwanlikhid et al. 2020). Decker et al. (2019) had utilised abdominal CT scan on living patients in a modern adult population, whilst Azofra-Monge and Alemán Aguilera (2020) used dry bones samples from the laboratory skeletal collection in Spain. All of the studies displayed similar trends of accuracy by discriminant function analysis, which was comparable to a study by Ostrofsky and Churchill (2015). Similarly, Suwanlikhid et al. (2020) had analysed dry bones from the laboratory skeletal collections in Thailand and measured nine variables, i.e. area and surface axis on all of the lumbar vertebrae. They documented that precision rates were higher in the upper lumbar vertebrae than those in the lower lumbar vertebrae, which corroborated with the findings by Decker et al. (2019).
The body of a vertebra was found to be sexually dimorphic by discriminant function analysis, as evidenced from the measured variables, i.e. upper endplate depth (EPDU) and upper endplate width (EPWU) (Zheng et al. 2012;Ostrofsky and Churchill 2015;El Dine and El Shafei 2015;Ramadan et al. 2017;Suwanlikhid et al. 2020). Physical osteological examination (POE) on L1 vertebra in the South African population revealed that EPWU had attained an accuracy of 87.1% (Ostrofsky and Churchill 2015), whilst EPDU showed a lower accuracy of 81.8% (Suwanlikhid et al. 2020). Decker et al. (2019) studied on 30 linear measurements, wedging angle and five aspect ratios on all of the lumbar vertebrae in the North American population, and revealed an overall accuracy of 81.2% to 85.1% by discriminant function analysis, with the highest accuracy achieved by L3 (85.1%). By using multilevel measurements, a higher accuracy of 92.2% was achieved (Decker et al. 2019). Oura et al. (2018) study on the width, depth and heights of L4 vertebra in the Northern Finns by MRI scan, and showed that they were sexually dimorphic with accuracy rates exceeding 80% in all three age groups (20, 30 and 46 years). The L4 vertebra was chosen as it was easily accessible in both axial and sagittal slices in MRI scans, and results showed lower accuracy rate (82.8%) in the 45-year-olds sample population compared to that in the 20-or 30-year olds (86.4% and 87.7%, respectively) (Oura et al. 2018).

Discussion
On initial screening of the abstracts and full texts, 19 studies were identified (Fig. 1). Forensic anthropologists have relied heavily on the vertebrae for estimation of sex in forensic caseworks. From 24 vertebrae (excluding sacrum and coccygeal), C2, T12 and L1 vertebrae were predominantly utilised for estimation of sex in several populations. Although the cranium and pelvis were commonly used for sex estimation, acceptable accuracies have been shown by studies on vertebral bones. Rozendaal et al. (2020) had reported accurate sex estimation in a combination study of all of the cervical vertebrae (C1 until C7). A total of 25 discriminant functions were generated from each vertebra with 80% accuracy, and 100% accuracy was obtained when both 4 th cervical vertebra (C4) and C2 vertebrae were combined (Kaeswaren and Hackman 2019). Similarly, the accuracy in sexing skeleton varied upon using L1 only, compared to using both L1 and T12 vertebrae, in which the accuracy rose from 68 to 93.3% (El Dine and El Shafei 2015). It was also demonstrated that the accuracy rose from 81.2 to 85.1% in individual study of each lumbar vertebra to about 92.2% when all of the lumbar vertebrae (L1-L5) were used in combination (Decker et al. 2019).
There was a significant variation in accuracy from different types of studies in different populations. The development of metric analysis for sex estimation are usually population-specific; hence, it can only be used for individuals from the same population (Bruzek and Murail 2006). Population variation and scientific study of inherited human variation may only enhance the accuracy and reliability of the method used and their outcomes (Kimmerle et al. 2008). For instance, L3 vertebra in the North American population showed an accuracy of 85.1% (Decker et al. 2019), whilst L3 vertebra in the South African population showed an accuracy of 87.1% (Ostrofsky and Churchill 2015).
Discriminant function analysis (DFA) is a common statistical analytical technique applied in forensic anthropology (Gama et al. 2015). The discriminant function is generated by multiplying the coefficients with variables of the vertebral measurements. The discriminant score is obtained by having a value that will act as the cut-off point between males and females, also known as the sectioning point. If the scores are greater than the sectioning point, it will be predicted as male, whilst scores smaller than the sectioning point, will be predicted as female (Omar et al. 2021). However, some studies have used logistic regression analysis, instead of DFA (Gama et al. 2015;El Dine and El Shafei 2015;Oura et al. 2018; Azofra-Monge and Alemán Aguilera 2020). It is a discriminative statistical model analogous to DFA, whilst logistic regression analysis is considered more robust and flexible in terms of data assumptions (Gama et al. 2015;Klales et al. 2020).
Discriminant analysis models should be subjected to evaluation and cross-validation analyses to estimate the accuracy of the classification, and to provide good estimation of the performance model. Cross-validation is one of the most practical methods to estimate the predictive model performance (Kuligowski et al. 2016). In some studies, the methods were created and tested against the sample population, but were not crossreferenced to other populations. Some studies used cross-validation analysis to evaluate the discriminant capacity of their classification models (Yu et al. 2008;Zheng et al. 2012;Amores et al. 2014;Ostrofsky and Churchill 2015;Decker et al. 2019;Garoufi et al. 2020), other studies did not perform crossvalidation analysis (El Dine and El Shafei 2015; Ramadan et al. 2017;Suwanlikhid et al. 2020). Several studies used both training and test samples to validate their discriminant functions, whereby the training sample is used to develop sex prediction models, and the model built on the training sample is applied to the test sample for prediction. In one study, discriminant functions were validated in both the training samples (190 individuals) and test samples (47 individuals), whereby sex was correctly estimated in 89.7% and 86.7% of cases, respectively (Gama et al. 2015). However, the model showed genderbiased towards males in the training sample.
Two studies by Marlow and Pastor (2011) and Bethard and Seet (2013) had done re-evaluation of the discriminant functions for sex estimation on C2 vertebra on the Hamann-Todd and Terry osteological collections of black and white specimens by adopting the Wescott's method. Analysis on the contemporary Americans in the Tennessee skeletal collections showed 80% accuracy rate (Bethard and Seet 2013), and by re-evaluating the Wescott's method on the eighteenth-nineteenth centuries European ancestry museum collection in London, UK, an overall classification accuracy of 76.99% was reached (Marlow and Pastor 2011).
Along with the dynamics of growth processes, vertebral anatomy can be directly affected by growth. The growth process can cause numerous changes in the bone structure and surrounding tissue. From the onset of puberty, the duration can be quite variable due to considerable variation in growth, whereby females tend to have a vertical growth spurt, whilst males tend to have a horizontal growth spurt (Taylor and Twomey 1984; Azofra-Monge and Alemán Aguilera 2020). The vertebrae are mostly predisposed to mechanical forces as they form the backbone of the body and with a small amount of rotation of the vertebral column, it may contribute to a change in size and shape of the vertebrae. Diet is an important factor that may influence bone balance and quality, which is particularly true in the elderly due to factors such as poor diet or energy balance. Whilst environmental factors and genes may affect growth hormone function and bone control development, other factors such as dietary pattern, daily physical activity and mechanical loading may contribute to changes in bone density, mass and strength and hence sexual variance (Torimitsu et al. 2016;Gilsanz et al. 2018;Muñoz-hernandez et al. 2018).
Based on linear measurements of the vertebrae, males were found to have greater bone dimensions than females. Linear measurements represent bone size, whilst ratios represent bone shape, which is formed by a combination of several linear measurements (Hou et al. 2012). Studies on ratios of linear measurements have shown that accuracies of ratios were statistically lower than linear measurements (Yu et al. 2008;Zheng et al. 2012;Hou et al. 2012;El Dine and El Shafei 2015). Further analysis of sexual dimorphism may also be done by geometric morphometric analysis for the analysis of structural modularity and optimization of shape classification criteria.
There are two types of measurement techniques on bones such as CT-derived methods, and dry bone measurements. Techniques within the field of anthropology have relied heavily on metric variables on dry bones, which may be derived from either anthropological museum collections or laboratories, and this requires an extensive effort and resources to improve access to a good range of biological samples. In contrast, CT-derived methods have been demonstrated to be more easily available, minimally invasive and more efficient for research use and identification purposes, for both dead bodies and living human subjects. Besides, a difference of only 2 mm in measurement error was reported between dry bone measurement and imaging derived method (Stull et al. 2014). This emphasizes that both methods are acceptable and feasible to be used by anthropologists as a tool for classification of sex.

Strength and limitation of the review
Many studies have demonstrated the impact of sex estimation formula on skeletal remains in forensic casework. From systematic review, 19 research articles have been identified. A critical review of these methods is highly relevant to decide which vertebrae and/or parts of the vertebra are important for identification of unknown subjects. Based on the accuracy score, the 12 th thoracic vertebra showed the highest scores for sexual dimorphism. For further analysis, an advanced meta-analysis of the vertebrae may be performed to accurately reflect the methodological quality of these evidences in sex estimation studies over the past few years.
In this review, several limitations were identified. Although classification according to age groups is important to minimize the identification pool, different age groups were not performed in these studies, which perhaps may produce different effects on the results. Further, the review may have missed some important relevant studies, especially when only three search engines were used in the review.

Conclusions
It can be concluded that vertebral bones have provided good accuracies for sex estimation, and that most vertebral dimensions are population-specific. Although individual vertebra has been studied to evaluate sexual dimorphism, the percentage accuracy was found to increase with combination of studies of the vertebrae. Results also showed that for all vertebrae, the most sexually dimorphic area of the vertebra was the vertebral body. Further studies may be needed to determine sexual dimorphism in other areas and traits of the vertebrae in an advanced method for meta-analysis.