In anthropometric studies, the determination of measurement error is essential (Ulijaszek and Lourie 1994). According to this study, all measurements exhibited an acceptable measurement error indicating high repeatability and precision, whereas their respective R coefficients demonstrated that all measurements may be regarded as reliable. These findings are in agreement with those reported by Toneva et al. (2016). Furthermore, most of the measurements are below the intra- and inter-observer %TEM thresholds of 1% and 1.5%, respectively, which are usually considered acceptable for skillful anthropometrists (Perini et al. 2005). Despite that the photogrammetric 3D modeling of the mandibles was not part of the present study, the reported cross-validated accuracy between digital and manual measurements, which was part of the digital documentation process, is smaller than the inter- and intra-observer absolute TEMs reported in the present study. Hence, the present results and DFs can be applied to either digital or manual measurements without any inter-method measurement error weakening their utility.
The aim of the present study has been twofold. We aimed to identify the mandibular measurements based on a modern Greek population sample that yield the highest sex discriminant capacity and use them to produce multivariate DFs that most successfully can estimate sex. Meanwhile, we investigated the performance of previously published DFs derived from different population samples. Population specificity is a long and well-established observation in osteometric studies also evident from our results (Giles 1964; İşcan and Steyn 1999; Franklin et al. 2007). Nevertheless, the rationale of our approach was not limited to a mere validation of population specificity but further aimed to identify particular morphometric traits that consistently yield high sex classification results among different population groups even though their expression patterns may vary among these groups.
Our findings regarding the most sex discriminant mandibular traits were consistent with earlier studies by Franklin et al. (2006, 2008), who found that the coronoid height, the ramus height, and the maximum mandibular length univariately exhibited the most pronounced sexual dimorphism. Furthermore, the produced DF (Function 10, Table 5), when combining these measurements from the Greek population sample, yielded a cross-validated accuracy of 84.1%. Although the cranium and the pelvis provide much more reliable sex estimates (Oikonomopoulou et al. 2017; Bertsatos et al. 2018), our results verify that the mandible can be useful for estimating sex in a forensic context, when cranial and pelvic elements are missing or deteriorated.
Franklin and colleagues in 2008 studied 225 individuals (120 male; 105 female) from five local populations of indigenous South Africans, in order to produce a series of mandibular metric standards for sex estimation, which resulted in Functions 2, 3, and 4 (see Table 7). The corresponding DFs based on the Greek sample (Tables 5 and 6) yielded the highest sex discriminant scores observed in the present study ranging from 84.3 to 85.7%. More specifically, Function 3, which includes nine mandibular measurements, yielded similar classification accuracy on our sample (84.8%) with that reported by Franklin and colleagues on their population sample (84%). Furthermore, applying the original DF (Franklin et al. 2008) on the Greek population sample also resulted to similar accuracy (83.84%). However, the accuracy of Functions 2 and 4, which utilize three and four mandibular measurements, respectively, was higher for the DFs derived from the Greek population (Function 2: right side 85.2%, left side 84.3%; Function 4: right side 85.7%, left side 84.8%) as opposed to their counterparts derived from the indigenous South African population (Function 2: 81.8%; Function 4: 82.7%). Additionally, applying the original DFs on the Greek population sample yielded even lower classification scores (see Table 7). Despite that Functions 2 and 4 resulted in differential performance between the two distinct population groups, which can be attributed to population specificity, the overall performance of these three functions implies some merit to the proposed metric standards by Franklin et al. (2008).
Steyn and İşcan in 1998 evaluated sexual dimorphism in the cranium and the mandible of South African Whites and developed osteometric standards to determine sex. They studied 91 South African Whites (44 males, 47 females) from cadaver collections housed at the Universities of Pretoria and Witwatersrand (Dart Collection). Although comparing their results to our corresponding DF results (Function 1) revealed similar classification accuracy (~ 81.5%), applying the original DF (Steyn and İşcan 1998) on the modern Greek population sample exhibited much lower accuracy (left side: 63.78%; right side: 68.11%). The same pattern was observed on the rest of the comparisons between different population samples. The corresponding DFs produced similar classification accuracy between their respective reference population samples, but applying the original DFs (from other population samples) on the Greek sample resulted in significantly reduced accuracy.
More specifically, Saini et al. (2011), working on 116 dry adult mandibles of a Northern Indian population sample from the Department of Forensic Medicine in India, reported classification accuracy of 80.2% for both Functions 5 and 6, which on our sample yielded approximately 82% and 79%, respectively, with small deviations on each side. Lin et al. (2014) produced their DFs from cranial CT scans of 120 males and 120 females from Seoul St. Mary’s Hospital. Despite most of their reported DFs were based on different set of measurements than these utilized in the present study, which restricted a direct comparison with their most accurate DFs in terms of correct sex classification, the available comparisons showed similar classification score for Function 7 (77.2% on Greek sample; 80.8% on Korean sample) and identical accuracy for Function 8 (80.4%). Similarly, the DF reported by Dong and colleagues based on a contemporary Han Chinese population sample yielded 83.3% classification accuracy (Dong et al. 2015), whereas the corresponding DF from the Greek sample produced 82.7% and 84.9% cross-validated classification scores for the left- and right-side measurements, respectively.
The present study produced a number of suitable DFs based on mandibular measurements that can be used for sexing unidentified individuals assumed to belong to the modern Greek population. Apart from verifying the fact that most often the morphometric DFs based on human bones exhibit population specificity, hence the need for studies on different population samples, the comparative part of this work also revealed some interesting aspects. The observation that the same combinations of measurements yield the same sex discriminant capacity between different population samples, despite their respective DFs being population specific, implies that the magnitude of expression of sexual dimorphism is similar on certain mandibular morphometric traits and shared across different populations, although their expression may follow different patterns in each population group. Regarding the identical results of Function 3 (Franklin et al. 2008) and especially the identical classification accuracy when applying the DF derived from indigenous South Africans on the Greek population, we cannot conclude whether this observation is a mere statistical coincidence between the two reference samples or the result of both population groups sharing similar environmental and developmental factors that led to similar expression of sexual dimorphism in their mandibles. Further work is necessary to this end, and studies on shared datasets comprising diverse population samples may expand our insight on this matter and provide more reliable sex discriminant tools for the forensic practice.