Classification

XYOM ANALYZES, CLASSIFICATION

Input:

A single file containing data which may be:

  • Global size (one column matrix)
  • DISTANCES (square matrix),
    • Mahalanobis distances (i.e., Euclidean distances between …_DF.txt)
    • Euclidean distances between …_PC.txt
  • PRINCIPAL COMPONENTS (rectangular matrix)
  • TABLE (rectangular matrix)
    • matrix of traditional measurements,
    • matrix of shape variables (…_NEF,matrix of …_ORP, etc…)

Analyses:

  • if input data are labelled as TABLE
    • Between group PCA (gPCA) see: Mitteroecker, P., Bookstein, F., 2011. Linear Discrimination, Ordination, and the Visualization of Selection Gradients in Modern Morphometrics. Evol Biol. 38,100-114. doi:10.1007/s11692-011-9109-8.
    • Validated classification based on Maximum likelihood method (CCCMLi) see: Dujardin, J.P., Dujardin, S., Kaba, D., Santillán-Guayasamín, S., Villacís, A. G., Piyaselakul, S., Sumruayphol, S., Samung, Y., Morales-Vargas, R., 2017. The maximum likelihood identification method applied to insect morphometric data. Zoological Systematics, 42(1), 46-58. 
    • Validated classification based on Mahalanobis distance method (CCCMaha). Be careful about statistical assumptions. See: Kovarovic, K., Aiello, L. C., Cardini,  A., Lockwood, C. A. 2011. Discriminant function analyses in archaeology: are classification rates too good to be true? Journal of Archaeological Science, 38: 3006-3018
    • PCA (Principal Component Analysis)
  • if input data are labelled as PC
    • DA (Discriminant analysis), with XYOM, a DA always assumes the entry data are PC; it cuts the number of the first PCs to the number of individuals in the smallest group, minus one.
    • Validated classification based on Maximum likelihood method (CCCMLi)
    • Validated classification based on Mahalanobis distance method (CCCMaha)
    • HC BOOTSTRAP (hierarchical clustering): single linkage algorithm, bootstraps of input data and resulting trees. See: Couette, S., G. Escarguel, and S. Montuire. 2005. Constructing, bootstrapping, and comparing morphometric and phylogenetic trees: A case study of new world monkeys (Platyrrhini, Primates). Journal of Mammalogy, vol. 86, no. 4, pp. 773 – 781.  See also: Morales Vargas, R. E., N. Phumala-Morales, T. Tsunoda, C. Apiwathnasorn, and J.-P. Dujardin, 2013. The phenetic structure of Aedes albopictus. Infection, Genetics and Evolution, vol. 13, no.1, pp. 242–251.
      No graphical output.
  • if input data are labelled as DISTANCES (Mahalanobis, Euclidean)
  • if input data are labelled as Global size
    • Validated classification based on Maximum likelihood method (CCCMLi)
    • Oneway ANOVA Parametric ANOVA output, and non-parametric estimation of statistical significance.

Output:

  • Report:
    • HC: Newick format with bootstrap if input are PC.
    • Validated classification: scores of correct assignments
    • For gPCA, a detailed report about wrong assignations of group origin
  • Graphic :
    • HC: Tree. Single linkage dendrogram if input are DISTANCES,
    • PCA, gPCA, DA: factor map of first two multivariate factors
    • Oneway ANOVA: quantile boxes