**Input (2 files):**

Input files may contain various kinds of data.

- RAW COORDINATES of anatomical LANDMARKS
- RAW COORDINATES of PSEUDOlandmarks (outlines)
- GLOBAL SIZE (centroid size, perimeter, square root area)
- TABLE OF TRADITIONAL MENSURATIONS.

Whatever the kind of input data, you will be asked to enter them in two separate files:

**UNKNOWN**:- The file containing the data to be tentatively identified

**REFERENCE**:- The file containing all the reference data; they must be organized as successive groups in a single matrix, so that you must also provide the
- subdivision argument: it refers to the successive groups of the reference matrix

Unknown and Reference data must have the **same kind of variables** in the** same order** (the **same number of columns**)

**Analyses: **identification statistics may use three approaches,

**Mahalanobis distances**: the unknown are “projected” into the discriminant space computed on the reference matrix only**Maximum likelihood**method: see*Jean-Pierre Dujardin, Sebastien Dujardin, Dramane Kaba, Soledad Santillan-Guayasamin, Anita G. Villacis, Sitha Piyaselakul, Suchada Sumruayphol, Ronald Morales Vargas. 2017. The maximum likelihood identification method applied to insect morphometric data. Zoological Systematics (2017), DOI: 10.11865/zs.2017*

**Artificial Neural network****(ANN**, using the Multilayer Perceptron) Requires an existing and adequate weights file.

**Output:**

- A report is issued indicating the
**tentative assignation**of each individual to a given group of reference.

**IDENTIFICATION section of XYOM: Maximum Likelihood or Mahalanobis distances methods:**

*According to the kind of variable announced by the user, XYOM will automatically perform some data transformation as described hereunder. *

- In case of RAW COORDINATES of anatomical LANDMARKS
**Concatenating**raw landmarks of unknown and reference individuals**GPA**(Generalised Procrustes Analysis) on these concatenated data**PCA**on resulting**ORP**(orthogonal projections, also called Procrustes residuals)- Splitting the file of PCs of shape into two files, one containing the PCs corresponding to the unknown individuals and one containing the PCs corresponding to the reference individuals.

- In case of RAW COORDINATES of PSEUDOlandmarks (outlines)
**EFA**(Elliptic Fourier Analysis)**separately**on UNKNOWN and REFERENCE data.- For the EFA, it is not mandatory to concatenate unknown and reference data into one file for subsequent comparisons. Indeed, in the process of size and shape separation, each individual is treated independently of the remaining individuals. In the EFA, shape generation (NEF) does not need to use the consensus of the data (as in the GPA)

- Removing the three first columns of
**NEFs**(Normalised Elliptic Fourier ‘s coefficients) on both UNKNOWN and REFERENCE data- For each individual, the three first coefficients are “degenerated” in the process of rotation and size correction.

**Taking a number of NEF identical for each individual**(same number of columns by rows), using as a maximum the number of columns of the individual described by the minimum number of coefficients, either an unknown or a reference specimen.- This step might reduce the power of shape reconstruction for the remaining individuals, but is mandatory to perform identification statistics.

**Concatenating**NEFs of unknown and reference specimens (MISCELLANEOUS, Working on data files, Concatenate, By Rows)- This merging of the two files (Unknown NEF and Reference NEF) is made necessary because of the next step: the Principal Component Analysis (PCA).

**PCA**on the merged NEF data- Splitting the file of PCs of shape into two files, one containing the PCs corresponding to the unknown individuals and one containing the PCs corresponding to the reference individuals.

- In case of TRADITIONAL MENSURATIONS (could be also a Table of NEFs)
**Concatenating**unknown and reference data**PCA**on these data- Splitting the file of PCs into two files, one containing the PCs corresponding to the unknown individuals and one containing the PCs corresponding to the reference individuals.

**IDENTIFICATION section of XYOM: Artificial Neural Network (ANN) method**

- RAW COORDINATES of anatomical LANDMARKS not allowed
- RAW COORDINATES of anatomical PSEUDO LANDMARKS not allowed
- Normalized Elliptic Fourier coefficients (NEF) allowed only under some conditions
- For any other kind of data, please remember that a previous learning step is mandatory at the MACHINE LEARNING section of XYOM.