This is an example about how to use the CLIC method to identify or characterize your own specimens. To help you get the best of the CLIC method in a short time, I will give you an example. Except for the downloading of a few images from the CLIC page (Step 1), and for the digitizing step (Step 2), it is a fast procedure longer to read than to apply.

OBJECTIVE and SUMARY

==============================================

You want to identify by geometric morphometrics a female single specimen of Bactrocera tau collected in Thailand where you know at least two cryptic species occur: A and C (Sangvorn & Dujardin, 2009).

The steps presented here may serve as practical guidelines to avoid possible waist of computer time. Five main steps are necessary, and at least 4 new folders would be created:

Step 1 - DOWNLOADING the REFERENCE IMAGES

Step 2 - DIGITIZING the IMAGES (COO)

Step 3 - PREPARATION of the TEST FOLDER

Step 4 - Creation of the REFERENCE file (TET)

Step 5 - IDENTIFICATION TESTS (MOG)

==============================================

Step 1 . DOWNLOADING REFERENCE IMAGES

==============================================

To know to which cryptic species belongs your unknown female B. tau, you need reference images of female B. tau "A" and female B. tau "C".

==============================================

Step 2 . DIGITIZING IMAGES, use of the COO module

==============================================

The reference images obtained from step 1 will allow you to get the reference coordinates. You need a software to collect coordinates.

==============================================

Step 3 . PREPARATION OF THE TEST FOLDER

==============================================

Be careful: the folders A and C you just created contain different reference images, but they contain text files having the same names (see previous steps).

You can change these names, but it is better to CREATE A NEW FOLDER and copy there these files under NEW NAMES. Thus you create now the 4th folder named Btau_TEST (the three first ones were A, C and Btau_UNKN).

==============================================

Step 4 - Creation of the total REFERENCE file, use of TET

==============================================

  • 4/1. If you followed previous steps carefully, everything is ready now to create within the Btau_TEST folder the reference file of all the reference coordinates. The latter is the concatenation of coord_a_format.txt and coord_a_format.txt. Concatenation is a job for TET. You open TET,

    you load coord_a_format.txt ,

    you ask for row concatenation and

    you load coord_a_format.txt .

    You then push the bottom right hand red button allowing you to save the concatenated megafile under a NEW NAME, say:

    a_c.txt

     

  • 4/2. The following files are now present in the Btau_TEST folder:

    coord_a_format.txt

    coord_a_format.txt

    coord_unkn_format.txt

    a_c.txt

    a_c_log.txt

     

    We have reached the end of the most important steps. The MOG module has now everything to allow your identification test.

==============================================

Step 5 - IDENTIFICATION TESTS, use of MOG

==============================================

In the example at hand (see previous steps), the identification step is completely performed by one module, MOG, and from files of the same folder, the Btau_TEST folder.

With MOG, please open the " a_c.txt " file containing the coordinates of the reference images. The MOG module allows:

  • 1. Obtaining size and shape variables
  • 2. The addition of unknown specimens
  • 3. The classification of the unknown specimens
  •    

    • # 1. Obtaining centroid sizes (CS) and shape variables (Residual coordinates, Procrustes residuals, PW, RW).

      Files generated will be :

      a_c_CS.txt

      a_c_ALIGNED.txt

      a_c_PrRes.txt

      a_c_PW

      a_c_RW

         At this stage, you can perform a PCA on Procrustes Residuals. File generated will be

      a_c_PrCp

       

    • # 2. The introduction of your own, unknown specimens as external data (button EXT/UNKN).

         * After entering the external data file coord_unkn_format.txt files generated will be (this step is rather slow):

      a_c_PW_base

      a_c_PW_unkn

      a_c_PW_base_unkn

         These files contain the the PW computed on the grand total, i.e. the reference and the unknown specimen.

         

    • # 3. Identifying/classifying "unknown" specimen(s)

      There are two ways of classification, one based on the Procrustes distances, the other one,more powerfull but requiring more sample sizes (***), on the Mahalanobis distances.

         * Procrustes classification. After the PW on the grand total have been computed, a first classification of your unknown specimen(s) will be automatically performed on the basis of Procrustes distances. The process is rather slow. It will use two algorithms, one based on the shortest Procrustes distance to each consensus (each reference species), and another one based on the K neirest neighbors method (KNN). They do not necessarily agree completely, and the next approach (DA and Mahalanobis classification) is preferred.

         * Mahalanobis classification. This classification does not start automaticaly. After the Procrustes classification, you must ask for a discriminant analysis (DA). You will not have the choice of the input file: the DA will use as input the PW relative to the reference images (a_c_PW_base). You will then be allowed to call your unknown specimen as supplementary data (yes, another button EXT/UNKN ): the program will use the PW of the   a_c_PW_unkn as supplementary data.

         *  Thus, the EXT/UNKN button of the DA will automatically open the file containing the PW relative to the unknown specimen(s) and compute their position in the discriminant space obtained from the PW of the reference specimens. The classification algorithm is the one based on the shortest Mahalanobis distance to each consensus.

    ========================= Back to =================

    (*)  You can leave here the CLIC atmosphere and go to your preferred software since the coord_Sept7.txt from the A folder, the coord_Sept7.txt from the C folder and the coord_Sept8.txt are in the TPS format, usable for TPS and some other scripts (see http:://www.edu.).

    (**) The .._format.txt and ..._DB.txt files can be obtained also from the coord_... file by using the TET module. You open TET, from there you open the raw coordinates file and ask for the first transformation, called

    .tps file (from TPSdig or COO) >> ..._format.txt + _DB.txt

    (**) More sample sizes refers here to the reference specimens, since the discriminant model is contructed from them.