MLP Configuration by XYOM

XYOM configuration of the Multilayer perceptron (MLP) training.

(see the summary at the end of this page)

Type of training is “mini-batch“. Mini-batch divides the training data records into groups of approximately equal size, then updates the … weights after passing one group; that is, mini-batch training uses information from a group of records. Then the process recycles the data group if necessary. Mini-batch training … may be best for “medium-size” datasets.

  • Data normalized or not ?
    • XYOM does not normalize the data. This parameter is the first one of the parameters list, it is automatically set to “0”. However, the XYOM experiences suggest that input data frequently should be normalized, preferably when they are of different kind (for instance, linear measurements between anatomical points + coordinates of pseudolandmarks).
    • The user is given the option to ask XYOM to normalize the data. Normalized values fall between 0 and 1 : the conversion formula is (x−min)/(max−min). Data are normalized by columns.

 

  • Input dimension
    It is the number of columns of the reference file, which must be the same as the number of columns of any other input file, as for instance the “fake unknown” file. Automatically computed and set by XYOM.

 

  • Output dimension
    It is the number of successive groups in the reference file, i.e., number of elements in the subdivision array.

    • The subdivision must be set by the user, XYOM does the rest. For instance, if the subdivision of the reference data is set to “20,24,29”, the “output dimension” is 3.
    • Output content
      Automatically set by XYOM. In our example (“20,24,29”), the output content is the correct assignment for each reference individual, we would have a matrix with 73 rows showing the expected output as follows:

      • for the 20 first rows:  [1, 0, 0],
      • for the 24 next rows: [0, 1, 0]
      • for the 29 final rows: [0, 0, 1].

 

 

  •  The number of neurons by layer…, for most problems, one could probably get decent performance (even without a second optimization step) by setting the hidden layer configuration using just two rules: (i) number of hidden layers equals one; and (ii) the number of neurons in that layer is the mean of the neurons in the input and output layers. ” (See https://stackoverflow.com/questions/39553489/how-to-detemine-total-hidden-layer-node-and-output-node). The XYOM current experiences suggest that when using landmarks derived data, the number of neurons should be very close to the output dimension. The number of neurons by hidden layers is automatically set to the output dimension, but the user is given the possibility to indicate a different number.

 

  • The training set, and the testing set.
    The training set is selected randomly among the reference data, taking care to extract data from each group (up to 50% of the smallest group). The testing set (or “validation” set) is what remains after selecting training set, it then also is randomly constituted. Automatically set by XYOM.

 

  • The leaning rate is set to 0.015 by XYOM, it can be modified by the user. Intuitively, the speed of the MLP should increase by increasing the learning rate. Actually, the reverse could also happen, seemingly freezing the MLP: it is then better to decrease the selected value.

 

  • There are 3 evaluation steps:
    • 1/ The error estimate derived from the amount of difference between observed and predicted assignment of the trained individuals. Its tentative limit is set up by XYOM  to:
      0.05 * output dimension.
      If this limit could not be respected after 30000 epochs (iterations), it is automatically increased. Other strategies will be implemented in the future.
    • 2/ The second step is related to another kind of error. It is  the percentage of correctly classified individuals among the testing set (or validation set) of individuals. The value is initially set by XYOM to 75%. The learning process above (see 1/) is repeated if this percentage is  < 75%. The user can modify the expected score (please take care about possible “overfitting”).
    • 3/ The last step allows the user to decide about MLP reconfiguration or not. It is the percentage of correctly classified individuals among the untrained individuals (or “fake unknown”, i.e., the first input file you entered along with the reference file to start the MLP learning). It is a different estimation than the second one, since the fake unknown are untrained individuals: they did not participate in the learning process. The user decides if this final estimation is acceptable or not.

 

  • Summary
    • Parameters that must be provided by the user:
      •  Output dimension (it is given by the “subdivision” of the reference data)
    • Parameters that the user can modify:
      • Normalisation of the data, default is “0”
      • Number of hidden layers, default is “1”
      • Number of neurons by layers, default is output dimension
      • Learning rate, default is 0.015 * output dimension.
      • Correctly classified objects in testing set, default is 75%
    • Parameters that cannot be modified by the user
      • Input dimension
      • Training and testing set composition
      • The user cannot suggest a number of neurons which would be different from one hidden layer to another.