Cross-validationCross-validation is mainly used when the task of the model is prediction.
Theway it works is by partitioning the train_data into complementary subsets; asan instance in k-fold Cross-validation which data is divided into k equally-sizedfolds, training is done on (k-1) subsets and the analysis of the performance ofthe model during training phase is done on the one hold out fold (which is calledvalidation / test set) and the experiment is repeated for k times which means thateach of the k folds will be used as development_set once. The final predictiveperformance of the model is computed by considering all the k experiments’performance.Cross-validation is a recommended technique to apply in the case of tuningthe hyper parameters of a model 13.Accuracy rate (Classification Accuracy):Accuracy rate or Classification Accuracy is the number of correctly classifiedsamples divided by all the predictions made. In simple words, accuracy rate ishow often the classifier is correct (Evaluation metric which are defined in therange of 0 and 1, in practice, normally are multiplied by 100 to turn it into apercentage, such as Accuracy rate ). The formula would be as follows 18:Classification_Error There is also, Classification_Error which is the complementaryconcept of Accuracy_rate (1- Accuracy_rate).
It computes how oftenthe classifier is incorrect. The formula would be as follows18: Null_accuracy (most_frequent_accuracy):Null_accuracy is computed by calculating the accuracy of the system consideringthe case that it assigns the most frequent label to each input. It is calculatedas a base_line accuracy to be compared with the accuracy achievedby system.Precision (Confidence):Precision is the division of correctly-labeled samples over all the instanceswhich were retrieved by the model 26.Recall (Sensitivity)):Recall is the division of correctly-labeled samples over all the number ofrelevant samples. 26.
F_measure:F_measure (F-score) which is also known as F1 is a special case of F?. Inthe case that F?=1.F1_measure combines precision and recall and it is approximately the averageof the two when they are close, and is more generally the harmonic meanof them26.The mathematical formula for calculating the F1_measure is as the following; The mathematical formula for calculating the F?_measure is as the following(which is the more general formula for F?_measure); Micro-average and Macro-average :There are different averages that can be considered when calculating , precision,recall, f-measure in classification tasks.
Micro-average, and Macroaverageare two of them; The formulas to compute Micro-average, and Macroaveragein the case of binary classification, is provided below (the indices 1 and2 refer to the class1 and class2, for example, TP1 is TP of class 1 and TP2 isTP of class2). 24:The Micro-average and Macro-average, F-Score is the harmonic mean ofthe two of them.Support:Support for each class is the number of samples of the true response that arepresent in that class; in the other words, support is the number of occurrencesof each class in y_true 24.
Confusion matrix:Confusion matrix is a visualization table which shows how well a model performs.Rows represent the samples which belong to a class based on prediction,and, columns samples belonging to their true class 26.