All functions

decisiontree()

Decision tree Trains a decision on the given training dataset and uses it to predict classification for test dataset. The resulting accuracy, sensitivity and specificity are returned, as well as a tree summary.

dtreevoting()

Decision tree voting scheme. Implements a feature selection approach based on Decision Trees, using a voting scheme across the top levels on trees trained on multiple subsamples.

eGA()

Embryonic Genetic Algorithm. Feature selection based on Embryonic Genetic Algorithms. It performs feature selection by maintaining an ongoing set of 'good' set of features which are improved run by run. It outputs training and test accuracy, sensitivity and specificity and a list of <=k features.

feamiR

feamiR: Classification and feature selection for microRNA/mRNA interactions

forwardfeatureselection()

Forward Feature Selection. Performs forward feature selection on the given list of features, placing them in order of discriminative power using a given model on the given dataset up to the accuracy plateau.

geneticalgorithm()

Standard Genetic Algorithm. Implements a standard genetic algorithm using GA package (ga) with a fitness function specialised for feature selection.

preparedataset()

Dataset preparation This step performs all preparation necessary to perform feamiR analysis, taking a set of mRNAs, a set of miRNAs and an interaction dataset and creating corresponding positive and negative datasets for ML modelling.

randomforest()

Random Forest. Trains a random forest on the training dataset and uses it to predict the classification of the test dataset. The resulting accuracy, sensitivity and specificity are returned, as well as a summary of the importance of features in the dataset.

rfgini()

Random Forest cumulative MeanDecreaseGini feature selection. Implements a feature selection approach based on cumulative MeanDecreaseGini using Random Forests trained on multiple subsamples.

runallmodels()

Run all models. Trains and tests Decision Tree, Random Forest and SVM models on 100 subsamples and provides a summary of the results, to select the best model. The number of trees and kernel chosen by select_svm_kernel and select_rf_numtrees should be used for SVM and Random Forest respectively. We can use this function to inform feature selection, using a Decision Tree voting scheme and a Random Forest measure based on the Gini index.

selectrfnumtrees()

Tuning number of trees hyperparameter. Trains random forests with a range of number of trees so the optimal number can be identified (using the resulting plot) with cross validation

selectsvmkernel()

Tuning SVM kernel. Trains SVMs with a range of kernels (linear, polynomial degree 2, 3 and 4, radial and sigmoid) using cross validation so the optimal kernel can be chosen (using the resulting plots). If specified (by showplots=F) the plots are saved as jpegs.

svm()

SVM

svmlinear()

Linear SVM Implements a linear SVM using the general svm function (for ease of use in feature selection)

svmpolynomial2()

Polynomial degree 2 SVM Implements a polynomial degree 2 SVM using the general svm function (for ease of use in feature selection)

svmpolynomial3()

Polynomial degree 3 SVM Implements a polynomial degree 3 SVM using the general svm function (for ease of use in feature selection)

svmpolynomial4()

Polynomial degree 4 SVM Implements a polynomial degree 4 SVM using the general svm function (for ease of use in feature selection)

svmradial()

Radial SVM Implements a radial SVM using the general svm function (for ease of use in feature selection)

svmsigmoid()

Sigmoid SVM Implements a sigmoid SVM using general svm function (for ease of use in feature selection)