Abstract:
In recent years, the data science and remote sensing communities have started to align due to user-friendly programming tools, access to high-end consumer computing power, and the availability of free satellite data. In particular, publicly available data from the European Space Agency’s Sentinel missions have been used in various remote sensing applications. However, there is a lack of studies that utilize these data to assess the performance of machine learning algorithms in discriminating Lantana Camara L. In this study, I compare the classification performance of six non-parametric algorithms: support vector machines (SVM), random forests (RF), gradient boosting, k-nearest neighbors (KNN), neural networks (NNET) and decision tree models. The study area chosen has presence of the study species Lantana as well as six other land-cover and land-use (LCLU) classes. The mono temporal satellite data used for the classification was downloaded in the month of October, the period during which the data of presence of Lantana was collected from the field. A total of 7811 samples were extracted from the training polygons for this study. Using stratified random sampling the samples were divided into training (60%) and evaluation (40%) subsets. Accuracy was assessed through metrics derived from error matrices, but primarily overall accuracy was used in allocating algorithm hierarchy. The results show that the highest overall accuracy was produced by random forest, gradient boosted model and neural network, all with a 100% accuracy. They are closely followed by support vector machine (0.9987), k-nearest neighbor (0.9971) and lastly decision tree (0.9891).