Abstract:
The increased use of real-time water quality monitoring using automated systems with sensors demands and makes it possible to identify unexpected values in time. Anomalies are brought by technical issues that are likely to prevent detection of problematic data manually at the incoming data rate. Use of machine learning approaches to detect anomalies for water quality data is the main focus of this article. There is analysis of four time series machine learning anomaly detection techniques: the local outlier factor, the isolation forest, the extended isolation forest and robust random cut forest. A subset data collected from deployment of sensors in a water treatment plant (Nyeri-Kenya) was used to carry out extensive analysis of experiments of the afore-mentioned techniques; for turbidity and pH parameters. There was successful correct detection of all outliers for both subsets by the local outlier factor algorithm, contrary to the rest of the other algorithms considered. As per the primary experiment, the local outlier factor emerged the fastest. Also, it was easier use as long as there was selection of optimum parameters. Moreover, analysis of the four techniques demonstrated that with or without training, it is a powerful tool for water quality anomaly detection and hence a feasible approach