Focus
Environmental Engineering, Machine Learning, Water Quality Prediction
Motivation
Sustainability, Accuracy, Data-Driven Modeling
About the project
This research examines how different machine learning and deep learning models perform in predicting dissolved oxygen (DO) levels in water bodies — a key indicator of aquatic health and environmental stability. As urbanization, pollution, and climate change increasingly threaten water systems, the ability to forecast DO accurately has become vital for effective water resource management. The study compares six predictive models: three traditional machine learning approaches (Ridge Regression, Random Forest, HistGradient Boosting Regressor) and three deep learning architectures (LSTM, GRU, and TCN).
Using a publicly available dataset on water quality parameters such as turbidity, pH, salinity, and temperature, the paper evaluates each model’s performance through statistical metrics like RMSE, MAE, R², and Nash-Sutcliffe Efficiency. Traditional models, particularly tree-based methods, demonstrate strong baseline performance due to their robustness and interpretability with smaller datasets. Deep learning models, on the other hand, showcase potential advantages in capturing complex temporal and nonlinear relationships, especially when dealing with larger and more continuous data streams.
The study concludes that while deep learning models such as LSTM and TCN exhibit superior temporal learning capabilities, traditional algorithms like Random Forest remain highly competitive when data is limited. This finding underscores that the most effective approach may not lie in model complexity but in matching the model’s inductive bias to the nature of the data. Ultimately, the paper contributes to the growing field of environmental informatics by showing how data-driven prediction models can support sustainability goals and more adaptive, evidence-based water management systems.
Check out more projects




