Initial Experiments with a Scalable Machine Learning Based Approach for Downscaling the MOD16A2 Evapotranspiration Product

Initial Experiments with a Scalable Machine Learning Based Approach for Downscaling the MOD16A2 Evapotranspiration Product

Objective

Over the years, the amount of cropping and by extension the water used for cropping has intensified greatly. The evapotranspiration hydrological variable is a good proxy to measure crop water usage in different regions. Current available and accessible Evapotranspiration sources provide us a maximum resolution of 500 m (provided by MODIS), and our model downscales it to a finer resolution. The downscaled output of our model is at a higher resolution of 30 m and downscaling of ET will allow us to observe finer relative differences in ET values. These downscaled ET values at a better resolution are needed for better local decision making.

Method

In the context of our research, we have focused on different regions of India in terms of Agro-Ecological-Zones known as AEZs.

We use satellite data derived from the Landsat satellite and also meteorological variables from the GLDAS system from NASA and we use these as features for our model. These are available at different spatial and temporal resolutions.

For aligning them spatially, all the variables were first brought to a resolution of 500 meters by aggregation of their rasters. The resolution of 500 meters was chosen because the MODIS dataset is available at this resolution. For aligning them temporally, we only take data points for which both MODIS and Landsat data are available and data acquisition for their rasters begin on the same day.

We then train a random forest model using these input features on the MODIS dataset. For inference of this trained model, we aggregate the input features to a resolution of 30 meters now. These features are provided as an input to the trained RF model which can then return an output raster image at a resolution of 30 meters.

                   **Methodology of Downscaling ET values**

Results and Conclusions

The results produced from the downscaled model were evaluated for different AEZs: 10,11,12 and 13. We took training data from each AEZ for the years 2016-21 and validated on data for the year 2022 (all points sampled from a single AEZ), and this model we refer to as single-AEZ model. The model shows R2 scores of 0.88, RMSE = 4.65 and NRMSE score of 0.39 for AEZ-10.

We also trained the model on data from years 2016-21 pooled from all the AEZs (10-13), and validated the results for data taken from particular AEZs, to see the effects of diverse data. This model shows an R2 score of 0.87, RMSE = 4.62 and NRMSE of 0.39, which shows a similar trend as the single AEZ model. This shows that a single model can be built for multiple AEZs, which makes the process less computationally expensive.

Overall, we found that the downscaling model is successful in capturing the trends, seasonality, and other patterns present in evapotranspiration data and can be used for relative comparison at the very least. However, the outputs did not match very closely with in situ lysimeter data collected by IMD stations, hence the downscaling still needs work to get accurate absolute values of ET. This could be due to the need for better calibration methods, the availability of meteorological variables which are too coarse and don’t capture fine scaled locational variations, and perhaps also poor quality in situ data.

More details can be found in our paper: Initial Experiments with a Scalable Machine Learning Based Approach for Downscaling the MOD16A2 Evapotranspiration Product - V. Jingar, S. Sahoo, Siddharth S., D. Sharma, S.A. Mehta, and A. Seth. ACM COMPASS, 2024.

The code and data for downscaling of ET is available on github here and for calibration of ET here.