Action

ACTION!

In this Lesson, the MONICA files are generated so it becomes possible to simulate the yield according to the traditional techniques. The first step is to generate the data, then calibrate the model by changing the cultivar parameters. Further, generating the data to assist the DSM procedure. For a deeper discussion regarding model calibration, we recommend (Wallach et al., 2021).

Step 1:

The first step is to generate the MONICA files required for the crop yield simulation.

As a first step to create MONICA files for all the selected locations, it is recommended to create manually one first template with the characteristics that the other profiles have. With the files setup for the first location, it becomes easier to make changes to the other locations.

The script for this stage is the 2.1_generate_mon_files.py. The script will copy the weather file and the template files in ./monica_files/, make changes, and save accordingly at folder ./monica_files/local_files_fit/. The template files are: crop_template.json, sim_template.json, and site_template.json.

These files already contain the proper fertilizers, layers configurations among other small changes to the original files that are downloaded with MONICA.

The author incentivizes the attendant to open the template json files, the generated json files and also compare with the file ./data/true_data/selection_GIS_32633_ws.csv. Detailed explanation of the content of the files is available at the MONICA webpage.

Step 2:

At this step, we fit the parameters to calibrate the model, thus predicting a more accurate yield. The LMA algorithm is used to estimate variables 7 variables for the winter rye: the base temperature for the crop and cultivar, and the phenological stages temperatures sums (at 5 stages).

The boundaries for this step are critical and require a deep study to set it, otherwise, overfitting will likely happen. The script for this stage is the 2.2.a_monica_fitting.py. At this stage is set also the weights for the OBJ and the initial guess of the parameters (required in this algorithm). The respective variables are bounds, for the parameter boundaries, weights for the weights, and params_ini for the initial parameter set.

The weights variable is a list containing 2 values, the first is related to the yield, and the second to the correlation of the LAI and NDWI.

After the fitting, the fitted parameters are saved in ./data/fit_params/fit_params.json.

Note: As explained in the theory section, the LMA algorithm requires the derivative of the MONICA’s output to the respective parameter, since it is impossible to acquire it analytically, a numerical derivative is realized:

\[ \frac{dy}{dp} \approx \mathop{\lim}\limits_{h \to 0} \frac{f\left( p + h \right) - f\left( p \right)}{h}\]

where \(p\) is the parameter, \(y\) the MONICA output, \(h\) a very small disturbance in the parameter, and \(y = f(x)\).

Consequently, it is required to run MONICA two times to obtain the numerical differentiation. Even though MONICA must run twice for each iteration, the LMA algorithm remains faster than most alternative optimization methods.

Step 3:

After the fitting, the same algorithm with small changes is run again, but this time, just once to generate the output of each selected location, thus allowing us to calculate the RMSE between the model output (prediction) and the observed ground values (like crop yield). The script file is named 2.2.b_monica_fitting.py, and the output folder containing all the simulated results is: ./monica_files/local_files_fit/results/.

In this tutorial, the MONICA files are configured to output the data as CSV files.

Tasks
  1. The fitted parameters are: the base temperature, the phenological stage temperature sum 1, 2, 3, 4, and 5. Why were these parameters selected? Are there other good options? (This specific task requires extra time to be concluded!)
  2. Which other methods could be used to fit the parameters in the presented scenario?