2.5.

Reference data for validation

(Approx. 30 min reading incl. references)

In order to validated EO derived phenometrics, reference data of ground observations are needed. For validation of phenology different data sources such as ground phenological observations, PhenoCams or data from national or international networks such as PEP (http://www.pep725.eu/) can be used (Tian et al. 2021). One of the most valuable references for phenological applications across Germany is the database maintained by the German Meteorological Service (DWD) (Kaspar et al. 2015). This database contains phenological observations from 6504 sites collected by volunteers 3-5 times per week over the same field throughout the growing season. While the data collection process relies on volunteers, it may contain inconsistencies in terms of regularity and continuity for some stations and geolocations errors. To overcome these limitations the PHASE dataset (Möller et al. 2023) of derived Germany-wide and spatiotemporally consistent 1 × 1 km² time series of interpolated DWD phenological phases is available within the BonaRes Repository (Specka et al. 2019). This dataset is used for relative comparisons with the derived LSP metrics.

Another reference widely used for comparative studies is the collection of products from The Copernicus Land Monitoring Service (CLMS), derived from the Sentinel-2 imagery at 10m spatial resolution (Smets et al. 2021). Here, the annual products of the Pan-European High-Resolution Vegetation Phenology Parameters (HR-VPPs) with the extracted LPS metrics in DOY for SoS (HR-VPP SOSD), PoS (HR-VPP MAXD) and EoS (HR-VPP EOSD) are applied for phenometrics comparison.

Figure 7: The comparison of derived SOS with the references (SOSD HR-VPP and DWD emergance stage), exemplified for selected field in Lower Saxony federal state of Germany; Source: Main-Knorn et al., 2024.

Performance of data fusion

The performance of S2-PS fusion methods is assessed using the Mean Squared Error (MSE) calculated by taking the average of the squared differences between the reference VI values (yti) and the predicted VI values (ypi):

\[ MSE = \frac{1}{n} \sum_{i=1}^{n} (yt_i - yp_i)^2\]

Additionally, absolute error maps are generated to visually assess the spatial distribution of fusion errors. To distinguish whether RF or RF-pp approach provide higher fusion performance, the Structural Similarity Index (SSIM) for randomly selected image pairs is calculated to accounts for luminance, contrast, and structure similarity between original and synthetic image (Wang, Z., et al., 2004). SSIM is a measure of the fidelity of two compared images, with values ranging from −1 to 1, where 1 indicates perfect structural similarity.

Validation of phenometrics

The comparison of the LSP metrics derived from both S2 and Fused datasets with the references based on the interpolated DWD BonaRes dataset and HR-VPP products. On the one hand a visual inspection of the spatiotemporal patterns, and on the other hand a statistical analysis of the Root Mean Squared Error (RMSE), the Mean Absolute Error (MAE) and the bias are conducted.

\[ MAE = \frac{1}{n} \sum_{i=1}^{n} \left| xt_i - xp_i \right|\]

The MAE metric is the average of the absolute differences between the reference DOY value (xti) and predicted value (xpi), weights all deviations equally and is less sensitive to outliers than MSE and RMSE. As the phenological stages defined within the DWD service are associated with the BBCH codes, the following phenometrics with corresponding codes are evaluated:

Phenometrics	Phenological phase	DWD_id	BBCH_id
SoS	Emergence / Beginning of growth	12	10
	Beginning of shooting/Stem elongation	67	31
PoS	Tip of tassel emergence	65	53
EoS	Early dough ripening	20	83
	Harvest	24	99

Table 3: LSP metrics associated with the maize phenological stages, defined in the DWD- and BBCH- codes

Impact of method selection on LSP accuracy

To assess how data temporal density and retrieval method selection affect the phenometrics accuracy, the averaged differences in SoS and EoS error metrics between the Fused and S2 datasets (Fused – S2) are calculated. These differences, along with the percentage improvement (or decline) in accuracy for each retrieval method, are visualized using box plots to highlight the distribution and spread of error metrics. Negative differences indicate improved performance with the Fused dataset.

\[ \text{Improvement }(\%) = \frac{\text{S2 metric} - \text{Fused metric}}{\text{S2 metric}} \times 100\]

References:

Kaspar, F., Zimmermann, K., Polte-Rudolf, C. (2015) An overview of the phenological observation network and the phenological database of Germany's national meteorological service (Deutscher Wetterdienst). Adv. Sci. Res. 11, 93-99.
Möller, M., Beyer, F., Dierks, M., Horney, P., Baumann, P., Svoboda, N., Gerstmann, H. (2023) Germany-wide time series of interpolated phenological observations of main crop types between 1993 and 2021.
Smets, B., Cai, Z., Eklundh, L., Tian, F., Bonte, K., Van Hoost, R., Van De Kerchove, , R., A., S., De Roo, B., Jacobs, T., Camacho, F., Sánchez-Zapero, J., , Swinnen, E., Scheifinger, H., Hufkens, K., & Jönsson, P. , (2021) HR-VPP Product User Manual Seasonal Trajectories and VPP parameters.
Specka, X., Gärtner, P., Hoffmann, C., Svoboda, N., Stecker, M., Einspanier, U., Senkler, K., Zoarder, M.A.M., Heinrich, U. (2019) The BonaRes metadata schema for geospatial soil-agricultural research data - Merging INSPIRE and DataCite metadata schemes. Computers & Geosciences 132, 33-41.
Tian, F., Cai, Z.Z., Jin, H.X., Hufkens, K., Scheifinger, H., Tagesson, T., Smets, B., Van Hoolst, R., Bonte, K., Ivits, E., Tong, X.Y., Ardö, J., Eklundh, L. (2021) Calibrating vegetation phenology from Sentinel-2 using eddy covariance, PhenoCam, and PEP725 networks across Europe. Remote Sensing of Environment 260.
Wang, Z., et al., (2004) Image quality assessment: From error visibility to structural similarity. Ieee Transactions on Image Processing. 13(4): p. 600-612.