Module 1

Data Collection and Pre-Processing

Objectives: Gather, prepare, and preprocess all spatial and ground-truth data required for DSM.
1. Data Sources
  • Field data: crop, management, fertilizer types and application, yield (if available).
  • Weather data: from DWD (Deutscher Wetterdienst).
  • Satellite data: Sentinel-2 Level-2A (S2L).
  • Auxiliary data: elevation (DEM), land use, boundaries.
2. Spatiotemporal Processing of Sentinel-2
  • Filter: Remove cloud-contaminated and low-quality pixels.
  • Temporal smoothing: Apply the Whittaker smoother to each pixel over time.
  • Spatial smoothing: Apply a Gaussian filter to reduce local noise.

Output: Daily, 10 m resolution crop growth indicators from sowing to harvest.

WHITTAKER SMOOTHING THEORY

The Whittaker smoother (Eilers, 2003) is a method used to remove noise from time-series data while preserving the main signal trend. It is especially effective for satellite-derived vegetation indices (e.g., NDVI, NDWI) that are irregularly sampled or affected by clouds.

The method minimizes a penalized least-squares objective function that balances two opposing goals:

(1) fidelity to the observed data, and (2) smoothness of the resulting curve.

Formally:

\[ \hat{x} = \mathop{\min}\limits_{z} \left\{ \sum_{i=1}^{n} w_i \left( y_i - z_i \right)^2 + \lambda \sum_{i=1}^{n} \left( \Delta^2 z_i \right)^2 \right\}\]

where:

  • \(\hat{x}\) is the smoothed value
  • \(y_i\) are the observed values
  • \(z_i\) are the smoothed values to be estimated
  • \(w_i\) are weights (e.g., \(0\) for invalid or cloudy pixels, \(1\) for valid data)
  • \(\Delta^2 z_i\) is the second-order difference operator (a measure of curvature)
  • \(\lambda\) is the smoothing parameter that controls the trade-off between smoothness and fidelity

Low \(\lambda\): curve follows the data closely (less smoothing).
High \(\lambda\): curve becomes smoother but may lose local detail.

The method is deterministic, fast, and robust to gaps in the data.

It is particularly suitable for remote sensing time series, as it can interpolate missing dates and suppress short-term fluctuations caused by atmospheric noise or sensor artifacts.

In this course, the Whittaker smoother is applied temporally to each pixel of the Sentinel-2 time series (NDWI and BSI), providing continuous daily values between sowing and harvest.

3. Derived Indices

Compute indices such as:

  • BSI (Bare Soil Index): derived from bare soil scenes (no crop).
  • NDWI (Normalized Difference Water Index): proxy for canopy water content. For the Sentinel-2 Level-2A, the formulas for the BSI and NDWI encompasses bands B02, B04, B08, B11, (respectively blue, red, near-infrared and shortwave infrared) and are:

\[ BSI = \frac{(B_{11} + B_{04}) - (B_{08} + B_{02})}{(B_{11} + B_{04}) + (B_{08} + B_{02})}\]

\[ NDWI = \frac{(B_{08} - B_{11})}{(B_{08} + B_{11})}\]

It is expected that the attendant has knowledge of the Sentinel-2 S2L products, but the official webpage from the European Space Agency is: https://www.esa.int/Applications/Observing_the_Earth/Copernicus/Sentinel-2

4. Sampling Design

Select points representing field variability:

  • Extremes and averages: choose local maxima, minima, and mean areas across NDWI, BSI, and elevation maps.
  • Start with ~20 points, but ideally >50 for higher accuracy.

For each point, collect (or assume) soil data at three depths:

  • 0.0–0.3 m
  • 0.3–0.6 m
  • 0.6–0.9 m

Laboratory analysis: texture (sand, clay) and soil organic carbon (SOC).

Outcome: A set of spatially distributed ground-truth soil profiles linked to satellite indicators.