When Can We Predict Fire Severity?

A temporal analysis of the 2025 Grampians (Gariwerd) bushfire

After every major Australian bushfire, land managers face the same urgent question: where did the fire burn most severely? The answer drives everything from emergency erosion control and water catchment protection to wildlife rescue and seed dispersal. But the standard tool — a satellite severity map — takes 2–4 weeks of waiting for cloud-free post-fire imagery. During that window, critical decisions are made with incomplete information.

What if we didn’t have to wait? What can landscape data, weather records, and real-time satellite hotspots tell us before the smoke clears? And once we do get post-fire imagery, does a single spectral index work everywhere, and how much does the ecological context matter?

In December 2024, dry lightning ignited a bushfire that burned approximately 135,000 hectares across the Grampians National Park (Gariwerd). A second lightning-ignited wave on 27 January 2025 extended the fire into the southern ranges, ultimately burning ~80% of the national park. This analysis tracks which data sources become available at each stage of the fire — from pre-fire landscape through real-time detection to months of recovery — and measures how much each one contributes to predicting final burn severity.

Rapid severity mapping matters because not all of the park burned equally. Within large fire footprints, unburned and low-severity patches function as refugia — safe havens from which biodiversity can recolonise the surrounding landscape (Keppel et al. 2012). With ~20% of the park unburned and a further 14,724 ha at low severity, identifying these refugia is a conservation priority. After the 2019–20 Black Summer fires, rapid severity data proved essential for overlaying with species distributions to prioritise conservation triage (Dickman et al. 2020). The 2025 Grampians fire continues a trajectory of unprecedented mega-fire seasons linked to climate modes (Nolan et al. 2020), making rapid severity assessment increasingly critical for post-fire conservation response.

Grampians National Park

The Grampians (Gariwerd) National Park in western Victoria — 168,000 ha of sandstone ranges, eucalypt forest, heathland, and riparian gullies. The white dashed line shows the park boundary. The cyan line shows the fire perimeter: ~135,000 ha, roughly 80% of the park.

The burn scar

Differenced Normalised Burn Ratio (dNBR) from Sentinel-2 imagery reveals the severity pattern. Green = unburned/regrowth, red = high severity. 63% of the burned park area was moderate-high or high severity (38,045 + 29,952 ha).

Severity classes

Classified using standard USGS dNBR thresholds into four severity classes. The fire was predominantly severe, consistent with extreme heat, strong winds, and prolonged drought during the 2024–25 fire season.

Severity distribution

Class	dNBR range	Area (ha)	%
Low	0.10 – 0.27	14,724	14%
Moderate-Low	0.27 – 0.44	24,658	23%
Moderate-High	0.44 – 0.66	38,045	35%
High	> 0.66	29,952	28%

Signal validation

Before using dNBR as a severity metric, we confirmed it measures fire effects and not seasonal phenology. Within the park, unburned areas show near-zero change (median dNBR = -0.043), while burned areas show a strong signal (+0.504). Farmland is unreliable — crop senescence produces dNBR of +0.367 even without fire. All analysis below uses the national park boundary to exclude agricultural confounders.

Can we predict severity before the fire? Months before fire

Pre-fire landscape variables are weak individual predictors. The strongest is NDVI (r = +0.260), explaining only ~7% of variance. Topography is weaker still.

Topography Always available

Slope (r = -0.192) provides weak signal. Steeper terrain can channel fire behaviour, but topography alone is a poor predictor of severity outcomes.

Pre-fire vegetation Months before

NDVI shows the fuel load landscape before the fire. Brown = sparse vegetation, dark green = dense canopy. Correlation with severity: r = +0.260 — dense vegetation burns more severely, but the relationship is weak.

Fire history Years before

Time since last fire ranges from 11 years (2014 Northern Complex) to 19 years (2006 Mt Lubra). Areas that burned more recently may carry less fuel, but the limited range of fire histories within the park means this predictor contributes only 1.3% of random forest feature importance — the lowest among landscape variables after TPI.

During-fire satellite signals During fire · hours to days

VIIRS hotspot detections from NASA’s Suomi-NPP and NOAA-20 satellites provide the first during-fire data, measuring fire intensity (energy release rate) rather than burn outcomes. Detection count (r = -0.370) is stronger than any pre-fire variable — pixels detected as active fire on more passes tended to burn less severely, likely indicating slower-moving flanking fire.

VIIRS fire intensity Hours–days

Maximum brightness temperature (T21) from VIIRS shows where fire burned hottest. Brighter colours indicate higher radiative energy, but the correlation with final severity is modest (r = -0.125) due to the 375m resolution mixing high and low intensity pixels.

Post-fire indices 2–8 weeks after fire

Post-fire satellite imagery carries the strongest signal. Multiple complementary indices — each sensing different physical properties — confirm fire effects:

dNBR: the standard 2–8 weeks

Differenced Normalised Burn Ratio uses near-infrared and shortwave-infrared reflectance to detect vegetation loss. Available 2–4 weeks after containment from cloud-free Sentinel-2 imagery. This is both our primary severity metric and response variable.

Red-edge chlorophyll (dCIre) 2–8 weeks

The red-edge chlorophyll index (r = +0.678 with dNBR) detects canopy chlorophyll loss via Sentinel-2 bands B7/B5 — a spectral region invisible to traditional SWIR indices. Particularly effective in dense, wet forests where dNBR saturates.

SAR structural damage (dVH) 2–8 weeks

Sentinel-1 C-band SAR (r = +0.513) detects structural canopy damage through radar backscatter — completely sensor-independent from optical indices. Works through cloud cover and at night.

Land surface temperature (dLST) 2–8 weeks

Landsat thermal imagery (r = +0.693 with dNBR) captures ground heating from exposed soil and ash — a different physical mechanism from optical reflectance. Areas where canopy was removed show increased surface temperature.

14-month recovery 2–14 months

NDVI recovery rate — the slope of monthly greenness over 14 months, capturing ecosystem response rather than direct fire effects — is the strongest single predictor (r = +0.710). Areas that burned severely recover more slowly. But this signal takes months to accumulate and serves as a retrospective scientific benchmark, not an operational input — no manager can wait 14 months for severity data.

NDVI recovery: slow but progressing 2–14 months after fire

The recovery trajectory shows a seasonal dip during winter (Jun–Aug 2025) before resuming upward, consistent with eucalypt resprouting phenology. Resprouter-dominated communities (common in the Grampians) recover faster spectrally than obligate seeders, and stratifying recovery by vegetation type would strengthen this analysis (Gibson & Hislop 2022).

Three decision points, three data windows

Fire severity prediction follows a clear temporal gradient, and each stage maps to a distinct management decision:

1. Immediate reconnaissance targeting (T4, days post-fire): Landscape, weather, and VIIRS hotspot data explain ~46% of severity variance. While the fire is still burning or smoke still obscures the ground, this is enough to prioritise aerial reconnaissance flights and flag likely high-severity zones for rapid assessment crews.

2. Rehabilitation planning (T5, weeks post-fire): The first cloud-free post-fire Sentinel-2 pass produces the largest single accuracy gain (+0.33 R²). This is the decision-grade severity map — the one that drives erosion control, seed dispersal priorities, salvage assessments, and fauna habitat triage.

3. Recovery monitoring (T6, months post-fire): The 14-month NDVI recovery signal adds only marginal further accuracy (+0.01 R²). Its value is retrospective: confirming that the T5 severity map was correct and tracking whether rehabilitation interventions are working. No operational decision should wait for this data.

The cross-fire validation reveals an important caveat: dNBR sensitivity appears ecosystem-dependent. In dry sclerophyll forest (Grampians), standard thresholds discriminate severity well. The Otway comparison (wet temperate forest) suggests red-edge indices may be needed in denser canopies, though this conclusion rests on a small sample (~15 polygons, 1 crown-fire polygon) and should be confirmed with larger datasets. Generalisable severity mapping across Victoria’s diverse vegetation types likely requires multi-index approaches.

Next steps

Several directions would strengthen and extend this work:

Ecological ground-truthing: Satellite indices measure spectral change, not ecological condition directly. Field surveys assessing canopy scorch, understorey condition, soil hydrophobicity, and fauna habitat loss would calibrate what dNBR severity classes mean for biodiversity and ecosystem function in the Grampians.
Multi-index severity mapping: Developing vegetation-type-specific severity metrics — combining dNBR, dCIre, and SAR — to handle the signal differences between dry sclerophyll, wet forest, and heathland across Victoria.
Larger cross-fire validation: The Otway finding rests on ~15 polygons. Extending validation to more of the 16 Victorian fires with aerial-photography severity data (Collins et al. 2018) would clarify where dNBR works and where it doesn’t.
Operational T4 severity estimates: Integrating real-time VIIRS and Himawari hotspot data with pre-fire landscape layers into emergency response workflows, so preliminary severity maps are available while the fire is still active.
Recovery stratified by vegetation type: Resprouters and obligate seeders recover at different rates — stratifying the 14-month NDVI recovery signal by functional type would better characterise post-fire trajectories.

Validating dNBR as a severity metric

This entire analysis uses dNBR as both the response variable and the primary severity metric. But does dNBR actually correspond to real-world fire damage? This is not a trivial assumption — dNBR measures spectral change in satellite reflectance, not ecological condition directly. To test this, we compared Landsat-derived dNBR against independently measured severity from aerial photography at two fires in contrasting forest types.

The 2006 Grampians fire: ground truth from the air

The 2006 Mt Lubra fire (120380 ha) burned the same Grampians landscape as the 2024–25 event. Its severity was mapped under Victoria’s Post-fire Burn Classification Procedure (McCarthy et al. 2017): aircraft flown at ~2,700 feet captured high-resolution (~15–25 cm) colour and near-infrared photography shortly after the fire. Trained interpreters classified 3,013 polygons into LOW, MEDIUM, and HIGH severity classes based on crown scorch percentages and understorey condition — detail that can only come from visual interpretation of aerial imagery, not satellite data.

This is fully independent of our satellite analysis. The 2006 fire predates Sentinel-2 (launched 2015) and the Gibson et al. (2020) satellite-based severity methodology now used operationally by DEECA. The aerial photography classification shares no data source with our Landsat 5 TM dNBR computation. It is among the 16 Victorian fires (2006–2016) severity-mapped from aerial photography documented by Collins et al. (2018).

We extracted Landsat 5 TM dNBR over the 2006 fire area and sampled 1515 points across all severity polygons. The results:

DEECA class	n	dNBR median	dNBR mean	SD
LOW	283	0.287	0.306	0.272
MEDIUM	728	0.352	0.358	0.234
HIGH	504	0.616	0.591	0.203

dNBR shows clear monotonic separation across aerial-photography-derived severity classes. Medians increase from 0.29 (LOW) to 0.35 (MEDIUM) to 0.62 (HIGH). Spearman rank correlation: ρ = 0.44 (p < 0.001). At the polygon level (aggregating per-polygon medians to avoid pseudoreplication across 3013 polygons): ρ = 0.54. Kruskal-Wallis test confirmed highly significant differences across classes (H = 328.3, p < 0.001, η² = 0.22).

Cohen’s Kappa between dNBR threshold classes (He et al. 2024) and DEECA classes was κ = 0.24 (overall accuracy 49%). This is “fair” agreement under the Landis & Koch (1977) scale, reflecting genuine overlap between severity classes — fire severity is a continuum, and boundaries between LOW/MEDIUM/HIGH are inherently fuzzy. The strong polygon-level rank correlation (ρ = 0.54 across 3013 polygons) is the more informative metric: dNBR correctly ranks severity even when categorical thresholds don’t perfectly align with DEECA’s classification boundaries.

The Otway fire: suggestive evidence that dNBR fails in wet forest

This validation is ecosystem-specific. At the Otway fire (wet temperate forest, same 2024–25 season), dNBR showed near-zero correlation with DEECA severity (ρ = -0.03, κ = 0.005). All severity classes had nearly identical dNBR medians (~0.42), consistent with dNBR saturation in dense, wet canopies where leaf area index exceeds 4–7 (He et al. 2024). The red-edge chlorophyll index dCIre showed clear separation instead (ρ = 0.39). However, this finding is based on only ~15 severity polygons (with just 1 BURNT_3 polygon), so it should be treated as suggestive rather than definitive.

Implication for this analysis: dNBR is a valid severity metric in the Grampians’ dry sclerophyll and heathland vegetation. The 2006 validation — using the same landscape, the same vegetation types, and fully independent aerial photography data — provides direct evidence that satellite-derived dNBR captures the severity gradient that matters ecologically. The Otway comparison suggests this conclusion may not transfer to wet forest, though the small sample size warrants caution.

Otway fire: dNBR appears to fail in wet forest

At the Otway fire (wet temperate forest, ~15 polygons), all DEECA severity classes had nearly identical dNBR medians (~0.42). dNBR saturates in dense, wet canopies (LAI 4–7+). The red-edge index dCIre showed clear separation instead (ρ = 0.39). Small sample size means this finding is suggestive.

2006 Grampians: dNBR validated

The 2006 Mt Lubra fire, severity-mapped from aerial NIR photography by FFMVic (3013 polygons). Landsat dNBR shows monotonic separation: LOW 0.29 → MEDIUM 0.35 → HIGH 0.62. Polygon-level ρ = 0.54.

Temporal prediction: when does each source contribute?

T1–T2 Pre-fire (landscape + drought): R² = 0.34 — weak but real
+ T3–T4 Fire weather + VIIRS: R² = 0.46 — ~46% of variance predictable before post-fire imagery
+ T5 Post-fire indices: R² = 0.80 — largest single jump (+0.33). Conservative SAR-only estimate (T5a): R² = 0.5786 — the rest partly reflects predicting one Sentinel-2 index from another.
+ T6 Recovery: R² = 0.81 — full model (retrospective benchmark, not operational)

Six cumulative Random Forest models (T1–T6) were trained on 10,000 stratified sample points. All R² values are from spatial block cross-validation (~2km blocks), guarding against inflated accuracy from spatial autocorrelation. Notably, ~46% of severity variance is predictable before or during the fire, enabling meaningful preliminary severity maps within hours of fire passage.

The modest contribution of landscape variables alone is consistent with the ‘muting’ effect described by Collins et al. (2019): under severe fire weather, topographic and fuel-age effects on burn severity collapse, reducing the predictive value of pre-fire landscape mapping. This underscores the importance of real-time fire monitoring data — as extreme fire weather becomes more frequent under climate change, operational severity assessment increasingly depends on active-fire and post-fire observations rather than landscape risk mapping alone.

What this means for managers

These temporal tiers map directly onto operational decision points. During active fire (T4), VIIRS hotspot intensity combined with landscape and weather data can guide reconnaissance flight targeting and initial rapid assessment — identifying the areas most likely to have burned severely while smoke still obscures the ground. Once the first cloud-free post-fire image arrives (T5, typically 2–4 weeks after containment), adding spectral and SAR indices produces the largest single accuracy gain (+0.33 R²), delivering reliable fine-scale severity mapping for rehabilitation planning — erosion control, seed dispersal priorities, and fauna habitat triage. The 14-month recovery signal (T6) adds only marginal further accuracy (+0.01); its real value is retrospective confirmation that the severity patterns identified at T5 were correct, not an input managers can wait for.

Feature importance

Permutation importance from the full (T6) random forest model — the drop in R² when each predictor is randomly shuffled. Higher values indicate greater contribution to severity prediction.

Tier	Predictor	ΔR²
Landscape	Pre-fire NDVI	0.197
	Elevation	0.041
	Slope	0.025
	Northness	0.016
	Time since fire	0.013
	Topographic position	0.010
Drought	Soil moisture (30d)	0.014
	VPD (pre-fire)	0.012
Fire weather	VPD (burn day)	0.011
	Wind speed	0.006
	Max temperature	0.005
Satellite intensity	VIIRS detection count	0.063
	Himawari cumulative FRP	0.016
	Himawari fire duration	0.016
	VIIRS FRP	0.013
	Himawari max FRP	0.008
	VIIRS max brightness	0.005
Post-fire indices	dCIre (red-edge)	0.944
	dVH (SAR)	0.069
Recovery	NDVI recovery rate	0.291

Toggle individual layers, view detailed statistics & RF analysis

Methods & Technical Details

Detailed methodology, cross-validation diagnostics, benchmark comparisons, and data sources.

This analysis asks: at which point in time can we predict burn severity, and how much does each data source contribute?

Methods

Study area and response variable

Analysis was restricted to the burned area within the Grampians National Park boundary (WDPA), excluding surrounding farmland where crop phenology confounds the dNBR signal. The response variable is dNBR (differenced Normalised Burn Ratio), computed as pre-fire NBR minus post-fire NBR from Sentinel-2 surface reflectance (bands B8A and B12, 20m resolution). dNBR measures burn severity — the first-order spectral signature of organic matter consumption (Keeley 2009) — rather than ecological fire severity directly. The chain from spectral change to ecological condition involves multiple inferential steps; we validate dNBR against independent aerial photography interpretation from the 2006 Grampians fire (see Cross-fire validation below). B8A (narrow NIR, 20m) is used rather than B8 (broad NIR, 10m) for spectral consistency with B12 at 20m native resolution, avoiding resampling artefacts. Pre-fire composite: median of cloud-free scenes from 1 Oct – 10 Dec 2024. Post-fire composite: median from 15 Feb – 15 Apr 2025. Cloud masking used the Scene Classification Layer (SCL classes 4–7). Severity was classified using standard USGS thresholds: Low (0.10–0.27), Moderate-Low (0.27–0.44), Moderate-High (0.44–0.66), High (>0.66).

Sampling

10,000 sample points were extracted using stratified random sampling (2,500 per severity class) within the park burnt area at 100m spacing (seed=42). For each point, all predictor values were extracted from the corresponding satellite imagery via Google Earth Engine. Points retain their geographic coordinates for spatial cross-validation. Note: equal allocation across severity classes oversamples rarer classes relative to their landscape prevalence, so R² values reflect balanced-sample performance rather than the natural severity distribution.

Predictor variables

Predictors were organised into six cumulative temporal tiers, reflecting the order in which data becomes available during a fire event:

T1 — Landscape (pre-fire): Pre-fire NDVI (Sentinel-2 median, Oct–Dec 2024) as a fuel load proxy. Slope, northness (cosine of aspect; positive = north-facing = drier in the southern hemisphere), elevation, and Topographic Position Index (TPI, elevation relative to 300m neighbourhood mean, distinguishing gullies from ridges) from SRTM 30m DEM. ESA WorldCover v200 vegetation type, one-hot encoded (tree/shrub/grass). Time since last fire (years), derived from the DEECA Fire Severity (Wildfire) shapefile — most of the Grampians last burned in 2006 or 2013–14; areas with no fire history were assigned a default of 99 years. Following Bradstock et al. (2010) and Collins et al. (2019), fuel age was included as a landscape predictor, though its contribution was modest given the limited range of fire histories within the park.

T2 — + Drought (pre-fire, time-varying): Antecedent soil moisture (ERA5-Land volumetric soil water layer 1, mean of 30 days before fire start: 17 Nov – 16 Dec 2024). Pre-fire vapour pressure deficit (VPD) derived from ERA5-Land 2m temperature and dewpoint, Oct–Dec 2024, using the Magnus formula: VPD = e_s(T) − e_a(T_d), where e_s = 0.6108 × exp(17.27T / (T + 237.3)).

T3 — + Fire weather (during fire): Each sample point was assigned a burn date from the earliest VIIRS active fire detection (system:time_start) at that location. ERA5-Land daily maximum temperature, 10m wind speed (from U/V components), and VPD were then extracted for that specific date. ERA5 resolution is ~9km, so points sharing a burn date receive identical weather values—the model learns which day of burning matters, not spatial weather variation.

T4 — + Satellite intensity (during fire): VIIRS (S-NPP + NOAA-20, 375m): maximum brightness temperature (Bright_ti4), maximum fire radiative power (FRP), and detection count across all passes during Dec 2024 – Feb 2025. Himawari-9 (JAXA P-Tree L2 Wildfire product, ~2km, 10-min intervals): cumulative FRP, maximum FRP, and fire duration (hours between first and last detection within 2km of each sample point).

T5 — + Post-fire indices (2–8 weeks): dCIre (differenced red-edge Chlorophyll Index: B7/B5 − 1, pre minus post; formulation after Gitelson et al. 2006), sensitive to canopy chlorophyll loss. dVH (differenced Sentinel-1 SAR VH backscatter, IW mode, pre minus post), detecting structural canopy damage independent of cloud cover.

T6 — + Recovery (months): NDVI recovery rate, computed as the linear slope of 14 monthly NDVI composites from Feb 2025 to Mar 2026.

Alpha Earth foundation model comparison

To test whether learned representations outperform hand-engineered pre-fire features, two additional standalone models were trained using 64-dimensional embeddings from Google DeepMind’s AlphaEarth Foundations (GOOGLE/SATELLITE_EMBEDDING/V1/ANNUAL in GEE). AlphaEarth is a foundation model trained on multi-sensor time series (Sentinel-2, Sentinel-1, Landsat) that encodes vegetation characteristics, phenology, surface moisture, and topographic context into dense unit-length vectors at 10m resolution (AlphaEarth Foundations, 2025).

AE_2023: 64 embedding dimensions from the 2023 annual composite — guaranteed pre-fire, representing landscape conditions one year prior to the fire.

AE_2024: 64 embedding dimensions from the 2024 annual composite — the most recent pre-fire snapshot, but potentially contaminated by ~2 weeks of fire signal (fire started 17 Dec 2024). Comparing AE_2024 vs AE_2023 provides a leakage check: a large difference would indicate the 2024 composite has encoded fire damage rather than pre-fire landscape.

These models were trained head-to-head against T1 using the same 10,000 samples, spatial block CV, and max_features tuning. Unlike the cumulative T1–T6 models, AE models use only embedding features — no hand-picked variables.

Bivariate analysis

Pearson correlations between each predictor and dNBR were computed from 5,000 stratified sample points (1,250 per severity class) to assess individual predictor strength.

Random Forest regression

Six cumulative Random Forest models (T1–T6) were trained using scikit-learn (RandomForestRegressor, 500 trees, oob_score=True, random_state=42). Each model adds a tier of predictors to the previous, quantifying the incremental variance explained at each temporal stage.

Random Forest models have been shown to improve fire severity mapping accuracy by 6–21% over single-index thresholding in south-eastern Australian eucalypt systems (Collins et al. 2018). Unlike classification-based approaches that predict discrete severity classes (Collins et al. 2018; Gibson et al. 2020), we use RF regression on continuous dNBR, which captures the full gradient of spectral change but does not directly predict ecologically meaningful severity categories.

Cross-validation

Three R² estimates are reported for each model:

Spatial block CV (primary): sample points were assigned to ~2km grid blocks based on their coordinates. GroupKFold (5 folds) holds out entire blocks, preventing the model from memorising spatial patterns. This is the most conservative and honest estimate.
Random CV: standard 5-fold CV with shuffling. Expected to be higher than spatial CV due to spatial autocorrelation leakage.
Out-of-bag (OOB): the inherent RF bootstrap estimate.

The empirical variogram of T6 spatial-CV residuals has a fitted range of 2099m, which slightly exceeds the 2km block size. Roberts et al. (2017) recommend block size ≥ 2× the autocorrelation range, so the primary 2km estimate may be slightly optimistic. A block sensitivity analysis at 4.5km (R² = 0.79) and 8km provides a conservative co-primary estimate: R² drops only modestly with larger blocks, confirming the spatial CV results are robust.

Model diagnostics

Permutation importance (10 repeats) from the full T6 model identifies which predictors drive predictions. Partial dependence plots (50-point grid) for the top 6 predictors reveal non-linear relationships and thresholds. A 2D interaction PDP for the top two predictors from different time groups shows whether predictor effects are conditional on each other. Predicted vs actual scatter uses spatial-CV held-out predictions to visualise model calibration.

FRP validation

To test whether satellite-measured fire intensity corresponds to spectral severity, Pearson and Spearman correlations were computed between dNBR and each FRP metric (VIIRS FRP, VIIRS brightness temperature, Himawari cumulative and maximum FRP). To address the resolution mismatch between VIIRS (375m) and dNBR (20m), sample points were aggregated to ~375m grid cells before computing correlations, with mean dNBR compared against the (constant) FRP value within each VIIRS pixel.

Signal validation

A 4-zone comparison (park vs farmland × burnt vs unburnt) validates that dNBR measures fire effects rather than seasonal phenology. Unburned park areas should show near-zero dNBR; farmland dNBR is expected to be confounded by crop senescence.

Detailed Results

Signal validation: dNBR works in native vegetation but not farmland

A 4-zone comparison (park vs farmland × burnt vs unburnt) revealed a critical confound in raw dNBR. Within the national park, the signal is clear:

Park burnt: median dNBR = +0.504 (SD 0.228)
Park unburnt: median dNBR = -0.043 (SD 0.113)

The unburned native vegetation shows effectively zero dNBR change, confirming that the index is measuring fire effects and not seasonal phenology. However, farmland tells a different story:

Farm burnt: median dNBR = +0.445 (SD 0.201)
Farm unburnt: median dNBR = +0.367 (SD 0.287)

Unburned farmland shows a median dNBR of +0.367—comparable to genuine low-severity fire—driven entirely by crop senescence between the spring pre-fire and late-summer post-fire composites. The high standard deviation (0.287) reflects the patchwork of irrigated, dryland, and fallow paddocks. This confirms that dNBR-based severity classification is only meaningful within native vegetation. All subsequent analysis uses the Grampians National Park boundary (from the World Database on Protected Areas) to exclude agricultural confounders.

To test whether SAR-based severity avoids this confound, the same 4-zone comparison was applied to Sentinel-1 dVH:

Park burnt: median dVH = +2.188 (SD 1.537)
Park unburnt: median dVH = +0.281 (SD 0.939)
Farm burnt: median dVH = +3.312 (SD 2.550)
Farm unburnt: median dVH = +3.375 (SD 3.717)

Within the park, dVH discriminates cleanly: burned vegetation shows a median backscatter loss of +2.2 dB versus near-zero in unburned areas. However, farmland also shows substantial dVH change even without fire (median +3.4 dB), likely from crop harvest and ploughing altering surface roughness between the spring pre-fire and late-summer post-fire SAR composites. While dVH still separates burned from unburned farmland (+3.3 vs +3.4), the high non-fire baseline means neither optical (dNBR) nor SAR (dVH) indices reliably isolate fire effects in agricultural landscapes. This reinforces the decision to restrict analysis to the national park.

Severity distribution

Within the burned portion of the national park, severity was classified using standard USGS dNBR thresholds:

Class	dNBR range	Area (ha)	%
Low	0.10 – 0.27	14,724	14%
Moderate-Low	0.27 – 0.44	24,658	23%
Moderate-High	0.44 – 0.66	38,045	35%
High	> 0.66	29,952	28%

The fire was predominantly moderate-to-high severity, with moderate-high (38,045 ha) and high severity (29,952 ha) together accounting for 63% of the classified burned area. This is consistent with the fire conditions: multiple ignitions during extreme heat, strong winds, and prolonged drought.

Note: these thresholds derive from US FIREMON standards. He et al. (2024) showed that vegetation-specific thresholds improve severity classification accuracy in SE Australian forests. The area estimates above should be interpreted as approximate; the Random Forest regression below uses continuous dNBR values and is not affected by threshold choice.

Bivariate correlations

The central research question—at which point in time can we predict severity—is addressed by examining correlations between dNBR and predictors available at each time step:

Time available	Predictor	r with dNBR
Pre-fire
	Pre-fire NDVI	+0.260
	Elevation	-0.252
	Slope	-0.192
	TPI	-0.142
	Northness	-0.032
During fire
	VIIRS detection count	-0.370
	VIIRS max brightness temp	-0.125
Post-fire
	dCIre (red-edge)	+0.678
	dVH (SAR structural)	+0.513
Recovery
	NDVI recovery rate	+0.710

Key finding: Pre-fire landscape variables are weak individual predictors of severity. The strongest pre-fire predictor is NDVI (r = +0.260), explaining only ~7% of dNBR variance. Topographic variables are weaker still. This does not support the initial hypothesis that pre-fire fuel load and topography might explain 30–50% of severity variance. However, this correlation is partly an arithmetic artefact: dNBR = NBR_pre − NBR_post, so greener pre-fire vegetation mechanically produces larger dNBR at equivalent burn damage (Gale & Cary 2022). The observed r may overstate the genuine fuel-severity relationship.

During-fire data adds meaningful signal. VIIRS detection count (r = -0.370) is a stronger individual predictor than any pre-fire variable. Pixels detected as active fire on more satellite passes tended to burn less severely, likely because repeated detections indicate slower-moving, lower-intensity flanking fire rather than a single high-intensity passage. However, detection count is also confounded by cloud and smoke obscuration, orbit timing, and scan angle (Bradstock et al. 2010), so this interpretation should be treated with caution. Maximum brightness temperature was less informative (r = -0.125), possibly due to the coarse 375m VIIRS resolution mixing high and low intensity pixels.

Post-fire indices carry the strongest signal. The red-edge chlorophyll index dCIre (r = +0.678) and SAR structural change dVH (r = +0.513) are both strongly correlated with dNBR. These are available within 2–4 weeks of fire containment, confirming that early post-fire imagery—even a single cloud-free Sentinel-2 pass—is far more informative than any pre-fire or during-fire data alone. Note that dCIre and dNBR share the same Sentinel-2 imagery (Fernandez-Manso et al. 2016), so their strong correlation partly reflects spectral redundancy between red-edge and SWIR bands rather than independent confirmation. In contrast, dVH from Sentinel-1 SAR provides the only truly sensor-independent severity measure, detecting structural canopy damage through an entirely different sensing modality (Hosseini & Lim 2023).

Recovery trajectory is the strongest single predictor (r = +0.710), but requires months of post-fire monitoring. This is consistent with the ecological interpretation that severity is ultimately defined by the biological response, not the spectral snapshot.

NDVI recovery trajectory

Monthly NDVI monitoring shows the burned area began at ~0.30 immediately post-fire (Feb 2025) and recovered to ~0.57 by March 2026—a 14-month gain of approximately 0.27 NDVI units. However, this remains well below the unburned reference of ~0.65, indicating that full canopy recovery has not occurred after one year. The recovery trajectory shows a seasonal dip during winter (Jun–Aug 2025) before resuming upward, consistent with the known phenology of eucalypt resprouting in this region. However, resprouter-dominated communities (common in the Grampians) recover faster spectrally than obligate seeders, and stratifying recovery by vegetation type would strengthen this analysis (Gibson & Hislop 2022; Caccamo et al. 2015).

Discussion

Bivariate correlations

The initial hypothesis—that pre-fire fuel load and topography might explain a large proportion of severity variance—was not supported by bivariate analysis. Individual pre-fire predictors explained less than 7% of variance each. The progression of bivariate predictive power follows a clear temporal gradient:

Pre-fire: weak (|r| ≤ 0.26) → During fire: moderate (|r| ≤ 0.37) → Post-fire: strong (|r| ≤ 0.68) → Recovery: strongest (|r| = 0.71)

The weak pre-fire signal is not simply a matter of missing variables — it reflects a fundamental interaction between fire weather and landscape. Bradstock et al. (2010) showed that weather, fuel, and terrain interact multiplicatively: under moderate conditions, topographic features like gullies and moist aspects buffer severity, but under extreme fire weather these protective effects collapse. Collins et al. (2019) formalised this as the ‘muting’ effect — the influence of topography and fuel age on burn severity diminishes as fire weather severity increases. The Grampians fire burned under prolonged drought and extreme heat, conditions where landscape predictors would be expected to contribute least.

Multivariate Random Forest analysis

To move beyond bivariate correlations, a Random Forest regression was trained on 10,000 stratified sample points with cumulative temporal predictor sets (T1–T6). All R² values below are from spatial block cross-validation (~2km blocks, GroupKFold), which guards against inflated accuracy from spatial autocorrelation. The variogram range (2099m) slightly exceeds the block size; a conservative 4.5km block sensitivity check yields R² = 0.79 (see Cross-validation section for details).

T1 Landscape only: R² = 0.25 — pre-fire predictors combined explain ~25% of variance. Non-linear interactions (captured by RF but missed by Pearson r) roughly double the explanatory power compared to bivariate analysis, but landscape alone remains a weak predictor. This is consistent with the ‘muting’ hypothesis (Collins et al. 2019): under extreme fire weather, topographic and fuel-age controls on burn severity are overridden, collapsing the predictive value of pre-fire landscape variables.

T2 + Drought: R² = 0.34 — antecedent soil moisture and VPD add +0.09, confirming that drought conditioning meaningfully affects severity outcomes.

T3 + Fire weather: R² = 0.38 — ERA5 temperature, wind, and VPD on the day of burning add a further +0.04.

T4 + Satellite intensity: R² = 0.46 — VIIRS brightness temperature, FRP, and detection count add +0.09. Real-time satellite fire detection provides meaningful additional information beyond weather alone.

T5 + Post-fire indices: R² = 0.80 — adding dCIre and dVH produces the single largest jump (+0.33). Ablation decomposition: SAR-only (dVH) achieves R² = 0.5786, while optical-only (dCIre) achieves R² = 0.7979. Since dCIre shares Sentinel-2 imagery with the dNBR response variable, much of its contribution reflects spectral redundancy rather than independent ecological information. The genuinely sensor-independent signal comes from SAR structural change (dVH).

T6 Full model (retrospective benchmark): R² = 0.81 — 14-month recovery rate adds a marginal further +0.01. This tier validates severity patterns identified earlier but is not an operational input — no management decision can wait 14 months for input data. Spatial CV RMSE: 0.098 dNBR.

Benchmark comparison: Gibson et al. (2020)

Our approach complements existing operational products such as the NSW Fire Extent and Severity Map (FESM), which uses Sentinel-2 RF classification trained on aerial photography (Gibson et al. 2020).

Gibson et al. (2020) developed the operational fire severity mapping methodology now used by DEECA (formerly DELWP) across Victoria. Their approach trained a Random Forest classifier on 8 differenced Sentinel-2 spectral indices—dNBR, dNDVI, dNDWI, dNBR2, dMSAVI2, dMIRBI, dCSI, and dSAVI—against reference labels derived from high-resolution (<35 cm) post-fire aerial near-infrared photography acquired over 12 fires in the 2018–19 season. The classifier mapped severity into 6 classes: no data, non-woody vegetation, unburnt (>90% green canopy), low canopy scorch (<20%), medium canopy scorch (20–80%), high canopy scorch (>80%), and canopy burnt (>20% foliage consumed). Overall accuracy was 85% (Kappa = 0.81).

A key finding of Gibson et al. was that no single spectral index performed best across all vegetation types. In dry eucalypt forest—the dominant type in the Grampians—dNBR ranked among the top indices but was outperformed by dNDWI and dMIRBI in some severity classes. The multi-index RF approach outperformed all individual indices (Collins et al. 2018), particularly for the problematic medium severity class where single-index accuracy was lowest (61%). This supports our choice of a multivariate RF framework over simple dNBR thresholding.

Per-class accuracy in Gibson et al. followed a U-shaped pattern: high for unburnt (88%) and canopy burnt (97%), lower for intermediate classes (low scorch 75%, medium scorch 61%). This is consistent with the well-established finding that spectral indices separate the extremes of the severity gradient more reliably than intermediate states, where mixed pixel effects and variable canopy architecture create ambiguity (Collins et al. 2018; Hammill & Bradstock 2006).

Our approach differs from Gibson et al. in several respects. First, we frame severity as a continuous regression on dNBR rather than categorical classification. This avoids the threshold-sensitivity problem but introduces the circularity concern that dNBR is both our response variable and a proxy for ecological outcome. The T5 ablation above quantifies this: SAR-based dVH provides the only truly sensor-independent severity signal (R² = 0.58), while much of the optical index contribution reflects spectral redundancy. Second, our temporal tier design (T1–T6) addresses a question Gibson et al. did not: when can severity be predicted, and how much does each data source contribute? The finding that pre-fire and during-fire data (T1–T4) explain R² = 0.46 suggests meaningful severity forecasting is possible before post-fire imagery becomes available. Third, we incorporate SAR data (Sentinel-1 dVH) and active fire metrics (VIIRS FRP, Himawari), which Gibson et al. did not consider—though their aerial photography calibration provides a stronger link to canopy-level ecological condition than any of our purely satellite-derived predictors.

DEECA’s severity mapping for the 2024–25 Grampians fire is not yet available (all 136,640 ha remain classified as BURNT_UNKNOWN in the fire history WFS as of April 2026). When released, it will provide a valuable semi-independent benchmark—semi-independent because both products share Sentinel-2 as a common data source, though the Gibson et al. methodology uses 8 indices in a classification framework calibrated against aerial photography, a substantially different analytical pathway from our single-index continuous regression.

Broader context: Australian fire severity mapping

Three approaches to satellite-based fire severity mapping represent different operational trade-offs: GEEBAM (DAWE 2020) provides rapid national coverage within days using a single spectral index; the NSW Fire Extent and Severity Map (Gibson et al. 2020) delivers operational state-level accuracy using RF classification trained on aerial photography; our temporal decomposition approach explores when severity becomes predictable and which data sources contribute at each stage.

The most directly comparable national-scale product is GEEBAM (Google Earth Engine Burnt Area Map; DAWE 2020), which applies relativised dNBR thresholds calibrated by bioregion and NVIS vegetation type, achieving 48–82% accuracy at four classes and 72–92% at two. Its bioregion-specific calibration addresses the pre-fire vegetation variability that fixed USGS thresholds miss, but GEEBAM acknowledged poor delineation of low-severity areas—the class most critical for identifying refugia. Our approach differs in three ways: we use continuous RF regression rather than categorical thresholding, we incorporate multi-sensor predictors beyond optical indices (SAR, FRP, weather), and our temporal tier framework addresses when severity can be predicted, not just what burned.

Cross-fire validation: Otway fire (2024–25)

To test whether our spectral indices correspond to independently-derived severity, we performed a cross-fire validation using the Otway fire (Chapple Vale, Barwon South West, same 2024–25 season). The Otway Ranges support wet eucalypt forest dominated by Eucalyptus regnans with rainforest understorey — structurally very different from the Grampians’ dry sclerophyll woodland, with pre-fire NBR averaging 0.56 (vs ~0.4 in the Grampians). While the Grampians fire severity remains unclassified, DEECA has completed severity mapping for the Otway fire using their operational Random Forest methodology (Gibson et al. 2020). We extracted dNBR, RBR, dCIre, and dVH over the Otway fire perimeter using identical Sentinel-2 temporal windows, cloud masking, and band mathematics as the Grampians analysis, then compared these continuous indices against DEECA’s categorical severity classes.

The Otway fire (509 ha) was classified into 3 severity classes (BURNT_2F, BURNT_2P, BURNT_3). A total of 500 random sample points were extracted across all severity polygons. The table below shows how our spectral indices separate DEECA’s severity classes:

DEECA class	n	dNBR mean	dNBR median	RBR mean	dVH mean	dCIre mean
BURNT_2F	59	0.3362	0.4291	0.2095	1.0999	0.6006
BURNT_2P	91	0.4126	0.4203	0.2728	1.0047	0.881
BURNT_3	350	0.3798	0.429	0.2307	1.2651	1.3623

Spearman rank correlations between DEECA severity (ordinal) and each continuous index were: dNBR ρ = -0.03 (p = 0.494), RBR ρ = -0.12 (p = 0.006), dCIre ρ = 0.39 (p < 0.001), dVH ρ = 0.06 (p = 0.165). A Kruskal-Wallis test confirmed significant dCIre differences across classes (H = 83.8, p < 0.001, η² = 0.165). Cohen’s Kappa between dNBR threshold classes (He et al. 2024: 0.27/0.44/0.66) and DEECA classes was κ = 0.01.

dNBR does not appear to discriminate DEECA severity classes at the Otway fire (though this conclusion rests on ~15 polygons with only 1 crown-fire polygon). Medians were nearly identical across all three classes (range 0.42–0.43), and 33% of pixels within the highest-severity (BURNT_3) polygon had dNBR < 0.27 (classified “low” by standard thresholds). In contrast, dCIre showed clear between-class separation, with mean values increasing from 0.60 (BURNT_2F) to 0.88 (BURNT_2P) to 1.36 (BURNT_3). Relativising to RBR (dNBR / [NBR_pre + 1.001]) did not rescue discrimination: RBR showed a weak negative correlation (ρ = -0.12), confirming that the problem is not pre-fire vegetation scaling but a fundamental insensitivity of NIR/SWIR bands to the scorch–consumption gradient in dense wet canopies. The red-edge chlorophyll index captures severity gradients in wet eucalypt forest that SWIR-based indices miss, consistent with He et al. (2024) on the need for vegetation-specific approaches.

Several design limitations constrain this comparison. The BURNT_3 (crown fire) class was represented by a single polygon (335 ha, 1 polygon), so its 350 sample points reflect within-polygon spatial variation rather than between-event class-level behaviour. The effective sample size for between-class inference is ~15 polygons (8 BURNT_2P, 6 BURNT_2F, 1 BURNT_3), not the nominal 500 points. The absence of BURNT_1 (low severity) limits the severity range tested, and the Otway’s wet eucalypt forest differs structurally from the Grampians’ dry sclerophyll and heathland. The contrast between the strong Grampians RF performance (spatial CV R² = 0.81 in dry sclerophyll) and the null dNBR result at Otway is suggestive: it points toward ecosystem-dependent dNBR sensitivity, though the small Otway sample size means this should be confirmed with larger wet-forest datasets before drawing definitive conclusions. Multi-index approaches incorporating red-edge and SAR data are likely needed for generalisable severity mapping across Victorian vegetation types.

Cross-fire validation: 2006 Grampians fire (Mt Lubra)

To test whether dNBR discriminates fire severity in dry sclerophyll forest (the Grampians ecosystem), we performed a historical cross-fire validation using the 2006 Mt Lubra fire (120380 ha). This fire burned the same Grampians landscape as the 2024–25 event and was severity-mapped by DEECA from aerial photography interpretation (FIRESEVWF dataset, 3,013 classified polygons). We extracted Landsat 5 TM spectral indices (dNBR, dNDVI, dNBR2) using pre-fire (Sep–Dec 2005) and post-fire (Mar–Jun 2006) composites (12 combined L5+L7 scenes per window), then compared against DEECA’s three-class severity classification.

The 2006 fire was classified into 3 severity classes. 1515 sample points were extracted across all severity polygons:

Spearman rank correlation between DEECA severity and dNBR was ρ = 0.44 (p < 0.001). At the polygon level (aggregating per-polygon medians to avoid pseudoreplication), the correlation strengthened to ρ = 0.54 (p < 0.001). Cohen’s Kappa between dNBR threshold classes (He et al. 2024) and DEECA classes was κ = 0.24 (overall accuracy 49%).

dNBR shows clear monotonic separation of severity classes in dry sclerophyll forest. Medians increased from 0.29 (LOW) to 0.35 (MEDIUM) to 0.62 (HIGH), compared to the Otway result where all classes had identical dNBR medians (~0.42). This confirms that dNBR sensitivity is ecosystem-dependent: it saturates in wet temperate forest (Otway, LAI 4–7+) but discriminates severity effectively in the dry sclerophyll and heathland that characterises the Grampians.

Critically, the 2006 severity classification is fully independent of satellite spectral data. The fire predates Sentinel-2 (launched 2015) and the Gibson et al. (2020) satellite-based RF methodology. The original severity codes describe crown scorch percentages and understorey condition at a level of detail (e.g., “30–65% of crowns scorched, understorey is a mosaic of scorched, burnt and unburnt areas”) that can only be derived from visual interpretation of high-resolution post-fire aerial photography — the established reference method for fire severity assessment. This classification system matches the Victorian Post-fire Burn Classification Procedure used by FFMVic, and the 2006 Grampians fire is among the 16 Victorian fires (2006–2016) that were severity-mapped from aerial photography in Collins et al. (2018). Unlike the Otway cross-validation, where both our dNBR and DEECA’s classification derive from the same Sentinel-2 imagery, the 2006 comparison involves no shared data source: aerial photography vs Landsat 5 TM.

The 2006 validation also avoids the statistical limitations that undermined the Otway comparison: it has 3013 polygons across three classes (vs 15 polygons at the Otway), a full LOW/MEDIUM/HIGH severity gradient (vs no BURNT_1 at the Otway), and vegetation structurally identical to the 2024–25 fire. The validation uses Landsat 5 TM rather than Sentinel-2, but dNBR is spectrally comparable between sensors (both use NIR/SWIR2 normalised difference; Parks et al. 2014). The Landsat sensor lacks Sentinel-2’s red-edge bands, so dCIre could not be tested in the dry forest context.

Independent sensor validation

To test whether dNBR severity corresponds to physically independent measurements, we correlated it against sensors operating at different wavelengths and sensing modalities. The table below shows Pearson correlations between dNBR and each validation layer at the 10,000 sample points:

Sensor	Metric	Wavelength	r vs dNBR	n
Sentinel-2 red-edge (dCIre)	dCIre	Red-edge (740 nm)	+0.669	10000
Landsat thermal (dLST)	dLST	Thermal IR (10.9 μm)	+0.693	1861
Sentinel-1 C-band SAR (dVH)	dVH	C-band (5.6 cm)	+0.513	10000
PALSAR-2 L-band SAR (dHV)	dHV_lband	L-band (23.5 cm)	+0.181	1861

These correlations form a validation hierarchy ranging from spectrally redundant to fully independent:

Spectrally redundant: dCIre (r = +0.669) — shares Sentinel-2 imagery with dNBR. High correlation confirms internal consistency but does not validate independently.

Semi-independent: dLST (r = +0.693) — Landsat thermal emission from exposed soil and ash. Different physics from S2 reflectance, but both respond to vegetation loss. Captures below-canopy ground heating that optical indices miss.

Semi-independent: dVH (r = +0.513) — Sentinel-1 C-band SAR. Detects leaf/small-branch structural change through radar backscatter. Fully sensor-independent from Sentinel-2.

Independent: dHV (r = +0.181) — ALOS-2 PALSAR-2 L-band SAR. Penetrates canopy (23.5 cm wavelength) and is sensitive to trunk and large-branch structural damage that C-band cannot detect. Validated for Australian eucalypt fire mapping by Collins et al. (2019).

Pre-fire canopy height was investigated but ruled out as a validation layer. Analysis found near-uniform canopy height across severity classes in the Grampians (r = 0.04), consistent with severity being driven by fire weather and topography rather than canopy structure in this landscape.

5. Alpha Earth embeddings vs mechanistic features

AE_2023 (pre-fire embeddings): R² = 0.33 — 64 learned embedding dimensions explain 33% of severity variance. This exceeds T1’s landscape-only model (R² = 0.25) but only matches T2 (landscape + drought, R² = 0.34) and falls short of T3 (+ weather, R² = 0.38) and T4 (+ satellite, R² = 0.46).

AE_2024 (same-year embeddings): R² = 0.36 — marginally higher than AE_2023 (+0.02), suggesting minimal fire contamination in the 2024 annual composite despite the fire starting in mid-December.

The Alpha Earth comparison is informative but does not support replacing mechanistic features with learned embeddings. While AE_2023 outperforms T1’s 8 landscape features alone, it only matches what 10 interpretable variables (T2) already achieve — using 64 black-box dimensions to do so. The embeddings add no new predictive signal beyond what hand-engineered features plus antecedent drought already capture, and they sacrifice interpretability entirely. The 64 dimensions have no physical meaning, making them unsuitable for understanding why certain areas burn more severely.

Critically, the embeddings cannot incorporate fire-day weather or satellite-detected fire intensity (T3, T4), which add a further +0.12 R² beyond T2. For pre-fire severity prediction, domain-specific features with temporal specificity outperform general-purpose landscape representations.

Overall, ~46% of severity variance is predictable before or during the fire (T4 spatial CV R² = 0.46), a meaningful improvement over the bivariate result. The remaining variance is only recoverable with post-fire satellite imagery. For operational forecasting, this means landscape + drought + weather + VIIRS data could produce a useful preliminary severity map within hours of fire passage, well before cloud-free optical imagery becomes available.

Foundation model embeddings as landscape descriptors

The Alpha Earth comparison tested whether a general-purpose foundation model could outperform domain-specific feature engineering for pre-fire severity prediction. The answer is no: AE_2023 (R² = 0.33) matches but does not exceed the combined explanatory power of landscape features plus antecedent drought (T2, R² = 0.34). The 64-dimensional embeddings likely encode a mixture of the same information captured by NDVI, terrain, and vegetation type, plus some implicit drought signal from seasonal spectral patterns — but this does not translate into additional predictive power.

The result underscores the value of mechanistically-reasoned features for fire severity analysis. Hand-picked variables like soil moisture and VPD provide equivalent predictive power to a foundation model trained on petabytes of global satellite data, while remaining interpretable and directly informative for land management. The temporal specificity of features like burn-date weather and satellite-detected fire intensity (T3, T4) — which annual embeddings cannot capture — adds a further +0.12 R² that no landscape embedding can provide.

Alpha Earth embeddings may still have value for rapid transferability to new fires where the full GEE feature extraction pipeline has not been set up, since the embeddings are pre-computed globally. But for a single well-studied fire, purpose-built features are preferable.

Signal validation

The 4-zone validation (park/farmland × burnt/unburnt) highlights an important methodological caution: raw dNBR is unreliable in agricultural landscapes. Seasonal crop phenology produces dNBR signals comparable to low-severity fire. The Grampians National Park boundary from WDPA provided a clean analytical mask, and the near-zero dNBR in unburned park areas confirms the index is valid within this domain. Note that the analysis is restricted to burned area within the national park, which represents a subset of the total ~135,000 ha fire extent; severity patterns in surrounding farmland and state forest are excluded.

Spatial block cross-validation R² is consistently lower than random cross-validation (e.g., T6 spatial 0.81 vs random 0.84), confirming that spatial autocorrelation inflates naive accuracy estimates. The spatial CV results provide an honest assessment of how the model would generalise to unseen areas of the fire.

Limitations

The weak FRP–severity correlations represent a confirmed null result rather than a data gap. Resolution mismatch is the primary driver: dNBR is measured at 20m, VIIRS FRP at 375m, and Himawari FRP at ~2km, meaning each FRP pixel integrates over vastly heterogeneous severity outcomes. This finding is consistent with Nguyen et al. (2024), who found that FRP–severity relationships weaken substantially at coarser resolutions, and Heward et al. (2013), who reported similarly weak correlations in Australian eucalypt forests. ERA5 weather is at ~9km resolution, meaning all points on a given burn date share the same weather values; finer-resolution weather (e.g., ACCESS or downscaled reanalysis) could improve T3. Sentinel-1 SAR data (dVH) may underperform in areas of gentle topography where geometric distortion is minimal. The recovery trajectory is based on 14 months of data; longer monitoring would improve severity confirmation.

We use raw dNBR rather than relativised alternatives such as RBR (dNBR / [NBR_pre + 1.001]; Parks et al. 2014). Relativisation adjusts for pre-fire vegetation differences and is preferable for cross-ecosystem comparison, as demonstrated by GEEBAM’s national deployment. A sensitivity analysis retraining key tiers with RBR as the target confirmed that within the Grampians the difference is modest: T1 R² = 0.22 vs 0.25, T4 = 0.44 vs 0.46, T6 = 0.79 vs 0.81 — RBR is slightly harder to predict because relativisation removes the pre-fire vegetation signal that the RF already captures via pre-fire NDVI. However, the Otway cross-validation demonstrated that RBR does not rescue dNBR in wet eucalypt forest (ρ = -0.12), indicating that the cross-ecosystem limitation of NIR/SWIR indices is not a scaling problem that relativisation can address. Extending this approach to wet forest vegetation types would benefit from red-edge indices (dCIre) or a multi-index response variable rather than RBR alone.

Low-severity burns, where fire affects the understorey beneath intact canopy, are the most challenging to detect from optical sensors (Collins et al. 2018). Our inclusion of Sentinel-1 SAR backscatter change (dVH) partially addresses this, as SAR penetrates the canopy to detect structural changes invisible to optical indices.

Next steps

Integration of Himawari geostationary FRP (10-minute temporal resolution, ~2km spatial) could fill the gap between VIIRS passes and improve during-fire severity prediction. Higher-resolution fire weather data from ACCESS or BoM station observations could improve T3. DEECA’s operational fire severity classification for the Grampians is pending (currently BURNT_UNKNOWN); when released, comparison using the methodology of Gibson et al. (2020) would provide a semi-independent benchmark, though both approaches share Sentinel-2 as a common data source. Truly independent validation would require aerial photography or spaceborne LiDAR (GEDI/ICESat-2) canopy height change.

Data Sources

Dataset	Source	Resolution
Fire perimeter	Vic Gov Fire History WFS (season 2025)	Vector
Park boundary	WDPA (WCMC/WDPA/current/polygons)	Vector
Sentinel-2 SR	COPERNICUS/S2_SR_HARMONIZED	10–20m
Sentinel-1 SAR	COPERNICUS/S1_GRD (IW, VH)	10m
VIIRS active fire	NASA LANCE SNPP + NOAA-20 C2	375m
DEM	USGS SRTM 30m	30m
ERA5-Land daily	ECMWF/ERA5_LAND/DAILY_AGGR	~9km
Vegetation type	ESA WorldCover v200	10m
Alpha Earth embeddings	GOOGLE/SATELLITE_EMBEDDING/V1/ANNUAL	10m, 64-d
Landsat 8/9 LST	LANDSAT/LC08-09/C02/T1_L2 (ST_B10)	100m
ALOS-2 PALSAR-2	JAXA/ALOS/PALSAR-2/Level2_2/ScanSAR (HV)	25m

Analysis by James Maino. Built with Google Earth Engine, Leaflet, and Chart.js.

	Mean	Median	SD
Park burnt	+0.495	+0.504	0.228
Park unburnt	-0.039	-0.043	0.113
Farm burnt	+0.460	+0.445	0.201
Farm unburnt	+0.358	+0.367	0.287

When Can We Predict Fire Severity?

Grampians National Park

The burn scar

Severity classes

Severity distribution

Signal validation

Can we predict severity before the fire? Months before fire

Topography Always available

Pre-fire vegetation Months before

Fire history Years before

During-fire satellite signals During fire · hours to days

VIIRS fire intensity Hours–days

Post-fire indices 2–8 weeks after fire

dNBR: the standard 2–8 weeks

Red-edge chlorophyll (dCIre) 2–8 weeks

SAR structural damage (dVH) 2–8 weeks

Land surface temperature (dLST) 2–8 weeks

14-month recovery 2–14 months

NDVI recovery: slow but progressing 2–14 months after fire

Three decision points, three data windows

Next steps

Validating dNBR as a severity metric

The 2006 Grampians fire: ground truth from the air

The Otway fire: suggestive evidence that dNBR fails in wet forest

Otway fire: dNBR appears to fail in wet forest

2006 Grampians: dNBR validated

Temporal prediction: when does each source contribute?

What this means for managers

Feature importance

Methods & Technical Details

Methods

Study area and response variable

Sampling

Predictor variables

Alpha Earth foundation model comparison

Bivariate analysis

Random Forest regression

Cross-validation

Model diagnostics

FRP validation

Signal validation

Detailed Results

Signal validation: dNBR works in native vegetation but not farmland

Severity distribution

Bivariate correlations

NDVI recovery trajectory

Discussion

Bivariate correlations

Multivariate Random Forest analysis

Benchmark comparison: Gibson et al. (2020)

Broader context: Australian fire severity mapping

Cross-fire validation: Otway fire (2024–25)

Cross-fire validation: 2006 Grampians fire (Mt Lubra)

Independent sensor validation

5. Alpha Earth embeddings vs mechanistic features

Foundation model embeddings as landscape descriptors

Signal validation

Limitations

Next steps

Data Sources

Grampians 2025 Fire Severity

Map Layers

4-Zone Validation

Severity Area (Park)

Correlations with dNBR

NDVI Recovery

Incremental R² by Time Step

Permutation Importance

Partial Dependence Plots

2D Interaction

Predicted vs Actual (Spatial CV)

FRP Validation