// GEOSPATIAL AI
Satellite-based forest fire prediction for Uttarakhand using multi-source geospatial data and XGBoost.
Architected a predictive system for forest fire risk in Uttarakhand by integrating multi-source geospatial data from NASA FIRMS, ISRO Bhuvan/Bhoonidhi, and ECMWF ERA5. An XGBoost classifier was optimized for extreme class imbalance across 450,000+ samples, achieving 75% Recall for fire events.
Uttarakhand is one of India's most fire-prone states, with dense forest cover and challenging terrain making ground monitoring infeasible. Early prediction from satellite data can enable timely forest department response and evacuation. The core challenge is extreme class imbalance — fire events are rare relative to non-fire observations.
Name
Multi-Source Geospatial Fusion Dataset
Metadata Matrix Size
450,000+ samples
Telemetry Source
NASA FIRMS, ISRO Bhuvan, ISRO Bhoonidhi, ECMWF ERA5
Pipeline Preprocessing Steps
Core Pattern
XGBoost Classifier
Technology Stack
scikit-learn + XGBoost
Key Components
Data Acquisition
Download ERA5, FIRMS, DEM, and LULC datasets for Uttarakhand bounding box.
Raster Preprocessing
GDAL-based CRS alignment, resampling, and clipping to study region.
Feature Engineering
Per-pixel feature extraction: wind speed, humidity, NDVI, slope, elevation, LULC class.
Label Generation
FIRMS fire hotspot labels matched to spatial grid.
Class Balancing
SMOTE + XGBoost scale_pos_weight for 450k+ imbalanced samples.
Training
XGBoost with early stopping, Recall-focused threshold tuning.
Evaluation
Spatial holdout evaluation on unseen geographic tiles.
75%
Recall (Fire Events)
450,000+
Samples Processed
4
Data Sources