Model-agnostic interpretability of deep learning models for car emissions assessment

This work has been partially supported by the Spoke 9 “Digital Society & Smart Cities” of ICSC – Centro Nazionale di Ricerca in High Performance-Computing, Big Data and Quantum Computing, funded by the European Union – NextGenerationEU (PNRR-HPC, CUP: E63C22000980007)

Alfonso Iodice D’Enza, Andrea Mancuso, Zubair Ahmad, Francesco Palumbo

drive green, drive safe

sustainability

aggressive driving can increase fuel consumption and CO₂ emissions by up to 40% [McConky et al., 2018; Xu et al., 2017].
green driving improves efficiency and reduces emissions [Zhou et al., 2016].

safety

aggressive driving is correlated with higher crash risk [Adavikottu & Velaga, 2021].
structured green driving programs have achieved up to 10% fuel savings and a 33% reduction in property-damage accidents [Nævestad, 2022].

Promoting and incentivizing green driving for sustainability and safety, while delivering clear benefits to insurers, fleet operators, and policymakers.

ECOSCORING project

It is a project aimed at developing an eco-scoring system to

assess driving behavior and promote eco-friendly driving practices via discounts on insurance premiums.

The ECOSCORING project stems from a partnership between Intesa Sanpaolo Insurance and the University of Naples Federico II.

boxes data

Customers of the insurance company have telematics devices (black boxes) installed in their cars.

The insurance company provided access to the boxes from 100000 customers over 6 months (approx 600 million records)

boxes-recorded features:

date and timestamp of the recording
latitude and longitude
speed and distance traveled between two recordings
the boxes record data every 2 minutes during trips

insurance company customers-related features:

car characteristics (e.g., fuel type, engine size, power, segment, registration year)
customer characteristics (e.g. age class, geographical area)

intuition

Driving behavior/style impacts emissions beyond “the more you drive, the more you pollute”.
Eco-driving features (e.g., smooth acceleration, steady speed) can reduce emissions: they are not directly observed in the boxes data.
two minutes in between recordings is too coarse to capture them.

tentative solution: focus on driving styles

detect aggressive driving patterns from the boxes data using API services (e.g., Google Maps or Tom Tom)

for each road segment (between two recordings) get the expected time of travel at same time of the day and day of the week it was driven.
compare expected vs actual time → infer if the driver was aggressive or not.

what’s good: the expected time takes into account contextual factors (e.g., traffic, road type).

what’s bad:

free access to these services is limited (e.g., 2500 requests/day for Google Maps).
too expensive for large-scale applications.

use free services? e.g. Open Street Map:

do not provide travel time estimates
the number of requests per minute and/or per day still limited: too slow for large-scale applications.

change of plans: estimate emissions

estimate emissions directly enriching the boxes data with contextual information (e.g., road slope, type, time of the day/day of the week).
use a state-of-the-art emission model (pre-trained) to generate training data.

MOVES

The MOVES framework refers to the Motor Vehicle Emission Simulator (MOVES):

MOVES is a state-of-the-art for emission modeling from telematics data [Koupal et al., 2003], and it is developed and maintained by the U.S. Environmental Protection Agency (EPA).
it is developed and continually updated by the EPA with extensive empirical validation against laboratory testing, roadside monitoring, and fuel consumption studies.
It provides legally defensible estimates [Park et al., 2016].

MOVES

The MOVES framework is rigid and computationally intensive.

Impractical for real-time eco-scoring assessment.
Derivative tools (e.g., MOVEStar, [Wang et al., 2020]) improve usability but lose accuracy.
Surrogate models have been proposed in the literature that are trained on MOVES outputs and combine telematics with contextual data[Xu et al., 2021; Chen et al., 2020].

Among them NeuralMOVES shows desireable features [Ramirez-Sanchez et al., 2025].

NeuralMOVES

A lightweight neural-network surrogate of the EPA’s MOVES model.

NeuralMOVES trained on millions of MOVES-generated scenarios

it essentially learns to emulate MOVES.
the discrepancies between NeuralMOVES and MOVES predictions are limited for most pollutants and driving conditions.
the pre-trained NeuralMOVES model is publicly available

NeuralMOVES

the pre-trained NeuralMOVES model to estimate emissions from the boxes data.

input features:

Vehicle dynamics: speed, acceleration.
Vehicle characteristics: type, age, fuel type.
Environmental context: road grade, temperature, humidity, traffic.

to use NeuralMOVES for emission estimation, data pre-processing and enrichment are needed

learning flow

pre-processing and enrichment

NeuralMOVES required arguments

road slope
air Temperature and Humidity
acceleration

none of them in the boxes data.

Enrichment process

slope: computed altitude from latitude/longitude: by retrieving altitude fpr each pair of recording, we derived the slope via basic geometry.
temperature and humidity: dataset from MeteoWeb.it: assign temperature and humidity of the nearest province in straight-line distance at the month of the recording.
acceleration: \(\frac{(Speed_2 – Speed_1)}{\Delta_{time}}\) between two successive records in a trip.

data cleaning and taming?

of course!

model specification

our modeling task

NeuralMOVES produces emissions per segment (every 2 minutes).
We want a surrogate model to describe the drivers of emissions at the customer level.
Target: co2_g_km (grams per km).
Features:
- Driving behavior (avg speed, variability, trip structure)
- Vehicle (type, power, registration year)
- Driver (age, geography)

The surrogate should be accurate but above all interpretable.

candidate models

emsemble tree-based models

boosting (XGBoost)
Random Forest

additive models

generalized additive models (GAM)
- smooth effect curves
- great for communicating insights
Elastic Net: penalized linear regession
- sparse, linear baseline
- checks whether linear terms already suffice

Model comparison: accuracy vs interpretability

XGBoost → best MAE, robust for typical drivers
Random Forest: best RMSE, stable predictions
GAM: interpretable effects, but less accurate
Elastic Net: weak baseline, shows need for nonlinear models

global model-agnostic interpretation methods

What drives emissions? (XGBoost VIP)

variable importance: average contribution in error reduction

Driving behavior dominates → duration, speed profile, short trips.
Vehicle features matter less once behavior is accounted for.
Registration year (proxy for emissions standards) also plays a role.

The surrogate confirms that how you drive matters at least as much as what you drive.

SHAP Values

What are SHAP values?

SHAP (SHapley Additive exPlanations):
A method from cooperative game theory applied to ML interpretability.

game theory to ML

Game → Model prediction function 𝑓
Players → Features
Coalition → Subset of features
Payout → Model output
Gain → discrepancy between the prediction with and without the feature

The Shapley value is the average marginal contribution of a feature value across all possible coalitions.
SHAP uses sampling-based approximations to estimate the Shapley values.

SHAP beeswarm

shap beeswarm

Accumulated Local Effects (ALE) plots

ALE plots show the average effect of a feature on the model’s predictions,
computed locally within bins of the feature range.
Unlike PDPs (Partial Dependence Plots), ALE avoids bias from extrapolations into sparse regions.
Interpretation:
- x-axis = feature values
- y-axis = change in predicted emissions relative to baseline
- slope/shape = how the model reacts to changes in that feature
ALE complements SHAP:
- SHAP → how features explain individual predictions
- ALE → how features influence the model globally

ALE: trip duration

Interpretation
- Longer trips → steadily increasing emissions per km.
- indeed the more you drive the more you pollute.

ALE: average speed

Interpretation
- Very low speeds (0–10 km/h) → high emissions (stop-and-go traffic).
- Around 30–40 km/h → lowest emissions (efficient steady cruising).

ALE: speed variation

Interpretation
- Small variability (steady driving) → lower emissions.
- Large variability (frequent acceleration/braking) → higher emissions.

local model-agnostic interpretation methods

customer A: speed variability profile

Interpretation

– Very high variation → higher emissions (stop–go or erratic).

– Moderate variation (10–20 km/h) → lowest emissions.

customer B: short trips profile

cp_rate_short_trip

Interpretation

– Moderate ratio (20–50%) → lowest emissions.

– High ratio (>80%) → strong increase in emissions (frequent cold starts).

customer C: old car

Interpretation

Emissions decrease clearly with newer registration years.
Old vehicles dominate baseline CO₂, regardless of driving style.

conclusions

In large-scale applications, use of pre-trained large network (when available) might often be the way to go.
interpretability is key to gather insights and justify predicted values.
translate models into actionable eco-driving recommendations for each customer.
Next step (we already working on it): integrate the recommendations into the eco-scoring system of the insurer.
To assess and identify aggressive driving patterns, higher granularity data (e.g., 1-second intervals) would be ideal.
Then it be possible to correlate aggressive driving directly with both safety and sustainability.

References

McConky, K., Chen, R. B., & Gavi, G. R. (2018). A comparison of motivational and informational contexts for improving eco-driving performance. Transportation Research Part F, 52, 62–74. Elsevier.
Xu, Y., Li, H., Liu, H., Rodgers, M. O., & Guensler, R. L. (2017). Eco-driving for transit: An effective strategy to conserve fuel and emissions. Applied Energy, 194, 784–797. Elsevier.
Zhou, M., Jin, H., & Wang, W. (2016). A review of vehicle fuel consumption models to evaluate eco-driving and eco-routing. Transportation Research Part D, 49, 203–218. Elsevier.
Adavikottu, A., & Velaga, N. R. (2021). Analysis of factors influencing aggressive driver behavior and crash involvement. Traffic Injury Prevention, 22(sup1), S21–S26. Taylor & Francis.
Nævestad, T.-O. (2022). Eco driving as a road safety measure: Before and after study of three companies. Transportation Research Part F, 91, 95–115. Elsevier.
Koupal, J., Cumberworth, M., Michaels, H., & Beardsley, M. (2003). Design and implementation of MOVES: EPA’s new generation mobile source emission model. Ann Arbor, 1001(48), 105.
Park, S., Lee, J.-B., & Lee, C. (2016). State-of-the-art automobile emissions models… KSCE Journal of Civil Engineering, 20(3), 1053–1065. Elsevier.
Wang, Z., Wu, G., & Scora, G. (2020). MOVESTAR: An open-source vehicle fuel and emission model based on USEPA MOVES. arXiv:2008.04986.
Xu, J., Tu, R., Ahmed, U., Amirjamshidi, G., Hatzopoulou, M., & Roorda, M. J. (2021). An eco-score system… Transportation Research Part D, 95, 102866. Elsevier.
Alam, M. S., & McNabola, A. (2018). Network-wide traffic and environmental impacts… Transportation Planning and Technology, 41(3), 244–264. Taylor & Francis.
Chen, M.-C., Yeh, C.-T., & Wang, Y.-S. (2020). Eco-driving for urban bus with big data analytics. Journal of Ambient Intelligence and Humanized Computing, 1–13. Springer.
Ramirez-Sanchez, E., Tang, C., Xu, Y., … Wu, C. (2025). NeuralMOVES: A lightweight and microscopic vehicle emission estimation model… arXiv:2502.04417.
Svozil, D., Kvasnicka, V., & Pospichal, J. (1997). Introduction to multi-layer feed-forward neural networks. Chemometrics and Intelligent Laboratory Systems, 39(1), 43–62. Elsevier.