COMPARISON OF THE EFFICIENCY OF UNSUPERVISED MACHINE LEARNING METHODS FOR DETECTING ANOMALIES IN OBD2 DATA
DOI:
https://doi.org/10.31891/2219-9365-2025-81-52Keywords:
unsupervised machine learning, remote diagnostics, OBD2, diagnostic trouble codesAbstract
This article delves into an experimental investigation aimed at the identification of anomalous engine conditions through the analysis of diverse signals obtained from vehicle sensors. The study meticulously examines data streams originating from a properly functioning diesel Honda CR-V, acquired via an OBD2 adapter. The specific sensor signals under scrutiny encompass crucial operational parameters such as vehicle speed, engine coolant temperature, motor oil temperature, engine revolutions per minute (RPM), absolute accelerator pedal position, engine load, and fuel consumption.
Prior to the application of anomaly detection algorithms, the collected raw data underwent a rigorous preprocessing pipeline. This involved filtering to remove noise and inconsistencies, systematic organization to structure the data effectively, and normalization techniques to ensure that all features contribute equally to the subsequent analysis, mitigating the impact of differing scales and ranges.
The core of this research lies in the comparative evaluation of seven distinct unsupervised machine learning methodologies for the task of anomaly detection in the context of engine health monitoring. The methods explored include: the Isolation Forest algorithm, known for its efficiency in isolating outliers; the One-Class Support Vector Machine (OCSVM), adept at defining a boundary around normal data; the Autoencoder Reconstruction Error method, which identifies anomalies based on deviations in the reconstructed data; the Principal Component Analysis (PCA) method, leveraging dimensionality reduction to highlight deviations from the principal components; the K-means Distance method, which flags data points far from cluster centroids; the Local Outlier Factor (LOF) coefficient method, identifying anomalies based on their local density compared to neighbors; the Gaussian Mixture Model (GMM), which models the data as a mixture of Gaussian distributions and identifies low-probability points; and the Deep Support Vector Data Description (Deep SVDD) method, a deep learning approach for learning a compact hypersphere around normal data.
The study aims to provide a comprehensive comparative analysis of the performance of these unsupervised learning techniques in detecting abnormal engine states based on real-world vehicle sensor data. The findings of this research hold significant potential for the development of proactive vehicle maintenance systems, enabling early detection of potential engine malfunctions and contributing to enhanced vehicle safety and reliability. The comparative insights gained from evaluating these diverse methodologies will offer valuable guidance for selecting the most appropriate anomaly detection approach for automotive diagnostics.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Любомир МАТІЙЧУК, Володимир ГОТОВИЧ, Віталій БОНАР

This work is licensed under a Creative Commons Attribution 4.0 International License.