Machine learning for anomaly detection in the automotive industry

Most innovative manufacturing companies use a distributed system of independent sensors and modules on machines along their production lines and throughout their factories, to aggregate data in a central database – a single source of truth. This data can then be analyzed for anomalies, also known as outliers.

Anomaly detection example
Sometimes, anomalies in data stick out like a sore thumb but other times they are harder to find.

Precision manufacturers, such as automakers, rely on anomaly detection to ensure healthy distributed manufacturing systems, to optimize production throughput, and to maintain product quality standards.

Complexities of anomaly detection

Quality control programs for manufacturing use anomaly detection as one of the methods to identify outliers in a dataset. Anomalies could be signs of abnormal network traffic, they could indicate a malfunctioning sensor on the production line, or they could point to data that needs to be cleaned before it is analyzed. In the intricate environment of data analytics in manufacturing, analyzing, monitoring, and managing manufacturing data systems can be a complex exercise.

Anomaly detection can be used on numerous sources of data to pinpoint errors on the production line, enhance root cause analysis, and notify technical teams of any issues that require their attention – but it is not a simple operation.

The problem with traditional anomaly detection systems is that they are difficult to build from scratch, as they require extensive domain knowledge and necessitate a deep understanding of how to analyze production data. Manually built anomaly detection solutions are not only expensive but they also have set thresholds of data, limiting their capabilities and making them prone to failure, as they are unable to scale to the needs of automotive manufacturing data analytics.

Anomaly detection using machine learning

Anomaly detection solutions that employ machine learning (ML) are powerful tools, especially in precision manufacturing environments in the automotive industry. They can process massive amounts of production data to identify vulnerabilities, pinpoint existing anomalies, and predict failures before they occur.

Machine learning and artificial intelligence (ML/AI) systems are helping engineers work more efficiently and improving their anomaly detection success rate. ML/AI systems are faster and more accurate at identifying anomalous data than traditional methods, and they can conduct comprehensive, real-time analysis on large datasets to deliver actionable insights and improved results.

Man running slowly versus car driving fast
Left: The speed at which a human can identify anomalous data. Right: The speed of an ML/AI solution that can identify anomalous data

Three styles of machine learning used for anomaly detection

Machine learning uses algorithms to detect, analyze, and react to anomalies in data. The algorithms are classified into three styles of machine learning, according to the way they process labelled and/or unlabeled data. 

1. Supervised machine learning

In a supervised machine learning environment, the algorithms utilize labelled data, where all datapoints within the dataset are already labeled as anomalous or nominal.

Supervised machine learning diagram

2. Unsupervised machine learning

In an unsupervised machine learning environment, the algorithms analyze datasets with unlabeled datapoints.

Unsupervised machine learning

3. Semi-supervised machine learning

Semi-supervised machine learning incorporates the two styles above, where the algorithms analyze datasets that consist of a combination of labeled and unlabeled data.

The style of machine learning used for anomaly detection depends on the available manufacturing data and the project. Sometimes, manufacturers use a combination of styles across production lines and factories.


Three elements of a successful ML/AI anomaly detection system

Implementing an effective ML/AI anomaly detection system necessitates a great deal of data and an understanding of the structure of the data available and how it relates to the problem at hand.

1. Big data

The greater the size of the dataset, the better the ML/AI anomaly detection software can perform. Machine learning requires large datasets to make predictions that can be validated, because the bigger the dataset, the more anomalous data will stand out.

2. Data structure

Structured data is the easiest to work with, as it is clearly labeled and defined. Any anomalous data within a structured dataset is easily identified as an item or event that deviates from the majority of the data and the predefined behaviours. For example, if a scale on a production line is set to signal any units with a mass exceeding 30 kilograms (the threshold), a unit weighing 33 kilograms would send a signal to the manufacturing data collection software that it has exceeded the threshold, indicating an anomaly.

Unstructured data, on the other hand, is much harder to work with, as there are no set parameters or definitions of the dataset. For example, data could come from a JPEG file and be in the form of pixels or a sequence of characters that the ML/AI anomaly detection algorithm is unable to identify, making the data unusable until it is structured and labeled.

Structured vs unstructured data

How useful is ML anomaly detection in the automotive industry?


Acerta has helped leading automotive manufacturers conduct comprehensive manufacturing data analysis to detect anomalies and uncover actionable insights during production. Here are some real-world use cases:

  1. Leading European driveline supplier: A client wanted to improve their end-of-line gearbox testing to identify units that were likely to fail under warranty. Acerta’s engineers were provided with a very large, decentralized, and unlabeled dataset that was derived from a very limited number of test units. This posed a challenge, as the 10 TB dataset came from such a small sample size. Acerta’s engineers used noise, vibration, and harshness (NVH) testing and advanced anomaly detection to examine the data and pinpoint early indicators of future gearbox failures. The outcome was an 89% classification accuracy in predicting failure during the warranty period and anticipated savings of €2M per plant from reduced annual warranty costs.

  2. Tier-1 engine supplier: A client needed to improve their vehicle diagnostics, by predicting and identifying the causes of engine failure across multiple engine platforms. The client needed a solution that could predict failures from data gathered from five different engine platforms, each of which had various test profiles. They also required that two different machine learning approaches be implemented, including classification and anomaly detection, and they wanted an accuracy result of 80% or better for failure prediction. Acerta’s machine learning models identified 100% of the suspect signals in the engine test signal data and exceeded the client’s expectations for accuracy in predicting engine failures by over 93%.

  3. Major axle assembly supplier: A client wanted to leverage machine learning on their production line to identify anomalous data. They needed help locating the source(s) of failure in their axle assemblies to cut the rate of product failure and related rework. The client’s production lines involved more than 20 different operations that generated over 200 measurements per unit, making it very difficult to narrow down the sources of failures. Acerta implemented the LinePulse solution – adapted and customized to fit the plant network infrastructure – and achieved a 65% reduction in failure and rework rates, along with the associated cost reductions.

Ready to improve part quality using ML for anomaly detection?

By leveraging the machine learning and statistical process control (SPC) capabilities of our LinePulse solution, we can help you analyze the mountains of time-series datapoints from your signals, identify the connections between anomalies, and expose opportunities to improve your part quality.

Want to translate your complex data into actionable insights? Get in touch.

Share on social: