What is anomaly detection in manufacturing?

Leading manufacturers are implementing Industry 4.0 initiatives to collect and analyze data for anomalies that could affect product quality and production operations. When analyzing data, anomaly detection is the process of identifying and observing rare items, events, patterns, and outliers that differ significantly from a dataset’s normal behavior.

Anomaly detection in a crowd
Can you spot the anomaly in this photo?

In a manufacturing context, anomalies are exposed variations from the norm, and they can have both positive and negative implications. For example, anomalous data can reveal problems, like a technical error on the production line. But, an anomaly can also highlight an opportunity, by revealing a way to improve a manufacturing process.   

Data mining and anomalies 

Industry 4.0 has led to a dramatic increase in the number of software solutions and analytics programs available to help companies collect, measure, and manage data from all aspects of their operations. The data mined can include an enormous amount of information from across the manufacturing line. This dataset encompasses data patterns that indicate normal operations, and any unforeseen change to these patterns is interpreted as an anomaly.  

When analyzing data from the production line, it is important to look not only for changes in patterns and outliers; no change can also indicate an anomaly if the pattern is contrary to what was expected from within that metric.  

Key machine learning anomaly detection methods 

Machine learning (ML) enables manufacturers to leverage the wealth of data they collect, giving them the insights they need to identify patterns, detect anomalies, and pinpoint outliers. The three key machine learning anomaly detection methods are: 

  1. Unsupervised: Unsupervised anomaly detection, enabled by machine learning, uses unlabeled data to uncover patterns in a dataset, including identifying anomalies and trends across a supply chain, or interpreting historical and current sensor data to predict and prevent equipment downtime.
  1. Supervised: Supervised anomaly detection models use machine learning to analyze datasets that have been labeled as normal or abnormal, to solve well-defined problems, such as predicting and identifying product defects based on known (labelled) failure data.
  1. Semi-supervised: Semi-supervised anomaly detection is a combination of the above and uses both labelled and unlabelled data. For example, in the automotive industry it can use tested (labelled) parts data and compare it to untested (unlabelled) parts data to predict which untested parts will fail.

Because labelling data is such a painstaking exercise, most manufacturers generally work with unlabelled or partially labelled data. A combination of the methods can be used, depending on the type of data, when it was captured, where it was captured, and the application.  

Anomaly detection in time-series data 

Time-series data are observations that are recorded in a sequence of values over time. Each data point is timestamped when it was measured, and a value is allocated to it at the time it was recorded. This data is used to forecast expected anomalies within the data collected, and uncover outliers in the extreme data points within the dataset. Anomalies within the dataset are divided into three main categories: 

  1. Global outliers: Also known as point anomalies, global outliers exist outside the entirety of a dataset. They are the data points that deviate the most from other data within a given dataset.Global anomaly
    For example, if a control chart displays a data point outside the specified range, this can be considered a global outlier.

  2. Contextual outliers: Also known as conditional outliers, these anomalies differ greatly from other data within the same dataset, based on a specific context or unique condition.
    Contextual Anomaly
    An example of a contextual outlier can be illustrated on a series of temperature readings collected over time that follow a predictable pattern. A machine alternately heats up and cools in a cyclical manner as it operates. If the machine cools when it is expected to heat, this represents an anomaly, even though the unexpected cooling falls within the expected range of overall temperature.
  1. Collective outliers: Data points that deviate significantly from the rest of the dataset. On their own they are not necessarily outliers but combined with another time series dataset they collectively act like outliers.
    Collective anomaly
    Collective outliers can be more challenging to spot, since they require comparing similar points across different datasets. An example of when a collective outlier could be found it the comparison of power supply data for different machines. If the power for the whole plant were to falter, a dip would be seen across all machine data at the same time. This finding would clearly indicate an issue with the central power supply, as opposed to a specific machine.

Using machine learning and Statistical Process Control (SPC), manufacturers can identify the connections between the time series of the outliers themselves and more importantly, to each other. 


Ready to turn anomalies into opportunities? 

By leveraging the anomaly detection, SPC, and machine learning capabilities of our LinePulse solution, we can help you monitor and analyze your production in real time to detect anomalies in your manufacturing data and uncover opportunities to improve your part quality. 

Want to translate your complex product data into actionable insights? Get in touch. 

Share on social: