Machine learning for manufacturing

When it comes to automotive applications of artificial intelligence, most people will think of autonomous vehicles. And yet, machine learning has the potential to be much more valuable in the factory than on the road, at least for the foreseeable future. Machine learning is emerging as a key Quality 4.0 technology to help manufacturers produce better quality products.

Machine learning is a term thrown around so often that it can easily be disregarded as just another Industry 4.0 buzzword. But in fact, it is a crucial piece of technology that is helping manufacturers improve key manufacturing KPIs. Manufacturers must understand the different applications of machine learning, and how to leverage it in their own plant to stay competition.

What can machine learning do for manufacturing?

Incorporating a machine learning platform into a manufacturing facility gives engineers more insight into their production processes. Tasks that have traditionally taken hours of manual labor, such as aggregating machine signal data to identify trends, can be automated and completed in minutes or less. In this case, machine learning isn’t competing with statistical process control (SPC) or other traditional manufacturing methodologies; it’s augmenting them.

Machine learning can also help identify the cause of defects or failed tests. Root cause analysis can involve dealing with hundreds or thousands of signals from the production line, depending on the scope of the issue and how much data is collected. Using automated signal pruning, machine learning can reduce the number of signals requiring manual investigation by more than 99%, reducing the time needed for root cause analysis from weeks to hours.

Machine learning can also benefit manufacturing by enhancing end-of-line testing, particularly for complex assemblies such as engines or transmissions. By generating predictions about how likely units are to pass or fail an end-of-line test, machine learning removes the need to test every unit at the end of the line.

Finally, by combining the insights gained by incorporating machine learning into a production line, manufacturers can use those insights not only to solve production problems, but actually avoid them all together. By identifying the key points during assembly that contribute to product failures, a machine learning model can recommend which parts to mate together in order to minimize the potential for quality escapes from completed assemblies. This approach has been shown to reduce the rework rate for axle assemblies by 65%.

Ultimately, how beneficial machine learning can be to manufacturing depends on the plant and the data being collected from it. With the right set of information and the right machine learning approach, manufacturers can reduce their scrap and rework rates, improve product quality, and increase production throughput.

Machine learning needs manufacturing data

In order to provide accurate analysis and insights, machine learning models need to be fed with adequate data. There are many potential sources of data in manufacturing, not just the obvious ones like machine sensors and PLCs. Other manufacturing data sourcesincluding business transactions, maintenance records, warranty information—can all be valuable sources.

In 2024, a lot of manufacturing data is being collected, and most manufacturers are not utilizing it effectively. By some estimates, more than two thirds of all manufacturing data collected goes unusedBut, just because we are collecting data, doesn’t mean that machine learning analysis can be executed flawlessly. Manufacturers face many challenges when it comes to collecting data for machine learning analysis, such as:

  • Not understanding how much manufacturing data is “enough” to provide accurate analysis
  • How to get data from legacy machines that weren’t designed to collect it
  • How to record data from manual operations or inspections
  • Data that is siloed at different machines or in different databases
  • Custom software applications that are challenging to get data out of
  • Rigid rules from corporate IT groups that can limit the flow of data

Problems with manufacturing data collection must be addressed before investing in machine learning analysis.

Machine learning and the role of the Manufacturing Execution System (MES)

A manufacturing execution system (MES) monitors the process of transforming raw materials into finished products. The MES is where all product and production data is collected and aggregated. MES is often connected to an organization’s enterprise resource planning (ERP), in which case everything from procurement, to accounting, to maintenance—even scrap and rework rates—is all available in one place. 

An MES can be incredibly valuable for accessing data needed by a machine learning platform. In terms of industrial data, the two biggest differences between a line with MES and one without (or one that hasn’t integrated it completely) are data traceability and completeness.

Data traceability

Traceability is what gives a unit on an assembly line its identity. Serial numbers or bar codes are common ways that units in production are identified and can be “traced” through the data.

Traceability is essential for machine learning applications in manufacturing. If you want to use machine learning to determine how OP10 is affecting OP20, you need traceability to link the data coming from those two operations.

Data completeness

Data completeness can be understood in terms of granularity. Think of measuring the distance between two features on a unit. In most cases, a line worker would use three probes for each surface, but only record one number for the distance. That may have been good enough in the past, but it’s not sufficient if you want to leverage machine learning to improve manufacturing quality.

Machine learning models would need all three measurements from that example in order to represent the part’s planar features as accurately as possible. Put simply: the more granular the data, the more opportunities for insight.

Do you need an MES to use machine learning?

While it’s possible to attain the levels of data traceability and completeness necessary for machine learning without a MES, those who do have a MES tend to be in a much better position to use machine learning compared to those who don’t. 

On the other hand, having a MES in place doesn’t guarantee that a manufacturer has all the data necessary to utilize machine learning. Manufacturing execution systems don’t record time-based data, e.g., for a stamping operation, they will only store the final or max force and the time of day when the operation took place rather than recording force vs time data during each operation. Machine learning models need the latter sort of information to deliver the most accurate results.

Types of machine learning

Machine learning uses a variety of algorithms to solve problems. These algorithms are commonly grouped into three categories by the way they learn. The learning style of an algorithm depends on how it ingests data, and whether that data is labelled or unlabelled.

Labelled data has a tag or classification, whereas unlabelled data does not. Think of units coming off an end-of-line test: If those units are identified as passing or failing in the data, then the data set is labelled. If the units do not have any indication of which parts passed or and which parts failed, then the data set is unlabelled. 

There are three styles of machine learning:

  1. Supervised learning: algorithms that use labelled data 
  2. Unsupervised learning: algorithms that use unlabelled data 
  3. Semi-supervised learning: algorithms that use partially labelled data 

The different applications of machine learning in manufacturing will fall into one of these three styles, depending on the available data and what algorithms are required to solve it.

Supervised ML

Supervised anomaly detection


Unsupervised anomaly detection

Applying machine learning to manufacturing

Supervised machine learning applications

Improving end-of-line (EOL) testing

Supervised learning uses labelled data to solve problems that have a well-defined failure mode or threshold for failure. By feeding a machine learning model historical testing data, for example, from a noise, vibration and harshness (NVH) test, the model can learn to identify “normal” behaviour and use that to predict whether a unit will pass or fail the test, before units actually reach the test station. The model is then able classify and label new parts in real time.  

After a supervised learning model has been trained on historical data, it can be deployed via a machine learning platform or in a Docker container directly on the line. The model can then conduct real-time data analysis on parts coming off the line, identifying defects in testing and predicting, for example, problematic frequencies from NVH testing.

Machine vision inspection

One common application of supervised learning in manufacturing is machine vision. These systems use image processing to automatically conduct analysis to predict failures and prevent stoppages on the line. They can also be used in automated inspection and process control to ensure better quality output. 

In production, image-processing commonly involves tracking objects in two-dimensional space, such as components on an assembly line. However, as the technology continues to improve, 3D machine vision can coordinate the complex movement of components in 3D space

Similar to the testing example discussed above, machine vision systems can be trained to identify broken or damaged parts on the line and then classify new parts as either broken or normal. However, vision systems are inherently limited to visual differences. By the time a vision system identifies an error in a part, the only recourse is to rework or scrap it. Ideally, machine learning should eliminate problems before they happen, rather than simply replacing human inspectors.

Unsupervised machine learning applications

Supply chain optimization

Unsupervised learning is very useful for finding the underlying structure of a dataset. It can analyze the complex interactions between different signals, providing insights into subtle trends and identify otherwise invisible issues with an assembly line or even a whole supply chain.

Currently, manufacturers are using AI and machine learning algorithms to identify the factors most likely to impact production volume, such as weather, consumer demand, or political considerations. These predictions are then used to optimize allocation of staff and inventory, among other resources, to ensure that operations run smoothly from end to end.  

Predicting Remaining Useful Life (RUL)

Unsupervised machine learning can also be used to determine when machinery might fail, comparing signals such as temperature, pressure, and usage frequency (among others) to historical failures. While there are obviously many consumer applications for improving estimations of remaining useful life, it’s also useful for manufacturers, as it reduces the frequency of and time spent on maintenance.

Semi-supervised machine learning applications

Semi-supervised learning is a good option when dealing with partially labelled data. Semi-supervised models are trained on a combination of labelled and unlabelled data, typically much more of the latter than the former. 

For example, a manufacturer could use semi-supervised machine learning algorithms to process a dataset from a production cycle in which some, but not all, units were tested. By comparing the labelled passing or failing units with the unlabelled (i.e., untested) units, a semi-supervised model can generate predictions of which untested units would be most likely to pass or fail the test, reducing the risk of quality escapes without comprehensive testing.

Building machine learning models for manufacturing

1. Define the problem

  • Understand the manufacturing process: Get a clear picture of the manufacturing process, the type of data available, and the specific problems or questions you want the ML model to address (e.g., predicting equipment failures, optimizing production efficiency, detecting defects).
  • Set clear objectives: Define what success looks like for the project, including the key performance indicators (KPIs) to be improved.

2. Collect and prepare the data

  • Data collection: Gather historical and real-time data from the manufacturing process.
  • Data cleaning: Since manufacturing data is generated by disparate sources, it needs to be cleaned, with all null, missing, or duplicate values removed
  • Data transformation: Since machine learning algorithms are only able to process numerical values, non-numerical features (like a string) need to be converted and represented numerically. Some linear models and neural networks also have a fixed number of input nodes, which means the size of all inputs must be the same. 
  • Feature selection and engineering: Identify the most relevant features (data inputs) that influence the outcome you’re interested in. Create new features from existing data to improve the model’s performance.

3. Choose a machine learning algorithm

  • Research algorithms: Depending on your problem type (classification, regression, clustering, etc.), choose a suitable machine learning algorithm. Common choices include decision trees, support vector machines, neural networks, and ensemble methods like random forests.
  • Considerations: Take into account the complexity of the problem, the size and type of your data, and the computational resources available.

4. Train the model

  • Split the data: Divide your data into training and testing sets to evaluate the model’s performance. For example, when building a machine learning model for predictive maintenance problems, the time sequence is crucial and therefore the dataset must be divided carefully. In contrast, for classification problems, such as distinguishing defective and non-defective parts, the time sequence is irrelevant but there should be an equal distribution of defective and non-defective parts in the training and the test set. 
  • Model training: Use the training data to teach the model to make predictions or decisions based on the input features. Usually, manufacturing problems fall into one of two umbrella categories: classification (e.g., pass vs fail from an end-of-line test) or regression (e.g., predicting cutting tool wear over time).

5. Optimize and tune the model

  • Hyperparameter tuning: Experiment with different settings for the model’s hyperparameters to find the most effective combination.
  • Feature engineering revisited: Consider adding, removing, or transforming features based on model performance. Since each and every problem is unique, machine learning models must also be tuned and tweaked to yield the best performance possible.

6. Deploy the model

  • Integration: Integrate the model into the manufacturing process. This could involve developing a software or using an existing platform.
  • Real-time monitoring: Set up systems to feed real-time data into the model and act on its predictions or insights.

Can I buy a machine learning model?

Fortunately, it is no longer necessary to employ a team of data scientists at your manufacturing plant to build and deploy machine learning models. Depending on the problem you would like to solve, there are already vendors who have created tools and software that apply machine learning models to the problems you face on the shop floor.

We created LinePulse to give you the insights of machine learning without ever needing to manipulate your data. LinePulse is a predictive quality platform that leverages machine learning to predict defects and accelerate root cause analysis. Best of all, it is designed to be used right on the shop floor, without a background in data science.

Machine learning in manufacturing glossary

Artificial intelligence: The implementation of human-like reasoning in computational systems

Classification problem:  A predictive modelling problem where a class label is predicted based on discrete input data

Deep learning: An approach to machine learning that uses artificial neural networks consisting of individual nodes connected and structured into input layers, hidden layers, and output layers

Hyperparameters: parameters whose values control the learning process and determine the values of model parameters that a learning algorithm ends up learning.

Labelled data: Data that includes tags or metadata which provides additional information about individual data points, e.g., PASS/FAIL

Learning style: A way to group machine learning algorithms according to their input data (e.g., labelled or unlabelled)

Machine learning: An approach to artificial intelligence that uses algorithms or statistical models to perform tasks without explicit instructions

Manufacturing Execution System (MES): Industrial software used for monitoring the production process

Regression problem:  A predictive modelling problem where the predicted output is continuous, or a real value

Semi-supervised learning: A method of machine learning in which algorithms utilize data with only one labelled feature

Supervised learning: A method of machine learning in which a model is trained on labelled data and presented with a desired output; typically used to solve classification or regression problems

Test Set: A subset of data used to evaluate a trained machine learning model’s performance

Training set (or Training data): A subset of data used to teach a machine learning model how to predict a target outcome.

Unlabelled data: Data that lacks tags or explanations of what the individual data points represent

Unsupervised learning: A method of machine learning in which models utilize unlabelled data to group points or features; typically used to solve clustering or dimensionality reduction problems

Automate root cause analysis and predict defects in real time

How is that possible?