Detect defective crimp using AI

Time:2024-11-08 Browse: 147

CFM reliably detects many defects, including strip length errors, missing stranding, and insulation in crimping. Photo courtesy of Partex Marking Systems

Voltage relay monitoring (CFM) has long been the standard for fault detection of wire assemblies. The technology reliably detects many defects, including wrong strip length, missing strand, wrong wire cross section, wrong terminal, inconsistent terminal material, insulation in crimping, wrong insertion depth, and wrong crimping height.

In CFM, piezoelectric sensors measure the force applied to the terminal assembly and the subsequent material displacement. After several reference crimp, each subsequent crimp is compared to a known good reference. If the force and displacement are within the specified tolerance, the crimping is good. If it goes beyond those limits, it's very bad.

Despite its simplicity and accuracy, CFM has some drawbacks. First, the technology is expensive. Each crimp machine needs its own monitor.

The researchers collected data from a crimping machine operated by a wiring harness manufacturing plant. The machine runs in several shifts every day. It is equipped with CFM system. Source: Dongguk University

Another problem is setting tolerances. Generating reference samples and collecting data takes a lot of time and skill, and the process must be repeated for each new wire and terminal. Much depends on the skill of the technician.

Scalability is another challenge. If yields and varieties increase, CFM systems may struggle to maintain efficiency and accuracy.

To address these challenges, CFM systems can be enhanced with artificial intelligence (AI). AI is constantly learning and adapting from real-time data, enabling it to adapt to a wide range of manufacturing processes and external conditions. This adaptability greatly reduces the need for frequent recalibration of the system. In addition, AI-based systems do not require expertise in data processing, making them more accessible.

AI can also enhance the scalability of manufacturing operations by effectively managing data from multiple production lines and adapting to changes in product types without requiring extensive reconfiguration. This flexibility can help manufacturers respond quickly to market demands and diversify their products.

This diagram shows the process of using an AI model for fault detection during a crimping operation. Initially, reference data was collected manually. RSDS is then applied to the data to generate synthetic anomaly data by increasing or decreasing the volume of specific regions. The data is then enhanced using Laplacian distributions to increase the number of datasets and improve the training robustness of the model. Finally, an enhanced data set is used to train an AI model using MLP Source: Dongguk University

However, several challenges must be addressed before AI can be introduced into crimp systems. First, changes in the crimping process could render existing AI models obsolete due to changes in the data scale. For example, changing the wire type may alter the overall data scale, invalidating the previously established model.

Another challenge is the lack of defective crimped data points. This data is very important for training AI models. Unpredictable defects can occur, so the more defect data a model has, the more accurate it will be. There are some exception detection algorithms, such as Isolation Forest, that can be trained using normal data alone to detect unknown defects. However, this may not guarantee adequate detection accuracy for all potential failures. This makes such algorithms less suitable for quality control in actual manufacturing.

To address these challenges, a fault detection system is proposed that employs AI with region-selective data scaling (RSDS). RSDS generates synthetic anomaly data from reference data by performing magnification or reduction on specific areas of the data. This allows fault detection systems to effectively train AI models using datasets made up entirely of normal operational data, and still achieve high accuracy in detecting faults.

In this study, the multi-layer perceptron (MLP) classification model was trained entirely on normal data and was able to effectively distinguish between normal and abnormal cases. To validate the system, 15 unique raw data sets were collected from real-world wiring harness manufacturing facilities and tested using four anomaly detection algorithms: isolated forest, single-class autoencoder, k-means, and histogram-based Outlier scores (HBOS).

This graph shows the RSDS that the researchers used to generate synthetic anomaly data. When actual data about defects is lacking, this data can help train AI models. Instead of uniformly scaling the entire area, this method divides the area into smaller parts and then selectively applies the scaling Source: Dongguk University

AI for Manufacturing data

Supervised learning has been used to detect faults in different industrial processes. Its ability to learn from labeled data and predict outcomes makes it a powerful tool for fault detection and classification, especially in complex manufacturing processes. This approach has been used in industries such as semiconductor manufacturing, where early detection of failures can save significant time and costs. The technology has also been applied to motor manufacturing to optimize processes such as hairpin windings.

However, supervised learning requires a large amount of labeled data to train the model, and the process of collecting and labeling the data is time-consuming and expensive.

To deal with such problems, consider using unsupervised learning and outlier analysis methods. These methods can extract meaningful features from raw data and efficiently process large amounts of unlabeled data. They help address the complexity of manufacturing environments and provide effective diagnostic tools that do not require predefined labels.

However, the utility of these unsupervised learning methods is not without limits. Often, the feature selection process may include noise or irrelevant features, which can adversely affect accuracy. It also requires large amounts of unlabeled data to achieve a satisfactory level of performance.

To supplement these shortcomings, semi-supervised learning techniques can be used. These techniques combine the benefits of supervised and unsupervised learning by selectively incorporating labeled data from an indeterminate pool of unlabeled data into the training process. This approach effectively optimizes learning from limited data while further enhancing fault diagnosis by integrating various classifiers, which helps reduce the risk of merging noise or irrelevant features. This can increase the variety and robustness of the learning process.

This diagram depicts the manufacturing process, where the fault detection system analyzes each data set sequentially Source: Dongguk University

Despite these advances, a key challenge remains in the process of training the model. For fault detection, these models require data from both normal and abnormal classes to be trained effectively. However, in the actual manufacturing process, obtaining abnormal data is a challenge due to the unpredictability of defects.

Anomaly detection algorithms can solve these problems by training the model using only normal data. A number of anomaly detection techniques have been proposed to classify outliers in normal data. Often, existing machine learning algorithms are used for outlier detection. For example, decision trees provide a simple, rule-based way to identify anomalies by detecting deviations from typical patterns. These algorithms implement single-class training by learning the boundaries and features of ordinary classes from a predominantly normal data set.

Neural networks can also be used for anomaly detection because of their ability to understand complex relationships. For example, autoencoders can effectively use their reconstruction errors to distinguish between abnormal states and normal data. Cluster analysis techniques are still powerful for anomaly detection, such as k-means, which can group similar data and highlight outliers in less-populated clusters.

Suggested method

To address the limitations of machine learning, anomaly detection algorithms have been proposed to train models using only a single class of data. However, in actual manufacturing, there are few reference datasets available to train AI models. A model trained with only a small amount of normal data will show poor performance when faced with a variety of previously unseen anomalies. In addition, these algorithms can suffer from overfitting, especially when the normal data is not representative of all possible normal behavior.

Setting the appropriate thresholds to classify exceptions is another challenge. Achieving high-precision fault detection requires a careful balance between model sensitivity and specificity.

To develop a practical fault detection system, raw data must be collected from actual manufacturing processes rather than theoretical simulations. In our study, we collected data from crimping machines operated by wiring harness manufacturing plants.

The machine runs multiple shifts a day and is specifically designed to produce wire harnesses for various electronic components. It is equipped with CFM system.

The researchers' AI system (far right) is better at detecting defective crimp than other well-known AI models Source: Dongguk University

Fifteen datasets were collected between April 19 and May 8, 2023. A total of 23,383 individual crimp records were collected. The CFM system provides a time stamp for each crimp, as well as a quality label (" good "or" bad "). About 200 data points are collected per crimping, with one data point collected every 5 milliseconds. According to the CFM system, 23,286 entries were marked as good and 97 entries were marked as poor. Poor crimping is mainly attributed to problems such as damaged insulation, which causes the wires to be exposed, and improper crimping, which leads to weak electrical connections, which impairing the overall function of the wiring harness.

The size of the data sets, even those collected on the same day, can vary significantly, posing a major challenge to developing general-purpose AI models for defect detection. The data collected on April 19, April 26 and May 4 showed stark differences. This inconsistency is not only due to product variability, but also due to sensor sensitivity issues and fluctuations in environmental conditions. Given these variable and inconsistent scales, it is critical to reset the AI model for each unique manufacturing setup to ensure accurate detection of defects under these different and variable conditions.

These charts show six representative predictions, all accurately classified. The researchers' model accurately identified defects of different sizes and shapes. The blue lines represent the researchers' composite data. The green areas are true crimp defects. Red Line is really good crimping Source: Dongguk University

Proposed fault detection system

Considering the limitations of traditional CFM and recognizing the challenge of applying traditional AI to fault detection, this paper proposes a new paradigm: fault detection system based on AI and RSDS. This paradigm solves the challenges posed by limited training data and unpredictable defects by using an anomaly detection based algorithm.

In the process, the initial reference data is collected manually by the operator. Then, RSDS are performed on the data to generate synthetic abnormal data by increasing or decreasing the volume of specific areas of the data. The data is then enhanced using Laplacian distributions to increase the number of datasets and improve the training robustness of the model. After that, the enhanced data set is used to train the AI model of the system, which utilizes the MLP.

The MLP consists of three layers: the input layer receives the initial data, the hidden layer processes and transforms this data through various calculations, and the output layer provides the final result or prediction based on the processed information. Once the model is trained, it begins to detect faults in the remaining upcoming crimp data.

Artificial Intelligence model

In real manufacturing scenarios, fault detection systems often classify defects without prior knowledge of them. For example, CFM systems can accurately detect faults using only 30 data points in normal manufacturing operations without any defect data. However, training any AI model with just 30 data points is challenging. The reason for this is overfitting, where the model becomes overly customized to limited training data, reducing its ability to detect invisible defects. In addition, the absence of abnormal data in the initial set may hinder the AI's ability to recognize and distinguish abnormal patterns from standard patterns.

Given these challenges, MLPS are a suitable and technically sound option for several reasons. First, thanks to its multifaceted approach, MLP models linear and nonlinear relationships through its structured layers of neurons, demonstrating a high degree of adaptability to different data patterns. Each neuron in these layers processes the input data.

An MLP requires at least two classes to train, so synthetic exception data needs to be created to effectively train the model. Generating and integrating synthetic anomalous data can introduce additional complexity and potential biases to the training process, requiring a careful and strategic approach to ensure authentic and meaningful learning.

One possible approach might involve zooming in and out of the raw data to create synthetic fault data. Randomly zooming in and out of raw data seems to be a viable solution for detecting unexpected defects. However, this technique requires the integration of many fault data classes, which complicates the model. This increases the structural complexity of the model while extending the training time.

In contrast, uniformly applied scaling can consistently adjust the entire data set, potentially simulating various defect scenarios by systematically deviating from the original "normal" manufacturing data. However, uniform scaling of the entire data set can hinder classification performance because it offsets MLP's intrinsic learning mechanisms.

Given that MLPS learn primarily by adjusting weights during backpropagation, uniform scaling (which essentially reduces differences in the data) may adversely affect the model's ability to distinguish and adjust weights effectively, potentially compromising its prediction accuracy and classification ability.

As a result, this uniform increment may distort the relative differences between each input feature, adversely affecting the classification performance and prediction accuracy of the MLP.

Region-selective data extension solves these problems. The resultant anomaly data helps to generalize AI models with fewer reference data points. Instead of scaling the entire area evenly, this approach divides the area into smaller parts and then selectively applies the scaling. In this way, this approach solves the challenges associated with uniform scaling, while also allowing for more systematic simulation of various defect scenarios. RSDS play a crucial role in creating comprehensive anomaly data, allowing models to learn and adapt to different defect types, even if actual defect data is not initially available.

Given the synthetic anomalous data generation strategy, it is critical to address the inherent data imbalance, especially with respect to anomalous data. Copying only synthetic outlier data may increase the dataset size, but it does not introduce the necessary variability into the MLP learning process. This can interrupt the model's learning during training.

Therefore, complexity must be applied to training datasets, ensuring the quantity, quality, and diversity of data to facilitate more complex learning mechanisms. To meet this requirement, data enhancement techniques are implemented by introducing noise from Laplacian distributions. This results in a wider range of diverse and challenging samples.

Many fully automatic cutting, stripping and crimping machines are equipped with CFM technology Photo courtesy of Schleuniger

Results and analysis

To validate the proposed fault detection system, 15 manufacturing datasets were tested. The data set was obtained from actual wiring harness manufacturing plants and was collected between April 19 and May 8, 2023. The dataset consists of 24,249 entries, i.e. 24,152 good crimp and 97 bad crimp.

It is important to emphasize that while the results of our AI model are not directly compared to the results of CFM, the data annotated by CFM has proven to be very valuable for testing our AI model. The CFM system exhibits a commendable level of accuracy. However, it does not exempt from errors. A label obtained from CFM can be considered reliable with a 99% confidence level, allowing for a minimum 1% chance of inconsistency.

Experiments are conducted in specific scenarios to simulate real-world manufacturing. First, the system starts with the initial data set and evaluates the availability of at least 10 reference data points. Assume that the collection and evaluation of reference data is performed manually by the operator. However, in the experimental setup, the first 10 data points from the normal label are used to simplify the experimental process.

The AI model is built after synthetic anomaly data is generated from the reference data. If the system processes all remaining data, it resets the AI model and continues processing the next data set until the last one is processed. This approach ensures that each dataset is paired with a dedicated AI model that is carefully calibrated to match its unique characteristics.

About 60 datasets were generated from 10 reference data points. These data sets were then expanded to 700 data sets using data enhancement techniques. Starting with the creation of six defect types, seven categories of labels are established accordingly, including the general category. In total, the training data consisted of 770 fully labeled datasets, each grouped into one of seven categories.

It is worth noting that only 10 of the 770 data sets are raw reference data. In order to maintain the consistency of normalization, MinMax teeth cleaning machine is used.

After scaling, the data was used to train an MLP model with 200 input neurons and two hidden layers consisting of 64 and 32 neurons, respectively. The model utilizes ReLU activation function and "adam" optimization algorithm. The "adaptive" learning rate was adopted, and the maximum iteration was set to 500. In the evaluation, accuracy and true negative rate (TNR) were used as the main indicators. Accuracy provides a comprehensive assessment of the model's performance, while TNR specializes in evaluating the system's proficiency in identifying defective items, a key aspect in the field of manufacturing quality control.

To evaluate the effectiveness of this method in detecting defects, it was tested against four well-known anomaly detection algorithms: Isolation Forest, autoencoder, k-means, and histogram-based Outlier scores (HBOS).

The Isolation Forest algorithm utilizes a tree structure to effectively identify exceptions by focusing on shorter paths than a normal instance. To optimize the performance of the Isolation Forest algorithm, a grid search is performed to determine the most suitable hyperparameters.

The k-means clustering algorithm is an unsupervised method commonly used in data analysis to divide data sets into different clusters. In this approach, if the distance from a data point to the center of the cluster exceeds a predetermined threshold set at the 95th percentile, the data point is marked as an anomaly.

The terminals must be connected to the wire with the proper force Photo courtesy of Partex Marking Systems

In addition, an autoencoder, a neural network architecture known for its ability to perform dimensionality reduction, was implemented. Anomalies are detected by assessing reconstruction errors that are significantly higher than the threshold set by the 95th percentile of the training error.

Finally, the study makes use of HBOS, a convenient unsupervised technique for calculating outlier scores based on the distribution of data in a multidimensional space. Our selection includes a range of methods that have been selected for their wide use, effectiveness, and inclusion of various anomaly detection techniques. To maintain a controlled environment, the first 10 data points with a normal label are used as reference data for all algorithms.

The proposed system is notable for its excellent average accuracy of 99.95%. Its TNR was 85.72%, indicating its high sensitivity in detecting anomalies.

HBOS showed an impressive 99.56% accuracy rate. However, TNR of 0% for all datasets indicates possible overfitting and a lack of effectiveness in detecting anomalies.

The k-means algorithm has 95.39% accuracy and 93.44% TNR, which has problems in manufacturing environment. A difference of 4.5 percentage points may not seem like much, but it means there's a lot of misclassification. In addition, while k-means' TNR appears to be superior to the system's 85.72% TNR, the small sample size of true and false negatives suggests that the apparent advantage may not be significant.

The results of Isolation Forest and autoencoder algorithms show that there is a case of inverse fitting. Specifically, Isolation Forest had an average accuracy of 40.42 percent and a TNR of 96 percent, while the autoencoder had an average accuracy of 68.92 percent and a TNR of 100 percent.

This study proposes a specific and systematic approach to improve quality control of wire harness crimping manufacturing by integrating RSDS with AI. This approach leverages the unique capabilities of RSDS to generate synthetic anomaly data, effectively addressing the challenge of having only a limited labeled data set available for robust AI training. Experiments conducted on real industrial datasets demonstrate promising alternatives to CFM and its advantages over traditional anomaly detection algorithms. This suggests that the integration of AI could help improve manufacturing quality control.

Home

About

News

Product

Equipments

Contact

中文

Detect defective crimp using AI