Is SPC obsolete for real time monitoring
in electronics manufacturing?
Statistical Process Control (SPC) has long been an important tactic for companies looking to ensure high product quality. In modern electronics manufacturing, the complexities involved don’t meet the fundamental rule of process stability. Combined with the increasing amount of data collected, this makes SPC worthless as a high level approach to quality management. An approach following Lean Six Sigma philosophy is superior in identifying and prioritizing relevant improvement initiatives.
SPC was introduced in the 1920s, designed to address manufacturing of that era. The purpose was to get early detection of undesired behaviors, allowing for early intervention and improvements. Limitations of SPC were sat by available Information Technology, a landscape completely different from modern times. Back-tracking Moore’s Law it is easy to accept that not only IT, but also product complexities and capabilities were different than today. In fact, the measurements from manufacturing operations have no common measure with today’s situation. Following this complexity, and combined with factors such as globalized markets driving up volume of manufacturing, the result is that the amount of output data today is incomprehensible by 1920 standards.
SPC appears to still hold an important position at Original Electronics Manufacturers (OEM). It is found in continuous manufacturing processes; calculating control limits and attempting to detect out-of-order process parameters. In theory such control limits help visualize if things are turning from good to worse. A fundamental assumption of SPC is that you have been able to remove the common cause variations from the process . Meaning that all variations remaining are special cause ones. The parameters you need to worry about when they start to drift.
An electronics product today can contain hundreds of components. It will experience many design modifications due to things such as component obsolescence. It will be tested in various stages during the assembly process; feature multiple firmware revision; test software versions; test operators; variance in environmental factors, and so forth.
Example of High Dynamics
An example of this is Aidon, manufacturer of Smart Metering products. According to their Head of Production, Petri Ounila, an average production batch
- contains 10.000 units
- has units containing over 350 electronics components each
- experience more than 35 component changes throughout this build process
This gives them a “new” product, or process ever 280th unit. In addition comes changes to test process, fixtures, test programs, instrumentation and more. The result is an estimated average of a “new process" every 10th unit or less. Or put in other words, 1.000 different processes in manufacturing a single batch.
How would you begin to eliminate common cause variations here?
And even if you managed, how would you go about implementing the alarming system? An tool in SPC, developed by Western Electric Company back in 1956, is known as Western Electrical Rules, or WECO. It specifies certain rules where violation justifies investigation, depending on how far the observation are from ranges of standard deviations . One problematic feature of WECO is that it on average will trigger a false alarm every 91,75 measurement.
False Alarms Everywhere!
Let’s say you have an annual production output of 10.000 units. Each gets tested through 5 different processes. Each process has an average of 25 measurements. Combining these you will on average get 62 false alarms per day, assuming 220 working days per year.
Let’s repeat that; assuming you, against all odds and reason, were able to remove common cause variations, you would still be receiving 62 alarms every day. People receiving 62 emails per day from a single source would likely mute them, leaving important announcements unacknowledged, with no follow up. SPC savvy users will likely argue that there are ways to reduce this by new and improved analytical methods. “There are Nelson Rules, we have AIAG, you should definitely use Juran Rules? What about this ground-breaking state-of-of-the-art chart developed in the early 2000s, given it a go yet?”
So what?? Even if we managed to reduce the amount of false alarms to 5 per day, could it represent a strategic alarming system? Adding actual process dynamics to the mix, can SPC give a system manufacturing managers relies on, and that keeps their concerns and ulcers at bay?
What most does is to make assumptions on a limited set of important parameters to monitor, and carefully track these by plotting them in their Control Charts, X-mR Charts or whatever they use to try and separate the cliff from the wheat. These KPI’s are often captured and analyzed well downstream in the manufacturing process, often after multiple units are combined in to a system.
An obvious consequence of this is that problems are not detected where they happen, as they happen. The origin could easily come from one of the components upstream, manufactured one month ago in a batch that by now has reached 50.000 units. A cost-failure relationship known as the 10x rule says that for each step in the manufacturing process a failure is allowed to continue, the cost of fixing it increases by a factor of 10. A failure found at system level can mean that technicians will need to pick apart the product, allowing for new problems to arise. Should the failure be allowed to reach the field the cost implications can be catastrophic. There are multiple examples from modern times where firms had to declare bankruptcy or protection against such due to the prospect of massive recalls. A recent example is Takata filing for bankruptcy after a massive recall of airbag components, that may exceed 100 million units .
One of the big inherent flaws of SPC, according to standards of modern approaches such as Lean Six Sigma, is that it makes assumptions of where problems are coming from. This is an obvious consequence of assuming stability in what in reality are highly dynamic factors, as mentioned earlier. Trending and tracking a limited set of KPIs only enhance this flaw. This again kicks of improvement initiatives likely to fail at focusing on your most pressing or cost efficient issues.
A Modern Approach
All this is accounted for in modern methods for Quality Management. In electronics manufacturing this starts by an honest recognition and monitoring of your First Pass Yield (FPY). True FPY to be more precise. By True it means that any kind of failure must be accounted for, even if it only came from the test operator forgetting to plug in a cable. Every test after the first represents waste, resources the company could have spent better elsewhere. FPY represents your single most important KPI, still most OEMs have no real clue what theirs is.
Real Time Dashboards and drill down capabilities allows you to quickly identify the contributors to poor performance. Here is it apparent that Product B has a single failure contributing to around 50% of the waste. There is no guarantee that Step 4 is included in monitored KPIs within a SPC system, but it is critical that the trend is brought to your attention
Knowing your FPY you can break this down in parallel across different products, product families, factories, stations, fixtures, operators, and test operation. Having this data available in real time as Dashboards gives you a powerful “Captains view”. It lets you quickly drill down to understand what the real origin of poor performance is, and make informed interventions. Allocating this insight as live dashboards to all involved stakeholders also contributes to enhanced accountability of quality. A good rule of thumb for dashboard is that unless the information is given to you it won’t be acted on. We simply don’t have time to go looking for trouble. As a next step it is critical that you are able to quickly drill down to a Pareto view of your most occurring failures, across any of these dimensions. By now it could very well be that tools from SPC becomes relevant in order to learn more details. But now you know that you are applying it on something of high relevance, not based on educated guesses. You suddenly find yourself in a situation where you can prioritize initiatives based on a realistic cost-benefit ratio.
The presence of repair data in your system is also critical, it cannot be exclusively contained in a MES system. Amongst other benefits, repair data supplies contextual data that improves root-cause analysis. From a human-resource point of view it can also tell you if products are blindly retested, as sometimes normal process variations takes the measurement within the pass-fail limits. Or if the product is in fact taken out of the standard manufacturing line and fixed, as intended. In short, quality influencing actions comes from informed decisions. Unless you have a data management approach that is able to give you the full picture, across multiple operational dimensions you can never optimize your product and process quality or company profits.
You can’t fix what you don’t measure.