Introduction to Failure Mode and Effects Analysis (FMEA)
There are numerous high-profile examples of product recalls resulting from poorly designed products and/or processes. These failures are debated in the public forum with manufacturers, service providers and suppliers being depicted as incapable of providing a safe product. Failure Mode and Effects Analysis, or FMEA, is a methodology aimed at allowing organizations to anticipate failure during the design stage by identifying all of the possible failures in a design or manufacturing process.
Developed in the 1950s, FMEA was one of the earliest structured reliability improvement methods. Today it is still a highly effective method of lowering the possibility of failure.
What is Failure Mode and Effects Analysis (FMEA)
Failure Mode and Effects Analysis (FMEA) is a structured approach to discovering potential failures that may exist within the design of a product or process.
Failure modes are the ways in which a process can fail. Effects are the ways that these failures can lead to waste, defects or harmful outcomes for the customer. Failure Mode and Effects Analysis is designed to identify, prioritize and limit these failure modes.
FMEA is not a substitute for good engineering. Rather, it enhances good engineering by applying the knowledge and experience of a Cross Functional Team (CFT) to review the design progress of a product or process by assessing its risk of failure.
Design FMEA (DFMEA) explores the possibility of product malfunctions, reduced product life, and safety and regulatory concerns derived from:
- Material Properties
- Interfaces with other components and/or systems
- Engineering Noise: environments, user profile, degradation, systems interactions
Process FMEA (PFMEA) discovers failure that impacts product quality, reduced reliability of the process, customer dissatisfaction, and safety or environmental hazards derived from:
- Human Factors
- Methods followed while processing
- Materials used
- Machines utilized
- Measurement systems impact on acceptance
- Environment Factors on process performance
Why Perform Failure Mode and Effects Analysis (FMEA)
Historically, the sooner a failure is discovered, the less it will cost. If a failure is discovered late in product development or launch, the impact is exponentially more devastating.
FMEA is one of many tools used to discover failure at its earliest possible point in product or process design. Discovering a failure early in Product Development (PD) using FMEA provides the benefits of:
- Multiple choices for Mitigating the Risk
- Higher capability of Verification and Validation of changes
- Collaboration between design of the product and process
- Improved Design for Manufacturing and Assembly (DFM/A)
- Lower cost solutions
- Legacy, Tribal Knowledge, and Standard Work utilization
Ultimately, this methodology is effective at identifying and correcting process failures early on so that you can avoid the nasty consequences of poor performance.
When to Perform Failure Mode and Effects Analysis (FMEA)
There are several times at which it makes sense to perform a Failure Mode and Effects Analysis:
- When you are designing a new product, process or service
- When you are planning on performing an existing process in a different way
- When you have a quality improvement goal for a specific process
- When you need to understand and improve the failures of a process
In addition, it is advisable to perform an FMEA occasionally throughout the lifetime of a process. Quality and reliability must be consistently examined and improved for optimal results.
How to Perform Failure Mode and Effects Analysis (FMEA)
FMEA is performed in seven steps, with key activities at each step. The steps are separated to assure that only the appropriate team members for each step are required to be present. The FMEA approach used by Quality-One has been developed to avoid typical pitfalls which make the analysis slow and ineffective. The Quality-One Three Path Model allows for prioritization of activity and efficient use of team time.
There are Seven Steps to Developing an FMEA:
- FMEA Pre-Work and Assemble the FMEA Team
- Path 1 Development (Requirements through Severity Ranking)
- Path 2 Development (Potential Causes and Prevention Controls through Occurrence Ranking)
- Path 3 Development (Testing and Detection Controls through Detection Ranking)
- Action Priority & Assignment
- Actions Taken / Design Review
- Re-ranking RPN & Closure
The Steps for conducting FMEA are as follows:
- FMEA Pre-Work and Assembly of the FMEA Team
Pre-work involves the collection and creation of key documents. FMEA works smoothly through the development phases when an investigation of past failures and preparatory documents is performed from its onset. Preparatory documents may include:
- Failure Mode Avoidance (FMA) Past Failure
- Boundary/Block Diagram (For the DFMEA)
- Parameter Diagram (For the DFMEA)
- Process Flow Diagram (For the PFMEA)
- Characteristics Matrix (For the PFMEA)
A pre-work Checklist is recommended for an efficient FMEA event. Checklist items may include:
- Requirements to be included
- Design and / or Process Assumptions
- Preliminary Bill of Material / Components
- Known causes from surrogate products
- Potential causes from interfaces
- Potential causes from design choices
- Potential causes from noises and environments
- Family or Baseline FMEA (Historical FMEA)
- Past Test and Control Methods used on similar products
- Path 1 Development- (Requirements through Severity Ranking)
Path 1 consists of inserting the functions, failure modes, effects of failure and Severity rankings. The pre-work documents assist in this task by taking information previously captured to populate the first few columns (depending on the worksheet selected) of the FMEA.
- Functions should be written in verb-noun context. Each function must have an associated measurable. Functions may include:
- Wants, needs and desires translated
- Specifications of a design
- Government regulations
- Program-specific requirements
- Characteristics of product to be analyzed
- Desired process outputs
- Failure Modes are written as anti-functions or anti-requirements in five potential ways:
- Full function failure
- Partial / degraded function failure
- Intermittent function failure
- Over function failure
- Unintended function failure
- Effects are the results of failure, where each individual effect is given a Severity ranking. Actions are considered at this stage if the Severity is 9 or 10
- Recommended Actions may be considered that impact the product or process design addressing Failure Modes on High Severity Rankings (Safety and Regulatory)
- Path 2 Development – (Potential Causes and Prevention Controls through Occurrence Ranking)
Causes are selected from the design / process inputs or past failures and placed in the Cause column when applicable to a specific failure mode. The columns completed in Path 2 are:
- Potential Causes / Mechanisms of Failure
- Current Prevention Controls (i.e. standard work, previously successful designs, etc.)
- Occurrence Rankings for each cause
- Classification of Special Characteristics, if indicated
- Actions are developed to address high risk Severity and Occurrence combinations, defined in the Quality-One Criticality Matrix
- Path 3 Development- (Testing and Detection Controls through Detection Ranking)
Path 3 Development involves the addition of Detection Controls that verify that the design meets requirements (for Design FMEA) or cause and/or failure mode, if undetected, may reach a customer (for Process FMEA).
- The columns completed in Path 3 are:
- Detection Controls
- Detection Ranking
- Actions are determined to improve the controls if they are insufficient to the Risks determined in Paths 1 and 2. Recommended Actions should address weakness in the testing and/or control strategy.
- Review and updates of the Design Verification Plan and Report (DVP&R) or Control Plans are also possible outcomes of Path 3.
- Action Priority & Assignment
The Actions that were previously determined in Paths 1, 2 or 3 are assigned a Risk Priority Number (RPN) for action follow-up.
RPN is calculated by multiplying the Severity, Occurrence and Detection Rankings for each potential failure / effect, cause and control combination. Actions should not be determined based on an RPN threshold value. This is done commonly and is a practice that leads to poor team behavior. The columns completed are:
- Review Recommended Actions and assign RPN for additional follow-up
- Assign Actions to appropriate personnel
- Assign action due dates
- Actions Taken / Design Review
FMEA Actions are closed when counter measures have been taken and are successful at reducing risk. The purpose of an FMEA is to discover and mitigate risk. FMEAs which do not find risk are considered to be weak and non-value added. Effort of the team did not produce improvement and therefore time was wasted in the analysis.
- Re-Ranking RPN and Closure
After successful confirmation of Risk Mitigation Actions, the Core Team or Team Leader will re-rank the appropriate ranking value (Severity, Occurrence or Detection). The new rankings will be multiplied to attain the new RPN. The original RPN is compared to the revised RPN and the relative improvement to the design or process has been confirmed. Columns completed in Step 7:
- Re-ranked Severity
- Re-ranked Occurrence
- Re-ranked Detection
- Re-ranked RPN
- Generate new Actions, repeating Step 5, until risk has been mitigated
- Comparison of initial RPN and revised RPN
FMEA Document Analysis
Deciding when to take an action on the FMEA has historically been determined by RPN thresholds. Quality-One does not recommend the use of RPN thresholds for setting action targets. Such targets are believed to negatively change team behavior because teams select the lowest numbers to get below the threshold and not actual risk, requiring mitigation.
The analysis of an FMEA should include multiple level considerations, including:
- Severity of 9 / 10 or Safety and Regulatory alone (Failure Mode Actions)
- Criticality combinations for Severity and Occurrence (Cause Actions)
- Detection Controls (Test and Control Plan Actions)
- RPN Pareto
When completed, Actions move the risk from its current position in the Quality-One FMEA Criticality Matrix to a lower risk position.
RPN Action Priority
When risk is determined to be unacceptable, Quality-One recommends a priority of action to be applied as follows:
- Error Proofing (Eliminate Failure Mode or Address Cause)
- Failure Mode (Only Severity of 9 or 10)
- Causes with High Occurrence
- Improve Potential Process Capability
- Increase Tolerance (Tolerance Design)
- Reduce Variation of the Process (Statistical Process Control and Process Capability)
- Improve Controls
- Mistake Proofing of the tooling or process
- Improve the inspection / evaluation techniques
FMEA Relationship to Problem Solving
The Failure Modes in a FMEA are equivalent to the Problem Statement or Problem Description in Problem Solving. Causes in a FMEA are equivalent to potential root causes in Problem Solving. Effects of failure in a FMEA are Problem Symptoms in Problem Solving. More examples of this relationship are:
- The problem statements and descriptions are linked between both documents. Problem solving methods are completed faster by utilizing easy to locate, pre-brainstormed information from an FMEA.
- Possible causes in an FMEA are immediately used to jump start Fishbone or Ishikawa diagrams. Brainstorming information that is already known is not a good use of time or resources.
- Data collected from problem solving is placed into an FMEA for future planning of new products or process quality. This allows an FMEA to consider actual failures, categorized as failure modes and causes, making the FMEA more effective and complete.
- The design or process controls in an FMEA are used in verifying the root cause and Permanent Corrective Action (PCA).
- The FMEA and Problem Solving reconcile each failure and cause by cross documenting failure modes, problem statements and possible causes.
This FMEA Example has one item with a progression through multiple recommended Actions. With each instance, the revised RPN has improved. The final RPN of 10 indicates the issue has been mitigated successfully. The new state should be captured as Standard Work.