FMEA: Explained

Failure mode and effects analysis

Failure mode and effects analysis (FMEA) is a method (first developed for systems engineering[/url]) that examines potential failures in products or processes. It may be used to evaluate [URL=“http://en.wikipedia.org/wiki/Risk_management”]risk management
http://en.wikipedia.org/wiki/Systems_engineering
priorities for mitigating known threat-vulnerabilities.
FMEA helps select remedial actions
http://en.wikipedia.org/wiki/Remedial_actions
that reduce cumulative impacts of life-cycle consequences (risks) from a systems failure (fault).
By adapting hazard tree analysis to facilitate visual learning
http://en.wikipedia.org/wiki/Visual_learning
, this method illustrates connections between multiple contributing causes and cumulative (life-cycle) consequences.
It is used in many formal quality systems
http://en.wikipedia.org/wiki/Quality_Management_System
such as QS 9000 or ISO/TS 16949.
The basic process is to take a description of the parts of a system, and list the consequences if each part fails. In most formal systems, the consequences are then evaluated by three criteria and associated risk indices:

  • severity (S),
  • likelihood of occurrence (O), and (Note: This is also often known as probability (P))
  • inability of controls to detect it (D)
Each index ranges from 1 (lowest risk) to 10 (highest risk). The overall risk of each failure is called [i]Risk Priority Number (RPN)[/i] and the product of Severity (S), Occurrence (O), and Detection (D) rankings: RPN = S × O × D. The RPN (ranging from 1 to 1000) is used to prioritize all potential failures to decide upon actions leading to reduce the risk, usually by reducing likelihood of occurrence and improving controls for detecting the failure.

Applications
FMEA is most commonly applied but not limited to design (Design FMEA) and manufacturing processes (Process FMEA).
Design failure modes effects analysis (DFMEA) identifies potential failures of a design before they occur. DFMEA then goes on to establish the potential effects of the failures, their cause, how often and when they might occur and their potential seriousness.
Process failure modes effects analysis (PFMEA) is a systemized group of activities intended to:
[LIST=1]

  • Recognize and evaluate the potential failure of a product/process and its effect,
  • Identify actions which could eliminate or reduce the occurrence, or improve detectability,
  • Document the process, and
  • Track changes to process-incorporated to avoid potential failures.FMEA Analysis is very important for dynamic positioning http://en.wikipedia.org/wiki/Dynamic_positioning systems.

    Disadvantages

    FMEA is useful mostly as a survey method to identify major failure modes in a system. It is not able to discover complex failure modes involving multiple failures or subsystems, or to discover expected failure intervals of particular failure modes. For these, a different method called fault tree analysis


    is used.

    History

    The FMEA process was originally developed by the US military in 1949[/url] to classify failures “according to their impact on mission success and personnel/equipment safety”. FMEA has since been used on the 1960s Apollo space missions. In the 1980s it was used by Ford to reduce risks after one model of car, the [URL=“http://en.wikipedia.org/wiki/Ford_Pinto”]Pinto
    http://en.wikipedia.org/wiki/1949
    , suffered a fault in several vehicles causing the fuel tank to rupture and it to subsequently burst into flames after crashes.

  • Hello Graham,
    In my particular case, I still don´t understand very well how to evaluate the Severity, Ocurrence and Detection. My idea is, first assess the RPN for all functionalities that will be defined in FS and then will be tested in OQ.
    After reading your information, the ranges can be changed for others ranges for e.g: 1 to 3?
    I need to undestand very well the FMEA procedure for to build my SOP.

    A practical example:

    with my new ranges 1=low, 2=middle and 3=high

    My aim is assess the filed “expiration date”,
    S=3
    O=1
    D=1

    RPN=3 is correct?
    

    All information that you send me, will be of great help to me like examples or templates.

    Thanks very much!!!
    Carlos

    Graham provided a good overview of the process. What I normally see, though, is a multi-pass evaluation. First, identify all the risks and rate everything to establish the base RPN. Then, in cases where the RPN exceeds your pre-defined threshold for safety, identify mitigations that can reduce the RPN. Those mitigations form the basis for verifiable requirements.

    Regarding your question about ranges for each element of the RPN, yes, you can change them. I’ve seen cases where 1-3 are used and cases where 1-10 are used.

    Just to be on the safe side, you are getting inputs on the rankings from multiple team members, aren’t you? In other words, you’re not just establishing the rankings yourself? It’s important to get a diversity of perspective when establishing the rankings.

    I mentioned above the pre-defined threshold for when you have to take further action. This, also is defined by you. Most approaches I’ve seen lay out a grid with all possible RPNs and draw a diagonal line somewhere in there delineating the threshold. This is very simplified, of course, and you need to establish rationale for the threshold. You must prove that, for those risks initially above the threshold, the mitigations are effective in controlling the risk.

    You may want to pick up a copy of ISO 14971. Much more detail and guidance provided there for risk management in general.

    I think there is a little confusion here. The first and foremost thing to decide is what are you assessing for?

    In the validation business we have been instructed through GMP,s that we may vary the scope and intensity of our validation, providing it remains adequate.

    ‘Our’ risk assessment is the tool that we use to define this scope and intensity of such validation.

    We are not designing (where the FMEA is a great tool) anything, and neither have we any say in the formulations or the manufacturing process, these have already been subjected to many forms of review and risk assessment. We are charged with the simple task of documenting the most cost effective degree of validation that will ensure all company cGMP and regulatory requirements are satisfied.

    Alex Kennedy

    1. START
    2. COLLECTION OF PROCESS INFORMATION USING PROCESS FLOW CHART ETC.
    3. IDENTIFICATION OF POSSIBLE FAILURE MODES, THEIR CAUSES AND EFFECTS USING FAULT TREE ANALYSIS, FISHBONE DIAGRAM, 5WHY etc.
    4. EVALUATION OF PROBABILITY OF OCCURENCE, SEVERITY OF FAILURE & PROBABILITY OF DETECTION
    5. DETERMINE RISK PRIORITY NUMBER
    6. IF RPM GREATER THAN LIMIT, DEFINATION AND IMPLEMENTATION OF MEASURE.
    7. IF RPM SMALLER THAN LIMIT, THEN END OF PROCESS

    Yes, the value you assigned either 1-3 or 1-10 is not a mandatary one, but the more number you provide (e.g. 1-10) the more easire will be later stage of risk analysis, e.g. you have too many risks and you want to prioritize your risk. Giving only 1-3 will assign similar risks to risks of different weightage. But giving 1-10 will be more accurate on providing weightage to the risks.

    This will be helpful for you at the later stage of risk analysis. Suppose, you have too many risks and you want to prioritize your risk. In this case, if you have numbering system of 1-3 then you will find clusters of risks having same weightage. Say, P= 3, S= 2 and D =2 , RPN = 12. It will make you difficult to prioritize your risks for taking measures. But in case of 1-10 there will be less number of risks having same weightage and will make you easire to prioritize your risks for taking measures.

    Dear alex

    Pls assign the SPD values as follows and calculate the RPN number subsequently,

    Assigning severity as:

    1. High severity: Event that is expected to have a very significant negative impact. The impact shall have significant long term effects and potentially catastrophic short tem effects. (3)
    2. Medium severity: Event that is expected to have a moderate impact. The impact shall have short to medium – term detrimental effects.(2)
    3. Low severity: Event that is expected to have no impact. (1)

    Assign probability as:

    1. High probability: Event is perceived to be highly likely. (3)
    2. Medium probability: Event is perceived to be reasonably likely.(2)
    3. Low Probability: Event is perceived to be unlikely. (1)

    Assigning Detectability as:

    1. High Detectability : Event has direct impact on product quality and the
      impact is easily detectable (1)

    2. Medium Detectability: Event has direct impact on product quality and its impact might be detectable. (2)

    3. Low Detectability: Event may have some impact on product quality but
      difficult for detection (3)

    5.0 FMEA team shall calculate the Risk Priority Number (RPN) for each failure mode by multiplying Severity (S), Probability § and Detection (D) values.
    RPN = S x P x D
    From 1 to 4 :Low risk

    From 5 to 13 : moderate

    From 13 to 27 : High risk

    Let’s be clear that the method presented above is ONE way to quantify risk; not THE way. If a company can better manage risk by assigning numbers between 1…100 then they’re certainly allowed to do so. If a company has a different formula for calculating the RPN, then they’re certainly allowed to do so.

    (P.S. I presume your ‘high risk’ range is actually 14 to 27 - although it wouldn’t be possible, using your formula to get a 13, it’s always best to have clear delineation between categories.)

    Dear All,
    I think we all are on same grid but difference is in understanding & interpretation.
    As FMEA tool is applied specially to know potential risk of failure in quality, process, equipment or method, we need to assess risk based on severity of failure, occurrence & detectibility of failure. Here I would like to emphasize on detectibility. Magnitude of Detectibility needs to be considered before occurring the harm rather than detection after occurring the harm.
    So far as scale for assessment is concerned, I feel it shall be quite realistic to categorise harm/occurrence & detectability as follow.
    Very Low
    Low
    Moderately High
    High
    Very High
    We can very easily mould it in scale of 1 - 5 with reasonable rationales.
    Herein we need to consider scale 5 -1 for detectbility. If the probability of detection before causing harm is low, shall be ranked as 5.
    Further it shall be well justified to set acceptability level below 50% i.e. considering mean value of the scale adapted but it may have influence of criticality.

    Rgds
    Ravi Dhanbhar

    Gnerally no guideline routing about assigning of risk prioty number. All the guidelines advices to rate the siverity, probability and detectability to quantify the risk and to assign subsequent controls for reduce the risk. So every one assigning different values in systematic way to quantify the risk and to mitigate the risk. Generally RPN value deviding in to three classes as high risk, moderate risk and low risk. But if you assign more controls/ classifcation you can add other three more controls are very low, moderately high and very high. The basic concept of the risk management is to reduce the risk in the process or system. You can choose any value as per thier convenience. No one is wrong either you can choose value 1-3 or 1-10.

    Regards

    I agree with the views expressed. Our efforts should be to be arrive at precise risk factor which represent realistic magnitude of harm.
    It can be ranged from 1 to 3 , 1 to 5 or may it be 1 to 10.
    In gmp we know that too less & too more data leads to confusion hence appropriate volume of data should be considered.
    Rgds
    Ravi Dhanbhar

    When a post risk mitigation to be done…my question is after identifying/proposing the risk and control measures or after completing the proposed actions.

    Lucky
    Sr.Analyst.
    Quality Assurance