Single Point of failure
What it is (SPOF): - Any Person, Facility, Equipment, Software or any other resource for which no redundancy is available and if that goes down, the process or system dependent on that resource will stop totally. In such cases, the daily function of that process or system is hampered.
Method to identification of SPOF
Common sense can be used to find such failure modes. Otherwise, detailed risk assessment can be performed to do so.
Risk assessment tools like PFMEA may help to find such all-failure points in any process.
During the design phase itself DFMEA can be done.
HIRA can be performed to find operation related failure mode.
Hazop is again a widely used tool in risk assessment in process industry.
With all these tools most of the SPOF (Single point of failure) can be found.
Methods to Mitigate such risks
Found all SPOFs must be classified in Severity, Detection and Occurrence to priorities to implement Mitigation plan.
For the mitigation of such failure below action can be taken
Create a redundancy plan in case of such failure. Support from an internal or external team can be taken for this.
Create a buffer of such part.
Training in hiring of people must be done if SPOF is a person, whose skill is mandated to run day to day operation.
Flexible layout of the system can be done to eliminate the point so in case of failure other available systems can be used.
Example: -
Taking a case of manufacturing
Out of all machines some critical machine or part of machine are available 1 or 2 only and if that fails whole manufacturing stops.
Malfunction of Data base system can stop all machine internal working.
Some specific tasks can be done by X person because he is having only training of that task, or one else is interested in same.
Air (Utility) is critical for some Manufactuing operation, failure of which can stop the plant.
To mitigate the same
Critical machine components must be procured and stored in inventory.
An additional standby server should be available in case of any issues faced.
Hiring or training existing staff must be done critical task.
Standby Air compressor can be installed to mitigate such failure of Air pressure.
Put system such that before occurrence of such failure, it can be found if cannot be implemented.