In the fist part, I talked about problems in general and chronic problems. Now I’m going to discuss about the other type, the emergencies or acute problems.
Acute cases are more critical. We are unable to fix the equipment until we identify the cause of the problem. This is an extremely tense situation, especially in fast paced environments. Usually after the most common repairing options have failed, technicians tend to start replacing every component in the system, sometimes all of them at once. This is highly ineffective and only complicates more the situation. Consequently, the most important thing in these cases is to adopt a methodical approach even if it seems to be slower. This approach is called Fault isolation Process.
To illustrate this case, I will talk about one of my experiences in this matter. Many years ago, a helicopter had a problem with the starter-generator in one of the turbines. The technicians replaced the component, but the problem persisted preventing the aircraft from flying.
I was the maintenance and engineering manager, and to assure the service availability, I coordinated a replacement for the helicopter with operations department to avoid canceling the flight task and gain some time. However, I needed to put it back in service because it was scheduled for the following day, and I had no replacement for that time.
When I arrived to the workshop, the technicians had already replaced almost all of the system’s components with no beneficial results. So, I decided to start from scratch.
Firstly, we analyzed the electric system’s diagrams stating hypotheses about possible causes and more critical subsystems. Then, I guided the technicians through a fault isolation process to find the problem. We systematically eliminated every possible cause from the “suspects list” until we found the problem.
We tested every part and component starting with the ones on the list made during the diagram’s analysis. Then, we eliminated the different areas of the wiring and installation, one by one, until we finally found a defective cable that carried a control signal from the starter to the control unit. It was located very deep inside the aircraft fuselage. Once detected, the technicians were able to replace it without any problems.
To summarize, in this case, the most important things to consider are:
- Do not replace all the components at the same time because you will not know what was wrong.
- For complex systems, keep notes of the performed actions, measurements, and spare parts replaced.
- First analyze the system schematics, in a group if possible, to understand the system and suggest possible causes.
- Start testing the systems that are more likely to be the cause of the problem.
- Do not overlook simple causes like power sources, switches, cables, plugs, etc.
- Before discarding a possible cause after a test, be sure that the test eliminates all doubts about it. Otherwise if you cannot find the problem, you will have to start over again.
- Work fast but effectively. Sometimes operational pressures are extreme, but try to look the big picture and realize that ultimately a methodical fault isolation process will be the most time-effective option.
What are your experiences in fault isolation and root cause analysis?
Which was your most difficult case?
Do you have a different approach?
Please share your viewpoint in the comment section!
Thanks for reading!