How To Prevent Unexpected Downtime In Heavy Industrial Operations

Unexpected machinery downtime is one of the biggest concerns for industries of all types, but particularly for heavy industry operations where machinery and equipment are put through a lot every day. In many instances, such machinery must operate in extremes of temperature, pressure, humidity, and dust, with numerous moving parts subject to repeated impact and vibration.

Table of Contents

The real cost of running assets to failure

Reactive maintenance – waiting for something to break before acting – is still common in heavy operations, often because the upfront investment in alternatives feels harder to justify than the cost of an emergency repair crew. The math on that stops working the moment a critical asset goes down mid-shift and takes a production line with it.

Precision installation isn’t optional

Poorly performed maintenance routinely shortens asset life, and the beautiful part is that the data needed to diagnose it is almost always already accumulated in the system. The potential of predictive maintenance to avoid machine failure is centered on analyzing magnetic, thermographic, ultrasonic, and vibration data for signals that a failure is about to occur.

The technology exists to analyze the data for hundreds or even thousands of machines and report back that six of them are a belt replacement, one is a bit of misalignment, two are bearing replacements, and one’s just operator error. Fix the belt, the alignment, and the bearings, and coach the operator on the appropriate response. Voilà, the health of the machine improved, six months of life restored to the belt and the bearings, and you got a free, early warning for the operator.

But some number of companies that invested in predictive maintenance – let’s be kind and say 60% – dropped it within two years because “it didn’t work.” They were half right. The high middle ground of analytics didn’t deliver those results. They and their vendors failed to understand the high-level package of work required to extract values from the new parts of predictive maintenance that they hadn’t mastered before.

A new file folder doesn’t make you better organized and a new camera doesn’t make you a better photographer. A predictive system doesn’t get you the answers you need unless you know the questions to ask. That generally required access to outside expertise, for tens of thousands of machines more likely a few external engineer-years of expertise.

Know when to bring in external specialists

There are plant assets that just shouldn’t be overhauled in-house, and centrifugal pumps are an excellent example. Internal components like impellers and wear rings cannot be successfully rebuilt in a maintenance shop. Made from hardened metal, these parts require the tightest tolerances, with access to both precision manufacturing equipment and calibrated, specialized balancing equipment. Environmental testing baths are critical to establishing that the pump meets all design parameters as well.

For the warehouse and processing manager, cavitation damage can also look a great deal easier to address than it actually is. We can’t see the damage dynamically that occurs due to cavitation and the use of light and magnification to find it is quite common. The problem is often underestimated and while it can persist in unmanaged equipment, poorly rebuilt pumps can actually create new instances of the condition by introducing out-of-spec parts.

When a fluid-handling system is displaying signs of internal wear, operation managers are going to limit risk by going direct to certified industrial pump repair specialists who can answer the question “Can this pump be economically returned to OEM specifications and how would you go about it?” If they are not sure, you shouldn’t be.

Operators are your first line of detection

Reliability engineers are the ones who design the monitoring systems. The operators are the ones who have to live with the machine. There’s a key difference between those two roles: Catching early warnings and getting directly supplied with early warnings.

Autonomous maintenance, which is the part of the TPM philosophy that trains floor operators to perform small but critical equipment checks during their daily rounds (and more importantly, log small but critical changes – a few drops of a fluid, a noise on startup, a 10-degree heat increase on a gearbox housing. None of these are “constellations of ambulances” as one of our industry clients put it, but they are all potential data points).

The importance of this methodology cannot be overstated. One of the most useful components of a properly tuned CMMS is that it can tell you two years in the future what parts you should order and even gives you plenty of time to shop around for the best price. But to get that kind of helicopter view of your maintenance department, you must have granular, useful data from the people who stand by the equipment for eight hours a day.

Don’t skip root cause analysis on unexpected failures

Replacing a failed component and putting an asset back in operation before conducting a Root Cause Analysis is one trigger to a repeat failure. The component failing is not the problem, it is the symptom. The problem could be a pipe strain causing a pump casing to take load, fluid contamination leading to early internal failure, or a systemic process issue that has been slowly destroying the component for months.

RCA doesn’t have to be a long, drawn-out bureaucratic task for every failure. It needs to be part of the DNA enough for someone to ask the simple questions: what changed, what condition allowed the failure to occur, and does that condition still exist? If the answer is yes to the last, then the repair must be a waste of time.