Andrey Kostyukov, President, DYNAMICS Scientific Production Center USA, Inc.
It’s well-known that the primary costs and losses in petroleum refining come from sudden failures of equipment. A distributed control system (DCS) supplies comprehensive information on the process and operation modes of the equipment. However, its health during operation escapes the attention of operators who continuously change process modes while controlling the process plant. For sustainable and reliable operation, it’s necessary to provide operators with timely, unbiased information on how the operation mode affects the equipment’s health before a failure when early diagnostics of defects allow people to eliminate future issues at its emergent stage. To answer what benefits digital reliability could bring for a petroleum refinery, let’s consider a difference between reliability and digital reliability, what diagnostics and early diagnostics mean, what the differences between protection, condition monitoring, and diagnostic system are, and, finally, how 24/7 real-time diagnostics adds value to every processing facility.
The conventional approach to keep the reliability of machinery at a certain level hasn’t changed over the last 70 years. It‘s based on three primary pillars (Figure 1): a safety factor, which is the ratio of a machine’s structural capability to actual applied load; operational discipline of personnel, which is always striving to do the right things by the right ways; and, failure recognition capability, which is an ability to predict failure before it happens. At the beginning of the reliability era, the primary method for extending reliable uptime of machinery was increasing the safety factor; however, significant over increasing of the structural capability to actual applied load led to rising costs of machinery which negatively impacted consumer’s interests. Customers required cheaper machines and agreed to substitute excess structural capability of
machinery for the better operational discipline of staff. Since the 1950s, several approaches to providing the reliability of machines were developed and implemented, such as preventive maintenance, proactive maintenance, reliability-centered maintenance, and others; but, all those approaches have two common issues – poor statistics for machinery lifespan and the human factor. To confine the influence of those factors on reliability, since the 1970s, the main focus moved to increasing a failure recognition capability. Until now, scientists, engineers, and practitioners attempt to point out an instance of a machine’s failure to be able to prevent its breakdown using different instruments and methods. We could name that era “Analog Reliability” because people strived to increase the reliability by evolving every one of those pillars and making them stronger, clear, and more precise.
“Digital Reliability” emerged at the beginning of the 1990s when PCs were used for reliability tasks for the first time. Since that time, computers play a significant role in the evolution of reliability theory and practice because of several reasons: they compute complex functions very fast, process large amounts of data, and deliver information, for better decision making, to different places and to many people simultaneously. The advantages of digitalization are obvious. If everybody uses unbiased information about a particular situation, nobody needs to guess. No one assumes, considers several scenarios, calculates their probabilities, and, finally, makes the most favorable decision which in some cases might be wrong. It merely doesn’t make sense. However, to be able to do that, it requires creating an infrastructure for collecting useful data from relevant sources, processing data to excavate information from it, and sharing the information among proper audience timely. We only recognize the data as useful if it contains the information about the object we are interested in. Furthermore, the sources of the data should be trustworthy or in other words, they must be directly related to a described object. In the analog era, we had a lack of data to be analyzed because the data gathering was very expensive. Nowadays, we can collect big data at a reasonable price but unless we are able to get proper data and process it promptly, we still have the same lack of information we had previously because the return on investment is not expected. Moreover, even if we extract valuable information from trustworthy sources, we cannot obtain benefits unless we use it in making timely decisions. Thus, the most important advantages of Digital Reliability are the ability to extract preventative information from Big Data and share it in a timely manner to everyone.
Let’s consider what’s the difference between machinery diagnostics and early diagnostics. Diagnostics is a process of identification of malfunction that consists of a few steps. The very first and most crucial step is collecting data about machinery health which contains the choices of data type, data sources, data volume, frequency of data gathering, and tools to do that. Any mistake at this step severely affects the diagnosis making it either wrong or late. The curve shown in Figure 2 represents the dependency of cost and losses by fault detection. Within its lifespan, the equipment undergoes three stages of degradation: non-linear wear, exponential wear, and critical wear. These are divided on the curve by the red dots. The process of diagnostics has two distinct errors – an error of static recognition and an error of dynamic recognition (Kostyukov & Kostyukov, 2009). The error of static recognition appears when a failure cannot be detected because either the wrong non-destructive testing (NDT) method is utilized, or the proper NDT-method is used in the wrong way. The error of dynamic recognition appears when a failure cannot be detected and prevented because an interval of monitoring equates or exceeds an interval of defect evolution from emergence to failure. The later a defect is identified, the greater the consequences and expenses are. Therefore, it is critical to identify the instant a defect begins because it is the best opportunity to intervene and prevent a serious accident and possibly a shutdown due to equipment failure. While diagnostics itself is a process of defect detection, early diagnostics is a process of defect prevention.
Existing solutions such as protection or condition monitoring systems are focused on confining consequences, not preventing breakdowns. DCS merely shows the failure, at best correctly starting the emergency shutdown to minimize the consequences of the equipment failure. It does not identify the fault at an early stage of degradation and take urgent actions to eliminate it. If the protection system alarms, the breakdown happens. Even new AI software solutions work at the end of the exponential wear stage because the AI analyzes the same parameters that protection, condition monitoring, or control systems measure. Also, almost all condition monitoring systems require an expert to interpret. Without that interpretation, the operator cannot understand what should be done to prevent an impending failure. This is the main reason why most breakdowns and accidents happen on the night or weekend shift; the expert is not on duty at the facility. To not only prevent failures but even repairs and maintenance, we need to identify the emergence of issues in the non-linear wear stage, when destructive forces are just beginning to degrade the equipment. Operators should play a crucial role in the reliability of the facility because they are at the facility 24/7. Particularly operators need precise, proven, and timely information about what should be done to recover equipment health. With the COMPACS system diagnosing the health of every piece of equipment in a facility in real-time and delivering precise, timely prescriptions to operators 24/7, an operator doesn’t need to be an expert in vibration analysis to know how and when to react. If operators identify and eliminate these destructive forces at this stage, safe and reliable operation will become a reality. It’s essential that only real-time diagnostic systems, which have low errors of static and dynamic recognition of defects and which are able to identify destructive forces influencing on the emergence of defects can shift operation reliability and maintenance efficacy tremendously towards paramount safety and uptime, thus, disrupting the existing mechanism of conventional relationships among operators, the maintenance team, and management (Kostyukov, 2019a).
The real value which the real-time diagnostic COMPACS system brings to a petroleum refinery comes from several sources. First, it prevents accidents and shutdowns. According to analysis of several trustworthy sources (Kostyukov, 2019b), in 2017, an average amount of annual refinery losses due to unscheduled shutdowns was approximately $150 million dollars, thus, by transforming sudden defect in machinery into the gradual ones, refiners can avoid most of the incidents and corresponding losses and save at least $100 million dollars. Another source of value is extended uptime. By decreasing turnaround’s outage and extending average uptime from 91% up to 99%, an average refinery could bring roughly $100 million dollars of additional profit. Finally, an average US refinery spends over $9 million dollars on the maintenance annually, which could be reduced significantly. By totaling all figures mentioned above, the total amount of additional profit that comes from the COMPACS system implementation at an average US refinery could reach $200 million dollars annually. However, to be able to reach a sustainable value from 24/7 real-time machinery diagnostics, it requires a paradigm shift of reliability management. In Figure 3 you can see 3 information loops which are provided by the real-time diagnostics COMPACS system. The system identifies malfunctions of every piece of equipment in the facility every three minutes and sends precise and timely information to operators because they are the only people in the entire facility who could prevent degradation in the emergent stage. This is the reason we promote the operator’s driven reliability approach. When prescriptions are received, they need to take urgent actions to improve the equipment health and reliability that could be done by themselves, calling the maintenance department or scheduling contractors. The second information loop is for the maintenance team who is
definitely involved in reliability. Using unbiased information from the COMPACS system, the maintenance team could be prepared to perform a very complex repair in a very short period of time or schedule repairs according to the condition of the equipment. The third loop is for management staff. They receive unbiased information about machinery health, the operator’s involvement in reliability and the quality of maintenance performed which supplies transparency and makes the benefits mentioned above achievable.
Summarizing the aforementioned information, it would be necessary to list the bullet points of Digital Reliability which is based on 24/7 real-time machinery diagnostics. In order to reach desirable outcomes in safety and uptime, Digital Reliability requires proper infrastructure for collecting useful data from trustful sources using the SCADA structure, the physics-based Artificial Intelligence which operates by invariants to extract information on time, and a plant diagnostic network for simultaneous information delivery to all levels of decision making from operators and engineers to managers. The COMPACS system is the only one that meets those requirements and assists refiners to make the reliability paradigm shift to implementing the technology of Safe Resource-saving Operation and Maintenance of Machinery. It brings timely and objective information about machinery health, operator’s involvement in reliability, and maintenance quality. Moreover, it provides financial benefits which exceed investments in the solution by at least ten times, provides confident safety of process operations, and generates prosperity for all who are involved and surrenders.