With increasing cyber-attacks on the control systems responsible for our critical infrastructure, it has become more urgent than ever for process manufacturers and utilities to understand their specific risks to ongoing operations and take countermeasures.
Proper cybersecurity protection is, of course, part of the answer. According to Pete Diffley, a veteran automation specialist responsible for the uptime of manufacturing facilities for products ranging from clean water to contact lenses, true operational security requires more than just perimeter protection. Today he leads global partnerships for Trihedral’s VTScada, developer of Human Machine Interface (HMI) and Supervision Control and Data Acquisition (SCADA) software. Control met with Diffley to better understand why access control is necessary but not sufficient, and what kind of mindset it takes to address operational risk holistically.
Q: How have cybersecurity threat vectors changed in the recent past?
A: In recent years, we have seen a dramatic increase in the number of ransomware attacks targeting critical infrastructure systems, with research suggesting that one in five of all ransomware attacks today target industrial companies. Ransomware represents an escalation of the cyber threat landscape because it is not just access to sensitive information that is at risk. When an Operational Technology (OT) system is successfully hijacked, profitable production or continuity of clean water or electricity can be disrupted. Of course, paying the ransom represents another potentially significant financial impact. Today, not only are sneaky hackers at work, but also large and very profitable criminal enterprises. And now that there is clear regulatory guidance, there is a very real possibility that individuals who fail to prioritize the necessary steps to stop a ransomware attack could face jail time.
Q: What kind of actions are government agencies recommending in response to these threats?
A: Just this summer, the Cybersecurity and Infrastructure Security Agency (CISA) and the National Security Agency (NSA) released a joint advisory that acknowledged the growing ransomware threat and reaffirmed many of the best practices already in place. This involves first creating an accurate “as operated” OT network map immediately, and then taking prudent steps to harden those networks. The third is to create an OT systems resilience plan to understand and assess the risks to OT assets so everyone knows what steps to take when things go wrong – all with a view to mitigating them Risks in order of priority and consequence. Fourth, the importance of implementing this incident response plan is important. And fifth is the implementation of a continuous event monitoring system.
Q: External cyber threats are not the only risk to business continuity. What other types of activities should be part of a resilience plan?
A: The National Institute of Standards and Technology (NIST) defines operational resilience as “the ability of systems to withstand, absorb, and recover from, or adapt to, an adverse event during operation that may cause damage, destruction, or loss of service.” ability to perform mission-related functions.” It is a broad definition that can include the failure of a non-redundant critical device due to more insidious causes. A thorough resilience plan should challenge the notion that the complex systems of systems that make up industrial processing units, and the human operators who oversee them, can always be relied on to behave in predictable and reliable ways. You plan how to respond to the possible, not the probable.
Q: The term “trust” is used in the context of cybersecurity, but often only about allowing a specific device or person access to a network. Should it go further?
A: I definitely believe it should. I just happened to be chatting with my teenage son the other day about how likely it is that a certain physical symptom is indicative of a more serious underlying condition. “Twenty percent,” he replied quickly, citing a presumably trustworthy .org site he found with a quick Google search. I replied that not everything posted on the internet (even on non-profit sites) is factual, and some things that people who post believe to be true may be due to underlying motivations such as business interests, money or simple expediency.
Take the rn4J vulnerabilities, for example. Because Log4j is a Java-based library with open-source logging capabilities that developers routinely embed into larger application software, its vulnerability prompted a fresh reexamination of the chronic need to better manage software development supply chains. An industrial application software can contain hundreds of third-party components – any of which may become unlicensed or unsupported at any time. And if you read the license agreement before ticking the “I agree” box, you might be surprised to learn that the solution provider to whom you paid that hefty license fee has little or no responsibility – even if it is itself one of these third-party software components will fall victim to a malware attack.
The expediency that motivates the use of tens or hundreds of pieces of code from other sources is an example of a mindset that is not influenced by a desire to build the most resilient software possible, but the fastest, cheapest solution possible works effectively – at least for now.
In contrast, Trihedral has long resisted using third-party code in our VTScada software. We also use development processes designed to ensure the quality and security of our software releases. This means extensive testing to catch problems before they are pushed to our users, including design reviews before we start coding and coding reviews by more than the person who developed it. Those processes were also key to the company’s rapid certification to the International Electrotechnical Commission’s 62443-4-1 cybersecurity standard for industrial automation and control systems earlier this year.
Q: How well prepared are today’s OT systems to defend against resiliency threats that sometimes appear like wolves in trusted sheep’s clothing?
A: When it comes to ensuring resiliency, it’s important to be able to examine the behavioral side of the OT environment, but not in an overly onerous way – and that means devices and systems as well as human operators. We need guard rails that prevent otherwise familiar systems from stepping out of line. Artificial intelligence can be used to monitor and alert when devices or systems are suddenly behaving or communicating strangely. From an operator perspective, just as we use code reviews other than those who wrote the code, setpoint changes that fall outside of the established guard rails may require input from multiple operators. It is a sanity check to ensure that human operators, like our systems, are also up to date when making decisions.
In the meantime, one of the most important opportunities to read is the end user license agreements (EULA) of the industrial application software that you use to control your process. The devil is in the details. How dependent is the functionality of third-party software components? Can it be used in a critical process at all, or has it been “waived”?
Some software vendors feature a cool wall with logos of companies using their software to show that their software is used by these big brands – hence “of course it will work for your application”. However, is this the case, or is it only deployed in a non-critical area that can tolerate downtime?
A few years ago, as an end user, I took part in a large evaluation project that compared a number of well-known industrial software packages. I had the opportunity to ask various end users if they would use their chosen software in a “critical environment where their process depends on it”? Almost everyone said no – everyone except the users of VTScada, who all answered yes, absolutely.