By Morton Swimmer and Rainer Vosseler Trend Micro Research
Only a few decades ago, logic in the devices and systems we used was implemented as mechanical actuators. If a train needed to be told to stop, a person would pull a lever to switch the signal to ‘danger’ (meaning stop) via cables or rods. If a train that was too high headed into a tunnel that didn't have enough clearance, it would smash a tube of mercury that would interrupt the flow of electricity to that track. (This can still be seen on the Northern Line in London.) The logic was relatively easy to understand and verify.
More recent systems used electromechanical components to manage the complexity of growing systems. Discrete logic circuits were used to handle more complex situations. If relays were used, the click that they made confirmed to the operator that their actions were executed. The logic could be verified with some persistence.
Now, the use of mechanical or electromechanical logic is rare. The demand for better performance and efficiency has led to the need for much more complex logic. And to implement the logic of these mission critical systems, manufacturers turn to microprocessors. These provide much more flexibility and can be reprogrammed much easier than discrete logic ever could.
We are now at the point where we own devices that each contain many microprocessors. A typical smartphone will have at least one processor for the radio part and another for the computer part. But many of the sub-components in the phone are also driven by very small microprocessors. A modern PC contains microprocessors in places like the memory or disk storage.
The continuous evolution and widespread adoption of sophisticated software is helping many industries become more efficient and productive. It allows users and enterprises to connect, create, and achieve more than what was previously possible — so why is this a problem?
There is no well-defined fail-safe state
One striking piece of evidence showing that we need to be concerned about software is the considerable number of vulnerability tickets issued by the US National Institute of Standards and Technology’s (NIST) National Vulnerability Database every year.
This observation is not helped by the fact that on a technical level, we have lost the ability to verify the correct and safe operation of our logic by moving to microprocessors. Unless it is implemented externally, there is no well-defined fail-safe state for a software system. A train signal will default to ‘danger’ if the cable breaks, utilizing gravity to drop the semaphore and create a safe condition. A software system is much more complexly structured, and will not default to any well-defined state if it fails.
The problem stems from the fact that microprocessors are Turing-complete, which means there cannot be any absolute statements about the software running on it. This gives these systems power and flexibility but also prevents definitive actions about security and safety, like reverting to a fail-safe state.
Critical problems are being patched with software
Using software to help operate a machine, patch an issue, or maintain stable operations during an emergency is not new. Most modern fighter jets are designed as inherently unstable, and require software that helps pilots fly them safely. And for years, mechanical failures or malfunctions have often been fixed with a software update.
This is remarkable in some ways — but the idea that we can fix large pieces of critical machinery with software should give people some pause. Numerous industries rely heavily on software to operate efficiently and produce competitively, but some are more critical than others. Failed updates in some industries could actually create life-threatening situations — transportation is one such area. The aviation industry has some of the strictest standards in the world when it comes to software security; however, it places a heavy burden on a system that cannot foresee all situations.
It is an epistemological fallacy to believe we can understand all edge cases that need to be handled in software. The question that complex systems designers have to ask themselves is, what will happen if the software component completely fails? Will that lead to a catastrophic situation or will it be manageable?
There are possible risks below the microprocessor
Modern microprocessors implement a lower level of programming than the machine code layer that all software is implemented in. This is the microcode that is often very chip specific, but allows the machine code implementation to be modified should bugs be found. The infamous Spectre and Meltdown were patched by modifying the microcode. There is some risk involved in that if the patch goes wrong, the microprocessor is essentially broken. Servers cannot arbitrarily be rebooted as valuable processes run on them; as such, these patches should not be taken lightly.
The possibility of modifying microcode means that even the chips we call microprocessors can be considered software. And there has been a recent uptick in interest in understanding and perhaps reprogramming microprocessors. Another class of processors exist that can be programmed as well. These are application-specific integrated circuits (ASIC), which can be programmed once for a specific task; field-programmable gate arrays (FPGA), which can be reprogrammed at will; or digital signal processors (DSP), which are used to process analog signals.
FPGAs are increasingly used as specialized co-processors for image processing, filtering network data, or other high-performance tasks. Entire, if simple, CPUs have been implemented on FPGAs. Future devices will not be a series of generic reprogrammable chips, but in some applications it may make sense. Also, a particular form of DSP has become prevalent without anyone really noticing. To deal with variances in radio standards, mobile phones and other devices implement the radio module as a Software Defined Radio (SDR) device so that a single module can be used across all norms with simple reprogramming. There are widely available USB dongles for DVB-T TV reception based on a Realtek chip that are, in fact, broadband SDR devices. Many projects have used this very inexpensive dongle for receiving wireless signals that would otherwise have required a much more expensive kit.
As these different programmable processors become more widely known and available, and interest in them grows, they can more readily be used for malicious attacks.
Manufacturers use software to manipulate systems
Another consequence of software being in charge of systems is that it can be programmed by those in control to perform differently in various situations. There are manufacturers across many industries that push software updates on their clients, sometimes using necessary means but in ways that clients do not appreciate.
One example of this is planned obsolescence, when companies manipulate devices and enforce an early retirement. This can be done by designing the product to break after a certain number of years, ensuring that the customer buys another. This has been recently done using software, by preventing a device from upgrading to the latest software, or by controlling device functions like battery life.
Verification of software is generally difficult to do. As evidenced in the above examples, if those controlling the software are trying to control or hide something, they can do so very effectively. This wouldn’t have been possible during the days of vacuum tubes and push-rods.
Manufacturers own software and access to it — not users
Many high-end cars tend to come with more hardware installed than actually requested by the customer. This allows features to be updated or added by reprogramming the software — very often this is just a register on the controller area network (CAN) bus that needs to be set — and that feature is available for a price. Once this became known, tuning companies and web sites offered services or advice on how to enable unpaid-for features. Recently, the car manufacturers have fought back by making it more difficult to enable such features. Still, the practice remains that cars can essentially be reprogrammed by changing their feature sets.
Farmers like to be self-reliant and are well known for modifying their equipment to suit their needs; but modern farm equipment repairs are often exclusive to manufacturers. As owners of this equipment, farmers would like to continue doing what they have done in the past: fix and modify what they own. Many forgo traditional repairs, and look for discreet ways to decode CAN buses. This allows farmers to modify these computers-on-wheels that they use on their farms while they fight for the right to repair.
The real issue is one of ownership. If you buy a vehicle, you expect to own its entirety, but the manufacturers have other ideas. Car manufacturers might include hardware or features in a car, but limit the right to access it unless customers pay for it. To control that access, they use software. Legally, this is a question for the courts: what is ownership in a world of software and hardware hybrid devices. Human expectation is typically if you pay for an object, you own the entirety of it.
Assuming users own the entirety of a vehicle, what are the consequences of ‘hacking’ it? Could a remote start hack (nominally to preheat or cool a car) without a location sensor lead to CO poisoning if in an enclosed space? Often, there are reasons for not enabling features even if the hardware is present.
Conversely, features can be removed. Many corporate software control systems have the capability to remove software from desktops and, of course, servers. The loss of a feature can often be disturbing — you don’t want to have to learn that your car no longer auto-brakes while you are driving.
Updating software can be troublesome and costly
For a variety of reasons, software must be updated. In some cases, software gets updated to enable promised features that didn’t make it in time for shipping. It is now common to unpack some device, connect it to the internet and then download an update to get the features. This type of update doesn’t always go well — features may not get released until much later than expected, or the features simply don't work as promised.
Changes that are implemented through updates can also result in unexpected or unwanted behavior. Another example: initial fixes for the Spectre bug on some architectures led to performance degradation. Besides the CPU performance hit, there were also storage performance issues in some cases. This, in turn, may have effects on the performance of some complex systems as the data retrieval rate suddenly plummets.
In general, devices that can be updated automatically are beneficial to users. Many bugs have been found in the software of devices that are generally better fixed than left unpatched. The dangers are that an update can render a device unusable. This is often the result of a badly thought through update plan, or some variance in the hardware. It is not uncommon for bad (or counterfeit) chips to get integrated into a product, giving them different behaviors that could lead to bricking.
The security of the update process needs to be well designed and properly orchestrated. If there are breaking changes, the order by which a client and a server get updated needs to be considered. Also, a large corporation may not be able to handle all devices updating themselves at the same time. In particular, servers are often difficult to update, but are often the prerequisite to updating clients.
Modeling these dependencies and then getting the update order correct is nearly an insurmountable challenge. Even more so, server updates in the DevOps world are no longer absolute but often use ‘canaries’ to trigger a roll-back to the previous state if some issue pops up. While this is a mostly sound strategy, it makes creating a plan and then believing it will be executed difficult.
Moving to a Software-Defined-Everything model means every professional will need to think like a highly capable programmer
Much of the world is moving to a Software-Defined-Everything model. DevOps probably started this movement. The move to cloud computing brought with it the concept of programmatically defining and deploying system architectures: This ranges from deploying virtualized servers in an Infrastructure as a service (IaaS) platform and virtualized applications in a Platform as a service (PaaS) platform, to virtualizing the actual applications in Software as a service (SaaS) for a fully serverless architecture.
While it is possible to manage these through a web console, the complexity usually requires some form of code to deploy and manage these architectures. This is on top of the code that is deployed onto these cloud services. The result is a cloud provider code that provides services to customer architecture code that deploys and manages the application code. On top of that, cloud providers also allow some form of software defined networking to manage the networking between the components of the architecture. So, in the DevOps sense, cloud architectures are software defined all the way through, allowing for rapid adaptation to new circumstances with a few lines of code. Even more advanced are systems that reorganize themselves so that an architect’s job now requires setting up this machine learning system to manage the architecture.
Software defined networking has already been deployed across different industries and these developments build the cornerstone of a software defined system. The next generation of telco architecture goes so far as to mandate this and includes orchestration as well. Thus, a 5G system can also adapt rapidly down to the architecture level if demanded. In the industrial sector, as factories evolve and need to become more flexible and efficient, these will also become more software defined. In a limited fashion, machines can already use multiple computer numerical control (CNC) programs, but the future lies in much more flexible programming that adapts to the tasks and circumstances at hand. In the future, the factory manager’s job will be much more like that of a DevOps engineer, using software to dynamically manage the workflow of physical objects.
In the very near future, every professional will need to think like a highly capable programmer. The programming ‘languages’ may be different and targeted to a particular domain, but nevertheless all the weaknesses of software development will become a part of our work-lives. These professionals will need a full understanding of their domain and software development — expertise that will need specialized training.
The consequences of so much software
We have come a long way from the simple logic that a mechanical device uses. We have reprogrammability across different levels of a complex system, from the components on a device’s board and its microprocessor to on-premise and cloud systems. This has undeniably resulted in increased efficiency and productivity that could not have otherwise been achieved. But, with the pros come the cons:
There is really no way to verify that the logic in modern systems is safe and sound. Moreover, when something goes wrong, it’s often unexplainable as the complexity of the system defies analysis.
Consumers have become used to automatic updates and now expect better functionality over time, but these can lead to disruption and interoperability issues — as software is deeply embedded in enterprise operations, a mismanaged update can cause significant problems.
The increase in software also means an increase in the attack surface that an enterprise or user has to deal with. Unfortunately, some may lack both the visibility into these components and the tools with which to monitor and protect them.
Implementing functionality in software also opens the door to users who just want to modify the functionality of the devices they own. Through published discourse, information on the inner-workings of these systems become known, not only to the owners, but also to malicious actors.
How do we deal with these issues?
In our everyday lives, we rely on technology — software — that is not very secure or even completely mature. Security researchers are constantly finding a considerable number of vulnerabilities. And vendors will often ship the beta version of a product knowing they can update it once it’s in the field. Instead of rigorous testing and then sealing off a device against further manipulation, we have embraced the model of continuous updates; unfortunately, a device that can be updated can also be modified for malicious purposes as well.
With all this in mind, having a proper defense is essential for those who use and are reliant on software — managing updates and proper patching are good first steps, and multi-layered security is also a must.
The way forward is complex since software is deeply integrated in users’ lives and enterprise operations. With critical systems, reliance on software is not a good idea. A device should always be designed in a fail-safe mode so that the failure of the software will revert the device into a safe, if inefficient or less comfortable, mode of operation. In other words, it should be possible to operate a machine or vehicle safely, even if the software totally fails.
Although it would equate to additional costs for the manufacturer — and the end-user — redundancy measures for critical systems should be standard. When there is a serious fault, the default mode should be safe; and if it can’t be secured through mechanical means, then more software redundancies have to be put into place.
Professionals should get used to the software-defined-everything model — as industries adopt more software into their workflow, the workers' knowledge base has to grow as well. Along with keeping up with operations, having a full understanding of the software systems will help in securing and protecting them.
Like it? Add this infographic to your site: 1. Click on the box below. 2. Press Ctrl+A to select all. 3. Press Ctrl+C to copy. 4. Paste the code into your page (Ctrl+V).