The SolarWinds Hack – A Much Bigger Problem

The SolarWinds hack is a big deal not just because of hack itself, but because it represents an example of a much larger problem. A lack of robustness in today’s network and security architecture. Professor C. Emre Koksal explains why this is a systematic problem that will repeat itself if we don’t change the way we think about cybersecurity.

Hello, this is the founder and CEO of DAtAnchor. I’m also a professor of electrical and computer engineering at the Ohio State University. Today I’m going to talk about, SolarWinds and beyond the attack, but rather than giving the details of the attack, my objective is to classify solar winds as a part of broader category of attacks and talk about that category and what we can do about it and, general precautions against the next SolarWinds-like attack. Okay. I’ll make some predictions and I’ll give some advice.

Whenever we have a technical discussion, it’s great to start with a model. Because we can understand the nature of the problem using models. The ideal model is simple because we can handle it. At the same time deep so that we can draw insights. It shouldn’t be shallow. The problem with cyber security is, it’s not like nature where we can have a good handle, and come up with a model which holds in a robust fashion. In cybersecurity we have to talk about the attacker and the model is dictated by the attacker. Because an attacker is typically a human or a group, the attack strategies are from a continuum. They are, we’re gonna continue very broad. What we do as a result is we always focused on narrow classes. It is unfortunate it’s perhaps, the hurdle, in coming up with a general approach to solving a broad set of problems, because we have to narrow down the attacker model. So today what I’m going to do is I’ll start with a relatively broad set of attackers all keep the attacker model broad. Obviously as a result, come up with some general guidelines, some general outcomes. I’ll keep the model very non-technical and I’ll keep the discussion at as non-technical a level as I can. Let’s start with that simple and broad attacker model.

I’m going to start with the objective, the attacker objective. We are going to focus today on the class of attacks in which the object of the attacker is data exfiltration. There’s an organization or group of organizations, whatever. The objective of the attacker is take the data that’s worth stealing from the organization. The organization to begin with has data that’s worth stealing. This is a very important class of attacks pertaining to the objective of the attacker. By far the highest financial losses are caused by data exfiltration attacks. In major organizations this leads to loss of trust and eroded integrity. In smaller organizations this is existential. If you lose sensitive data, it is existential, to the small to mid-sized businesses. Next I’m going to talk about the attack mechanism. When I say attack mechanism, I’m referring to the data exfiltration attacks and in general, such attacks happen in the following way.

Here we have an attacker, let’s call it Eve. In the rest of the presentation, I’ll refer to the attacker as Eve. There’s a network, it’s an organization network. Eve does some investigation and finds a point of entry. So this is the first parameter. There’s a point of entry. There needs to be a point of entry. Once the point of entry is found, an attacker is able to do lateral movements to come up with a credential. This credential may belong to a user or a process. The credential is capable of accessing data. If the data is encrypted, the credential that Eve targets is capable of accessing the keys, encrypting the data. So that’s important. What effectively Eve does in this attack mechanism after propagating in the network, is it levels itself with Alice, so Eve equates to Alice. Anything Alice can do, Eve can do.

The leveling is the major step towards success of an attack from the point of view of an attacker. The next one is access to plain text. Once plaintext is accessed, data can be exfiltrated. These are two really critical points about a successful network attack. So solar winds is one such attack. How did it happen? SolarWinds has a product called Orion. Orion is IT resource management. It basically enables IT to do all sorts of visualizations and monitoring of their network. It’s a highly capable product used by 33,000 organizations. These are really major organizations including a state department, Department of Homeland Security and, global private companies, such as Microsoft and so on and so forth. So it is a fairly broad product. Once you have it, once you have access to the output of Orion, you can monitor the network and the broad set of exchanges that are happening in the network.

What happened? So the update server for Orion got hacked by the attackers and a malware to the upgrade is injected. What this malware enabled was to equate as the first step, the attackers with the IT department. Now, there’s a lot of intelligence gathering that the attackers can do. Why? Because they equated themselves to the IT department and they have a great tool to monitor and manage resources. What they did is they have done intelligence gathering for two weeks and then sent a beacon, meaning a small payload, fairly difficult to detect, to their own servers and using which they came up with this equality, meaning Alice levels with Eve. Found an Alice that equates with the attacker. Once they did that they exfiltrated a significant amount of data, according to FireEye it’s from 50 organizations, but it may be more, we don’t know.

The problem here is one of the major things to realize, is that the point of entry, so the impact of this attack is there is no one point of entry. There’s no one network, there is no one point of entry. What the attackers have managed to do is they found a point of entry to Orion which led to 33,000 points of entry. This is a signal of a non robust system. Why? Because, what’s the definition of robustness? Robustness is, in a system, the broad definition is small changes in the input leads to small changes in the output. In the result, in the end impact. What happened here is that a small change in the input, which is kind of the malware that’s injected into the software, just one malware into one system led to a significant impact in the whole system.

Which means this attack exploited the non-robust nature of the ecosystem in the first place. In the first place, the system is not robust. Is it generalizable? Meaning is SolarWinds unique? Is Orion unique? The answer is no. There are so many different\ points of non robustness in today’s networks. As the networks start to become more sophisticated, the attack surface grows. It’s naive to assume that there will not be, once we handle this problem, there will not be another SolarWinds attack. What now? So the question is what now? Do we do more of the same? So should we treat this problem as before? Meaning network centric security? Come up with better detectors? Patch your servers and mitigate after the attack? After the fact? The answer is no. One of the major realizations is that the ecosystem is not robust. The second major realization that we started reading around is that people have realized that this is going to happen.

The philosophy that we should follow should not be reactive. We should assume breaches to begin with. Breaches are there. And act rather than react. Act in a proactive way rather than react to the breach, which means it’s actually a revolutionary realization. What this means is that now you have to assume that your data is commodity. I don’t care whether that data is in your network, behind your firewall, it is a commodity. You have to assume even your sensitive data is a commodity. What this means is that Eve has access to your data. Remember we’ll assume breaches. So breaches are there. So Eve has access to your data. The problem is you have to make sure that Eve cannot capture the content of the data. The problem here is can we do that? This is a very difficult problem. I mean, for example, you can encrypt the data, but encryption by itself is not the solution.

You should encrypt the data, but itself, it’s not a solution. Why? Because remember in a successful network attack the major thing that’s achieved by Eve is leveling with Alice. If you want to do this arrow one way but not the other so the content cannot be exfiltrated, it should apply to Alice as well. Alice is legitimate. If you make sure that Eve cannot get the content, it should apply to Alice as well. So this is a big problem. Why? Because Alice is a legitimate user and as long as Eve equates herself to Alice, your problem continues. In fact, this big problem is referred to as the zero-trust problem. If you assume breaches, and if you assume your attacker levels with the legitimate users, you have to assume that any requests coming for your data is potentially illegitimate either from in the network or outside of the network. It’s illegitimate.

You have to assure that it is legitimate before giving access. This is a difficult problem because again, for the fifth time, Eve equates to Alice. How do we solve this problem? This is a very difficult, fundamentally difficult problem. It’s not just another iteration of a big problem. It’s a fundamentally difficult problem. First off, we agree on the fact that you should encrypt data. Especially sensitive data should remain encrypted. The second thing, that’s one solution, that’s not necessarily the solution, but one really elegant and simple solution, is attribute based access. What is attribute based access? Attribute based access is, Eve equates Ellis virtually, but come up with an external state for Alice which differentiates from Eve. What can the state be? The state can be a geography. The state can be another device. The state can be connectivity.

There are numerous examples of this, but come up with an attribute of Alice that is orthogonal to the virtual credentials of Alice so that once Eve equates with the virtual credentials of Alice, it’s not sufficient because the physical credentials or the attributes of Alice cannot be equated by Eve. That’s the idea behind attributes based access control and that’s the topic for my next tutorial. Let me just continue and give the rest of the lessons from this attack.

The other thing that you need probably is revocation. You will be nice to react to an attack in such a way that data, even after the fact, can be revoked. This can be thought of as a part of encryption and decryption, it cannot be necessarily separated, but revocation is something that is really handy and important to give full control of the data to the organization.

In short, these are all data centric approaches rather than network centric approaches. These are ways of addressing this zero-trust problem. One other thing that we need is robust design because I talked about the issues associated with non robustness. Small seemingly small faults lead to catastrophic effects. So we need a robust design. My advice toward achieving robust design is separate where you store the data from where you have the keys because of obvious problems. So the data store and the key management should be disjointed. What I mean by this is there are all these data management tools or data stores on the cloud, or locally. For added security, what they do is encrypted sharing. They provide encrypted access to their data. They say, hey, we have it secure. But data may be encrypted, as I said, but it doesn’t mean it’s robust and secure because if you make data and key management handled by the same entity, there’s a fundamental single point of failure. So it should be separated. Preferably the organizations or the foundations managing the key and the data should be separated. So that’s all for today’s tutorial. As I have mentioned, next I’ll talk about attributes based access control and tell you its value. Thank you very much for your attention until next time.