Nate Beach-Westmoreland wrote a Tweet recently that piqued my interest, as it aligned very closely to one of my major concerns in a former IR position: how does one ensure that sensitive data isn’t manipulated?
Typically, cyber defense focuses on two key impacts: the loss or theft of sensitive (or otherwise valuable) information, or the inability to access such information (via ransomware or a destructive wiper). Less often discussed – but in certain environments potentially quite significant – is the possibility of manipulation. In this case, sensitive, important data is altered by the attacker to degrade the quality or usefulness of the information, or potentially to cause a catastrophic effect later on. Furthermore, the possibility exists that even the perception of such an alteration, given the potential consequences, are significant enough to result in severe operational impacts as the targeted organization attempts to address the question, “did anything happen?”
An example of the destructive scenario above – which may or may not be true – comes from the Cold War where the US allegedly inserted malicious functionality into software designed to be stolen. On its face, the software “looked” fine, but subtle manipulations and timing resulted in an allegedly catastrophic physical impact: the explosion of a natural gas pipeline due to an overpressurization event.
This particular case – if true – is not completely applicable to the problem at hand, since the software was deliberately modified with the intent that it would later be stolen and used. But the idea is very important and relevant: introducing subtle changes in software that result in unintended, anomalous, or in some very unique cases very unique and desired actions that negatively impact the target organization. Additionally, such attacks prey on the recognition mature organizations have of the risks such actions pose, thus making perceived or potential manipulations priority events for investigation and analysis. As information security professionals, many of us are likely familiar with wormable executable infectors such as Virut or Sality that sort-of have this functionality, in that they can effectively destroy important files inadvertently as part of the infection process – but these also appear distinct since they are rather mindless and opportunistic in origin.
A more worrisome situation might look like the following: an adversary penetrates a secure network holding test results or manufacturing specifications for nuclear weapons. While the adversary is unable to perform a bulk extraction and a mass corruption or deletion would advertise its presence, a few subtle changes in data – not executable code, but actual information stored – can have potentially catastrophic effects. These could range from corrupting manufacturing specifications so no new physics packages could be produced (or the ones produced are non-functional) to modifying test results to shape package design toward ineffective models moving into the future (since there are no physical tests). More insidiously, if the detection is detected (or advertised to defenders), the attacker can indicate the mere possibility of such a change and tie defenders (and other personnel) in knots for months as they strive to verify that data at hand was not altered or manipulated in any unauthorized or malicious fashion.
This may seem an extreme example, but step down somewhat to something with fewer spooky connotations – like nuclear weapons – to more mundane items, such as census results or voter registration data. At least from the perspective of the United States, questions on voter eligibility, population, and even citizenship are “hot-button” items within the current political atmosphere, affecting matters such as who can vote, how representative districts are assigned, and similar matters. Introducing so much as the hint of manipulation into such datasets would more than likely set off immense partisan fights within the country to paralyze government and decision-making to the exclusion of all other issues. Even absent obvious manipulation of such datasets, the mere perception that such stores are unreliable may have profound implications for the political process and decision-making for one of the world’s largest democracies.
Underlying the above scenarios and their resulting concerns are two principles: first, the impossibility of proving a negative (i.e., to claim with 100% certainty that nothing “bad” happened); second, the often-overlooked issue of data provenance (at least, outside of blockchain bro circles).
The first should be familiar to experienced security practitioners, because it forms one of the first “teachable moments” to non-technical management: answering the question of whether or not a breach occurred or if data was lost. In all such instances, the honest best a security team can ever do is answer: “We find no evidence that X occurred.” This may seem like an attempt to weasel out of a direct response, but the truthful answer must take into account limited visibility and an acknowledgement that one never has all the answers, data, or artifacts needed to make a 100% determination that something did not occur. Proving that something did in fact happen is a far lower bar but unfortunately is seldom the case when dealing with an amorphous, potentially long-term breach scenario. The best one can do is rely on the data available to make a judgment call on the facts at hand.
This leads to the second, and less familiar issue of data provenance. Provenance can be distilled into the concept of integrity surrounding data, its origins, and subsequent manipulations. While security often concerns itself with data theft – such as the increasing number of data loss prevention (DLP) solutions – and corruption (encryption via ransomware or outright destruction via “wipers”), less attention is paid to subtle changes in the underlying dataset itself. Furthermore, few tools exist to allow for the unequivocal (see previous paragraph) determination that a particular dataset – if not exfiltrated or outright destroyed – was not in some fashion manipulated. Underlying technologies appear to offer this capability via transaction logs, but the ability of adversaries to manipulate such items calls their efficacy into question. From a file-system level, the ability to massage timestamps, modify logs, and other fairly standard adversary techniques allow for a wide variety of possibilities to essentially attack confidence that what is stored is what is true.
In this fashion, an attacker undermining data provenance achieves what I refer to as a “mission kill”: an attack that does not destroy the target, but renders it incapable of performing its desired function. Looking at the examples above, manipulated datastores (or even the perception that such stores were manipulated) result in a lack of faith in the accuracy and efficacy of the data housed therein. The subtle dangers of the provenance issue combined with the impossibility of “proving the negative” in the case of intrusions or manipulations presents organizations with a truly tough and almost intractable problem: how to ensure that what is stored is accurate, when the ability to positively confirm an absence of malicious effects remains a distant, and likely unobtainable, possibility.
Thus, an adversary need only create the perception of data manipulation – or lack of data provenance – in order to create a potentially crippling effect on the targeted organization. Even more important and interesting from a cyber security perspective, such an attack need not subvert or harm any aspect of the entire infrastructure in order to be effective – all an adversary need do is gain sufficient privileges at some point to create the possibility that they may have manipulated information in a fashion that cannot be effectively disproved. While still requiring skill and ability – namely to drill into a target network to sufficient “depth” to reach vital information stores – a potentially debilitating effect can be created without resorting to new wiper variants or fancy cyber-physical capabilities.
The take-away from the above discussion is organizations need to get serious about how they track, record, and ensure the accuracy of the vital sources of information within their environments. This can range from personnel information (just ask OPM how vital keeping personnel records can be – as the biggest risk from the OPM “breach” isn’t so much theft of data in my opinion, but injecting “new” data into trusted stores) to manufacturing specifications. Introducing uncertainty into vital data stores results in operational paralysis, just as effective (if not more so due to the profound difficulty in recovering from such scenarios) as a truly destructive cyber attack. Blockchain or other faddish technologies may or may not be the key to resolving such issues, but the sooner organizations take heed of such concerns – and investigate and hopefully apply corrective measures – the better we all shall be.