On December 29, 2016, the United States Department of Homeland Security (DHS) and Federal Bureau of Investigation (FBI) released a joint analysis report (JAR) detailing, in their words, “tools and infrastructure used by the Russian civilian and military intelligence Services (RIS) to compromise and exploit networks and endpoints associated with the U.S. election, as well as a range of U.S. Government, political, and private sector entities”. While the report doesn’t name the DNC outright, it is clear that the technical information in the report is focused on data associated with the DNC intrusion and which was publicly attributed to two different state-sponsored Russian hacking groups.
The report, which dubs the operation “GRIZZLY STEPPE”, has been consumed and analyzed by various members of the information security community at large. The report basically has three parts: 1) what I take as an official government statement that the tools and groups listed in the Alternate Names section are in fact RIS; 2) a set of indicators which FBI/DHS thought would help defenders detect some segment of the activity; 3) a set of guidelines to harden networks against RIS activity.
While the report has received some positive feedback, the majority of feedback has been negative. Many of the critiques are valid, focusing on the indicators themselves. This is where the JAR fell short and could easily be improved. It was lacking in several key technical and contextual details making it vague and difficult to derive substantial value. To that end, and given that we are anticipating future JARs, I’ll walk through some reasons why the technical indicators were not useful and suggest small and hopefully achievable changes which would make the next public JAR more actionable for defenders.
Critiques of the Technical IndicatorsWhile the report provides a clear statement of attribution, the technical information included was not very high-value. The most significant technical weaknesses were as follows.
Noisy IP addressesThe report included 876 unique IP addresses to search as “indicators of compromise” for malware command and control (C2) or data exfiltration infrastructure. The problem is that a substantial amount of the indicators belong to websites and online services that are used legitimately by a huge amount of people around the world, such as websites and services owned by Yahoo, Google, Dropbox, Tor, and various cloud providers. This will create many false positives and potentially panic in organizations who look for connectivity to these IPs and jump to the conclusion that they’ve been hacked by the Russians. If the Russian-sponsored hackers are using potentially high traffic IPs belonging to organizations like Yahoo for C2 or data exfiltration (which is completely feasible), then the report should include that as context (see below for more on that). We are left wondering whether FBI/DHS were aware that some of the false positives (FPs) would be noisy and FP prone.
Included only one Yara signature, not specific to RIS activityThe report includes a single Yara signature. Yara is universally used by security researchers to perform byte sequence analysis of binary files. Yara signatures are often used to identify fingerprints of sorts for malware which are less brittle than using unique hashes, filenames on disk, or other IOCs. The single Yara signature in the report appears to be for a php web shell (backdoor) likely used by an attacker once they’ve compromised an external-facing web server. There are certainly a substantial number of tools used by the RIS actors, and this signature was the only one included in the report. Additional signatures would be welcome. More problematic, the sample identified with this signature doesn’t even necessarily implicate RIS actors specifically. It is used by a variety of actors and cybercriminals and can actually be downloaded here and used by anyone. If this signature hits on a file on your webserver, you should be concerned and remediate, but it wouldn’t be conclusive proof that you were hacked by the Russians.
Lack of contextThe majority of IP address indicators provided in the DHS report include nothing more than the sentence, “It is recommended that network administrators review traffic to/from the IP address to determine possible malicious activity.” The GRIZZLY STEPPE report does not include any sort of time window, description, or severity, making them less actionable by security teams around the world. This is an especially big problem since, as mentioned above, many of the IPs seem to be multi-user or subject to changing ownership over time. A hit outside of the time window when FBI/DHS believed the IP to be in use by the Russians would be a false positive.
Areas of ImprovementThe following are specific recommendations that would drastically improve the value of the technical details of the GRIZZLY STEPPE report.
Curate IOCsThe United States Intelligence Community (IC) has the resources and expertise to differentiate between high-value and useless indicators. All indicators handed to the public should be high value. They could prioritize the indicators, explicitly calling out the indicators that will provide the most value and notate which high-value indicators are prone to false positives or shared infrastructure. Future reports should also include which indicators may have been compromised infrastructure. That is, which ones are random servers on the Internet compromised and used as a means to an end towards the real target, such as the DNC. Such actor tradecraft is quite common. On the other hand, they should specify which indicators were believed to have been used solely by RIS. The list provided in the report contains noise and seems to lack curation, which undermines confidence in the entire list, and has been used in attempts to refute the attribution. Confidence would increase with slightly more information.
Contextualize IOCsFuture reports would be greatly improved by including the time period the malicious activity would have taken place and the lifespan of each indicator. Is the activity ongoing or should security teams dig through historical netflow logs from 2012 to find malicious activity? Did attacks take place via that infrastructure for a year or a week? Which week? Will attacks be coming from a VPS rented from a cloud provider or a compromised website? How many indicators are proxies or Tor exit nodes? The lack of context will make it difficult to understand how to interpret hits, sift out FPs, and discover actual evidence of RIS activity, which must have been the point of providing the IPs in the first place.
Actionize the nomenclature table The alternate names table on page four of the GRIZZLY STEPPE report potentially confirmed years of claims from commercial threat intelligence providers, but in its current format is completely useless and possibly even counterproductive from a technical standpoint. A few pieces of information from the IC could be a force multiplier on the value