Results and Analysis: Introduction

Please provide the information below to view the online Verizon Data Breach Investigations Report.

Thank you.

You will soon receive an email with a link to confirm your access, or follow the link below.

Download this document

Thank you.

You may now close this message and continue to your article.

  • Welcome to the DBIR 15-year reunion! Please grab a name tag, find some familiar faces, and reminisce about the good ole days back in 2008. Now, let’s catch everyone up on how we’ve changed over the years.

    A picture may tell a thousand words, but so will a good figure. The charts we use in our report are the result of numerous iterative attempts to convey both the main story of the data, as well as the main constraints, which is a tricky1 proposition. Our dataset comes to us in a variety of formats and represents many different contributors—each of which come complete with its own particular nuances and biases. We realize that our data is not a ‘pure sample’ of the world of breaches and incidents (because such a thing does not exist). Nevertheless, we can still extract meaningful analysis.

    You may already be familiar with the charts such as Figure 10 we used in the original DBIR, and while these bar charts are an excellent means of allowing for easy comparisons between a small set of things, they can also sometimes hide important information in their percentages. Therefore, in an effort to be more transparent with our readers regarding the level of ambiguity or uncertainty in our data, over time we have transitioned to slanted bar charts such as Figure 11, which captures both the comparison between the “things” and the range of values for those based on the confidence we have in the data. We’ve also applied the same notion to our line charts, instead of representing trends as singular lines based on the average, we plot a collection of lines within our confidence interval. The good news is that you can still convey the core message of “things change” but also provide an honest illustration of “these are the possible representations of those changes.”

    After 15 years of this data breach journey, we find ourselves reminiscing about all the deadlines, failed cover ideas, and heated arguments that we encountered along the way. However, maybe the real treasure of our journey wasn’t all the fame, mega yachts, book deals and data breach analysis, but the friends we made over the years. Initially this report was based solely on Verizon data, but since then we have been joined by 87 partners and collaborators from across the globe who make this report possible. Due in large part to them, we have collected and analyzed in total over 914,547 incidents, 234,638 breaches and 8.9 TBs of cybersecurity data, to bring our readers the best possible analysis and results. Truly, we stand on the shoulders of giants. Without further ado, let’s take a dive into the analysis.

  • Actor

R&A Actor – Friends in low places

  • Our findings indicate that data compromises are considerably more likely to result from external attacks than from any other source. Nearly three out of four cases yielded evidence pointing outside the victim organization. In keeping with other studies revealing risks inherent to the extended enterprise business partners were involved in 39 percent of the data breaches handled by our investigators. Internal sources accounted for the fewest number of incidents (18 percent), trailing those of external origin by a ratio of four to one.

    The relative infrequency of data breaches attributed to insiders may be surprising to some. It is widely believed and commonly reported that insider incidents outnumber those caused by other sources. While certainly true for the broad range of security incidents, our caseload showed otherwise for incidents resulting in data compromise. This finding, of course, should be considered in light of the fact that insiders are adept at keeping their activities secret. (2008 DBIR)

    Some things haven’t changed since we first began publishing this report back in 2008 (For those of you who need context, the original iPhone had been released only one year prior). The 2008 cyber2 security world, with limited access to handheld wonder machines, held the belief that insider incidents outnumbered external ones, or at least felt it was “certainly true for the broad range of security incidents.” As we look back now, with the benefit (?) of 15 years of time-wasting apps, considerably more gray hair and a few chips off the collective Infosec shoulders, we can confidently state that External actors are consistently more common than Internal, with 80% of breaches being caused by those external to the organization, as seen in Figure 11. 

  • The median size (as measured in the number of compromised records) for an insider by more than 10 to one. Likewise, incidents involving partners tend to be substantially larger than those caused by external sources. This supports the principle that privileged parties are able to do more damage to the organization than outsiders. (2008 DBIR)

    Bottom line: most data thieves are professional criminals deliberately trying to steal information they can turn into cash. Like we said—same ol’ story. It’s not the whole story, however, nor is it the most important one. The most significant change we saw in 2011 was the rise of “hacktivism” against larger organizations worldwide. The frequency and regularity of cases tied to activist groups that came through our doors in 2011 exceeded the number worked in all previous years combined (2015 DBIR)

    In the 2008 report, the number of records breached was the metric of choice. Now that we are a bit further into the 21st century, the currency of impact is the metric du jour. Though records are still of interest, they are typically not viewed with the same level of importance as in past years. In 2008 the median internal breach nabbed 375,000 records, as you can see in Figure 13, this year it’s only 80,0003. While it appears the number of records is decreasing, it is important to keep in mind that a number of changes have taken place both in this report and within the industry at large. Therefore, the change in record count could be reflective of the fact that there are now more ways for attackers to monetize data. 

    Motive, for the most part, was not an initial topic of analysis for the DBIR (although in 2008 we did consider it in the context of targeted vs opportunistic breaches). In 2010, we stated “Today’s cybercriminals are not hobbyists seeking knowledge or thrills; they are motivated by the illicit profits possible in online crime.” While that may seem obvious to today’s readers, it is important to remember that at that time the stereotypical “let me hack this site from my mom’s basement to impress my bros” type of activity was believed by many to account for a certain proportion of breaches. Regardless, the motive of the threat actor is important to understand in order to attempt to quantify how many of our troubles are caused by the illicit economy, personal vendettas or by accidental blunders.

    Financial has been the top motive since we began to track it in 2015. However, that same year the rise of hacktivism (particularly leaks) accounted for many attacks. Espionage-related attacks were not even on the radar, but seven years later the world is a very different place. Espionage has taken the 2nd place spot for years, and hacktivism is, for the most part, simply an afterthought. Before we move on, however, it should be noted that while espionage has almost certainly increased over the last few years, the fact that it did not appear at all in 2015 was quite likely due to our contributors and general case load at the time.

  • Actions

    The Actions section tells the story of how the security incident or breach plays out. It’s a bit like a Hollywood action movie, only with a modest budget and there are no explosions or car chases. Nevertheless, in spite of the dearth of pyrotechnics, the actions that lead up to these breaches have a definite impact on their victims. Actions are discussed in the DBIR by variety (the type of action) and vector (through what means the action took place). Figure 17 through 19, illustrate the varieties and vectors associated with incidents and breaches.

    The Denial of Service (DoS) action is the clear leader, representing 46% of total incidents, followed by the malware types of Backdoor or C2 at 17%. However, a much more interesting finding is the inclusion of Partner and Software update among the top vectors this year. This is a first for Software update, and is something we will discuss in greater detail in a subsequent section. Web applications is the number one vector, and, not surprisingly, is connected to the high number of DoS attacks. This pairing, along with the Use of stolen credentials (commonly targeting some form of Web application) is consistent with what we’ve seen for the past few years.

  • Action Categories

    Hacking: attempts to intentionally access or harm information assets without (or exceeding) authorization by circumventing or thwarting logical security mechanisms.

    Malware: any malicious software, script, or code run on a device that alters its state or function without the owner’s informed consent.

    Error: anything done (or left undone) incorrectly or inadvertently.

    Social: employ deception, manipulation, intimidation, etc to exploit the human element, or users, of information assets.

    Misuse: use of entrusted organizational resources or privileges for any purpose or manner contrary to that which was intended.

    Physical: deliberate threats that involve proximity, possession, or force.

    Environmental: not only includes natural events such as earthquakes and floods, but also hazards associated with the immediate environment or infrastructure in which assets are located.

  • Turning to breaches, the top varieties are a bit more dynamic, with Use of stolen credentials, Ransomware and Phishing all in the top five. The category of “Other” has stealthily crept into one of the top three spots this year as well. This is largely due to the dataset being long “tailed” and diverse. In other words, there are a lot of different things that aren’t in the top 10, but are still noteworthy. We can also flip that on its head and state that 73% of breach varieties are found in the top 10 varieties. Not too shabby considering the fact that we have more than 180 different action varieties. How’s that for the Pareto principle?4

    In terms of vectors, these align well with the notion that the main ways in which your business is exposed to the internet are the main ways that your business is exposed to the bad guys. Web Applications and Email are the top two vectors for breaches. This is followed by Carelessness, which is associated with Errors such as Misdelivery and Misconfiguration. Next we have Desktop Sharing Software which captures things like Remote Desktop Protocol (RDP) and third-party software that allows users to remotely access another computer via the Internet. Unfortunately, if you can access the asset directly over the internet simply by entering the credentials, so can the criminals.

  • ‘08 Throwback

    While the DBIR has grown and evolved dramatically since its inception, we find it incredibly interesting how many of the core stats remain the same. In Figure 0a31addb from 2008, you’ll find that the numbers are eerily similar to what we see today. Hacking continues to be the main action, followed by Malcode (Malware). In the 2008 report, Error was recorded in two ways: Errors that directly caused the breach (the full bar) and Errors that contributed to the breach (light colored bar). We no longer use this breakdown (for a few reasons, one of which is that it can be argued that errors play at least a small part in almost all breaches), but Error accounts for 14% of breaches overall. From there, however, things begin to deviate slightly. This is particularly true with regard to Social and Error, but please keep in mind that our data has grown both in size and diversity of source over the years, expanding from 500 incidents that first year to over 23K incidents this year.

  • Assets

  • For those not “in the know” about VERIS, (but if you are, that’s awesome!) Assets are the THING that the Action happens to. So, this is where you find WHAT was hacked via an exploit (probably a server), WHO was socially engineered by an attacker or WHAT was lost or stolen.5 For the staunch defenders, this should help you understand what is being targeted and also be a useful tool to start prioritizing what type of coverage your infrastructure needs. 

    Figure 22 illustrates that the top varieties of Assets impacted in breaches are Servers, People and their devices.  When we start to zoom into the specific types of servers (Figure 24) we find Web application (56%) and Mail (28%) servers accounting for the top two varieties, which is rather intuitive when one considers that email servers and web applications are the Assets that are most likely to be internet-facing.  As such, they provide a useful venue for attackers to slip through the organization’s “perimeter” by using clever tricks like (spoiler alert) stolen credentials.

    Dropping down the list a bit farther to the folks that are socially engineered. These are commonly the individuals who deal with company money and have the ability to do things with it (like update where it is deposited).

    While we are on the topic of assets, it is important to remember that not only are Information Technology (IT) assets important, but so are OT (Operational Technology). The topic of OT is on many people’s minds (and in the news) these days due to the current political climate. These are the computer systems that run our national infrastructure, and while we do have a smattering of cases, they only account for approximately 3% of our overall incident data. Technically, this is an increase from last year (about 1%). Please consider this a gentle reminder to protect those systems that are quietly chugging away in the background keeping our infrastructure up and running. It isn’t called critical infrastructure for nothing. 

  • Looking back

    Although how we classified assets in the 2008 DBIR (all those years ago) was different from how we do it today, the findings are relatively similar. Most incidents impact Servers (online data) with a sprinkling of user and networking devices. It seems that servers in data breaches, like JNCO jeans and spiked tipped hair in haute couture, are timeless.

  • Attribute

  • If security incidents did not have associated attributes, the life of an InfoSec professional would be a good deal easier. Unfortunately, they do: Confidentiality, Integrity and Availability (commonly referred to as the CIA triad), and they can greatly impact numerous aspects of an incident (who needs to be notified, what actions need to be taken and what explanations need to be given to senior management to name a few). Figure 25 shows CIA over time in our dataset (with regard to security incidents). 

    The DBIR defines a data breach as a compromise of the Confidentiality attribute, and anytime Confidentiality is compromised, it begs the question what type of data was involved? 

    15 years ago, (Figure 20 from the 2008 DBIR) Payment card data led the pack by a large margin. However, it has slowly declined over the last few years. No doubt this decline is to some degree reflective of the additional security controls that have been added in recent years to protect this type of data. Regardless, Figure 41a99c26 shows the top two data types are now Credentials and Personal data.  We’ve long held that Credentials are the favorite data type of criminal actors because they are so useful for masquerading as legitimate users on the system. Much like the proverbial wolf in sheep’s clothing, their actions appear innocuous until they attack. With regard to breaches, attackers are frequently exfiltrating Personal data, including email addresses, since it is useful for financial fraud. There is also a large market for their resale, which means they are truly the “gift” that keeps on giving. Unfortunately, what it gives is mostly trouble to the data subjects (whom the data is about). 

  • Once attackers are inside the victim’s network they often install malware, which violates the Integrity of a system (as does any other illicit change). The Integrity of a person can also be compromised when they alter their behavior due to the actions of the adversary. Examples include responding to a phishing email or falling victim to a pretexting scenario. These are the two main types of Integrity violations we see in our data, and while they have both been present in all reports, they were not necessarily referred to in the same terminology. In the early days of the DBIR, social actions such as Phishing were not as prevalent as they are now. However, the installation of malware was already quite common back in the day, and our data shows that this year is no exception, with over 30% of breach cases involving some type of malware, and approximately 20% of cases involving a Social action. 

    Ransomware’s heyday continues, and is present in almost 70% of malware breaches this year. Ransomware is an attack that straddles the first two of the CIA Triad (38% of ransomware cases have some Confidentiality compromise), bringing us to the third leg: Availability. When ransomware is triggered, the organization experiences an Availability loss since they can no longer access their data. The particular variety is Obscuration in our dataset, as shown in Figure 28.

    Another common form of availability impact is Interruption which often arises from Distributed Denial of Service (DDoS) attacks. These attacks make up a large number of incidents, but are relatively non-existent in our breach caseload. But if DoS is something you are particularly concerned about, we have an entire pattern devoted to it. 

  • Timeline

  • Discovery time is a good place to begin when viewing timelines. While Figure 29 might seem like good news, (that we are more likely to detect breaches within days than months), it gets to be a little less comforting once you start looking at some of the drivers. The top Discovery Method for breaches (more than 50%) is now “Actor Disclosure” (normally either on the asset in the form of a ransomware note or on a criminal forum to sell the data or announce the breach). Neither of which is desirable. “Ignorance is bliss” doesn’t readily apply to breaches. 

  • Event Chains

    Rather than simply analyze how long an attack took in time, we can also analyze how long it took with regard to Actions in. We can view this timeline of Actions in our Event Chain data. Event Chains capture the path an attack followed.6 Figure 30 shows that the vast majority of breaches include only a handful of steps. Three Actions (Phishing, Downloader, and Ransomware) are the most common, while very few breaches utilize five or more Actions. Our job as defenders is to lengthen that attack path. Attackers tend to avoid longer attack chains because every additional step is a chance for the defender to prevent, detect, respond to, and recover from the breach.

  • Value Chain

    Over the last two years the DBIR team has been collecting value chain information, defined as the capabilities and investments an attacker must acquire prior to the actions on the target, either by purchase or investment in its creation. Traditionally, defenders are largely focused on the events that occur within their boundaries, which makes sense since those are the things they control. However, an attacker ecosystem exists both before and after the breach, and it plays into and feeds off of the incident. The value chain asks the question “Where did that email address come from?” Or “Where do those stolen credentials go?” It often seems that breaches beget more breaches, creating a Circle of Breach7 so to speak. By understanding the transactions associated with this ecosystem, we can understand the key steps involved in attacks and work collaboratively to make those transactions more difficult, expensive or unsustainable for the attackers.

  • “It takes money to make money”

    There are several things attackers must invest in for a breach:

    Development: software or content that must be developed to accomplish the actions on the target.

    Targeting: work that identifies exploitable opportunities. These overlap heavily with the data varieties that are compromised.

    Distribution: services used to distribute actor content including email, compromised servers, and websites.

    Non-Distribution Services: services provided and used by threat actors other than those used for distribution of actor content.

    Cash-out: methods for converting something (likely the attribute compromised) into currency.

    Figure 31 provides the top variety for each part of the value chain. Email is the most common method. We can infer the email chain because it is the primary reason other things, such as malware in development or credentials in targeting, aren’t higher. Malware can be freely available and credentials may be stolen. The takeaway is not to think of breaches only in terms of starting or ending. Instead, think of them like you might think of a sports team: they are either on the field or preparing to be. 

  • 1 It’s not just rhymes that are tricky to rock.

    2 Can you find all 45 “cyber” references in the DBIR this year? I bet you can’t…

    3 And it should be noted that most internal breaches are from errors, not malice.

    4 Not to be confused with the Peter Principle, which is something else entirely.

    5 We feel a Schoolhouse Rock song coming on.

    6 Event Chains are kinda like Attack Flows in the Controls Appendix, but more basic.

    7 Like the Circle of Life, but for threat actors.

Let's get started.