Partnering With Verizon on the 2021 DBIR

Written by

Martin McKeay

May 13, 2021


By the time you read this post, the 2021 Verizon Data Breach Investigation Report (DBIR) will be published. Akamai has been one of the many partners contributing data to this report for more than half a decade. We greatly value the time, effort, and dedicated data science that goes into providing this level of research to the security community.  

On a personal level, my excitement about this report may verge on disturbing to folks who aren't data nerds, security professionals, or press. My wife has learned to nod her head at key points, but generally ignores me when I start talking about the DBIR. I find much to be inspired by in the work the Verizon team does, especially the partnerships they've built over the years. Our next State of the Internet / Security Report, Phishing for Finance, is a reflection of our own commitment to partnering with other organizations to produce the best research possible.

Unique research

There are always plots, charts, and tables that don't make it into a report. In this case, one of the plots that didn't make it into the DBIR was an examination of the Gini coefficient of the web application attacks seen by Akamai. The idea expressed by the Gini coefficient is complex, and originally meant to express the deviation of wealth distribution when compared to a perfectly distributed system. Not an easy concept to express succinctly.

Figure 1: It's nearly impossible to predict the number of application attacks an organization will experience on any given day.

The Verizon team dedicates several pages of the report to introducing concepts like this.  However, the easiest way to read Figure 1 is to understand that the further to the right an organization is, the less predictable the number of attacks it'll see on any particular day is. Over 95% of the companies represented by this plot have a coefficient greater than 50, which makes prediction nearly impossible.  

Another way to highlight this lack of predictability is the work that was done to predict DDoS attacks by use of a Recurrent Neural Network (RNN), based on events from Akamai and other partners. Even the best results a machine learning algorithm, like RNN, can provide aren't much better than a much simpler measure, like the median time between attacks. One of the major themes running through the DBIR is summed up as "Engineer for the expected and develop operational processes for the unexpected", a sentiment highlighted by the unpredictable nature of application and DDoS attacks.

Figure 2: Perhaps the time between DDoS attacks might make a better random number generator than cosmic radiation.

An example of an even more unpredictable attack vector is credential abuse, a topic near and dear to the SOTI / Security team. I'd encourage you to look at the Gini coefficient in the DBIR, but for our analysis the important plot highlights the sheer number of attacks many organizations see. While Verizon's data doesn't necessarily support this position, I would suggest that this is actually undercounting the number of credential stuffing attacks by several orders of magnitude for most organizations. Most, if not all, of the organizations in our data set count annual credential abuse in the millions of attempts.

Figure 3: The median yearly count of 841,239 credential stuffing attempts is what some companies see in a single day.

The influence of COVID-19

We've written about the effect of the pandemic on traffic and attack patterns multiple times in the last year, even releasing a SOTI/Research report on how it affected our own systems. So it shouldn't surprise anyone that we agree with Verizon that there's been a huge increase in the number of phishing-based compromises during the pandemic.

Figure 4: Correlation isn't causation, but our own research has highlighted a number of pandemic-based phishing kits. 

The level of uncertainty shown by the long long tails in the violin chart in Figure 3 means that the data can't be conclusively tied to the pandemic based on just the attack data. But if you ask anyone who's actually looking at the kits being circulated by criminals or seeing more details on the attack traffic, you'll hear more certainty: Criminals have been taking advantage of the pandemic.

The rise of ransomware is another data point that's especially concerning. In the case of this type of attack, it seems less likely that the increase was related to the pandemic and more probable that criminals have found a low risk, high return way of making money. That said, recent events in the news may bring more attention to ransomware attacks and significantly drive up the risk as attention is focused on the issue.


Efforts like the DBIR are vital for our industry if we're going to move beyond the days of proving the need for security based on anecdotal evidence. A personal story is great, but being able to show leaders that the need to increase the security of our systems is provable (and not just the opinions of a paranoid security professional) is much better for securing budget.  

No organization, even Akamai, has been doing as much to partner and gather data as the team at Verizon. I've been reading every report since the first one was published over a decade ago.  I'm not going to tell you to go back to the beginning and read every report, but there's worse ways to put yourself to sleep at night.

Written by

Martin McKeay

May 13, 2021