Information Safety

Improving technology through lessons from safety.

Interested in applying lessons from safety to security? Learn more at security-differently.com!

Does Phishing Training Work?

I was recently talking with a couple of friends who both work in technology outside of cybersecurity, and our conversation led to one of the most common interactions with the security team: phishing training. Their general experience reflected my own: companies generate simulated phishing emails, send them out periodically, and deliver training to employees that click.

This raises an important question, Does phishing training work? In the spirit of the Safety of Work podcast, let’s examine the evidence.

Academic Literature

What does the academic literature have to say? A quick search in Google Scholar found a literature review paper on the topic from 2020, “Don’t click: towards an effective anti-phishing training. A comparative literature review.” The paper reviewed 104 papers from 2003-2020 that researched questions on phishing attack success rates and/or training effectiveness.

“There is a large body of publications that confirm a decreased likelihood that users will fall victim to phishing messages after educating them with general anti-phishing material or via embedded training.”

There was a consensus that embedded training (simulated phishing emails with education for people who click the link) improved outcomes through reduced click rates, although there was no consensus on how best to educate users. Training needs to be repeated periodically to remain effective, at least every 5 months, preferably quarterly.

The paper also recommends providing a mechanism to report suspicious emails, like the increasingly common “report phishing” button, and also to tailor the level of difficulty of training to the individual, something that is not done in current commercial systems.

Data

What does the data say? Outside the academic literature, a series of reports by Cyentia and Elevate Security offer additional insights.

The first report, published in 2021, had a number of interesting findings:

  • Completing training 1-3 times reduces average click rates, but performance gets progressively worse for 4 and 5 times; average training rates for 5 training sessions was higher than none at all!
  • Sending more simulation emails decrease average click rates, even at high numbers of simulations, but flattens out just below 5%

Similar results are reported in the literature review paper: security fatigue is a real problem, and one study cited found two groups that were not affected by training - the “always click (11%)” and “never click (22%)”. In a presentation at Secure360 2015, the CEO of Cofense (then PhishMe) also noted that some users (between 5-10%) would always click on the link, no matter how much training they’d received.

Importantly, the Cyentia/Elevate report also noted that 100% of organizations eventually click or are compromised - that is, no matter how much you train, someone within your organization will click the phishing link.

A second report studied the problem in greater detail, finding that:

  • Some users get many more phishing emails than others (100s per year vs. a few).
  • The more emails a department gets the better they are at blocking them.
  • Most users won’t click the emails that do make it to their inboxes.
  • But some of those who do will click a lot (as much as one click per week).
  • Subjecting all users to the same level/type of treatment is counterproductive.

What the analysis showed was that nearly 80% of users never click a phishing link, and 4% account for 80% of clicks - a small number of high-risk users are the biggest source of phishing clicks.

(A third report studied the question of high-risk users in greater detail.)

My Experience

My own experience, reflected in my recent discussion, is that nearly everyone uses embedded training and has a “report phishing” button or a reporting email address, but does little beyond that. At worst, it is a game played by cybersecurity team to see how many people they can fool, using email templates with substantial inside knowledge (and not a realistic example of what they are likely to receive). At best, it uses realistic examples, periodic training, and provides feedback, with the goal of 0% click rate. Feedback when reporting phishing is uncommon, for either simulated or real phishing emails.

Education and training typically focuses on identifiers of phishing emails, like unusual links, email addresses, a sense of urgency, and spelling/grammatical errors. (Interestingly, the literature review paper suggests that looking for spelling errors is not an effective approach)

Training Differently

So, does phishing training work? Yes, with significant limitations. It’s clear that the embedded training and reporting button used by most companies helps improve average employee performance, but there is a small group (4-5%) of users that will always fall victim to phishing.

How can we improve? To answer that, I suggest we look at the problem through the lens of Security Differently, and shift the focus from preventing negative outcomes (clicking the link) to promoting positive capacities (reporting and blocking phishing).

If we can never get to a 0% click rate, what should be the goal of phishing training? We want to encourage employees to report suspicious emails (a measurable, positive action) so that we can take action and proactively block those emails and links. This is an important difference, illustrated well by this article, which shares the story of “Vicky”, who correctly identifies a phishing email but doesn’t report it, because “it’s not my thing to deal with.”

Focusing only on awareness, as most phishing training does, is only one part of what leads to the desired behavior (reporting) - the Capability. We also need to provide the Opportunity by making it as easy as possible to report by adding a reporting button, and Motivation through feedback - thanking people for reporting, and by following up if the email is not reported, even if the person didn’t click the link.

Even with high rates of reporting, we still have an organizational challenge: someone, somewhere, will click the link. To solve this, we can leverage the fact that most phishing emails will be sent to multiple people within our organization (or elsewhere). If we automatically block emails and links based on early reporting, we can stop the first click before it happens, and effective.

Ironically, this approach was the core content of the Cofense presentation at Secure360 in 2015: they created a system where employees were assigned a “credit rating” based on the reliability of their reports, as judged by security analysts. The system was designed to automatically block copies of emails that passed a threshold of reliability and number of reports, without human intervention of the security team.

Unfortunately, I’ve never come across an organization that fully adopted this model. If your organization has, please get in touch, I’d love to hear from you about the real-world effectiveness of this approach!

comment

SIRAcon 2023

As I’ve been posting, I’m cataloging and posting my past presentations, and this is the last one! This talk from SIRAcon 2023 summarizes my experiences leading Site Reliability Engineering (SRE), “Measuring and Communicating Availability Risk”.

The particular focus of my SRE work was on measuring availability and availability risk, and I learned quite a bit over the 3 years or so I did SRE. One of the key lessons was that the value of measuring availability using Service Level Objectives (SLOs) was for decision support (SIRA helped with this framing). That is, SLOs and the associated measurements help make decisions about what to do, either during an incident, tactically over the course of a month, and strategically over the course of several months and into the future.

Our biggest success was the result of measuring availability in ways that supported all three timescales, using a explicitly defined customer-focused measure of “available”, we were able to construct visualizations that helped during incidents (real-time), during maintenance planning (one month), and for longer-term work (many months).

A key element of this success was the business imperative: the work supported a large and important client, who had just negotiated a significant increase in availability by no longer allowing us to count scheduled downtime against our availability target. The Service Level Indictor (SLI) we created helped our incident responders understand outages better, and the SLO we created allowed our teams to schedule maintenance with confidence or confidently defer it. A hidden benefit was that the metrics, being based on direct observations from our monitoring tools, brought together and aligned the different stakeholders on a common view of how available our systems were - the new approach we developed was even adopted by our client as an improvement.

A copy of my slides are here, and the visual notes from the talk are below! As a bonus, you’ll get to see a photo of my dog, Gertie, which was added at the last minute as part of an ongoing cats vs dogs competition at the conference.

visual notes

comment

Measuring Security Effectiveness

The last “security” talk I’m cataloging this week is one that Sean Scott and I gave in some form a few times, based on Sean’s work to measure the effectiveness of our Application Security Team.

From the abstract:

How do you measure the effectiveness of security? How can you prove that security is a good investment?

In 2016, we established a security function within software engineering at Express Scripts. Taking a software engineering approach to security, we created testing services, hired developers to build tools and conduct secure code reviews, and established the AppSec Defenders training program.

In 2020, we challenged ourselves to evaluate the effectiveness of our application security program by analyzing the impact of our team’s services on pen-test findings. A 3-month data analysis found that development teams working with us fixed their pen-test findings faster and had significantly fewer new pen-test findings than teams we didn’t work with. We continued the analysis with a randomized controlled trial: by assigning new teams to work with us (or not), we have created an experiment to test the impact of our program.

We will share the specific application security practices that we believe led to these improved outcomes, how we adjusted our services in response to our findings, a recently published industry report that supports our conclusions, and the current status of our randomized controlled trial, which we expect to complete in the first half of 2021.

Results

This was an ambitious project, and we worked hard to create a rigorous study, grounded in evidence. There were a few key findings:

  1. There was a significant improvement in the performance of the development teams we worked with, which we expected. The magnitude of the impact was large enough that we estimated we more than paid for our entire team with just the reduction in time spent fixing bugs after the fact, without considering the reduction in security risk.
  2. Surprisingly, the level of engagement with our team had much less impact than whether or not a development team was working with us. We used this insight to scale back the amount of time we spent working with individual teams in order to work with a greater number of teams, to increase our overall impact.
  3. Our strategies: using DAST + SAST, frequent scanning, and integrating SAST with our CI/CD pipeline for a steady scan cadence, were found to be the top 4 factors in improving remediation times by the Cyentia/Veracode State of Software Security Volume 11 report. This was quite validating for us, as we had taken this approach before the data was available.

Presentations

We presented the talk on four different occasions, all virtual due to the pandemic:

  • In August 2020, I gave a brief presentation based on our work at the SIRAcon Day 2 lightning round (Slides).
  • Later in 2020, we gave the full presentation at an internal company technical conference. One attendee suggested we present at an external security conference, so we did!
  • In May 2021, we presented together at Secure360. Slides from that event are here.
  • Finally, in October 2021, we presented again at the ISC2 Security Congress, with a few new slides, copy here.

Note: the SIRAcon video is only available to SIRA members.

A Footnote

Finally, a footnote: this work was inspired in part by an earlier study we performed, where we measured the impact of static code analysis scanning on security bugs in software. What we found was that simply giving teams the tool or SAST reports didn’t reduce the number of security bugs. Making SAST testing “high” failures break the builds (which we encouraged teams to adopt voluntarily) or pushing teams to resolve open high findings (which was less voluntary) was necessary for improvement, as we later showed.

comment