Analysis of PHI Breach Data Indicates Different Control Recommendations than 2020 Verizon DBIR

Don McKeown, 5/30/2020


The 2020 Verizon Data Breach Investigations Report (DBIR) Healthcare analysis suggested implementing a security awareness and training program, boundary defense, and data protection as the "Top 3" controls. However, an analysis of PHI breach data from the US Department of Health and Human Services (HHS) suggested that protecting administrative privileges and implementing secure configurations on network servers should be prioritized above these.


The Verizon Data Breach Investigations Report (DBIR) is among the most authoritative free information security reports published. It offers an analysis of security incidents and breaches across and within industries, including healthcare. I wrote previously [1] about using freely available PHI breach data from the US Department of Health and Human Services (HHS) [2] to complement the DBIR analysis. In this article, I analyzed the HHS breach from the same time frame of the 2020 DBIR data and compared it to the DBIR.


I downloaded all HHS data from November 1st, 2018 and October 31, 2019. This is the time frame of the DBIR analysis.

Most breaches of PHI were from Hacking/IT Incidents. These types of breaches affected more individuals than other breaches in that 70% of the breaches were Hacking/IT Incident but impacted almost 90% of the individuals affected. Unauthorized Access/Disclosure is still a significant breach type, accounting for 21% of breaches and 10% of individuals impacted:

Type of Breach

Table 1.Type of Breach

The largest number of breaches were PHI in email (~40%), but only accounted for about 10% of the affected individuals. About 25% of breaches were of PHI stored in network servers, but accounted for roughly 83% of individuals affected:

Location of breach information

Table 2. Location of breach information

Hacking/IT Incidents of network servers resulted in by far the most PHI records breached, followed by hacking of email. Unauthorized Access/Disclose of PHI on network servers was third.

Type of breach and location of breach.

Figure 1. Type of breach and location of breach.


The HHS data set is comparable to DBIR breach, not incident, data. The DBIR analysis of breaches within the healthcare industry suggested that the most common pattern is Miscellaneous errors. This differs dramatically from the HHS data in which Hacking/IT Incidents of network servers was the most common pattern.

The second most common DBIR breach pattern within healthcare was Web Applications [3]. Because the datasets do not use the same underlying data framework, they are not directly comparable. The DBIR concluded "as more and more organizations open patient portals and create new and innovative ways of interacting with their patients, they create additional lucrative attack surfaces" (p. 56). However, the HHS data does not support this. Table 2 shows that electronic medical records were breached in roughly 3% of breaches.

The third most common DBIR breach pattern is Everything Else. While this category, as the name suggests, is for incidents that do not fit into the other patterns, it contains mostly social engineering attacks such as phishing. The most comparable HHS pattern is hacking of email location (Figure 1). This is the 2nd most common pattern in HHS data, so there is not a significant difference with the DBIR.

There are multiple reasons for the differences the HHS and DBIR datasets. First, they are different samples. The DBIR draws data from a wide range of international sources, while the HHS data is only PHI data breaches of 500 records or greater reported in the US.

In addition, the DBIR analysis does not explicitly cite breaches of PHI. It cites “medical data,” which likely includes PHI. So, the DBIR analysis cannot distinguish between a breach of 100 PHI records and 100 million PHI records. Moreover, the DBIR definition of breach is “an incident that results in the confirmed disclosure—not just potential exposure—of data to an unauthorized party.” The DBIR breach data is only analyzing confirmed breaches of sensitive data, not PHI specifically.

Finally, the datasets are categorized differently. The DBIR uses a detailed taxonomy called the Vocabulary for Event Recording and Incident Sharing (VERIS) to categorize its data, while the HHS data uses a much simpler set of categories. As I have written previously, there would a great benefit for a collaboration between HHS and Verizon. HHS should use VERIS to categorize its breaches, and this data should be included in the DBIR analysis.

Given that this is not likely a short-term possibility, how should one act on this analysis? Overall, security reports such as the DBIR and datasets such as from HHS can only help with prioritization. At a basic level, an organization that deals with PHI must demonstrate HIPAA compliance. However, this is insufficient to protect against today’s rapidly evolving threats. An organization needs a broad security program that has strong defenses across the 5 NIST Cybersecurity Framework (CSF) core activities: Identify, Protect, Detect, Respond, and Recover. Moreover, an organization needs to build a risk-based security culture [4].

Once security fundamentals are in place, organizations should prioritize based on the most likely threats to PHI. The DBIR based its “top controls” recommendations on the Center for Internet Security (CIS) Critical Security Controls(version 6) (CSC). Note that prioritizing a CSC control set means implementing the non-Foundational and Advanced recommendations. For healthcare, the DBIR recommends Implement a Security Awareness and Training Program (CSC 17), Boundary Defense (CSC 12), Data Protection (CSC 13). Note that CSC 17 must teach users how to appropriately handle PHI. CSC 13 should be used as an opportunity to perform a comprehensive search for PHI across all systems. Undoubtedly, organizations make criminals lives easier by storing PHI, either mistakenly or intentionally, on systems not appropriately hardened to store, process, and transmit PHI.

Clearly, the DBIR recommended important controls, but the most significant incident patterns in the HHS data suggested other top controls. Assuming the other CSC Basic Controls are in place, Controlled Use of Administrative Privileges (CSC 4) and Secure Configuration for Hardware and Software on Mobile Devices, Laptops, Workstations and Servers (CSC 5) would effectvely protect network servers, where the majority of PHI records were breached. Furthermore, these controls will protect email servers, where the second most PHI records were breached. Data protection (CSC 13), a DBIR recommendation, should additionally cover email systems to protect against leakage of data. Security Awareness and Training Program (CSC 17), another DBIR recommendation, will also protect against breaches from unauthorized access/disclosure, the 3rd most impactful HHS incident pattern.

References and Notes

[1] D. McKeown, "Understanding threats to Protected Health Data (PHI) using the Verizon DBIR and HHS breach data," 12 September 2019. [Online]. Available:

[2] U.S. Department of Health and Human Services - Office for Civil Rights, "Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information," [Online]. Available:

[3] DBIR definition of Web Applications pattern (p. 37): "Anything that has a web application as the target. This includes attacks against the code of the actual web application, such as exploiting code-based vulnerabilities (hacking-exploit vuln) to attacks against authentication, such as hacking-use of stolen creds."

[4] D. McKeown, "Building a Risk-based Information Security Program," ISSA Journal, pp. 14-21, 2019.