Understanding threats to Protected Health Data (PHI) using the Verizon DBIR and HHS breach data

Don McKeown, 9/12/2019


For the healthcare sector, the US Health and Human Services (HHS) dataset of PHI breaches (HHS Dataset) and the Verizon Data Breach Investigations Report (DBIR) are critical sources for understanding threats to information assets. However, one needs to understand the strengths/weakness of each to properly utilize them (see table below). Use both to gain a broad perspective on the threats to the confidentiality, integrity, and availability of information assets. The DBIR and HHS dataset would improve if Verizon and HHS collaborated.


It's difficult to prioritize information security investments given limited resources. If you're in healthcare, HIPAA compliance is table stakes. HIPPA is not highly prescriptive like PCI, so there are many ways to achieve compliance. Moreover, achieving compliance with any regulation or framework does not mean you're protected against emerging threats. An organization with a culture that equates compliance with being secure should evolve into a risk-based security culture that can respond to emerging threats [1].

Organizations with a risk-based security culture will want to understand the current threats to their most valuable assets. I'm going to make the wild assumption that most healthcare organizations are primarily worried about a breach of Protected Health Information (PHI). Two outstanding ways to statistically understand how such breaches have occurred in other organizations are the DBIR [2] and the HHS Dataset of breaches of PHI [3]. These could be used as sources to create executive reports about the most relevant threat patterns. In addition, security teams could use these data to ensure that it can defend against attacks that are most common within the healthcare sector [4]. However, it's critical to understand the strengths and weaknesses of each to interpret them appropriately.

The DBIR [2] is widely considered the most authoritative, rigorous, freely available analysis of data breaches and security incidents. It draws its data from seven internal Verizon sources and 66 external, international organizations from the public and private sectors, and structures it according to a comprehensive, standard set of metrics (VERIS [5]). The DBIR analyzes breach and incident data across all industries and analyzes within each industry sector. Sectors generally have different threat patterns.

The HITECH Act requires HHS to publicly disclose breaches of PHI that affect 500 or more individuals. It does this on its website, also known as the Wall of Shame, and offers the last 24 months for download. Each data point has information such as covered entity type, individuals affected, date of breach, type of breach (e.g., hacking/IT incident, Unauthorized Access/Disclosure), and location of breached data (e.g., network server, email).

Any analysis based on aggregated data depends on the type and quality of the data sample and can only offer general guidance about the most prevalent threats. Organizations need to conduct their own risk analysis, and sources such as the DBIR and HHS are outstanding starting points. Moreover, as the company matures, it should consider integrating threat intelligence into its operations so it can rapidly respond to emerging threats. The below table summarizes the strengths and weaknesses each source. If you have any feedback on these or know of any others, please reach out to me.

Comparative strengths and weaknesses of DBIR and HHS Dataset

Comparative strengths and weaknesses of DBIR and HHS Dataset


Use both sources to holistically understand threats to information assets while understanding the strengths and weaknesses of each source. For example, don't assume DBIR breach data for the healthcare sector is about PHI breaches. It doesn't say that. That's what the HHS dataset is about. If you want to understand incidents that could more broadly affect confidentiality, integrity, and availability of your information assets, turn to the DBIR.

The DBIR and HHS dataset would improve with collaboration and learning from each other. HHS should encode its data with VERIS and share it with the DBIR team. The DBIR should analyze confirmed breaches of PHI using the measures included in the HHS dataset such as number of individuals affected and covered entity type.

If you have any feedback or questions, please contact me.


[1] D. A. McKeown, "Building a Risk-based Information Security Program," ISSA Journal, pp. 14-21, April 2019.

[2] "Verizon Data Breach Investigations Report (DBIR)," [Online]. Available: https://enterprise.verizon.com/resources/reports/dbir/.

[3] U.S. Department of Health and Human Services - Office for Civil Rights, "Breach Portal: Notice to the Secretary of HHS Breach of Unsecured Protected Health Information," [Online]. Available: https://ocrportal.hhs.gov/ocr/breach/breach_report.jsf. [Accessed May 2019].

[4] R. Mogull, "How to Use the 2013 Verizon Data Breach Investigations Report," Securosis, 22 April 2013. [Online]. Available: https://securosis.com/blog/how-to-use-the-2013-verizon-data-breach-investigations-report. [Accessed 3 September 2019].

[5] "VERIS- The Vocabulary for Event Recording and Incident Sharing," 3 September 2019. [Online]. Available: http://veriscommunity.net/index.html.

[6] HIPAA Journal, "Healthcare Data Breach Statistics," [Online]. Available: https://www.hipaajournal.com/healthcare-data-breach-statistics/. [Accessed 3 September 2019].