Tuesday, June 17, 2008

The Breach Data Report

Today I want to talk about the breach data report. No, not the
Verizon breach report, but the other one. The one you haven't seen yet.

Over the past semester, University of Washington researchers in the
iSchool Information Assurance program spent hundreds of hours analyzing breach data. This was a semester-long final project for a pretty senior group of graduate, under-graduate, and returning professional students.

Initially, the goal was to dig for nuggets of useful information in the breach data, much like the results of the Verizon study. However, the analysis quickly uncovered that most of the breach data out there is incomplete, inaccurate, or just plain incomprehensible.

How did Verizon get such accurate results? Well, according to them, they used data incidents they were involved in. Specifically, they say the data comes "directly from the casebooks of our Investigative Response team." So we know that this data is at least biased towards Verizon customers, which is interesting. I'll mention that I am a Verizon Business Security customer but I've never been involved in a breach investigation with their team. If I did have an incident, I don't know if they'd be the ones I'd call. The data they examined is probably complete for the cases involved. It's just that cases do not represent the entire range of possibilities.

Now our project, Defining a process for quantitative analysis of data breach information, cast a much wider net. And the results were startling. The students could only verify 30% of the reported breaches with high confidence. And many data sources had to be thrown out since they were so incomplete to be not useful.

The whole report is 56 pages long and covers processes for vetting, parsing, and querying breach information sources. The report isn't available yet, but soon will be. If you're in the Pacific Northwest, we will be having a special InfraGard meeting with the researchers to go over the results in detail.

No comments: