Assuming the breach: March 2014

Wednesday, March 26, 2014

7 areas outside of infosec to study

There's a lot of areas that most of us infosec people like to dabble in that is outside of our required skill set. For example, it seems like every third security person has a set of lock picks and loves to do it. Unless you're a red teamer, admit that it's just a puzzle you like to play with and stop trying to impress us. Here are some areas just outside of infosec that I like it to hone:

1. SEO - Becuz hackers use it to sneak malware into your organization. Information warfare is older school than “cyber warfare”, and information warfare is all about managing perception. Where to start? I recommend my neighbor, Moz

2. Effective communication. That means learning to write well in both email, long form, and to educate. It means being to speak effectively one-on-one, in a meeting, and giving a speech. It means being clear, concise and consistent. It means respecting your audience and establishing rapport. Where to start? I recommend Manager Tools

3. Project Management. Everything we do is a project. We can always be better at doing them. I’ve been managing projects for decades and I’m still not satisfied on how well things are run. I recommend Herding Cats.

4. Programming. I started in programming but rarely do it anymore. We work in technology. We give advice to developers. We work with sysadmins on scripting. We should at least have a good fundamental grasp of programming in a few major flavors: basic automation scripting, web apps, and short executables. I’d say you should at least be able create something useful (beyond Hello World) in PERL, Bash, or PowerShell… plus something in Ruby/Python/Java.

5. Databases. Most of everything is built on on a database. You should at least be able to write queries and understand how tables and indices work. It’s helpful to know a little more than how to do a SQL inject “drop tables” or “Select *”. You don’t need to become a DBA but tinker with SQLite or MySQL. As I level-up on item 4, I find myself doing more and more of number 5. They kinda go together.

6. Psychology. Since we can't solve all our security problems with money (cuz we don't have enough) we have to use influence to get things done. And we have to anticipate how controls will live or die in the real world. A good basic understanding of people beyond treating users as passive objects (or even worset, as rational actors) is required. A good starting place is Dan Ariely's Predictably Irrational: The Hidden Forces That Shape Our Decisions.

7. Behavioral economics (More psychology) If you ever wondered why I have a CISSP, do SSAE-16 audits, and have an office shelf of security awards, it’s because I get visited by a lot of nervous customers and auditors entrusting me with their data. And signaling theory.

Note how almost have the things on my list are human-centric areas… because the people are always the hardest part of the job.

Wednesday, March 19, 2014

An interesting tidbit in the EU data protection regs:

The European Parliament has finally passed their big redesign of data protection regulation. Nothing too shocking in there, in light of the Snowden fallout. One little item caught my eye tho:

Data Protection Officers: the controller and the processor shall designate a data protection officer inter alia, where the processing is carried out by a legal person and relates to more than 5000 data subjects in any consecutive 12-month period.

Data protection officers shall be bound by secrecy concerning the identity of data subjects and concerning circumstances enabling data subjects to be identified, unless they are released from that obligation by the data subject. The committee changed the criterion from the number of employees a company has (the Commission suggested at least 250), to the number of data subjects. DPOs should be appointed for at least four years in the case of employees and two in that of external contractors.

The Commission proposed two years in both cases.

Data protection officers should be in a position to perform their duties and tasks independently and enjoy special protection against dismissal. Final responsibility should stay with the management of an organisation.

The data protection officer should be consulted prior to the design, procurement, development and setting-up of systems for the automated processing of personal data, in order to ensure the principles of privacy by design and privacy by default.

Not anything new here, but reviewing it made me thing about an interesting metric buried in there: the controller and the processor shall designate a data protection officer inter alia, where the processing is carried out by a legal person and relates to more than 5000 data subjects in any consecutive 12-month period ... The committee changed the criterion from the number of employees a company has (the Commission suggested at least 250), to the number of data subjects

First I liked the old metric of 250 employees per data protection officer. It tracked with my experience with about the right size to start having a dedicated security officer. But changing it to the size of pile of confidential data you're protecting is even more relevant.

When I was hired on in my current job, we were a smallish company but we were custodians of megatons of PII. And 5000 sounds about right, if nothing else, for breach numbers: If the average cost is around $136 per person's records breached, then 5000 x $136 = $680,000.

Okay, now we have our impact. The question is what is the probability of breach and how much does a dedicated DPO reduce that probability? Well, that probably varies on organization to organization, tho it'd be good to know some hard numbers. Something to munch on.

The other thing I liked in the regs is Data protection officers should be in a position to perform their duties and tasks independently which continues to support my position that infosec should not report into the IT hierarchy.

Monday, March 17, 2014

Make your security tools: DLP

After spending tens of thousands of dollars on commercial security solutions that did not meet our needs, our security team opted for a DIY approach. One of the first tools we wanted was a decent DLP. We were also very disappointed in the DLP solutions available, especially when it came to tracking confidential data elements across both Linux and Windows file systems. Many were hard to use, difficult to configure, and/or dragged along an infrastructure of servers, agents and reporting systems. We wanted something small, flexible, and dead simple. At this point, we were either looking at going back to the well for more resources to get the job done or coming up with some crafty. None of us were coders beyond some basic sysadmin scripting, but we decided to give it a shot.

The problem was that we potentially had confidential data laying around on several large file repositories. Nasty stuff like name + SSN, birthdate, credit card, etc. We tried several commercial and open source DLP scanners and they missed huge swaths of stuff. What was particularly vexing is that our in-house apps were generating some of this stuff, but it was in our own format. It was pure ASCII text but the actual formatting of the data was making it invisible to the DLP tools.   It was structured but not in a way that any other tool could deal with.   Most of the tools didn't offer much flexibility in terms of configuration. Those that did were limited to single pass reg-ex.

Our second problem is that we also wanted a way to cleanly scrub the data we found. Not delete it, not encrypt it, but excise like a tumor with precision of a surgeon. We were tearing through log files and test data load files used by developers.   Some of these files came directly from customers who did not know better to scrub out their own PII. We had the blessing of management to clip the Personal out of PII and anonymize it in place. No tool on the market did that.

Luckily we knew what we were looking for and how it was structured and what we wanted to do with it. It allowed us to do contextual analysis... when you see these indicators, look here for these kinds of files. Using Python and some hints based on OpenDLP (one of the things we looked at), plus a little Luhn test, and did a first pass.

We got a ton load of stuff back. Almost none of good.   This was not unexpected, as this was our experience with a lot of the DLP tools.

So we then started a second pass of contextual and content analyses. We dove in and looked at look at these false positives and found what made them false. This second pass scan would weed out those cases with pattern matching and algorithms.   We rinse, lathered and repeated with bigger and bigger data sets until we were hitting exactly what we want with no false positives.

Next we added a scrub routine that replaced the exact piece of PII in a file with a unique nonsense data element. For example, some of these files were being used as test loads by developers. If we just turned all credit card numbers in 9's, their code would fail.   They also needed unique numbers for data analysis. If you turn a table of SSNs into every single entry being 99999, the test will fail. So we selectively changed digits but maintained uniqueness. I can't get into too much detail without giving away proprietary code, but you can read all about it here

We also kept a detailed log of what was changed to what, so that we could un-ring that bell if it ever misfired. And of course, we protected those log files since they now have confidential data elements in them.

What we ended up with was a single script that given a file path, would just go to town on the files it found. No agents, no back-end databases, no configuration. Just point and shoot.

The beauty is we knew what we were willing to trade off, which was speed, against precision. Our goal was the reduction of manual labor and better assurance.   Our code was clunky, ran in a slow interpreted language, and it took hours to complete. But it was also easy to modify, easy to pass around to team members, and the logic was very clear. Adopting the release early and often approach, we had something usable within weeks that proved more functional than the products on the market.

The tool proved to be laser-precise in hunting down the unique PII data records in our environment, preventing costly and embarrassing data leaks. After showing it around, we were given precious developer resources to clean up our code, add functionality, and fix a few little bugs. It's been so successful as an in-house tool that our management will soon be releasing it as a software utility to go along with our product.