Assuming the breach: Data Driven Security, Part 3

Wednesday, May 20, 2015

Data Driven Security, Part 3 - Finally some answers!

I am continuing my exploration inspired by Data Driven Security.

In Part One, I imported some data on SSH attacks from Outlook using AWK to get it into R.

In Part Two, I converted some basic numbers into graphs, which helped visualize some strange items. The final graph was most interesting:

It was strange that there were twice as many attacks at the Dev SSH service as the DR service. What is going on here?

Well, we've got over 37,000 entries in here over a couple of years. Let's break them out and get monthly totals based on target. First, I'm going to convert those date entries to a number I can add up.
Remember the data looks like this:

Each entry has a date and what target location was hit. So I'm going to use an ifelse to add a new column with a "1" in it for every matching target

So now I have this

Now I add another vector the dataframe, breaking dates down by month

There might be an easier way to do this, but I'm still an R noob and this seemed the easiest way forward for me. Anyway, I can plot these and compare.

You can do this for all four and compare:

This gives me two insights. First, different SSH systems began receiving alerts at different times. Basically there are twice as many Dev alerts as DR alerts simply because the DR system didn't generate alarms until the middle of 2014. Same with the Tst SSH system. So there is a long tail of Dev alarms skewing the data. In fact, the Dev system was the SSH system to go live. Pretty much any system on the Internet is going to receive some alarms, so a zero in the graph means that service was not alive yet.

Second, we confirm and can graphically show what IT orginally asked back in part 1. To refresh your memory: said "The IT Operations team complained about the rampup of SSH attacks recently. "

Here we can visually see that ramp up, beginning at the end of 2014 and spiking sometime in the first quarter.

The next step in our analysis would be to see who was in the spike: are these new attackers? Where are they from? The traffic to the targets seem to peak at different times, so there might be something worth investigating there.

And why did the traffic die down? Were the IPs associated with a major botnet that got taken down sometime in April 2015? A quick Googling says yes: "A criminal group whose actions have at times been responsible for one-third of the Internet’s SSH traffic—most of it in the form of SSH brute force attacks—has been cut off from a portion of the Internet."

I hope this series was informative to you as it was to me. I was pleasantly surprised in being find some tangible answers on current threats from simply graphing our intrusion data. Now it's your turn!

Wednesday, May 20, 2015

Data Driven Security, Part 3 - Finally some answers!

No comments: