Saturday, February 2, 2013

New Year, New Look, New Post: How did they find me? Part 2.

Last post we went through some of the free utilities available to attackers for reconnaissance purposes.  The utilities I talked about in that post are all things that I have seen used over and over again in successful attacks. What I did not touch on was what these attacks look like in Apache and IIS log-files.

Let's start with some basic search methodology. The idea here is to "read" through a log-file and search it for specific terms. You can use grep by itself or sed, awk, gawk or a dozen other commands. If you use a Linux workstation or the windows ports of Linux utilities it will look something like this:

grep -i "keyword" -r *

If the output doesn't look the way you want it to or you are having trouble targeting specific files with grep alone, you can refine somewhat by stacking commands like so:

Strings -s *.log |grep -i "keyword"

I guess the big secret here is the keywords. They will vary slightly from case to case but, generally speaking, SQL injection can be identified by searching for union select, xp_cmdshell, concat and also by looking for specific database table names in the logs. The last of these is especially true if you know what type of data is at risk and where it resides. One of my favorite PCI related searches is to look for "cvv" in the logs or "cc_number".  If you are concerned about data being snatched from a particular database, grab the table names and run a search. It's very common to see fields like "First_Name, Last_Name, Address"


If you find the name of your database and its corresponding tables in your access logs, that's bad. Really, really bad. Call a professional.

Directory Traversal is also fairly easy to find:

grep "..\..\" -r * (flip those slashes for an Apache log)

There's a fairly common attack against older versions of ColdFusion that involves a directory traversal attack to gain the admin password hash. Once you have the hash, you can drop it into the URL for the admin console and gain access without actually "knowing" the password.

OUCH again!

RFI can be a little trickier. You are looking for a URL inside a log full of URLs. The cool thing here is that you can do inverse searches and/or omit certain strings. For example, if you are searching for RFI you can perform a regexp search for URLs and omit so you only see pattern hits other than "".

egrep http|https|ftp ex6102012.log |grep -v ""

Now you know how to do the searches, how do you identify automation from human activity?

The answer is "timing". Human beings can only type so fast. If you are seeing large SQLi strings once every second or two, you are looking at the residue from an automated tool like SQLmap, Havij, OWASPZap or any other of a dozen different tools. Many of these tools leave enticing little clues by default. Havij stamps itself in the UserAgent field and Accunetix does the same thing.

These tools also exhibit a "patterning" behavior. Regular web traffic is haphazard, the GET and POST requests will vary in length and the page being requested will change from one entry to the next. Scanning tools don't do that, they hit the same file (index.php, startpage.cfm) over and over and over again adding a little piece at a time. This results in a visual pattern in the logs that even the most novice investigator can pick out. I zoomed wayyyy out for the screenshot below but it illustrates my point.  Sorry about the sizing, blogger didn't cooperate very well with this image.

This is Havij at work. Nice "shark teeth" huh?

Chris talks about this kind of pattern recognition here on his blog.

Next post "The End Game" will follow. I want to talk a little bit about what attackers are doing once they find one of these vulnerabilities and actually break into these sites.

Happy New Year!


  1. Grayson,

    It's very interesting to me that here it is, 2013, and folks are *still* getting hit with SQLi, where the commands are being sent into the system in ASCII text.

    Back when we were at IBM, @cpbeefcake and I authored a 6-pg white paper for the other team members, describing SQLi in enough detail that they could discuss it confidently with the customer. This was due, in part, to my going on-site on a Saturday morning, and the following day, "discussing" SQLi with a victim organization and trying to get them to understand that based on the Cisco logs (logging had been disabled on the web server) and the artifacts on the internal infrastructure, SQLi had indeed been used, and had been successful...and therefore, there was a database server somewhere that had a connection with the web server. This went on for about 2 1/2 hrs, until someone finally came to the realization of what was going on.

    Anyway, we saw a lot of character set encoding, so keyword searches for "union" and "xp_cmdshell" weren't working on some cases. What I ended up doing was writing a script that culled through the logs, looking for vulnerable pages (.asp, not .aspx) and then checking the length of the submitted request. if a "normal" submitted request came in a around 20 char, and we started seeing requests of 60-80 char, we knew we had something to look at. From there, we would get the page name, and then cull through the logs looking at the IP addresses submitting longer than normal requests to those pages. This provided to be much more effective than keyword searches, in part because some analysts would just run a keyword search and declare their work done.

  2. Thanks for the comment. I've also seen cases where the requests were hex and/or URL encoded. You're absolutely correct in the length of the requests being key in that type of case. We have a handy little ruby script that "demuckifys" the logs for us. Using it first allows the keyword searches to work as designed.

    Unfortunately, the breach landscape hasn't changed that much. Sqli is still very prevalent and sites are left unprotected.