Let's start with some basic search methodology. The idea here is to "read" through a log-file and search it for specific terms. You can use grep by itself or sed, awk, gawk or a dozen other commands. If you use a Linux workstation or the windows ports of Linux utilities it will look something like this:
grep -i "keyword" -r *
If the output doesn't look the way you want it to or you are having trouble targeting specific files with grep alone, you can refine somewhat by stacking commands like so:
Strings -s *.log |grep -i "keyword"
I guess the big secret here is the keywords. They will vary slightly from case to case but, generally speaking, SQL injection can be identified by searching for union select, xp_cmdshell, concat and also by looking for specific database table names in the logs. The last of these is especially true if you know what type of data is at risk and where it resides. One of my favorite PCI related searches is to look for "cvv" in the logs or "cc_number". If you are concerned about data being snatched from a particular database, grab the table names and run a search. It's very common to see fields like "First_Name, Last_Name, Address"
Directory Traversal is also fairly easy to find:
grep "..\..\" -r * (flip those slashes for an Apache log)
There's a fairly common attack against older versions of ColdFusion that involves a directory traversal attack to gain the admin password hash. Once you have the hash, you can drop it into the URL for the admin console and gain access without actually "knowing" the password.
RFI can be a little trickier. You are looking for a URL inside a log full of URLs. The cool thing here is that you can do inverse searches and/or omit certain strings. For example, if you are searching www.website.com for RFI you can perform a regexp search for URLs and omit www.website.com so you only see pattern hits other than "www.website.com".
egrep http|https|ftp ex6102012.log |grep -v "www.website.com"
Now you know how to do the searches, how do you identify automation from human activity?
The answer is "timing". Human beings can only type so fast. If you are seeing large SQLi strings once every second or two, you are looking at the residue from an automated tool like SQLmap, Havij, OWASPZap or any other of a dozen different tools. Many of these tools leave enticing little clues by default. Havij stamps itself in the UserAgent field and Accunetix does the same thing.
These tools also exhibit a "patterning" behavior. Regular web traffic is haphazard, the GET and POST requests will vary in length and the page being requested will change from one entry to the next. Scanning tools don't do that, they hit the same file (index.php, startpage.cfm) over and over and over again adding a little piece at a time. This results in a visual pattern in the logs that even the most novice investigator can pick out. I zoomed wayyyy out for the screenshot below but it illustrates my point. Sorry about the sizing, blogger didn't cooperate very well with this image.
|This is Havij at work. Nice "shark teeth" huh?|
Chris talks about this kind of pattern recognition here on his blog.
Next post "The End Game" will follow. I want to talk a little bit about what attackers are doing once they find one of these vulnerabilities and actually break into these sites.
Happy New Year!