Monday, October 1, 2012

dbmgr reloaded

This has been ported over to my GitHub site and is not longer being maintained here. For any issues, comments or updates head here.


I recently had a discussion with another coworker regarding scenarios where you can try and determine if something malicious is or was on a system based on mutexes.  For those unfamiliar with what a mutex/mutant is, a definition:

"Stands for Mutual Exclusion Object, a programming object that may be created by malware to signify that it is currently running in the computer. This can be used as an infection 'marker' in order to prevent multiple instances of the malware from running in the infected computer, thus possibly arousing suspicion."

Mutexes are referred to as mutants when they're in the Windows kernel but for the purpose of this post I'm going to only refer to mutexes even when mutant might be the correct technical term (deal with it).  So in theory, and in practice, by enumerating mutexes on a system and then comparing them to a list of mutexes known to be used by malware you would have good reason to believe something malicious is/was on the system - or at least a starting point of something to dig into if you're in the 'needle in a haystack' situation.  During our conversation I remembered a script from the Malware Analysts Cookbook which scraped ThreatExpert reports and populated a DB (Note : This script requires the 'avsubmit.py' file from the MACB as well since it takes the ThreatExpert class from it).  After taking another look at the script, I figured it would be less time consuming to modify it to fit my needs instead of starting from scratch.  This idea can be implemented across other online sandboxes as well but in this instance I'm just going to touch on ThreatExpert.

I grabbed the latest copy of the 'dbmgr.py' script but when I went to verify it was functioning properly prior to making any modifications I ran into a tiny hiccup.  As a result of a simple grammatical error within this version of the script, the processing would come to a halt and not complete ... I submitted a quick bugfix and within ~2 mins MHL acknowledged the issue, commented and fixed it.  I know it was a small fix but man, what service!

Now that there was a working copy up I took a look at the params/args which ThreatExpert made available and noticed I could use the 'find' parameter in addition to the 'page' parameter (which the script already included) and supply it with whatever I wanted to search for within the archived reports.

The addition of  'sl=1' is credited to another post MHL pointed out a little while ago where another user noted this would filter ThreatExperts results to only show 'known bad' ... after all, for the purposes most of us will be using this for, we don't really want to have 'good' results.  When you query ThreatExpert you receive ~20 results per page and ~200 pages max from what I've seen.  The other post mentioned above included a quick external bash script to loop the dbmgr.py script and supply it with a new value to grab different pages for bulk results.  To make things easier, I added another def to the script so you have the ability loop through multiple result pages and I also put in a simple check to stop processing results if there's no more left (i.e. - if you tell it to search 5 pages but only 3 are returned, instead of trying to process the last two it checks for the 'No further results to process' text which ThreatExpert produces and exists).



Example search terms which might be of interest:
  • mutex
    • would  produce results which have a greater chance of containing mutexes since that's a required word within the report based on what we're querying.
  • exploit.java || exploit.swf
    • either of these would produce results which involve either 'exploit.java' or 'exploit.swf' in their A/V name
  • wpbt0.dll
    • could be used to look at reports involving a commonly associated BHEK file


There were also a few other cosmetic changes that you'll notice in the patch but those are mainly to display things a certain way I wanted to see them - but I also came across an instance where there was some funky encoding on a file name it was trying to insert which caused it to fail so I added a little sanity check there as well.



So what's the point of this all and why do you care?  One of the reasons which I mentioned above was to populate a DB with known malicious mutexes (without wasting time grabbing a bunch of other reports that aren't relevant to your needs).  



This becomes even more handy when you're analyzing a memory image and want to do a cross-reference with volatility's 'mutantscan' command.  In fact, if you read the blurb under that commands reference you'll notice the volatility folks actually mentioned a similar PoC they tested so it's good to see others thinking the same way.  Other ways of interest could be to populate a DB and start to put together some stats regarding which registry keys are commonly associated with malware, which registry values, common file names, common file locations targeted, IP addresses contacted via the malware etc.. there's a wealth of data mining that can be done and the great thing is (1) it can be automated and (2) you don't have to have the samples or waste the time processing them in your own sandbox as you can just leverage this free resource.

If you want to play around with the patch I put out, head over to my github and follow the instructions for patching the original version.

:: Note - during recent testing I noticed I wasn't getting results but I believe this might be due to something on ThreatExpert's side, or I'm just being throttled... either way, it works but just be aware in case you aren't getting results every time (even with the original script) ::