Obfuscation and (non-)detection of malicious PDF files
The idea is that it’s possible to use some malformations in the documents, like those commented by Julia Wolf, and the PDF specification itself in order to keep the files hidden from Antivirus engines and parsers. Bad guys can effectively use it to create an undetectable exploit and use it as an attacking vector. Some of the techniques are the following:
- Using the /Names and /AcroForm elements of the Catalog object to execute code when the document is opened, instead of the /OpenAction element.
- If the malicious content is stored in a string object it’s possible to hide it thanks to the octal codification.
- However, if the content is stored in a stream object some unknown filters can be applied, like /JBIG2Decode or /DCTDecode, avoiding the most used, like /FlateDecode and /ASCIIHexDecode. Avast researchers found recently that this is something that cyberdelinquents are already using in the wild.
- In the case of /FlateDecode and /LZWDecode filters it’s possible to define some parameters in order to make the analysis more difficult.
- Avoid the endobj tag at the final of the objects to cheat the parsers.
- Put null bytes in the header of the document.
- Compressing the malicious objects in the so-called object streams to add an additional obfuscation level.
- Encrypt the document with the “default password”.
- Embed the malicious file in a legit one. It’s possible to open the malicious file automatically when the legit document is opened.
In the demo I performed last Friday I modified a detected malicious PDF file (34/43) to decrease its detection rate, being detected only by one Antivirus engine after the modifications. The results of the tests performed in February were even worse, being totally undetectable. Although bad guys are not using all these techniques yet, they will do, therefore it’s important to take them into account in the development process of analysis tools and Antivirus products.
The tool I’ve developed for the analysis of malicious PDF files, peepdf, was released last Friday too, and it supports most of the commented techniques, so it’s a good option when a PDF file must be analysed. It’s the first version of the tool, so all comments and potential bugs are welcomed! 😉