Você está na página 1de 22

IS 873: Malware Analysis and Reverse Engineering

Basic Static Analysis


Overview
• AV scanning
• Hashing
• Finding strings
• Packed and Obfuscated malware
• Linked libraries and functions
• Static analysis in practice
• PE file format
Anti-virus Scanning
• A useful first step - run it through multiple AV programs
• AVs might already have identified the malware
• AVs are certainly not perfect
– Rely mainly on a database of identifiable pieces of known
suspicious code (file signatures), as well as
– behavioral and pattern-matching analysis (heuristics) to identify
suspect files
– Malware writers can easily modify their code thereby changing
their program’s signature and evading virus scanners.
– Also, rare malware often goes undetected by antivirus software
because it’s simply not in the database.
– Finally, heuristics, while often successful in identifying unknown
malicious code, can be bypassed by new and unique malware.
Anti-virus Scanning
• Because the various antivirus programs use different
signatures and heuristics, it’s useful to run several
different antivirus programs against the same piece of
suspected malware.
• Websites such as virustotal.com allow you to upload a
file for scanning by multiple antivirus engines.
• VirusTotal generates a report that provides the total
number of engines that marked the file as malicious, the
malware name, and, if available, additional information
about the malware.
Anti-virus Scanning
Hashing
• Hashing is a common method used to uniquely identify
malware – Provides a fingerprint
• Software is run through a hashing program that
produces a unique hash
• The MD5 hash function is the one most commonly used
for malware analysis, though the Secure Hash Algorithm
1 (SHA-1) is also popular.
• For example, using the freely available WinMD5
program to calculate the hash of the notepad program
that comes with Windows is shown:
Hashing
Hashing
• Once you have a unique hash for a piece of malware,
you can use it as follows:
– Search for that hash online to see if the file has already been
identified.
– Share that hash with other analysts to help them to identify
malware.
Finding Strings

• A string is a sequence of characters such as “MyFilename”


• A program contains strings if it prints a message, connects to a
URL, or copies a file to a specific location.
• Searching through the strings can be a simple way to get hints
about the functionality of a program.
• For example, if the program accesses a URL, then you will see the
URL accessed stored as a string in the program.
• You can use the Strings program (http://bit.ly/ic4plL), to search
an executable for strings, which are typically stored in either ASCII
or Unicode format.
Finding Strings
• WannaCry ransomware appeared in May 2017
• Its early version was neutralized using a “Kill Switch”
– http://www.iuqerfsodp9ifjaposdfjhgosurijfaewrwergwea. com/
Finding Strings
Finding Strings
• Both ASCII and Unicode formats store characters in sequences
that end with a NULL terminator to indicate that the string is
complete.
• ASCII strings use 1 byte per character, and Unicode uses 2 bytes
per character.
• Following figure shows the string “BAD” stored as ASCII.

• The ASCII string is stored as the bytes 0x42, 0x41, 0x44, and 0x00,
where 0x42 is the ASCII representation of a capital letter B, 0x41
represents the letter A, and so on.
• The 0x00 at the end is the NULL terminator.
Finding Strings
• Following figure shows the string “BAD” stored as Unicode.

• The Unicode string is stored as the bytes 0x42, 0x00, 0x41 …..
• Strings searches for a three-letter or greater sequence of ASCII
and Unicode characters, followed by a string termination
character.
• Strings program ignores context and formatting, so that it can
analyze any file type and detect strings across an entire file
• Though this also means that it may identify bytes of characters as
strings when they are not.
Finding Strings
• Most invalid strings are obvious, because they do not represent
legitimate text.
• For example, the following excerpt shows the result of running
Strings program against the file bp6.ex_:
Finding Strings
• If a string is short and doesn’t correspond to words,
it’s probably meaningless.
• On the other hand, the strings GetLayout and
SetLayout are Windows functions used by the
Windows graphics library.
• We can easily identify these as meaningful strings
because Windows function names normally begin
with a capital letter and subsequent words also begin
with a capital letter.
• GDI32.DLL is meaningful because it’s the name of a
common Windows dynamic link library (DLL) used by
graphics programs.
• DLL files contain executable code that is shared
among multiple applications.
Finding Strings
• 99.124.22.1 is an IP address—most likely one that the
malware will use in some fashion.
• The string “Mail system DLL is invalid.!Send Mail failed
to send message.” is an error message.
• Often, the most useful information obtained by running
Strings is found in error messages. This particular
message reveals two things:
– The subject malware sends messages (through email), and
– It depends on a mail system DLL.
• This information suggests that should:
– check email logs for suspicious traffic, and
– another DLL (Mail system DLL) might be associated with this
particular malware.
Finding Strings
• Note
– the missing DLL itself is not necessarily malicious
– malware often uses legitimate libraries and DLLs to further its goals.
Packed and Obfuscated Malware
• Malware authors often use packing or obfuscation to make their
files more difficult to detect or analyze.
• Obfuscated programs are ones for which the author has
attempted to hide execution.
• It is the deliberate act of creating source or machine code that is
difficult for humans to understand. It may use needlessly
roundabout expressions to compose statements
• Types include simple keyword substitution, use or non-use of
whitespace and self-generating or heavily compressed programs.
• Packed programs are a subset of obfuscated programs in which
the malicious program is compressed and cannot be analyzed.
• Both of these techniques will severely limit your attempts to
statically analyze the malware.
Packed and Obfuscated Malware

• Obfuscators typically turn small fragments of readable source


code (JavaScript example):
for (i=0; i < M.length; i++){
// Adjust position of clock hands
var ML=(ns)?document.layers['nsMinutes'+i]:ieMinutes[i].style;
ML.top=y[i]+HandY+(i*HandHeight)*Math.sin(min)+scrll;
ML.left=x[i]+HandX+(i*HandWidth)*Math.cos(min);
}
• into this:
for(O79=0;O79<l6x.length;O79++){var O63=(l70)?document.layers["nsM
\151\156u\164\145s"+O79]:ieMinutes[O79].style;O63.top=l61[O79]+O76+(O79*O7
5)*Math.sin(O51)+l73;O63.left=l75[O79]+l77+(O79*l76)*Math.cos(O51);}

Source: Semantic Designs http://www.semdesigns.com/Products/Obfuscators/


Packed and Obfuscated Malware
• Unlike malware, legitimate programs almost always contain many
strings

• If you discover that a program has very few strings, it probably


means it is packed or obfuscated – suggesting a malware

• Packed and obfuscated code will often include the functions


LoadLibrary and GetProcAddress, which are used to load and gain
access to additional functions.
Packed and Obfuscated Malware
• When the packed program is run, a small wrapper program also
runs to decompress the packed file and then run the unpacked
file, as shown

• When a packed program is analyzed statically, only the small


wrapper program can be dissected.
• We will discuss this topic in detail in “Anti-reverse engineering”.
References

• Practical Malware Analysis A hands-on guide to dissecting Malicious Software


by Michael Sikorski

• WinMD5 program from Edwin Olson:


http://www.blisstonia.com/software/WinMD5
• Strings program from Microsoft:
https://technet.microsoft.com/en-us/sysinternals/bb897439

Você também pode gostar