Application-level Search Techniques

I said in the beginning of this book that I was going to focus primarily on the volume and file system layers of the analysis model. This is one of the sections where I will move into the application layer and discuss a couple of techniques that can be used to recover deleted files and to organize allocated files for analysis. These are file system-independent; therefore, they will not be discussed again in the subsequent chapters.

Both of these techniques rely on the fact that many files have structure to them, including a signature value that is unique to that type of file. The signature can be used to determine the type of an unknown file. The file command comes with many Unix systems and has a database of signatures that it uses to identify the structure of an unknown file (ftp://ftp.astron.com/pub/file/).

Application-based File Recovery (Data Carving)

Data carving is a process where a chunk of data is searched for signatures that correspond to the start and end of known file types. The result of this analysis process is a collection of files that contain one of the signatures. This is commonly performed on the unallocated space of a file system and allows the investigator to recover files that have no metadata structures pointing to them. For example, a JPEG picture has standard header and footer values. An investigator may want to recover deleted pictures, so she would extract the unallocated space and run a carving tool that looked for the JPEG header and extract the data in between the header and footer.

An example tool that performs this is foremost, (http://foremost.sourceforge.net) which was developed by Special Agents Kris Kendall and Jesse Kornblum of the United States Air Force Office of Special Investigations. foremost analyzes a raw file system or disk image based on the contents of a configuration file, which has an entry for each signature. The signature contains the known header value, the maximum size of the file, whether the header value is case sensitive, the typical extension of the file type, and an optional footer value. An example can be seen here for a JPEG:

jpg y 200000 xffxd8 xffxd9

This shows that the typical extension is 'jpg,' the header and footer are case sensitive, the header is 0xffd8, and the footer is 0xffd9. The maximum size of the file is 200,000 bytes, and if the footer is not found after reading this amount of data, the carving will stop for that file. In Figure 8.21 we can see an example set of data where the JPEG header is found in the first two bytes of sector 902 and the footer value is found in the middle of sector 905. The contents of sectors 902, 903, 904, and the beginning of sector 905 would be extracted as a JPEG picture.

Figure 8.21. Blocks of raw data that can be carved to find a JPEG picture in sectors 902 to 905.

A similar tool is lazarus (available in The Coroner's Toolkit at http://www.porcupine.org/forensics/tct.html) by Dan Farmer, which examines each sector in a raw image and executes the file command on it. Groups of consecutive sectors that have the same type are created. The end result is a list with an entry for each sector and its type. This is basically a method of sorting the data units by using their content. This is an interesting concept, but the implementation is in Perl and can be slow.

File Type Sorting

File type can also be used to organize the files in a file system. If the investigation is looking for a specific type of data, an investigator can sort the files based on their content structure. An example technique would be to execute the file command on each file and group similar file types together. This would group all pictures together and all executables together. Many forensics tools have this feature, but it is not always clear if they are sorting based on file name extension or by file signature. The sorter tool in TSK sorts files based on their file signature.

Категории