Unabridged and vastly underestimated, often is the potentially dense quality of indirect digital forensic data. Sometimes the most prime sources of information lays not within an event’s data, but the data that is produced –about– the data.
What is Metadata and why is it important?
The shorthand answer: Metadata is informational data/statistics/descriptors of other pieces of data. Simply, it is the data about data. File names, creation dates, modified dates, file extensions, sizes, hashes and so on fit into the metadata category. Despite this information not showing investigators the content data of a file (like the text inside a word document), it can present a lot of background information and context on the file itself.
Tying back to its Greek roots, “meta-” is a prefix which could identify ‘change’, ‘after’, and ‘beyond’. It is often used in today’s language to analyze something at a higher level than the subject itself is normally viewed; hence “beyond”. Therefore, for digital forensic purposes, you can think of metadata as data that provides more context to the item you are viewing. On an operating system this information can include stuff like the:
- Date the file was created, modified, last accessed, or changed
- File size
- File name
- Permission attributes
- Hash value
The ability to create an accurate and detailed timeline is often crucial in digital forensics. File metadata allows for that information to be satisfied, with excess. Although Metadata does not show you the actual content of the file, it can tell you a lot about the file. For instance, let’s examine a file on both Windows 10 and Ubuntu Linux.
Windows: Some file metadata can be easily viewed on Windows without the need for any forensic tools. To do so, right click on the target file and select “properties” at the bottom. You should see something like figure 1.
We are greeted with the name of the file “Stolen_Cards.txt”, the type of the file being “.txt”, the location of where the file is being stored, the logical and physical sizes of the file, when the file was created, when the file was last accessed and last modified, and any permission attributes of the file. As you can see, we now have a better understanding of this file, without even looking at its content. A lot of this information could even be valuable to a case. Besides the name of the document, this timestamp metadata tells us amazing information. Let’s continue with timestamps for now.
It is very important to make sure you are aware of the implications of user actions and how they alter these timestamp dates. For instance, the file could have been created on another system or volume and copied to this system/volume. The creation date will no longer show the actual creation date of the file, but rather will reflect the date the file was created on this particular system. A good example of this is shown in figure 2.
As you can see, the location of this file has changed and so has the created/accessed timestamps. The result is a modified date that is older than the date it says to have been created on. Funny things like this are helpful to be aware of when reviewing file metadata. There are a lot of really good in-depth explanations on timestamps like this one: https://cyberforensicator.com/2018/03/25/windows-10-time-rules/. They provide a great outline for how and when the timestamps of a file are changed on Windows. For instance, the last accessed timestamp on Windows is fairly interesting. You may have noticed that the last accessed time in figure 1 was the same as the modified time. However, I did access this file numerous times after modifying it. The reason the timestamp was not updated was because of the Windows registry item called “disablelastaccessed”. This was introduced into Windows 7 and beyond, which disables this timestamp to update automatically, thus preventing precious RAM space from being taken up whenever a file was accessed. As I stated before, metadata can be VERY helpful –as long as you can appropriately interpret it. (More on this particular feature can be found here: https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/fsutil-behavior)
Ubuntu: In the Linux Ubuntu Desktop you can follow the same process as Windows by right clicking the file and selecting ‘properties’. Shown in figure 3, similar timestamp information can be found using this method.
Here you will also get the file name, file type, size, location, an accessed date, and a modified date. The information is basic, but still helpful. Another way to get even more metadata information in Linux is by running the ‘stat’ command. Simply type ‘stat <filename>’ into the terminal and you will get the output show in figure 4. This new output shows us a slew of valuable information about the target file. Here we can see the file’s name, size and blocks used (this file clearly has no content), what permission access level there is, when the file was last
accessed, when it was last modified, and even the change date of the file. Just like in Windows, user action implications are also important to know on Linux systems. For those of you who do not know, Linux also introduces the “change” timestamp which displays the last time that the metadata of the file was altered. This would be for metadata changes other than the timestamps. So it could display the date/time when the filename was altered, when the ownership changed, or even when the permissions of the file is changed.
Stepping away from timestamps, other file metadata is still valuable. One great example is a file’s hash. The file hash reflects the exact content data of the file. The hashing process runs the content data of the file through an algorithm which then calculates a string of characters, unique to the content in the file. There are numerous hashing schemes, however the most common are called ‘MD5’ and ‘SHA-1’. A hash value is surprisingly useful metadata for both forensic and non-forensic reasons. Non-forensically, metadata helps prove that the file you have is actually the exact file you want. It is not uncommon for people to try and use a legitimate program or PDF files to hide malware Trojans. Legitimate files have a hash value that is unique to the exact content within the file. When you find and download this file, you can compare the hash value to the legitimate hash value online. If the hash values are the same then you can be certain it is the legitimate version of the file. Forensically speaking, this comparison for legitimacy is also useful to show that a particular file was not altered in any way (either from the user, malware, or even the investigator). The hash can also become forensically useful when you need to find a file that has had its name changed. If a wrong-doer has a file they’re not supposed to, they may try to hide it by changing the file name. So if you were to search for the file name, you might get no results. But if you were to search for the file hash, you will find the newly-rename file because the hash is only dependent on the content of the file, not the file metadata like the name.
Then there are the more subtle benefits of file metadata such as the file size, which can compare what an application or DLL size is known to be, versus what it is showing up at. Less extreme, you can use this metadata to simply ensure you have enough room on a removable storage device to transfer it.
Conclusion: Metadata exists everywhere and allows us to understand the history, and background of a particular file. It gives us more context of the file without even having to look at its content. It exists in many forms and can all can be levied in forensic investigations. The immediately-apparent uses are of course timestamps and hashes, but the use does not end there. File names, permissions, and sizes all provide benefits as well. Beyond the typical, time zones, forensic acquisition metadata, and so on even further grow the valuable pot. In short, metadata holds true benefit when performing a forensic analysis.