If you’re working in litigation, I’m sure you’ve frequently wondered how to get your vendor to conduct the most effective keyword searches and not break the bank. How can you find information that might be critical to the case? We’ve learned that keyword searching is an art as much as it is a science. Every project has some kind of resource limitation, so we have developed search strategies to make the most of real-life budgets, time and computing power.
Keyword searches for a typical e-discovery production yield more predictable results because the searches are conducted on complete documents and files that remain intact on the system. However, many of the cases we work on involve data that has been deleted, requiring computer forensic techniques to recover. Simply searching the ‘unallocated space’ of the hard drive (where deleted documents reside) can be helpful, but often retrieves far too much information to be useful. This is because the data is no longer organized as individual files. It’s like hunting through a land fill in search of a penny.
Let’s look at a sample case involving Bill Smith. Bill Smith works for LP Corporation and is suspected of embezzling funds. Counsel requests information such as Office documents, Acrobat files, emails, and web activity. An initial search for keywords such as “Smith”, “LP Corporation”, and “bill@lpcorporation” would return several hundred thousand hits when run across deleted and regular files. If we limit the search to only saved files, valuable information may never be found. However, when searching the ‘unallocated’ part of the hard drive we might see hundreds of thousands hits - too many to review. In Bill Smith’s case, we have 678,354 hits that might represent deleted documents, fragments of documents, emails and web activity. This data is all in unallocated space and can only be retrieved using forensic techniques.
Many folks simply ‘carve’ through unallocated space to resurrect any dead files. This can result in a high number of corrupted or irrelevant hits. How do we avoid this problem? We use keywords as we’re recovering deleted files. This technique provides us with live files that contain relevant keywords. These are now much easier to search than the land fill of unallocated files.
These live files can then be loaded into a forensic application. They are much easier to deal with and we can run additional searches on them producing information that may have been missed on the first pass. We keep searching and filtering down by relevant criteria until we come closer to finding the needle in the haystack. For example, we might search all the documents containing Bill Smith,from that set, we may eliminate all those that don’t contain relevancy to embezzlement. Reducing the search criteria further will reduce our hits so that the original 678,354 hits are now 1,200.
By only restoring deleted information that contains relevant keywords, we dramatically reduce the amount of work performed. By turning deleted information into live files, we can then easily search them and filter criteria yielding a compact, highly relevant set of data. This technique allows us to work more efficiently and save valuable computing and financial resources.
Document Authentication
Signatures, faxes and paper are so 20th century. While there is still a need for handwriting analysis experts, modern document authentication techniques takes place primarily in the digital domain. Frequently a document such as a contract or letter of intent comes into question during litigation and we are asked to verify if it is authentic or fraudulent.
Working in the world of bits and bytes, one of the first things we will check during a document evaluation is metadata. Files such as Microsoft Word documents can contain hidden information known as metadata. Metadata is “data about the data.” If we were to use an analogy, if you were to investigate a homicide in which a gun was used, the metadata would be everything about the gun, including fingerprints on the handle and trigger, the type of bullet fired, the time and date it was fired, and the number of times it was fired.
The metadata embedded in a Microsoft Word document might reveal: the creator name, company name, when the file was created, where the file was saved, total editing time and potentially much more. This list is not exhaustive, instead just offering a peek of what most document metadata contains. Any of these elements can be used to show a document is authentic or not.
Unexpected Metadata Revelations
If someone is surreptitiously trying to backdate a contract created in Microsoft Word, one thing they might do is set the clock back and then save the document with an earlier date. Taking a casual look at the computer, you might see Windows shows that the document was created or modified on the earlier date. However, a deeper inspection of the document itself might reveal that the metadata embedded in the document is inconsistent with the Windows time/date stamps.
For example, Windows might show a Last Modified Date of Jan. 23, 2005 while the metadata embedded in the document itself might show a much later date and even a different author. The document metadata can also reveal the total document editing time. When a document is intentionally backdated by setting the clock back and then resaving the document, the total editing time indicated can be unrealistically high, sometime showing that the document was edited for years. Since typical document editing time is measured in hours or days, when we see a document that has been edited for years we become understandably suspicious.
Metadata used in conjunction with other elements of computer forensics such as internet activity, examination of emails and Windows time/date stamps can be used to determine if a document is the real deal or a forgery.
Is The Document Worth The Paper It’s Printed On?
Recently we have looked at a number of agreements, and letters of intent that are provided to us on paper. If the authenticity of the document is questioned, somehow the electronic version of the document is almost always difficult to get access to. However, in those cases where we are able to examine the electronic version of the document, often a very different story emerges, illuminated by the bright light of metadata.
It’s no secret that Twitter’s star has rapidly ascended to become a beacon of information coming out of Iran during the June elections. Faced with state enforced censorship of traditional communication channels such as cell phones and text messaging, opposition bloggers have turned to new mediums such as Twitter to post photos, videos and messages from the streets of Tehran. As Western media was largely banned from leaving their offices, the flood of news, live from the protests, has likely had Iranian authorities trying to track down rogue tweets.
Finding an anonymous party on a social network
In civil litigation it might be challenging to track down the identity of someone who’s posting defamatory or libelous material. If a suspect is the focus of a criminal investigation, it’s relatively simple for law enforcement to issue a subpoena to a social network such as Facebook, MySpace or Twitter and then collect the log files of IP addresses and messages for computer forensic analysis.
However, in a civil matter, learning the true identity of a blogger will typically require serving a subpoena in connection with a John Doe suit to the social networking site, demanding the IP addresses of the poster. Once you’ve got the IP address of the blogger from Facebook or Twitter, you can then send a second subpoena to the ISP hosting the IP address, such as Time Warner, AT&T or many others. Typically, the ISP will inform their subscriber that a request has been made to reveal their identity. Generally speaking, if the subscriber does not object, the ISP will provide the requested information. If the subscriber does object they will have revealed their identity in the process.
Protecting your privacy on a social network
If you’re an attorney with a client who uses Facebook, MySpace or Twitter you might be surprised at what can easily be found out about them on the web. You don’t need a computer forensics specialist to see what’s out there. Simply Google their name and see what information is available. PC World reports that 78% of social networkers have their profiles visible in a Google search. Ironically, Facebook is currently testing some privacy controls in beta that would make “status updates” available to everyone by default. It’s clear Facebook is moving toward less privacy, not more. Facebook is in the business of selling ads and the more content they can make available to the general public the more pages they can serve up for advertisers. In order to increase privacy on Facebook from the low default settings go to Settings: Privacy. You can dramatically alter who can see or search for information.
The amount of discoverable data available has dramatically increased with the rocketing popularity of social networks. Computer forensic experts from the private and public spectrum are rapidly adapting to the new landscape and it will no doubt be a challenge for litigation to keep up with the pace of change.
Federal Forensics Group demonstrates cell phone spyware for Fox 11 News in Los Angeles.
Spyware is a type of malware that is downloaded onto a victims cell phone by an attacker which allows monitoring of conversations, email and text messages.
Pirates Get a Taste of Microsoft COFEE.Microsoft’s Computer Online Forensic Evidence Extractor (COFEE) software, which helps law enforcement officials grab data from password protected or encrypted sources, has leaked.
Facebook Privacy Changes Draw Mixed Reviews. Facebook’s revamped privacy settings will push more user data onto the Internet and, in some cases, make privacy protection harder for Facebook users, digital civil liberties experts said.
Hackers Pillage Jailbroken iPhones. Hackers are plundering personal data from jailbroken iPhones using the tactic demonstrated last week by an Australian programmer’s self-described “prank,” researchers said today.
Crafting a More Effective Keyword Search Despite the insight of Facciola, Grimm and Peck, lawyers still don’t know what to do when it comes to effective, defensible keyword search.
Police say hacker stole phone time from AT&T, others The investigation began in May 2007 following a tip-off from the FBI that a group of hackers based in the Philippines had violated the IT security of major international phone companies.
Don’t Mess With System Metadata. Sometimes a computer holds evidence, and sometimes a computer is evidence. It’s a distinction with a difference when deciding whether to act in ways that will stomp on data essential to computer forensic examination.
How Facebook mucks up office life. Managing a workforce is already a challenging job; now Facebook and other social networks raise a host of sticky new situations.
Laid-off workers as data thieves?A growing crime wave where laid-off workers exact vengeance on their former employers by walking out the door with sensitive customer data and other proprietary information.
As Jurors Turn to Web, Mistrials Are Popping Up. The use of BlackBerrys and iPhones by jurors gathering and sending out information about cases is wreaking havoc on trials around the country, upending deliberations and infuriating judges.