Filtering and Searching in Early Data Analyzer

In the eDiscovery world, data size and time are two factors that can impact a project. With the right tool a project doesn’t have to be as daunting as most people might think. It’s Monday morning and your manager drops off a one terabyte hard drive on your desk. Your task is to exclude duplicates, identify relevant documents based on key word hits, and find all e-mails linked to 3 different companies.

LexisNexis Early Data Analyzer is an early data assessment tool that can search and cull through data prior to production. Filtering and Searching are two core features in EDA used to analyze and identify documents that are important. Search is only performed across filtered data which means that you are searching across the pool of documents you’ve identified as relevant. EDA Filters include File Hash, Duplicate, Date Range, File Type, E-mail Sender Domain, and Language. Simply exclude all the duplicates under the Duplicate Filter to check off one task from your project list. With EDA’s filtering mechanism, you are leaving out documents you don’t need and moving forward with documents that you feel are important.

During the Filtering process in EDA, you discover that about 10% of the documents are image-based files (mainly Pdfs). EDA is equipped with two different OCR (Optical Character Recognition) engines, Expervision and ABBYY FineReader. Documents without content are identified during the initial Analysis process so there’s no need to go looking for them. Run the batch OCR on the image-based documents and continue with your next task.

Searching a terabyte of data can be an overwhelming task for some people. With EDA’s intuitive user interface and advanced features, searching can be a breeze. The classical way of searching is to type in a key word, return with a few thousand hits, and then tag the results. These days when you are searching terabytes of data, a single, multi-word and phrases may not be sufficient to acquire precise results. What if you could fine tune your search and reduce the few thousand hits significantly without having to generate a long complex search query? EDA offers a way to fine tune your searches by using Search Filters. Search filters include Date Range, Custodians, Import Sessions, Tags, Folders or Files, and other Searches. Import the list of key words your manager gave you and Search Filters as needed. Run your search and step away for a cup of coffee.

You return to your desk and now you’re ready to look for e-mails from the 3 companies on your project list. Using the Search Fields might be helpful in accomplishing this last task. In EDA, you can choose between two connecting Operators and use the available field list for selection. After selecting a field, EDA will display a list of all available values to choose from. By displaying the field values, you are saving time from potentially searching a field that comes back with zero hits. This also gives you an idea as to how many possible values there are per field. The FromDomain field, for example, might contain 50 total values identified across your data set, but there are only 3 domains you are interested in. You select the 3 e-mail domains from the 3 companies and run your search. EDA simplifies searching through the use of search fields which creates an efficient workflow.

It’s late in the afternoon and you’ve completed every task on your project list. You even made sure to tag all of your results along the way so that you can easily create your export. Your manager is impressed that you were able to search and cull through one terabyte of data in one day.