Technocat Tidbits: What is eDiscovery Processing?

August 22, 2023/ACEDS Blog, Data and Technology, Technocat Tidbits (Casey)

Share this article

Hey there, digital detectives! TechnoCat, Cat Casey, at your service. Today we’re diving into the magical world of eDiscovery data processing. The journey from raw data to usable information isn’t always straightforward, but I promise you, it’s an exciting ride. Think of it as the exciting journey from the digital babble of ones and zeros to a human-readable language.

Data processing in eDiscovery refers to the systematic organization, filtering, and preparation of electronic data for review and analysis. I.E. turning the ones and zeros of Electronically Stored information (ESI) into something a human can organize, review and analyze.

This involves transforming a vast amount of raw data and data about data (metadata) into a manageable and usable form.

Imagine dealing with a labyrinth of raw data and metadata (that’s the ‘data about data’, for those new to the game). Now, our task is to transform this mammoth beast into something manageable, something usable. It’s not just a question of size, though. Oh no, we’re aiming to ensure relevance, maintain integrity, and check all those compliance boxes for legal requirements. Quite a juggle, don’t you think?

But that’s what makes it so thrilling! Data processing is like solving a complex puzzle, unscrambling the digital language into a story we humans can understand, organize, and interpret. It’s not just data; it’s the heartbeat of our eDiscovery journey.

What Actually Happens During eDiscovery Data Processing?

Whether you’re a law firm, in-house discovery team, or a whiz-bang legal service provider, the eDiscovery processing dance has the same key steps:

a) Data Collection and Preservation:

It’s all about finding and protecting the goods – emails, documents, databases, and all sorts of digital gems. It’s crucial to start right, or the rest of the journey might be a wild goose chase. Defensible data collection and preservation are the name of the game here, with a sprinkle of legal holds and a solid chain of custody to maintain that precious data integrity.

b) Data Filtering and Culling:

Let’s face it, we often need to pick out the whispers in the cacophony. We hone our dataset, removing the irrelevant and focusing on the potentially important. Think of it as a digital sieve, using keyword search terms, date filters, custodian filters, and file type filters to get to the good stuff.

c) Data Deduplication:

It’s time to declutter, folks! We’ve got machine learning algorithms working hard to find and remove duplicate files or emails. Only the unique instances get the VIP pass.

d) DeNISTing:

Sounds like a term right out of a James Bond movie, doesn’t it? But DeNISTing plays a pivotal role in our eDiscovery world. We get rid of known system files and commonly encountered non-relevant files, aka “NIST files” – named after the National Institute of Standards and Technology. It’s a clean-up operation that helps streamline our document review process, letting us focus on the juicy bits.

e) Data Normalization and Standardization:

We like to keep things tidy by converting data into a consistent format. It’s about polishing metadata, organizing file formats, and standardizing naming conventions. In short, it’s about making sense out of the digital chaos.

f) Metadata Extraction and Indexing:

We’re pulling out metadata, text, and all sorts of relevant info from our files to create an index. It’s like creating a digital map, allowing us quick access to our data when we’re knee-deep in review and analysis.

g) Data Transformation and Load File Creation:

We’re transforming our data into a language our review platforms and tools can understand. It’s like translating binary code to plain English. We’re creating load files that contain the details we need, like document metadata, coding decisions, and production specifications. All to ensure we’re ready to go in our powerful review platforms like Reveal.

That’s a Wrap

And there you have it – eDiscovery data processing demystified. It’s a wild ride, but with a little know-how and some TechnoCat tips, it’s an adventure worth embarking on. Until next time, stay savvy and stay safe!

Interested in ACEDS’s educational content? Subscribe to the ACEDS Blog for weekly blog updates: https://aceds.org/aceds-blog/

Catherine “Cat” Casey

Chief Growth Officer at Reveal

Catherine “Cat” Casey is the chief growth officer for leading AI-powered e-discovery technology Reveal. A global thought leader on the application of AI and advanced technology to the practice of law. She is a frequent keynote speaker and outspoken advocate of legal professionals embracing technology to deliver better legal outcomes. Casey has more than a decade and a half of experience assisting clients with complex e-discovery and forensic needs that arise from litigation, expansive regulation, and complex contractual relationships. Casey has an A.L.B. from Harvard University and attended Pepperdine School of Law.