Red Hots, Hot Docs, and the Ones that Got Away, Estimating Prevalence Series Part 3
Random sampling is a powerful eDiscovery tool that can provide you with reliable estimates of the prevalence of relevant materials, missed materials, and more
by Matthew Verga, JD, Xact Data Discovery
In “Finding out How Many Red Hots are in the Jellybean Jar,” we discussed a candy contest hypothetical and the importance of sampling techniques to eDiscovery. In “Key Sampling Concepts for Winning the Candy Contest,” we discussed sampling frame, prevalence, confidence level, and confidence interval. In this final Part, we apply those concepts to our hypothetical contest and to eDiscovery review.
Now that we understand the necessary sampling concepts, let’s apply those concepts to our candy contest and figure out how many red hots we think are in the jellybean jar. In order to do so, we will need to identify our sampling frame, select our desired confidence level, and select our desired confidence interval.