December 4, 2022 – S.D. Cal.: Request for Production from Backup Tapes Denied
In 2015, Magistrate Judge William V. Gallo, in United States ex rel. Carter v. Bridgepoint Educ., Inc., 305 F.R.D. 225 (S.D. Cal. 2015), ruled on a dispute over the production of data from backup tapes, and the format used for email productions.
The Plaintiffs contended that the Defendants were responsible for intentionally altering data transferred to backup tapes because litigation was anticipated, and this transfer constituted a form of intentional spoliation. The Defendants in turn asserted that the data was inaccessible and so the cost of production should shift to the Plaintiffs. The requested data was matrices used by the Defendants to track their performance as enrollment advisors to the Plaintiffs, which in turn was used to determine how much employees were paid. The Plaintiffs contended this is a violation of the Higher Education Act’s prohibition against incentive payments.
The backup tapes in question were used for disaster recovery. The encrypted tapes could be used to store more than 1 TB of data. The Defendants stated that it was only possible to restore one tape per day, and that the full restoration process would take several months and cost more than $2.2 million for the data for all of the relevant custodians to be converted to native format. These facts would make the production unduly burdensome. The Defendants had been transferring data to backup tapes for a long time, prior to the suit’s unsealing. (This is a qui tam action, which the Defendants only received notice of when the government chose not to intervene.)
The Plaintiffs argued that the Defendants as a large ‘billion dollar’ public company which emphasizes its technological capabilities should have the resources to handle the production and noted that their suit concerned more than $2 billion in damages. They faulted the Defendants for failing to disclose how their backup tape system worked.
In his decision, Judge Gallo citing Zubulake v. UBS Warburg LLC, 217 F.R.D. 309 (S.D.N.Y. 2003)), acknowledged that a party is not entitled to cost shifting if it converts data into an inaccessible format when it’s reasonably foreseeable that it will be discoverable in anticipated litigation. But he emphasized that the litigation must be probable not merely possible.
In rejecting the contention that the Defendants’ deliberately made data inaccessible, the Court notes that, “[e]ven in making this accusation, Plaintiff acknowledge that this ESI has been placed onto ‘backup tapes,’ thereby accepting Defendants’ own description of the relevant ESI as ‘inaccessible.’ Dangerously, Plaintiffs have chosen to describe this storage system as adopted ‘under the pretext or excuse of a business purpose,’ even though the use of backup tapes for non-active ESI has become standard business practice.” Bridgepoint Educ., Inc., 305 F.R.D. at 241. The opinion cites dozens of holdings that ESI stored on backup tapes is inaccessible from a technological standpoint.
The Defendants did restore one backup tape to its native format which contained all of the emails between the relevant employee custodians and their superiors. This gives the Plaintiffs “an unfettered ability to examine almost every potentially relevant quantum of ESI” Id. at 242. The Defendants made a production in TIFF images of other less relevant emails. The Plaintiffs only offered their own attorneys’ estimates of the cost of production, while the Defendants filed a declaration prepared by an expert. The Court also noted the Plaintiffs’ failure to specify the exact data they were requesting. “If a party fails to identify the form or forms in which it wishes ESI to be produced and any fields or types of metadata sought, the non-requesting party may rightly provide the ESI sought in the form in which it is regularly maintained. With Plaintiffs’ request ambiguous as to form and format, Defendants were certainly reasonable in refusing to provide reasonably inaccessible ESI.” Id. at 243. Judge Gallo rejected the claim of intentional spoliation because the Plaintiffs did not explain why the Defendants’ storage process was unusual.
The Court also rejected that Plaintiffs’ request for the production of emails from active storage in native format. It regarded a TIFF image production as a proper response to a “generic request for original documents.” Id. at 245.
December 10, 2022 – Use Powershell Script to Count the Number of Lines in Multiple Text Files
You can use a PowerShell script to count the number of lines in multiple text files saved to a folder.
Enter the file path for the folder after the Get-ChildItem command on the first line. Then specify the extension of the files to be analyzed towards the end of the first line after ‘extension – eq’.
Get-ChildItem c:\foofolder\test2 -recurse | where {$_.extension -eq “.txt”} | % {
$_ | Select-Object -Property ‘Name’, @{
label = ‘Lines’; expression = {
($_ | Get-Content).Length
}
}
} | out-file C:\foofolder\test2\lines1.txt
On the last line provide the file path for a new text file to which PowerShell will write the results. Open Windows PowerShell ISE
(x86) and then enter the script in a new pane, and then press the play button on the toolbar.
The text file that is generated will list each file name in the source folder and show the number of lines in each in a column to the right.
I ran this script on a set of more than 100,000 text files (which turned out to consist of more than 9 million lines) and it finished the review in less than 30 minutes.
The script can also be used to find the number of lines in other files such as .csv files.
Be sure to enter the file paths in quotes if they include blank spaces.
Thanks to Hari Parkash for posting this script here.
December 17, 2022 – Excel VBA Code to Get Page Count for Multiple PDF Files
The below Visual Basic code, posted here by skyang, can be used to generate a list of PDF files which shows how many pages are in each file.
Simply enter the code in a new module in Visual Basic. Start from the beginning macro at Sub Test(), and after you press play you’ll be prompted to select a folder where your PDF files are located.
This code will process any files saved to subfolders.
The code will generate a list like this which will also show the file size and file path of each PDF:
Note that if any file has a folder path longer than 255 characters, the code will fail and this message will come up.
As always, I have tested this code tonight and confirmed that it works – although in a large data set it did give a zero count for the pages in some PDFs. However it took less than 20 minutes to review more than 9000 files containing more than 80,000 pages.
Sub Test()
Dim I As Long
Dim xRg As Range
Dim xStr As String
Dim xFd As FileDialog
Dim xFdItem As Variant
Dim xFileName As String
Dim xFileNum As Long
Dim RegExp As Object
Set xFd = Application.FileDialog(msoFileDialogFolderPicker)
If xFd.Show = -1 Then
xFdItem = xFd.SelectedItems(1) & Application.PathSeparator
Set xRg = Range(“A1”)
Range(“A:B”).ClearContents
Range(“A1:B1”).Font.Bold = True
xRg = “File Name”
xRg.Offset(0, 1) = “Pages”
xRg.Offset(0, 2) = “Path”
xRg.Offset(0, 3) = “Size(b)”
I = 2
Call SunTest(xFdItem, I)
End If
End Sub
Sub SunTest(xFdItem As Variant, I As Long)
Dim xRg As Range
Dim xStr As String
Dim xFd As FileDialog
Dim xFileName As String
Dim xFileNum As Long
Dim RegExp As Object
Dim xF As Object
Dim xSF As Object
Dim xFso As Object
xFileName = Dir(xFdItem & “.pdf”, vbDirectory) xStr = “” Do While xFileName <> “” Cells(I, 1) = xFileName Set RegExp = CreateObject(“VBscript.RegExp”) RegExp.Global = True RegExp.Pattern = “/Type\s/Page[^s]”
xFileNum = FreeFile
Open (xFdItem & xFileName) For Binary As #xFileNum
xStr = Space(LOF(xFileNum))
Get #xFileNum, , xStr
Close #xFileNum
Cells(I, 2) = RegExp.Execute(xStr).Count
Cells(I, 3) = xFdItem & xFileName
Cells(I, 4) = FileLen(xFdItem & xFileName)
I = I + 1
xFileName = Dir
Loop
Columns(“A:B”).AutoFit
Set xFso = CreateObject(“Scripting.FileSystemObject”)
Set xF = xFso.GetFolder(xFdItem)
For Each xSF In xF.SubFolders
Call SunTest(xSF.Path & “\”, I)
Next
End Sub
December 24, 2022 – PowerShell Faster Than XCOPY
When writing a script to copy files from one destination to another, be sure to use PowersShell instead of creating a batch file with the XCOPY command. In PowerShell, the Copy-Item command followed by the source path and then the destination folder:
Copy-Item -Path “C:\foofolder\2022.07.Litigation Support Tip of the Night V2.docx” -Destination “C:\copy set\” -PassThru
Copy-Item -Path “C:\foofolder\acme.js” -Destination “C:\copy set\” -PassThru
Copy-Item -Path “C:\foofolder\AndroGel Meeting.docx” -Destination “C:\copy set\” -PassThru
Copy-Item -Path “C:\foofolder\Applications” -Destination “C:\copy set\” -PassThru
. . . will copy files to a new location faster than a .bat file.
December 31, 2022 – Searching Inside an Excel Cell for One of Multiple Strings
You can use an Excel formula to check to see when any one of multiple values appears in the contents of a cell.
=SUMPRODUCT(–ISNUMBER(SEARCH($D$2:$D$7,A2)))>0
This formula will check inside cell A2 for the values listed in cells D2 to D7.
When you have a range of cells that you want to search through, enter the list of strings you’re looking for hits for with an absolute reference using dollar signs. So in this example, we search through the addresses listed in column A for the state capitals listed in column F. We can pull down the formula entered in cell B2 to the cells below using CTRL + D. The formula will return ‘TRUE’ when one of the values from cells F2 to F51 are listed in column A.