
for DS9, does anyone have the following files:
EFTA00709804
EFTA00709805
EFTA00709806
EFTA00709807
EFTA00770595
EFTA00774768
EFTA00823190
EFTA00823191
EFTA00823192
EFTA00823221
EFTA00823319
EFTA00877475
EFTA00892252
EFTA00901740
EFTA00912980
EFTA00919433
EFTA00919434
EFTA00932520
EFTA00932521
EFTA00932522
EFTA00932523
EFTA00984666
EFTA00984668
EFTA01135215
EFTA01135708
If so, please DM me them and then I can include them in my master archive.
Epstein Files - Complete Dataset Audit Report
Background
The Epstein Files consist of 12 datasets of court-released documents, each containing PDF files identified by EFTA document IDs. These datasets were collected from links shared throughout this Lemmy thread, with Dataset 9 cross-referenced against a partial copy we had downloaded independently.
Each dataset includes OPT/DAT index files — the official Opticon load files used in e-discovery — which serve as the authoritative manifest of what each dataset should contain. This audit was compiled to:
Executive Summary
Dataset Overview
EPSTEIN FILES - DATASET SUMMARY ┌─────────┬──────────┬───────────┬──────────┬─────────┬─────────┬─────────┐ │ Dataset │ Volume │ Files │ Expected │ Missing │ Corrupt │ Size │ ├─────────┼──────────┼───────────┼──────────┼─────────┼─────────┼─────────┤ │ 1 │ VOL00001 │ 3,158 │ 3,158 │ 0 │ 0 │ 2.5 GB │ │ 2 │ VOL00002 │ 574 │ 574 │ 0 │ 0 │ 633 MB │ │ 3 │ VOL00003 │ 67 │ 67 │ 0 │ 0 │ 600 MB │ │ 4 │ VOL00004 │ 152 │ 152 │ 0 │ 0 │ 359 MB │ │ 5 │ VOL00005 │ 120 │ 120 │ 0 │ 0 │ 62 MB │ │ 6 │ VOL00006 │ 13 │ 13 │ 0 │ 0 │ 53 MB │ │ 7 │ VOL00007 │ 17 │ 17 │ 0 │ 0 │ 98 MB │ │ 8 │ VOL00008 │ 10,595 │ 10,595 │ 0 │ 0 │ 11 GB │ │ 9 │ VOL00009 │ 531,282 │ 531,307 │ 25 │ 3 │ 96 GB │ │ 10 │ VOL00010 │ 503,154 │ 503,154 │ 0 │ 0 │ 82 GB │ │ 11 │ VOL00011 │ 331,655 │ 331,655 │ 0 │ 0 │ 27 GB │ │ 12 │ VOL00012 │ 152 │ 152 │ 0 │ 0 │ 120 MB │ ├─────────┼──────────┼───────────┼──────────┼─────────┼─────────┼─────────┤ │ TOTAL │ │1,380,939 │1,380,964 │ 25 │ 3 │ ~220 GB │ └─────────┴──────────┴───────────┴──────────┴─────────┴─────────┴─────────┘Notes
Dataset 9 — Missing Files (25)
Dataset 9 — Corrupted Files (3)
EFTA00645624.pdfEFTA01175426.pdfEFTA01220934.pdfValid
%PDF-headers but cannot be rendered due to structural corruption. Likely corrupted during original document production or transfer.File Type Verification
Two levels of verification performed on all 1,380,939 files:
filecommand) — All files contain valid%PDF-headers. 0 mislabeled.pdfinfo, poppler 26.02.0) — Parsed xref tables, trailer dictionaries, and page trees. 3 structurally corrupt (Dataset 9 only).Duplicate Analysis
Integrity Verification
SHA256 checksums were generated for every file across all 12 datasets. Individual checksum files are available per dataset:
dataset_1_SHA256SUMS.txtdataset_2_SHA256SUMS.txtdataset_3_SHA256SUMS.txtdataset_4_SHA256SUMS.txtdataset_5_SHA256SUMS.txtdataset_6_SHA256SUMS.txtdataset_7_SHA256SUMS.txtdataset_8_SHA256SUMS.txtdataset_9_SHA256SUMS.txtdataset_10_SHA256SUMS.txtdataset_11_SHA256SUMS.txtdataset_12_SHA256SUMS.txtTo verify any file against its checksum:
If you’d like access to the SHA256 checksum files or can help host them, send me a DM.
Methodology
shasum -a 256with 8-thread parallel processingfilecommandpdfinfo(poppler 26.02.0) — xref tables, trailer dictionaries, page treesRecommendations
Report generated as part of the Epstein Files preservation and verification project.