home

epstein-data
Research ▼
🔍 SearchFull-text document search 🤖 Ask AIAI research assistant 🔎 Evidence MapFBI serial resolution 📷 Reverse Image SearchCLIP + face across 614K images 🧑 Find Face BETASearch 29K faces by photo 💻 Run Your OwnDownload & search locally
Explore ▼
📚 Full Text Corpus1.39M docs, 2.77M pages 🌎 Global Heatmap145 countries mentioned 📈 Coverage MapWhat's here 🌌 AtlasSemantic map · 1.29M docs ⚖ Cases53 federal & state cases · per-case briefings 🎤 DepositionsTranscribed audio & video 💬 Hear from the SurvivorsSurvivors in their own words 📖 Cover to Cover-Up24-hour public reading, synced to the video ✉ Wolff–Epstein Emails2,009 messages · 2009–2019
📷 Images92K analyzed photographs 🔍 Multi-DB SearchSearch all databases individually 🗃 All Databases14 searchable databases
Entities Reports
News ▼
📰 NewsCoverage & reporting ⚖ Justice MonitorArrests, charges, lawsuits, firings
Source ▼
🏛 DOJ ProductionOfficial EFTA disclosures 📜 EFTA Law TextPublic Law 119-38 📁 Source Data (GitHub)Open source databases
🌐 Community ResourcesCurated external projects ✉ ContactGeneral · privacy · DMCA · press
❤️ Donate 🎧 Podcast

Research

🔍 Search Documents 🤖 Ask AI 🔎 Evidence Map 📷 Reverse Image Search 🧑 Find Face BETA 💻 Run Your Own Investigator

Explore

📚 Full Text Corpus 🌎 Global Heatmap 📈 Coverage Map 🌌 Atlas ⚖ Cases 🎤 Depositions 💬 Hear from the Survivors 📖 Cover to Cover-Up ✉ Wolff–Epstein Emails 📷 Images 🔍 Multi-DB Search 🗃 All Databases

Entities

👥 Entity Directory

Reports

Browse All Reports 📰 News ⚖ Justice Monitor

Source

🏛 DOJ Production 📜 EFTA Law 📁 Source Data (GitHub) 🌐 Community Resources ✉ Contact
🎧 Podcast & Newsletter ❤️ Donate Privacy Policy

HOUSE_OVERSIGHT_017028

← Prev Next →
Loading document…

known, their article will be a member of a “decade_births” category such as “1890s_births” and “1930s_births”. We treat these individuals as if born at the beginning of the decade. For every parsed article, we append metadata relating to the importance of the article within Wikipedia, namely the size in words of the article and the number of page views which it obtains. The article word count is created by directly accessing the article using its URL. The traffic statistics for Wikipedia articles are obtained from http://stats.grok.se/. Figure $10a displays the number of records parsed from Wikipedia and retained for the final cohort analysis. Table S7 displays specific examples from the extraction’s output, including name, year of birth, year of death, approximate word count of main article and traffic statistics for March 2010. 1) Create a database of records referring to people born 1800-1980 in Wikipedia. a. Using the DBPedia framework, find all articles which are members of the categories ‘{700_births’ through ‘1980_births’. Only people both in 1800-1980 are used for the purposes of fame analysis. People born in 1700-1799 are used to identify naming ambiguities as described in section IIl.7.A.7 of this Supplementary Material. b. For all these articles, create a record identified by the article URL, and append the birth year. c. For every record, use the URL to navigate to the online Wikipedia page. Within the main article body text, remove all HTML markup tags and perform a word count. Append this word count to the record. d. For every record, use the URL to determine the page’s traffic statistics for the month of March 2010. Append the number of views to the record. l1.7.A.2 — Identification of occupation for individuals appearing in Wikipedia. Two types of structural elements within Wikipedia enable us to identify, for certain individuals, their occupation. The first, Wikipedia Categories, was previously described and used to recognize articles about

Suggest a category
Misclassified? Pick a better fit.
Community Notes
▸ People Mentioned
▸ Interest Level
Routine Notable Significant
▸ Dates Mentioned
▸ Related Topics
▸ Places & Organizations
▸ Transcription Correction
Related documents
Source Data Investigation Reports DOJ EFTA CC BY-NC-SA 4.0 Contact
Independent research project. Not affiliated with the U.S. Department of Justice, FBI, any government agency, or Anthropic. All analytical text on this site is AI-generated (Claude, Anthropic) and iteratively fact-checked against source documents, but may contain errors. Verify all claims against linked EFTA sources before citing.
Powered by Datasette  ·  ❤️ Buy me a coffee

You are leaving epstein-data.com

You are being redirected to an external website not operated by this project. We are not responsible for the content or privacy practices of external sites.

Powered by Datasette