home

epstein-data
Research ▼
🔍 SearchFull-text document search 🤖 Ask AIAI research assistant 🔎 Evidence MapFBI serial resolution 📷 Reverse Image SearchCLIP + face across 614K images 🧑 Find Face BETASearch 29K faces by photo 💻 Run Your OwnDownload & search locally
Explore ▼
📚 Full Text Corpus1.39M docs, 2.77M pages 🌎 Global Heatmap145 countries mentioned 📈 Coverage MapWhat's here 🌌 AtlasSemantic map · 1.29M docs ⚖ Cases53 federal & state cases · per-case briefings 🎤 DepositionsTranscribed audio & video 💬 Hear from the SurvivorsSurvivors in their own words 📖 Cover to Cover-Up24-hour public reading, synced to the video ✉ Wolff–Epstein Emails2,009 messages · 2009–2019
📷 Images92K analyzed photographs 🔍 Multi-DB SearchSearch all databases individually 🗃 All Databases14 searchable databases
Entities Reports
News ▼
📰 NewsCoverage & reporting ⚖ Justice MonitorArrests, charges, lawsuits, firings
Source ▼
🏛 DOJ ProductionOfficial EFTA disclosures 📜 EFTA Law TextPublic Law 119-38 📁 Source Data (GitHub)Open source databases
🌐 Community ResourcesCurated external projects ✉ ContactGeneral · privacy · DMCA · press
❤️ Donate 🎧 Podcast

Research

🔍 Search Documents 🤖 Ask AI 🔎 Evidence Map 📷 Reverse Image Search 🧑 Find Face BETA 💻 Run Your Own Investigator

Explore

📚 Full Text Corpus 🌎 Global Heatmap 📈 Coverage Map 🌌 Atlas ⚖ Cases 🎤 Depositions 💬 Hear from the Survivors 📖 Cover to Cover-Up ✉ Wolff–Epstein Emails 📷 Images 🔍 Multi-DB Search 🗃 All Databases

Entities

👥 Entity Directory

Reports

Browse All Reports 📰 News ⚖ Justice Monitor

Source

🏛 DOJ Production 📜 EFTA Law 📁 Source Data (GitHub) 🌐 Community Resources ✉ Contact
🎧 Podcast & Newsletter ❤️ Donate Privacy Policy

HOUSE_OVERSIGHT_017022

← Prev Next →
Loading document…

a particular n-gram in year X as shown in the plots is the mean of the raw frequency value for the n-gram in the year X, the year X-1, and the year X+1. Note that for each n-gram in the corpus, we can provide three measures as a function of year of publication: 1- the number of times it appeared 2- the number of pages where it appeared 3- the number of books where it appeared. Throughout the paper, we make use only of the first measure; but the two others remain available. They are generally all in agreement, but can denote distinct cultural effects. These distinctions are not explored in this paper. For example, we give in Appendix measures for the frequency of the word ‘evolution’. In the first three columns, we give the number of times it appeared, the normalized number of times it appeared (relative to #words that year), the normalized number of pages it appeared in, and the normalized number of books it appeared in, as a function of the date. III.1B. Multiple Query/Cohort Timelines Where indicated, timeline plots may reflect the aggregates of multiple query results, such as a cohort of individuals or inventions. In these cases, the raw data for each query we used to associate each year with a set of frequencies. The plot was generated by choosing a measure of central tendency to characterize the set of frequencies (either mean or median) and associating the resulting value with the corresponding year. Such methods can be confounded by the vast frequency differences among the various constituent queries. For instance, the mean will tend to be dominated by the most frequent queries, which might be several orders of magnitude more frequent than the least frequent queries. If the absolute frequency of the various query results is not of interest, but only their relative change over time, then individual query results may be normalized so that they yield a total of 1. This results in a probability mass function for each query describing the likelihood that a rand

Suggest a category
Misclassified? Pick a better fit.
Community Notes
▸ People Mentioned
▸ Interest Level
Routine Notable Significant
▸ Dates Mentioned
▸ Related Topics
▸ Places & Organizations
▸ Transcription Correction
Related documents
Source Data Investigation Reports DOJ EFTA CC BY-NC-SA 4.0 Contact
Independent research project. Not affiliated with the U.S. Department of Justice, FBI, any government agency, or Anthropic. All analytical text on this site is AI-generated (Claude, Anthropic) and iteratively fact-checked against source documents, but may contain errors. Verify all claims against linked EFTA sources before citing.
Powered by Datasette  ·  ❤️ Buy me a coffee

You are leaving epstein-data.com

You are being redirected to an external website not operated by this project. We are not responsible for the content or privacy practices of external sites.

Powered by Datasette