language lexica, we tried whenever possible to have the annotation performed by a third party with no knowledge of the analyses we were undertaking III.3. Controls To confirm the quality of our data in the English language, we sought positive controls in the form of words that should exhibit very strong peaks around a date of interest. We used three categories of such words: heads of state (‘President Truman’), treaties (‘Treaty of Versailles’), and geographical name change (‘Byelorussia’ to ‘Belarus’). We used Wikipedia as a primary source of such words, and manually curated the lists as described below. We computed the timeserie of each n-gram, centered it on the date of interest (year when the person became president, for instance), and normalized the timeserie by overall frequency. Then, we took the mean trajectory for each of the three cohorts, and plotted in Figure $5. The list of heads of states include all US presidents and British monarchs who gained power in the 19" or 20" centuries (we removed ambiguous names, such as ‘President Roosevelt’). The list of treaties is taken from the list of 198 treaties signed in the 19" or 20" centuries (S7); but we kept only the 121 names that referred to only one known treaty, and that have non zero timeseries. The list of country name changes is taken from Ref S8. The lists are given in APPENDIX. The correspondence between the expected and observed presence of peaks was excellent. 42 out of 44 heads of state had a frequency increase of over 10-fold in the decade after they took office (expected if the year of interest was random: 1). Similarly, 85 out of 92 treaties had a frequency increase of over 10- fold in the decade after they were signed (expected: 2). Last, 23 out of 28 new country names became more frequent than the country name they replaced within 3 years of the name change; exceptions include Kampuchea/Cambodia (the name Cambodia was later reinstated), Iran/Persia (Iran is still today referred to as Persia in