a particular n-gram in year X as shown in the plots is the mean of the raw frequency value for the n-gram in the year X, the year X-1, and the year X+1. Note that for each n-gram in the corpus, we can provide three measures as a function of year of publication: 1- the number of times it appeared 2- the number of pages where it appeared 3- the number of books where it appeared. Throughout the paper, we make use only of the first measure; but the two others remain available. They are generally all in agreement, but can denote distinct cultural effects. These distinctions are not explored in this paper. For example, we give in Appendix measures for the frequency of the word ‘evolution’. In the first three columns, we give the number of times it appeared, the normalized number of times it appeared (relative to #words that year), the normalized number of pages it appeared in, and the normalized number of books it appeared in, as a function of the date. III.1B. Multiple Query/Cohort Timelines Where indicated, timeline plots may reflect the aggregates of multiple query results, such as a cohort of individuals or inventions. In these cases, the raw data for each query we used to associate each year with a set of frequencies. The plot was generated by choosing a measure of central tendency to characterize the set of frequencies (either mean or median) and associating the resulting value with the corresponding year. Such methods can be confounded by the vast frequency differences among the various constituent queries. For instance, the mean will tend to be dominated by the most frequent queries, which might be several orders of magnitude more frequent than the least frequent queries. If the absolute frequency of the various query results is not of interest, but only their relative change over time, then individual query results may be normalized so that they yield a total of 1. This results in a probability mass function for each query describing the likelihood that a rand