That could be done for twenty-six different possibilities, but it couldn’t be done for ten thousand. It was just a matter of scaling up the whole system that makes this possible today. There are maybe five thousand picturable common nouns in English, ten thousand if you include things like special kinds of plants and beetles which people would recognize with some frequency. What we did was train our system on 30 million images of these kinds of things. It’s a big, complicated, messy neural network. The details of the network probably don’t matter, but it takes about a quadrillion GPU operations to do the training. Our system is impressive because it pretty much matches what humans can do. It has about the same training data humans have—about the same number of images a human infant would see in the first couple of years of its life. Roughly the same number of operations have to be done in the learning process, using about the same number of neurons in at least the first levels of our visual cortex. The details are different; the way these artificial neurons work has little to do with how the brain’s neurons work. But the concept is similar, and there’s a certain universality to what’s going on. At the mathematical level, it’s a composition of a very large number of functions, with certain continuity properties that let you use calculus methods to incrementally train the system. Given those attributes, you can end up with something that does the same job human brains do in physiological recognition. But does this constitute AI? There are a few basic components. There’s physiological recognition, there’s voice-to-text, there’s language translation—things humans manage to do with varying degrees of difficulty. These are essentially some of the links to how we make machines that are humanlike in what they do. For me, one of the interesting things has been incorporating those capabilities into a precise symbolic language to represent the everyday world. We now have a s