Human Cognitive Capacity

Abstract

Articles by Landauer and Cherniak are examined to compare their estimates of the cognitive capacity of the human brain. Both articles draw on previous experiments on human memory (notably those by Standing), but are largely intuitive rather than empirical. Landauer infers that we must store around 109 bits of information, while Cherniak estimates that we must store around 106 “concepts” which may average 106 bits each.

1. Motivation

With the rise of computational models of the mind, questions must be answered about whether the human brain has the capacity to store the amount of information required by those models. What is sought are models that are both computationally and psychologically realistic as well as neurally and genetically realistic [Cherniak, p402]. Ignoring these questions has lead to models which lack realism, particularly in their “inattention to scale” [Cherniak, p403]. Computational models of the mind (whether they be algorithmic/rule-based or connectionist) must be tested against the physical limitations of the human brain.

Cognitive capacity can be approached from three angles. Firstly, by discovering the organisation of the brain and hence determining how much information is actually stored. Secondly, by determining the theoretic capacity of computational models of the mind. Thirdly, by determining from human behaviour how much information is likely to be stored in the mind.

With regard to the first of these, the chief limit is the size of the brain’s neural network, particularly the number of synapses in the cortex. The commonly quoted estimate of this constraint is 1013. It is unclear what to deduce from this figure, however, since the configuration of these synapses and the method by which information is stored and manipulated is uncharted ground.

The second approach is a necessary step to be taken by the AI developers once they find some computational model which is robust enough to be scaled up the size of human cognition1.

This essay looks at the third approach as addressed in two recent papers: one by Thomas K. Landauer in 1986 and the other by Christopher Cherniak in 1988. Both attempt to base their estimates on experimental evidence, though much of their reasoning is by intuition.

The experiments they cite are mostly ones which test human memory capacity. Typical of these is Lionel Standing’s “Learning 10,000 Pictures”. In Section 2, I describe these experiments to indicate the sort of foundation on which Landauer and Cherniak build. Then, in Sections 3,4 and 5 I summarize and compare the papers of Landauer and Cherniak.

2. Learning 10,000 Pictures

Standing conducted four experiments to extend the results of Nickerson and Shepard.

The basic stimuli for the experiments was a pool of 11,000 “Normal” photos (eg dogs, aeroplanes), a pool of 1,200 more “Vivid” photos (e.g. a dog with a pipe in its mouth, a plane crash), and a pool of more than 1,000 randomly selected, common English words.

In the first experiment, groups of subjects were shown a number of items from one of the pools. The number of items shown varied from 20 to 10,000. Two days later, the subjects were presented with pairs of stimuli (one from the previously presented set and the other previously unseen) and asked to identify which they had seen before.

By recording the number of errors the subjects made, and allowing for the fact that in picking one picture from two there is a 50% guess factor, Standing calculated the number of items which must have been accurately stored in memory. (See Table 1)

Size of training setTraining Pool
VividNormalWords
202019.616
404036.428.8
100969070
200190166116
400381286234
1000880770615
4000—–2490—–
10000—–6600—–

Table 1.

The two main implications of these results are that –

  1. Memory of Vivid pictures is superior to memory of Normal pictures, which in turn is superior to memory of visually presented words.
  2. Memory of all three pools follows a power-law function depending on the size of the training set2.

The second experiment was designed to test whether the superiority of picture memory was maintained when the recognition task was more difficult. A procedure similar to the first experiment was used, except that instead of subjects having to distinguish between pairs of items, they were presented with anywhere from 2 to 32 items and had to select the one which they had previously seen.

The results showed that the subjects’ error rate was constant regardless of how many test items were shown.

In the third experiment, the testing procedure was changed from being simply a recognition task to also include a recall task. Additionally, auditory words, visual nonsense syllables and music segments were added to the training set.

Even in the recall task, memory of pictures was convincingly superior to the memory of other stimuli.

The fourth experiment was intended to compare the memory retrieval times for different types of stimuli, and for different training set sizes. According to the results, retrieval time for words is somewhat less than for pictures. But the more interesting result is that (for both types of stimuli) as the training set increases in size, the retrieval time changes according to a relatively slowly increasing power-law.

3. Thomas K. Landauer

Landauer discusses four different estimates of how much we store in long term memory. All are concerned with how much storage space we actually used rather than the maximum capacity, and all end up in the vicinity of 109 bits3.

The first three assume that during the hours in which we are awake, the input rate to long term memory is constant. The first two also assume that nothing is ever lost from long term memory. With these assumptions, the results of Standing’s experiments (along with others based on visual recognition, reading rates, and memory of dates) can be used to estimate input and loss rates.

Landauer’s final estimate is based on how much information a mature person uses in practice and thus avoids the issue of input and loss rates.

3.1 Reading Rates

In an experiment conducted by Landauer, 204 people read at their own speed for 1.5 minutes. One group was then shown a copy of the same text with words randomly deleted; another (control) group was shown a completely new text with words randomly deleted. Both groups were asked to fill in the missing words. The number of words which they could correctly supply gives an indication of how much of the original text was remembered.

By Landauer’s calculation the results imply that 1.2 bits of information are stored in long term memory for each second of reading time.

This figure seems low, but Landauer explains this with “the fact that little of the knowledge in what is read is new to the reader, and the certainty that only a fraction of what is new is remembered” [p481].

With this input rate, a 70 year old would have accumulated 1.8 x 109 bits.

3.2 Visual Recognition

Taking the data from Standing’s first experiment (see Table 1 above), Landauer tries to work backwards to determine the number of bits which must have been stored for each picture.

By his calculation, this varies from 8.7 to 14.6 depending on the size of the training set. After allowing for other experimental results which test memory retention over a longer period than Standing’s experiment, Landauer settles on 14 bits per picture. Allowing six seconds per picture, this amounts to 2.3 bits per second, or 3.4 x 109 bits in a lifetime.

A 14-bit code seems ludicrously low: it is only enough to differentiate 16,000 items! There is no experimental evidence of such a limit. Taking into account all the landscapes, faces, works of art, movies, television shows, advertisements and other picture-like memories we have, I would expect we differentiate 100 or 1,000 times more than that in practice.

By comparison, a standard computer bit-map representation of a picture requires around 260,000 bits. While it is unlikely that the brain stores pictures in as raw a form as a bit-map, we would at least need to store a code to identify each of the main objects in the picture along with their colour, and to somehow represent the relationships between objects (including proportions and spatial orientation).

A small part of this discrepancy is accounted for by noting that Landauer fails to allow for the 50% guess factor. However, I think the whole method in this estimate must be flawed.

3.3 Loss Rate

It is unclear whether the assumption of no memory loss is a valid one. Certainly we “forget” things, but this may be caused by either imperfect retention or imperfect retrieval. Regardless of the mechanism, however, we need to qualify the above input rates with some sort of loss rate.

Landauer cites two experiments which provide data on memory loss.

Firstly, Nickerson performed studies similar to Standing’s, but retested his subjects a month and a year after exposure to the training set of pictures. Although the early rate of memory loss was high, after a month the rate was very low. It is this later rate which interests us because we want to know “the rate at which the average bit in memory is changed, and, of course, the vast majority of information known at any one point in time has been known for a long time” [p487].

Some mathemagic yields a loss rate of 6.5 x 10-10 bits per bit per second. That is, if 1.5 x 109 bits were stored in long term memory, then on average, one bit would be corrupted every second.

Secondly, Thompson observed how well college students remembered the dates of their daily activities over a three month period. The same mathemagic as before indicates a memory loss rate of 5 x 10-8 b/b/s.

Now we can combine our rough input rate (around 2 b/s) with our rough loss rate (10-9 b/b/s – half way between the two estimates) to give a 70 year accumulation of 1.4 x 109 bits.

3.4 How much information do we use?

For the final estimate, Landauer postulates how much space our English lexicon must take up and extrapolates from this to other memory domains.

A well educated adult can identify about 105 words4. “Identifying” implies that they are aware of spelling, pronunciation, grammatical use, and definition. One can guess that 70 bits may be required for the former three, but what is involved in capturing definition?

Landauer supposes that a word is defined by its place in a semantic network. Hence we need to estimate how many other words are connected to this word, what sort of links they are (e.g. “isa”, “very similar to”) and how much space each link takes. Twelve links per word, with each link taking 17 bits to identify the target of the link and six bits to specify the type of link gives a total of 346 bits per word, or 3.5 x 107 bits for the whole lexicon.

If our memory had 15 domains of this complexity, then the storage space we require in order to live a normal life comes to about 0.5 x 109 bits.

4. Christopher Cherniak

The main point of Cherniak’s paper is to show that even if we could model the brain computationally, it would be unverifiable due to its complexity. This essay is not concerned with this claim and we will just examine the first part of Cherniak’s paper which deals with his estimate of the size of the “mind’s program”.

Cherniak takes first a top-down approach to derive the number of concepts typically required for normal life, and then a bottom-up approach to decide how much space is available for each concept.

4.1 The number of concepts

It is difficult to summarise the set of cognitive facilities described, since they seem rather arbitrary and vaguely defined. Maybe an annotated table will serve best:

Mental SystemNumber of ConceptsComment
Dictionary100,000The words we know and the internal tokens in our “language of thought” (in Fodor’s sense).(I would double this figure given Landauer’s estimate of a 100,000 word vocabulary.)
Encyclopedia100,000Commonsense knowledge and basic factual information. Based on the size of actual encyclopedias.
Episodic memory2,000,000Details of personal history. Based on learning something new every minute but forgetting 90% of it.
Goal structure100,000(Goodness knows)
Visual font200,000The number of visual patterns which can be distinguished. (As I stated on this point with Landauer, I think this is too small, maybe by a factor of 10.)
Other sensory modalities200,000The number of non-visual sensory patterns which can be distinguished
Motor schemata200,000Cognitive patterns for control of movement
Total number of cognitive “concepts”2,900,0003 x 106

Table 2.

4.2 The average size of a concept

Given that the cortex contains about 5 x 1013 synapses, how much space could the average “concept” use?

Clearly not every synapse will be directly involved in storing concepts: some will be needed for “operating system” facilities and some to provide redundancy in case of damage. But suppose 1013 were available for concept storage. Then each cognitive concept could claim over one million synapses.

Cherniak is quite happy to equate a synapse with a bit5, and therefore to allocate 1 Mbit to each concept.

5. Conclusion

Summarising the estimates of Landauer and Cherniak, we have:

ResearcherCategorySize (bits)
LandauerReading rate1.8 x 109
 Visual recognition3.4 x 109
 Allowing for 10-9 b/b/s loss rate1.4 x 109
 Words in a semantic network0.5 x 109
CherniakNumber of cognitive concepts3 x 106
 Size of each concept1 x 106

Table 3.

It is unfortunate that Cherniak doesn’t appear to have read Landauer’s paper: a combination of their thinking may have been interesting. Landauer’s paper stands on much more empirical and well-reasoned ground than Cherniak’s. Cherniak falls back too quickly on the number of synapses in the cortex without developing a solid model of cognitive requirements.

Though some of Landauer’s estimates seem to short-change human talent (especially his analysis of visual memory), most of his reasoning is sound. However, I’m left with the suspicion that he has worked hard to gather data which will lead to the magic 109.

The large gap left by both authors is the space required for the internal workings of the brain – memory insertion and retrieval strategies, problem solving techniques, mechanisms for creativity, analogical and hypothetical reasoning, allowance for redundancy etc. These issues are central to an understanding of the mind’s structure, and without an understanding of the structure, any estimate of capacity will be necessarily vague.

Bibliography

Cherniak, C; “Undebuggability and cognitive science”; Communications of the ACM, v31 #4, 1988

Landauer, T.K,; “How much do people remember? Some estimates of the quantity of learned information in long-term memory”; Cognitive Science v10, 1986

Standing, L; “Learning 10,000 pictures”; Quarterly Journal of Exp. Psyc. v25, 1973


Footnotes

1 One attempt at this has been made in “Long term memory storage capacity of multiconnected neural networks” by P. Peretto and J.J. Niez (Biological Cybernetics v54, 1986). In this article they find that the number of bits stored in a neural network is proportional to the number of synapses, regardless of the network’s topology. Interestingly, the constant of proportionality decreases as the synaptic order (a measure of the network’s interconnectedness) increases.

2 This should be a useful result to computational modellers. If the retrieval time had turned out to be linearly related to the size of the training set, we could deduce that the brain used a sequential search algorithm. But what sort of search algorithm yields a power-law access time?

3 Throughout this essay, “bits” is used in the technical sense of one binary digit.

4 This seems quite a reasonable estimate, though when you consider that it requires learning at a rate of 11 words a day for 25 years, it’s quite amazing.

5 I presume this is based on the thought that a synapse is either active or inactive at any one time. But if a synapse also involves a weight and can be either excitatory or inhibitory, then more than a bit is involved. Landauer thinks that a synapse could hold from two to ten bits [p492].