David Marr (neuroscientist)

David Courtenay Marr (19 January 1945 – 17 November 1980)[1] was a British neuroscientist and physiologist. Marr integrated results from psychology, artificial intelligence, and neurophysiology into new models of visual processing. His work was very influential in computational neuroscience and led to a resurgence of interest in the discipline.

David C. Marr
Born(1945-01-19)19 January 1945
Died17 November 1980(1980-11-17) (aged 35)
Alma materTrinity College, Cambridge
AwardsIJCAI Computers and Thought Award
Scientific career
FieldsComputational neuroscience
Artificial intelligence
Psychology
InstitutionsMassachusetts Institute of Technology
ThesisA general theory for cerebral cortex (1972)
Doctoral advisorGiles Brindley
Doctoral studentsShimon Ullman
Eric Grimson
John M. Hollerbach

Biography

Born in Woodford, Essex, and educated at Rugby School; he was admitted at Trinity College, Cambridge on 1 October 1963 (having been awarded an Open Scholarship and the Lees Knowles Rugby Exhibition).

He was awarded the Coutts Trotter Scholarship in 1966 and obtained his BA in mathematics the same year. He was elected a Research Fellow of Trinity College, Cambridge in 1968. His doctoral dissertation, supervised by Giles Brindley, was submitted in 1969 and described his model of the function of the cerebellum based mainly on anatomical and physiological data garnered from a book by J.C. Eccles. His interest turned from general brain theory to visual processing. Subsequently, he worked at the Massachusetts Institute of Technology, where he took on a faculty appointment in the Department of Psychology in 1977 and was subsequently made a tenured full professor in 1980. Marr proposed that understanding the brain requires an understanding of the problems it faces and the solutions it finds. He emphasised the need to avoid general theoretical debates and instead focus on understanding specific problems.

Marr died of leukemia in Cambridge, Massachusetts, at the age of 35. His findings are collected in the book Vision: A computational investigation into the human representation and processing of visual information, which was finished mainly in the summer of 1979, was published in 1982 after his death and re-issued in 2010 by The MIT Press. This book had a key role in the beginning and rapid growth of computational neuroscience field.[2] He was married to Lucia M. Vaina of Boston University's Department of Biomedical Engineering and Neurology.

There are various academic awards and prizes named in his honour. The Marr Prize, one of the most prestigious awards in computer vision, the David Marr Medal awarded every two years by the Applied Vision Association in the UK,[3] and the Cognitive Science Society also awards a Marr Prize for the best student paper at its annual conference.

Work

Theories of cerebellum, hippocampus, and neocortex

Marr is best known for his work on vision, but before he began work on that topic he published three seminal papers proposing computational theories of the cerebellum (in 1969), neocortex (in 1970), and hippocampus (in 1971). Each of those papers presented important new ideas that continue to influence modern theoretical thinking.

The cerebellum theory[4] was motivated by two unique features of cerebellar anatomy: (1) the cerebellum contains vast numbers of tiny granule cells, each receiving only a few inputs from "mossy fibers"; (2) Purkinje cells in the cerebellar cortex each receive tens of thousands of inputs from "parallel fibers", but only one input from a single "climbing fiber", which however is extremely strong. Marr proposed that the granule cells encode combinations of mossy fibre inputs, and that the climbing fibres carry a "teaching" signal that instructs their Purkinje cell targets to modify the strength of synaptic connections from parallel fibres.

The theory of neocortex[5] was primarily motivated by the discoveries of David Hubel and Torsten Wiesel, who found several types of "feature detectors" in the primary visual area of the cortex. Marr proposed, generalising on that observation, that cells in the neocortex are flexible categorizers—that is, they learn the statistical structure of their input patterns and become sensitive to combinations that are frequently repeated.

The theory of hippocampus[6] (which Marr called "archicortex") was motivated by the discovery by William Scoville and Brenda Milner that destruction of the hippocampus produced amnesia for memories of new or recent events but left intact memories of events that had occurred years earlier. Marr called his theory "simple memory": the basic idea was that the hippocampus could rapidly form memory traces of a simple type by strengthening connections between neurons. Remarkably, Marr's paper only preceded by two years a paper by Tim Bliss and Terje Lømo that provided the first clear report of long-term potentiation in the hippocampus, a type of synaptic plasticity very similar to what Marr hypothesized.[7] (Marr's paper contains a footnote mentioning a preliminary report of that discovery.[8]) The details of Marr's theory are no longer of great value because of errors in his understanding of hippocampal anatomy, but the basic concept of the hippocampus as a temporary memory system remains in a number of modern theories.[9] At the end of his paper Marr promised a follow-up paper on the relations between the hippocampus and neocortex, but no such paper ever appeared.

Levels of analysis

Marr treated vision as an information processing system. He put forth (in concert with Tomaso Poggio) the idea that one must understand information processing systems at three distinct, complementary levels of analysis.[10] This idea is known in cognitive science as Marr's Tri-Level Hypothesis:[11]

  • computational level: what does the system do (e.g.: what problems does it solve or overcome) and similarly, why does it do these things
  • algorithmic level (sometimes representational level): how does the system do what it does, specifically, what representations does it use and what processes does it employ to build and manipulate the representations
  • implementational/physical level: how is the system physically realised (in the case of biological vision, what neural structures and neuronal activities implement the visual system)

Marr illustrates his tripartite analysis recurring to the example of a device whose functioning is well understood: a cash register.[12]

At the computational level, the functioning of the register can be accounted for in terms of arithmetic and, in particular, in terms of the theory of addition: at this level are relevant the computed function (addition), and such abstract properties of it, as commutativity or associativity. The level of representation and algorithm specify the form of the representations and the processes elaborating them: "we might choose Arabic numerals for the representations, and for the algorithm we could follow the usual rules about adding the least significant digits first and `carrying' if the sum exceeds 9".[12] Finally, the level of implementation has to do with how such representations and processes are physically realized; for example, the digits could be represented as positions on a metal wheel, or, alternatively, as binary numbers coded by the electrical states of digital circuitry. Notably, Marr pointed out that the most important level for the design of effective systems is the computational one.[12]

Stages of vision

Marr described vision as proceeding from a two-dimensional visual array (on the retina) to a three-dimensional description of the world as output. His stages of vision include:[10]

  • a primal sketch of the scene, based on feature extraction of fundamental components of the scene, including edges, regions, etc. Note the similarity in concept to a pencil sketch drawn quickly by an artist as an impression.
  • a 2.5D sketch of the scene, where textures are acknowledged, etc. Note the similarity in concept to the stage in drawing where an artist highlights or shades areas of a scene, to provide depth.
  • a 3D model, where the scene is visualised in a continuous, 3-dimensional map.

2.5D sketch is related to stereopsis, optic flow, and motion parallax. The 2.5D sketch represents that in reality we do not see all of our surroundings but construct the viewer-centered three dimensional view of our environment. 2.5D Sketch is a so-called paraline drawing technique of data visualization and often referred to by its generic term "axonometric" or "isometric" drawing and is often used by modern architects and designers.[13]

Marr's three-stage framework does not capture well a central stage of visual processing: visual attention. A more recent, alternative, framework proposed that vision is composed instead of the following three stages: encoding, selection, and decoding.[14] Encoding is to sample and represent visual inputs (e.g., to represent visual inputs as neural activities in the retina).[15] Selection, or attentional selection, is to select a tiny fraction of input information for further processing , e.g., by shifting gaze to an object or visual location to better process the visual signals at that location. Decoding is to infer or recognize the selected input signals, e.g., to recognize the object at the center of gaze as somebody's face.

See also

Publications

  • (1969) "A theory of cerebellar cortex." J. Physiol., 202:437–470.
  • (1970) "A theory for cerebral neocortex." Proceedings of the Royal Society of London B, 176:161–234.
  • (1971) "Simple memory: a theory for archicortex." Phil. Trans. Royal Soc. London, 262:23–81.
  • (1974) "The computation of lightness by the primate retina." Vision Research, 14:1377–1388.
  • (1975) "Approaches to biological information processing." Science, 190:875–876.
  • (1976) "Early processing of visual information." Phil. Trans. R. Soc. Lond. B, 275:483–524.
  • (1976) "Cooperative computation of stereo disparity." Science, 194:283–287. (with Tomaso Poggio)
  • (March 1976) "Artificial intelligence: A personal view." Technical Report AIM 355, MIT AI Laboratory, Cambridge, MA.
  • (1977) "Artificial intelligence: A personal view." Artificial Intelligence 9(1), 37–48.
  • (1977) "From understanding computation to understanding neural circuitry." Neurosciences Res. Prog. Bull., 15:470–488. (with Tomaso Poggio)
  • (1978) "Representation and recognition of the spatial organization of three dimensional shapes." Proceedings of the Royal Society of London B, 200:269–294. (with H. K. Nishihara)
  • (1979) "A computational theory of human stereo vision." Proceedings of the Royal Society of London B, 204:301–328. (with Tomaso Poggio)
  • (1980) "Theory of edge detection." Proc. R. Soc. Lond. B, 207:187–217. (with E. Hildreth)
  • (1981) "Artificial intelligence: a personal view." In Haugeland, J., ed., Mind Design, chapter 4, pages 129–142. MIT Press, Cambridge, MA.
  • (1982) "Representation and recognition of the movements of shapes." Proceedings of the Royal Society of London B, 214:501–524. (with L. M. Vaina)
  • (1982) Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. San Francisco: W. H. Freeman and Company. ISBN 0-7167-1284-9. (In 2010, MIT press re-published the book with a foreword from Shimon Ullman and an afterword from Tomaso Poggio under ISBN 9780262514620.)

References

  1. David Marr, from the International Encyclopaedia of Social and Behavioral Sciences, by Shimon Edelman and Lucia M. Vaina; published 2001-01-08; archived at Cornell University; retrieved 2021-07-21
  2. Marr, David (2010). "Afterword (by Tomaso Poggio)" (PDF). Vision. A Computational Investigation into the Human Representation and Processing of Visual Information. The MIT Press. p. 362. ISBN 978-0262514620. Though it may not be true that this book started the field known as computational neuroscience, it is certainly true that it had a key role in its beginning and rapid growth
  3. AVA - The David Marr Medal
  4. Marr D (June 1969). "A theory of cerebellar cortex". J. Physiol. 202 (2): 437–70. doi:10.1113/jphysiol.1969.sp008820. PMC 1351491. PMID 5784296.
  5. Marr D (November 1970). "A theory for cerebral neocortex". Proc. R. Soc. Lond. B Biol. Sci. 176 (43): 161–234. Bibcode:1970RSPSB.176..161M. doi:10.1098/rspb.1970.0040. PMID 4394740. S2CID 13248803.
  6. Marr D (July 1971). "Simple memory: a theory for archicortex". Philos. Trans. R. Soc. Lond. B Biol. Sci. 262 (841): 23–81. Bibcode:1971RSPTB.262...23M. doi:10.1098/rstb.1971.0078. PMID 4399412.
  7. Bliss TV, Lømo T (July 1973). "Long-lasting potentiation of synaptic transmission in the dentate area of the anaesthetized rabbit following stimulation of the perforant path". J. Physiol. 232 (2): 331–56. doi:10.1113/jphysiol.1973.sp010273. PMC 1350458. PMID 4727084.
  8. Bliss TV, Lømo T (April 1970). "Plasticity in a monosynaptic cortical pathway". J. Physiol. 207 (2): 51–89. doi:10.1113/jphysiol.1970.sp009101. PMID 5511138. S2CID 222195297.
  9. Willshaw DJ, Buckingham JT (August 1990). "An assessment of Marr's theory of the hippocampus as a temporary memory store". Philos. Trans. R. Soc. Lond. B Biol. Sci. 329 (1253): 205–15. Bibcode:1990RSPTB.329..205W. doi:10.1098/rstb.1990.0165. PMID 1978365.
  10. Marr, D.; Poggio, T. (1976). "From Understanding Computation to Understanding Neural Circuitry". A.I. Memos. Massachusetts Institute of Technology. hdl:1721.1/5782. AIM-357.
  11. Dawson, Michael. "Understanding Cognitive Science." Blackwell Publishing, 1998.
  12. Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. W. H. Freeman and Company.
  13. Uddin, Saleh (1997). "Conventions and Construction of Paralines". Axonometric and Oblique Drawing: A 3-D Construction, Rendering, and Design Guide. New York: McGraw-Hill. pp. 1–14. ISBN 0-07-065755-6.
  14. Li Zhaoping 2014, Understanding vision: theory, models, and data , Oxford University Press
  15. Zhaoping, Li (2014). "The efficient coding principle". Understanding Vision. Oxford University Press. pp. 67–176. doi:10.1093/acprof:oso/9780199564668.003.0003. ISBN 978-0-19-956466-8.

Further reading

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.