Syllabus for Phil 3395 Introduction to Cognitive Science 1998
Jim Garson x 3208 Office Hours: MWF 11-12 e-mail:garson@menudo.uh.edu
Texts: M Mind , Paul Thagard
CS Cognitive Science , Stillings et. al.
Packet #56 at the UC Copy Center or the Bookstore
Week 1 Jan 21-26 What is Cognitive Science?
M Ch. 1 CS Ch. 1
Top-Down: Classical Cognitive Science
Week 2 Jan 28-Feb 2 Logic
M pp. 23-34 of Ch 2.
Week 3 Jan Feb 4-9 Reasoning
M pp. 34-41 (rest of Ch. 2) CS pp. 168-174
Week 4 Feb 11-16 Rules
M Ch 3. CS pp. 139-151 (skim if you like) pp. 155-159, 164-168
Week 5 Feb 18-23 Concepts and Learning
M Ch. 4. CS pp. 159-164, 192-213
Week 6 Feb 25-27 Review and QUIZ Feb 27
Bottom-Up: Neurally Inspired Cognitive Science
Week 7 Mar 2-6 Neuroscience
CS Ch. 7 (skim 270-275; 282-289; 291-298; 321-325)
Week 8-9 Mar 9-23 Vision
M Ch. 6. CS Ch. 12 (skim 467-479; 487-490; 506-512)
Week 9-10 Mar 25- Apr 3 Connectionism
M Ch. 7. CS pp. 63-83; 324-325; 92-93; 114-116; 121-124
Week 11 Apr 6 - Apr 10 Review and QUIZ Apr 10
In the Middle: Cognitive Psychology, Linguistics and Philosophy
Week 12 Apr 13-17 Cognitive Psychology
CS Ch. 2.1-2.3 M Ch. 6. CS 2.7
Weeks 13 Apr 20-24 Linguistics
CS Ch 6.1, 6.2, 6.3
Weeks 14-15 Apr 27-May 1 Philosophical Issues
M Ch. 9 CS Ch. 8.3 Qualia
May 4 Review
Final or Project MAY 8
Quizes 30% each. Final/Project 40%
Cognitive Science Notes Week 1
What is Cognitive Science?
A. Course Mechanics
- Introductions
- Books, There are 2:
a. Stillings et al. will be used for many reading selections: Chs. 1-4,
6.1-6.3, 7, 8.3, 12
b. Thagard is our text. We will cover all of it except (perhaps) Ch. 5
- Your Duties: Read week's assignment before first class. Be prepared
to answer oral questions about its content from me.
- Evaluation: 2 Quizzes and a Final/Project. The project may be a paper,
a website or a computer program, or you may opt to take a final exam. If
you chose to do a project, a proposal of 2 pages explaining your plans
must be provided to me on April 1 . If I do not receive it on time
you will take the final. All quizzes 60%, Final/Project 40%. These notes
contain an annotated list of resources: Books, Websites etc. to help you
with projects. For more details see me.
B. Introduction: What is Cognitive Science?
- Cognition = Thinking
(So Cognitive Science = the Science of Thinking or the Science of the
Mind)
- So it is just Psychology?
True there is Cognitive Psychology.
This brand of psychology is a new movement that has to some extent displaced
the Behaviorist School.
- But there are also essential contributions from other fields:
Linguistics, Computer Science, Neuroscience, Anthropology, and Philosophy.
- So any course in Cog Sci is multidisciplinary.
a. Problem: Since each professor is schooled in a given discipline, each
teacher will have gaps and prejudices. Here is some account of mine. Strong
areas: Philosophy, (Logic), Linguistics, and Computer Science. Weak ones:
Neuroscience, Psychology and Anthropology
b. We will have a few visitors to help fill some of my gaps.
- In each of the disciplines the contributions are only from a subgroup:
a. Philosophy: Philosophy of Mind, Logic
b. Psychology: Cognitive Psychology
c. Computer Science: A wing of Artificial Intelligence
d. Linguistics: Computational Linguistics
e. Neuroscience: Functional neuroscience
f. Anthropology: Cultural or Cognitive Anthropology
- How to conceive of this interdisciplinary overlap: Each is about the
same topic, but each one differs markedly in their methods: (See the discussion
in CS section 1.6.)
a. Philosophy: More abstract issues about the nature of thinking.
- What are the principles of correct thinking?
- How can a brain (or a machine) represent the world?
- The Mind Body Problem. What is a Mind and how is it related to the
Body?
- Subjective Experience. Are subjective experiences just activities of
the Brain?
b. Cognitive Psychology: Controlled experiments with human subjects
(and sometimes smaller animals) to try to characterize the nature of human
thinking and to figure out and test theories of the mechanisms that might
produce those abilities.
- Perception: How is visual information processed into a picture of the
world?
- Memory: What different memory systems are there and how do they work?
- Reasoning: How do humans reason? What explains their success and failure?
- Concepts: What concepts do humans use and where do they come from?
- Emotions: How do emotions affect and/or contribute to thinking?
- Planning: How do humans formulate plans and monitor their execution?
- Movement: How can we explain human abilities at graceful motion?
c. Artificial Intelligence: Create computer models that actually simulate
human cognition.
- Many AI researchers do not care to figure out how humans actually do
it. Often their goal may be to create a system to do some task better than
humans do by doing it a better way: (medical diagnosis).
- Creating models can help reveal how hard the problems are that the
human being takes for granted. For example robotics has a long way to go
to understand and duplicate the seemingly simple feat of walking.
d. Computational Linguistics: Formulate the structures found in language
as a set of rules and try to see how such rules might be carried out in
machines or the brain.
e. Functional Neuroscience: Study the brain using PET, MRI, probes, and
the effects of brain damage etc., to try to see how various parts of the
brain contribute to human cognitive abilities.
f. Cognitive Anthropology: Cultural anthropology studies of ways of life,
including how humans in different cultures perceive, reason and act. By
examining how human cognition differs in different cultures, clues to the
nature of human thinking can be developed. For example, there are universals
in name systems for colors: Black White Red, Blue/Green etc.
C. Applications of Cognitive Science
- What could be more exciting than to cross the last major intellectual
frontier?
- Understand reading so as to treat reading problems.
- Speech therapy for stroke victims.
- Understand memory so as to predict reliability of legal witnesses
(Witness blending: At what speed did the car smash into the truck? vs.
At what speed did the car run into the truck?)
- Imagery for training athletes
- Expert systems and robotics
D. Roots of Cognitive Science
- History
a. Cog Sci (like sciences in general) has its roots in philosophy, especially
the philosophy of nature. All other sciences related to cog sci branched
out of philosophy:
Philosophy: 300 BC
Neurology: mid 1800s
Psychology: 1860s
Anthropology: 1920s
Linguistics: 1920s
Computer Science: 1960
(PhD is doctor of philosophy, In Library of Congress Book codes, B is philosophy,
BD is psychology). b. So at the roots we can expect to see philosophers
and a key philosophical issue: Is the Mind appropriate for study by science?
This question is related to the fundamental struggle between dualists and
naturalists.
- Dualism: Mind Body are distinct.
Plato: Reasoning is something distinct and higher than the physical. (Man
is rational animal.) Thought is communication with the world of eternal
and perfect Forms.
Descartes: (and the Medieval Christian tradition). The Mind (Soul) is radically
distinct from (and higher than) the body. It is eternal and therefore non-physical.
It interacts with the physical world, but is essentially outside what physical
science can explain.
Today: The general public and such philosophers as David Chalmers and Thomas
Nagel believe in remnants of this view.
- Naturalism: There is nothing beyond the physical world so a Mind must
be some feature or aspect of the Body
Some historical sources:
Aristotle: There is no world of forms, though each thing has a built-in
essence that helps us understand what it is and what it strives for. Formulation
of the rules of logic.
La Mettrie: Man is a machine. Human thinking (like all else) is subject
to the laws of nature.
Hobbes: Thought is processing of non-numerical symbols.
Leibniz: Reasoning is like numerical calculation (Ars Combinatoria).
Wundt: Birth of Experimental Psychology (though he was a philosophical
dualist)
Watson, Skinner: Behaviorist Psychology. There is nothing to explain except
human behavior; the Mind/Soul is a myth.
Classical Cognitive Science: The Mind is not a myth. We can theorize about
internal (or "subjective") states in scientifically acceptable
ways. Thought is akin to the information processing carried out by symbolic
computers, and this processing can be the subject of science.
Connectionism: A better theory of how the mind works can be forged by taking
our cue from neurology. Forms of parallel computation (parallel distributed
processing, neural nets) that are essentially non-symbolic are better suited
to explaining how thinking works.
E. A Controversy: Top-Down vs. Bottom Up
- Top-Down The classical paradigm: CRUM (Computational-Representational
Understanding of Mind) GOFAI (Good Ol' Fashion Artificial Intelligence)
a. Characterize what the mind does using logic and computer science. Once
these abilities are simulated on a computer you can then understand the
general principles that underlie human intelligence.
b. Thinking = processing of representations.
c. Some analogies:
- Data Structures + Algorithms = Programs
- Ingredients + Instructions = Meal
- Mental Representations + Computational Procedures = Thinking
- Mind : Brain :: Software : Hardware
d. Understanding cognition does not require detailed information about
how the brain actually implements information processing.
Instruction: cream the butter can be done in various ways. It doesn't matter
(much) to the end product exactly how you do it
e. After all. You can do software science without knowing anything about
hardware.
f. The project: Define the task. Theorize about what are likely
representations and procedures to get job done. Model the theory
as a data structure and algorithm. Program actual computer code,
and run it on a computer platform to test the ideas.
g. When the project is done we will understand the Mind as the result of
computation in the brain.
- Bottom - Up: Connectionism
a. We need to look more carefully at the actual neurology of the brain
to understand what it does and how it does it.
b. The symbolic computer is a very bad model of what the brain does and
how it does it.
c. A search for alternative mechanisms guided by what we find out about
neural structure will be more fruitful in explaining thought.
Reading
Von Ekhart, Barbara What is Cognitive Science? A good discussion
of the nature of the discipline at the turn of the 90s. Discusses worries
that cognitive science may not count as a cohesive science.
Cognitive Science Notes Weeks 2-3 (M Ch 2, CS pp. 168-173): Logic and
Reasoning
I. Logic
A. The Attraction of Logic to Cognitive Science
- Logic is a formal theory about how reasoning works.
- Thinking appears to depend centrally on reasoning ability.
- We now know how to program computers to process the rules of logic,
so this might be an excellent model for understanding (human) intelligence.
B. History
- Aristotle is the father of logic: His work on the syllogisms was refined
by logicians of the middle ages.
a. At root is the idea of a syllogism - a pattern of reasoning:
All men are mortal All A are B
Socrates is a man S is A
Therefore: Socrates is mortal So: S is B
b. Fundamental idea: good reasoning depends on using the right form. The
content does not matter to whether the reasoning is correct. Any example
of reasoning that matches the right-hand pattern above is good reasoning.
c. Medieval logic taught which were correct syllogisms with a poem.
d. But it was recognized that this system had its limitations. For example
there seemed to be no way to handle the following reasoning with the Aristotelian
approach:
All horses are mammals
So every tail of a horse is a tail of a mammal
The fundamental problem was that the system couldn't handle relations:
is a tail of.
- The work of de Morgan, Frege and Russell lead to the development of
predicate logic, which has much stronger expressive powers, and forms the
core of the modern approach to formal logic.
C. Predicate Logic
- The Language of Predicate Logic
a. Connectives (Symbols of Propositional Logic)
& . ^ And
Or
~ - Not
-> If .. then
<-> If and only if
b. Symbols from Predicate Logic
a b c Names R(a) Albert is rowdy
P Q R Predicate (Letters) L(a,b) Albert loves Betty
f g h Function (Letters) R(f(a)) Albert's father is rowdy
x y z Variables UxR(x) Everything is rowdy
= Identity Ux(R(x)->x=a) Only Albert is rowdy
U E Quantifiers UxEyL(x,y) Everything loves something
The conventions of computer science are more long-winded but more legible:
(for-all x)(Man(x) -> Mortal(x))
Man(Socrates)
Mortal(Socrates)
- Some Rules for Predicate Logic
UxA(x) / A(c) Universal Instantiation (Specification) (UOut)
A, B / A&B Conjunction (&In)
A, A->B / B Modus Ponens (>Out)
A->B, ~B / ~A Modus Tollens
- Proofs in Predicate Logic
One may use proofs to verify that a pattern of reasoning is correct (valid).
A proof is a sequence of steps that result from applying the rules to previous
members of the sequence to obtain the conclusion from the given formulas.
For example, here is a proof for the Socrates is mortal argument:
- (for-all x)(Man(x) -> Mortal(x)) Given
- Man(Socrates) Given
- Man(Socrates) -> Mortal(Socrates) 1 Universal Instantiation
- Mortal(Socrates) 2 3 Modus Ponens
- The power of Predicate Logic
a. By adding to these rules one can formulate a consistent (sound) and
complete system for predicate logic. This means that every correct reasoning
pattern can be proven by the rules (completeness), and no incorrect arguments
can be proven (consistency).
b. The process of proof finding can be mechanized with computer programs
called theorem provers. Most theorem provers have the feature that when
an reasoning pattern is correct, the theorem prover is guaranteed
to find the proof (sooner or later). So all correct patterns of reasoning
can be identified mechanically, simply by applying symbolic rules to symbolic
representations of arguments. Good reasoning can be computerized!
D. Problem Solving, Logic Programming and PROLOG
- It may be hard to imagine how a theorem prover might be used to solve
problems. Isn't problem solving essentially creative? There is a branch
of artificial intelligence (AI) called logic programming that uses predicate
logic and its rules as the basis for a programming language called PROLOG.
This school of AI believes that intelligent problem solving (and so all
computer programming) can be based simply on predicate logic.
- To make the idea more plausible, we will work through an example. (For
simplicity I will not use PROLOG notation, but a variant of the notation
of Predicate Logic.)
- First some basic ideas. It can be shown that any formula of predicate
logic can be written equivalently in the form of a set of sentences with
the form:
A&..&B -> C .. D
PROLOG simplifies things by only allowing sentences of the form:
A&..&B -> C
You can think of this as a rule saying that if you can prove A, .., B then
you will have C. Or to put it another way, to prove C, you should prove
A, .., B. Note also that
-> C
says that C is proven. It can also be shown that
B ->
(where the right hand side of the -> is empty) amounts to saying that
not B is proven. So in PROLOG notation you express 'not' by putting
a sentence on the other side of the arrow. Finally, in PROLOG, you leave
off universal quantifiers: Ux, as they are understood.
- PROLOG is based on the resolution rule . Here are two examples
of the rule in operation:
A(x) -> C(x)
C(a) ->
So A(a) ->
- A(x)&B(x) -> C(x)
- -> A(a)
- So B(a) -> C(a)
The idea is that two matching sentences (C(x) and C(a) in the first
example) on different sides of the -> can be cancelled, in which case
the variable x takes on the value it would get in the match in the result.
(So in the first example, we cancel the C(x) in the first line, and set
x to a, obtaining A(a) ->. It can be shown that the resolution rule
is the only rule needed for a correct system of predicate logic!
- A simple problem solving example. We have the following paths:

This data can be expressed in PROLOG in the following set of sentences.
where 'C' abbreviates 'connects to', and where extra parentheses are omitted
for legibility
1. -> Caf
2. -> Cab
3. -> Cbe
4. -> Cbc
5. -> Ccd
6. Cxy&Cyz -> Cxz
Note the last trivial claim about connectedness: if x is connected to y
and y to z, then x is connected to z.
a. Suppose we want to solve the problem of how to get from a to d. It turns
out that Cad can be proven in predicate logic from 1-6, and the proof provides
instructions for getting from a to d.
b. Let us see how the resolution rule turns up the proof. c. Begin by adding
NOT Cad to steps 1-6. Our strategy is to prove Cad by showing NOT Cad leads
to a contradiction.
7. Cad ->
8. Cay&Cyd -> 6 7 Resolution
9. Cbd -> 2 8 Resolution
10. Cby&Cyd -> 6 9 Resolution
11. Ccd -> 4 10 Resolution
12. > 5 11 Resolution
Line 12 is an arrow with nothing on the right or the left. This is indicates
a contradiction in PROLOG. So line 7 has lead to the contradiction on line
12 and we know Cad follows from 1-6. Note also that the data used in the
proof: 1. -> Cab, 4. -> Cbc, and 5. -> Ccd amounts to a solution
for the problem. In proving there is a connection between a and d, PROLOG
has found a pathway from a to d, and so solved the problem.
II. Strengths and Weaknesses of Logic for Cognitive Science
A. Good News: Predicate Logic is sound and complete. A completely rigorous
and correct system for predicate logic can be computerized so that any
correct pattern of reasoning in the language can be discovered by a computer.
B. Good News: Turing's Thesis: There is excellent evidence that any process
that can be expressed with a finite set of rules can processed by a digital
computer that operates on representations in the language of predicate
logic. So it would seem that any coherent reasoning is always something
that is captured by logic programming.
C. Bad News: A fully correct mechanism for logic problem solving may spend
way too much time solving problems. (There is strong evidence that logic
problem solving requires exponential processing times in a worst case.)
For this reason PROLOG has had to sacrifice correctness to obtain efficiency.
D. Bad News: Turing's Theorem: Predicate Logic has no decision procedure.
Although good reasoning can always be discovered (eventually) by a logic
problem solver, there is no guarantee that bad reasoning can be identified
as such in a finite amount of time.
E. Bad News: Godel's Theorem: There is no correct finite system of rules
for mathematics.
F. Bad News: Many sentences of English resist representation in the language
of predicate logic: John believes that all men are mortal on Sunday but
not Monday.
G. Good News: Systems to handle belief, time and other so-called intensional
concepts have been developed, and are being adopted by AI researchers.
However, many of these systems are controversial, and there are few standards
as there are with predicate logic and PROLOG.
H. Bad News: Predicate Logic doesn't let you take it back. If I say that
all tigers are striped in the data, then asserting that Tigger is an albino
tiger causes a contradiction. (I assume the data includes a rule that says
albino means not striped.) From a contradiction anything follows
in predicate logic. The problem is that predicate logic uses monotonic
reasoning, which means that the more information you have the more you
can prove from it. What is needed is a way to add data that removes previous
lines of reasoning.
I. Good News: Non-monotonic logics have been developed that are modifications
of predicate logic. In these systems you can say 'All birds fly', and then
assert 'Penguins don't fly' without causing contradictions. Arrangements
are made so that the data 'Penguins don't fly' automatically creates an
exception to the rule: 'All birds fly'.
J. Bad News: Predicate Logic doesn't (conveniently) let you handle matters
of degree. If I write 'Tall(John)' then I have said that John really is
tall. There is no way to say he is sort of tall, or somewhat tall. Similarly
you can't (easily) say that the probability that John is tall is 90%.
K. Good News: Many-Valued Logics, Logics of Probability, and Fuzzy Logics
allow expression of matters of degree. Fuzzy logics have been found to
be quite useful in AI especially in controlling machines.
III. The Psychological Plausibility of Logic
A. Probably the most damaging complaint about the use of logic as a foundation
for cognitive science is that it is a poor model of human reasoning. Logic
may be a fine standard for good reasoning but is a bad picture of
what human beings actually do when they reason. There are two kinds of
evidence for this claim.
Further Reading for Weeks 2-3
Predicate Logic H. Pospesel A good introduction to Predicate Logic.
(If you don't already know propositional logic, read his Propositional
Logic .)
Logic for Problem Solving R. Kowalski The classic text on logic
programming.
Minimal Rationality C. Cherniak Argues that logic is too expensive
for humans to use.
Cognitive Science Notes Week 4: Rules
M Ch 3, CS 4.1 (AI), 155-159 (Semantic Nets), 164-168 (Rule Based Representations)
I. Introduction: Rules and Problem Solving in AI
A. Logic programming is too limiting. Why model the human mind as predicate
logic data plus a logic "engine" when there are so many other
perfectly good programming techniques the mind might use.
B. Take a more general approach. A mind is a data structure along with
a set of rules (a program) that operates on that data.
C. The basic idea of rule representation was implicitly in PROLOG: have
rules of form: If A&..&B then C. But instead of having these as
data about the world, think of these as cognitive processes that intelligent
agents use: the rules of cognition. Also, why restrict yourself to rules
of the if..then form? Any representations of procedures programmable in
a computer language can be used.
D. Problem solving is taken to be the application of rules (programs) to
explore a search space, i.e. a space of all the possible actions that might
lead to a solution.
(Example: What is the search space for a combination lock?)
E. Note on this week's topic: It is "Rules". This can be narrowly
construed so that we would be studying just rule-based representations
and so-called production systems. Or we can construe this more broadly
to include any system that uses programs (sets of rules) to operate on
symbolic representations.
II. Some Historical Roots
A. The Logic Theorist (Newell and Simon)
Newell and Simon interview subjects to determine strategies they use to
solve problems. (Protocol analysis.) In the logic theorist the task was
solving proof finding problems in logic. The program was built around strategic
methods found in their protocols.
B. General Problem Solver (GPS) (Newell and Simon)
Newell and Simon felt that there are basic features common to all problem
solving: Define goal, resources (initial state), and a "search space"
with a measure of "distance" from resources to goal. Fundamental
strategy: Find subgoals (with shorter distances from initial state) that
contributes to ultimate goal. Now apply the method over again to each of
the subgoals.
C. CHECKERS (Samuel) One of the very first programs able to learn a task.
It plays championship grade checkers and learned to do so by chaining how
the board is evaluated on the basis of past successes and failures.
D. SHRDLU (Winograd)
The program converses with the user about a block world. It can describe
the world and carry out commands.
E. Expert Systems MYCIN (Shortliffe)
Expert systems are computer programs that simulate an expert's reasoning
ability. For example, MYCIN diagnoses blood diseases on the basis of symptoms
and test results. It compares well with doctors. A rank ordered list of
diagnoses is produced along with suggestions for further tests and observations.
MYCIN is strongly rule based: If stain is negative and rod shaped and anaerobic
then probability is .6 that the organism is bacterial.
F. SOAR (Newell and Simon)
Uses chunking or knowledge compiling to define collections of rules that
resolve smaller problems. These are created and then stored for use in
future problem solving. This provides a kind of rule based learning. SOAR
has been able to duplicate the kind of reasoning reported by subjects on
the same problems; it also models data on learning rates fairly well.
G. (Shank and Abelson) Developed a system that uses frames and scripts
to interpret newspaper clippings.
H. HYPO (Ashey and Rissland) A reasoning system designed to relate a new
case to legal precedents developed in the past. It is capable of measuring
the similarity of one case to another along various legal dimensions
I. CYC (Lenat) A massive reasoning system designed to simulate the world
knowledge of a human being. Lenat's theory is that failures in pervious
systems to model cognition are simply due to the lack of enough data. (Gravity
is everywhere. Gravity was over the river. Gravity was not supported. Therefore
Gravity fell in the river. Gravity is not a sea creature. Gravity cannot
swim. So gravity drowned.)
III. Fundamenals of AI
The fundamental issues are these:
A. Representing the Data (other than with predicate logic)
- Semantic nets Nodes and Links
isa: dog isa canine isa mammal isa animal haspart: dog haspart leg haspart
toe haspart nail
- Frames, Scripts
Labeled slots and fillers often filled with defaults. Also has attached
procedures. This makes them like objects in computer science These can
be inherited and overridden by instances. Scripts are like frames but also
allow representation of standard sequences of events. (Example: Birthday
script p. 161 CS)
- Rules
Information is stored in rule format: If situation then action. Rules may
express probabilities.
B. Operating on the Data
- Traversing a semantic net. Is a dog a mammal? Does a dog have toes?
2. The Interpreter Cycle for a Production System
a. Forward Chaining
Examine the context to locate the rules that apply.
Decide which rules to fire
Fire the rules and calculate their effects on the context
Repeat.
b. Backward chaining:
Examine goal and locate rules that would produce it.
Decide which rules are good bets.
Set their antecedents (the if parts) as new goals.
Repeat.
c. Both

- Culling out Multiple solutions. Cost measures.
Example. There are lots of routes from my house to UH. What is interesting
to me is either which is fastest, or which uses least gas.
- Heuristic Search
Example. In chess, there is a immense space of possible moves, responses,
moves, responses etc.. Since the search space is so large, we try to focus
attention on likely moves by introducing rules: 1) Do not get your queen
out early. 2) Do not get a knight on the side of the board. 3) Place pieces
so that they control the center squares. 4) Be willing to sacrifice a pawn
to preserve good pawn structure.
Example: What is wrong with my car. There are so many possibilities so
explore those possibilities that can be fixed most quickly, cheaply, or
are more likely.
- Blackboard models
In a blackboard model, we have many processes going at once. When one process
needs some information, or the solution to some goal, it posts the request
on the blackboard. Then any other process that is free may try to resolve
the problem and post the solution when it is found. You can arrange blackboards
hierarchically. You can also arrange for "demons" to look over
blackboards to see that requests do not use too much resources by considering
costs, benefits and likelihood of success.
- Learning (We will discuss this in more detail next week.)
We can model the idea that rules are acquired from experience (Samuel's
CHECKERS program) from other rules (SOAR's chunking). Also decisions about
what rules to try first can be made on the basis of how successful a rule
has been in the past.
V. An Assessment of AI
A. Bad News: Representational Power
As we explained in the case of PROLOG, rules do not have the full expressive
power of predicate logic.
B. Bad News: Correctness
In logic we can certify that a reasoning process will be correct. There
are no similarly straighforward ways to certify the correctness of rule-based
systems. For example the project of certifying programs (of reasonable
size) is still far beyond us.
C. Good News: Flexibility
You can program rules to be as flexible as you like, for example you can
build them with defaults. 'If x is a bird then x flies' can live in harmony
with 'If x is a penguin then x does not fly'. The secret is to build a
reasoning system that keeps track of which rules override other rules.
(Penguins are a kind of bird. So all rules about birds are overridden by
any information you have about penguins.) You can also allow the formulation
of rules about how to use rules ... rules about rules about rules etc.
D. Good News: Speed through Heuristics
In logic programming, the system blindly searches though every possible
solution. In rule based systems it is a lot easier to formulate heuristics:
rules of thumb that help shorten the search.
F. Good News: Rules Seem Ideal for Explaining Language
The grammatical structure of language can be expressed as a set of rules
(although no one claims to have an account of the full grammar of English).
Chomsky has argued that there must be a set of innate rules that govern
language abilities called Universal Grammar. If this is right then rule
based approaches correctly model a central cognitive accomplishment.
G. Bad News: Combinatorial Explosion
Dennett's account of R2-D1
H. Bad News: Common Sense is Elusive. Why Computers Can't Understand Language.
(Terry Winograd's work on airline reservation systems shows how complex
language can be. When someone asks for a flight a little closer to 7, they
don't want one a little closer, they want one as close as possible to 7!)
I. Bad News: AI has no theory. What have I learned when I create a computer
model that does something? All the AI researcher has done is transfer his
intelligence to the computer in the form of a program. This may tell us
something about what computers are capable of doing, but it certainly does
not reveal the details or even the principles of human intelligence.
Further Reading
Winston, P. Artificial Intelligence A clear text on artificial intelligence
with thorough accounts of such topics as representation and search.
Winograd, T. Language as a Cognitive Process Provides a good starting
point for understanding the classical approach to natural language processing.
Forsyth, R. Expert Systems A collection describing a number of expert
systems.
McCorduck, P. Machines Who Think A excellent survey of the history
of AI.
Cognitive Science Notes Week 5
M Ch 4, CS 4.1 (AI), 159-164 (Scripts and Frames), 192-213 (Learning)
you may skim 203-213
I. Concepts in Cognitive Science (I-III after a presentation by Eric Margolis)
A. Concepts are a mystery but cognitive science can now help us resolve
some age old philosophical problems, for example: How much of the mind
is innate?
B. Common sense and cognitive science may look at innateness in different
ways. The ordinary man's questions might be what innate talent did Mozart
have that made him so good at music (and me so awful). Or what innate characteristics
differ from one race to another.
- But there are good reasons for cognitive science not to study individuals,
and races.
- One is that cognitive science is the study of features common to humanity.
- Furthermore race, for example, is a superficial and subjective concept
and not appropriate for science. Science requires the use of natural kinds,
that is concepts that pick out essential features of the world around us.
We should try to get rid of any variables that might be irrelevant or might
cloud our view.
C. What are the fundamental questions in cognitive science. They are
ones that we take for granted, but which on second thought are really quite
mysterious.
- How do people understand language?
- How do they reason?
- How do they recognize faces? etc.
- Why doesn't John appear to have grown when he walks towards you? Why
does the pen not seem to be deforming when you change its angle of view.
We seem to perceive the world in terms of objects . This is a truly
wonderous perceptual ability.
- Why do only certain sounds (not, for example, the raspberry noise)
of those that we can make with our mouths turn up in the world's languages?
- How can we recognize which words are possible words of English ('blick'
is, but 'tliop' is not , despite the fact that 'tl' is a legal phoneme:
as in 'battl e'). We seem to know the rule that English words don't
start with 'tl'.
- How do we know that a thing that opens is an opener and that
many of such things are openers not openser ?
- How do we process structure? For example 'John climbed up the mountain'
has a different form from 'John called up his mother'. The proof of this
is that we can invert the 'up X' part in the first but not the second:
Up the mountain John climbed.
?Up his mother John called.
This shows that 'up' goes with the phrase 'up the mountain', in the
first, but 'up' goes with 'called up' in the second. We detect such structural
facts without thinking.
II. Concepts: What are They?
A. Philosophers have always analysed concepts: Plato: What is justice?
B. And wondered whether they are inborn.
- Rationalists (Descartes, Leibniz) tend to think (some) inborn concepts
are a prerequiste for knowledge. (Descartes thought the concept of God
was innate.)
- Empiricists (Locke Hume) tend to think that concepts are the product
of learning and so needn't be inborn.
C. Most cog scientists are very interested in how concepts can be learned,
assuming that most concepts are not inborn.
D. Classical accounts of concepts:
- Semantic nets: what makes a node the concept for DOG is its position
in a complex net - where it fits in the hierarchy (or web) of other concepts.
Concepts are used in the process of exploring the semantic net where it
is found.
- Frames, Scenes, Scripts: what makes a frame the concept DOG is the
complex set of (labeled) structures related to it. Concepts are used when
slots in these structures are filled in so as to apply them to a specific
task. (C p. 159 for a sample Frames; p.161 for a sample Script.)
- The Theory-Theory. The ideas in 1 and 2 can be generalized: A concept
is any symbolic structure defined both by its role in processing and by
its place in a hierarchy of other concepts. This is sometimes called the
theory-theory of concepts. Concepts are like the terms of a scientific
theory. (Consider F=ma. What is mass? - The thing that plays a role in
the theory of physics.)
E. Prototype theory of concepts: A concept is not fixed by a definition
but by a prototype, a typical member of the class.
F. Connectionist accounts of concepts: A concept is a region in a space
of possible features.
G. Philosophical Thoughts about Concepts (or Meanings)
- Meanings of concepts are derived from meaning of other concepts: Eg.
LOVE, UNITY (Compare: the meaning of a word is its definition, the meanings
of the words in its definition defined by their definitions, etc.)
- Meanings of concepts are derived from perceptual observations: DOG,
MOTHER
- Meanings of concepts are derived from their role: NOT, ALL
III. Puzzles about the Classical View of Concepts.
A. We never seem to be able to give the definition (rules) for any concept.
Try: A tiger is a large, striped feline. But there are well known counterexamples
to the definition: the albino tiger. It seems that what makes something
a tiger is not the presence or absence of any one feature but something
more like a cluster of features (Consider Wittgenstein's thoughts about
family resemblance.)
B. Prototypes vs. Rules.
- Some psychological evidence (though not all) suggests that protoptyes
are a better model. (eg. 'A robin is a bird' is more quickly processed
than 'a penguin is a bird'.) The definition approach does not explain our
judgments of what is and is not typical.
- Then again, concepts like 'is drunk' needn't be applied on the basis
of similarity to a prototype, but via a process of inference based on rules.
(This guy jumped in the pool with his clothes on. That's distorted judgement,
and being drunk does that, So he's drunk.)
C. If having concepts is a prerequisite for thought, and thinking is
needed for learning (for example evaluating alternative hypotheses about
which concept is usefully applied to the world), then how could we ever
acquire a new concept? The classical theory seems to have only one explanation
for concept learning. New concepts are composed from combining old ones.
But then it seems we could never learn a genuinely new atomic concept.
IV. Learning
A. Which concepts are innate or to what extent are they innate? An how
do we explain the acquisition of ones that are not innate?.
B. Techniques for acquiring new concepts: a) combinations of old concepts
b) combination of features c) extraction of patterns in raw data (like
inductive generalizations) C. Learning is the acid test for AI models,
for often good performance is simply due to the fact that the desired ability
is simply programmed in. To display the flexibility of human intelligence,
you need a model that can generalize from what it has already achieved
to resolve novel problems
D. Classical learning techniques. A historical overview.
- Samuel's CHECKER learns two ways. Rote memorization of good/bad board
positions (boring) and adjusting the evaluation function for board positions
on basis of their appearance in a sequence of moves that eventually won
a game.
- Winston's ARCH-LEARNER generates and tests usefulness of new concepts.
Uses hits and near misses to refine concept boundaries. (C. p. 158 top
a-d)
- Quinlan's ID3 learns concepts from positive and negative instances
by creating a decision tree for classifying objects. Algorithm used to
order the questions in the tree (most broadly discriminating first).
- Fisher's COBWEB learns concepts by clustering features of examples
and calculating a classification tree. It is unsupervised, that is not
told which are positive and negative instances of concepts. It just looks
at the class of all instances and clusters them.
- Newell's SOAR uses chunking of problem solving steps (knowledge compiling)
to build a toolbox of useful problem solving methods.
- Connectionist models: To be explained in a few weeks
E. Problems:
- Getting the right atoms or features
- Developing genuinely new concepts, (the new term problem)
Further Reading
Shank, R. and Abelson, R. Scripts Plans Goals and Understanding The
classic work on AI from the frames and scripts perspective.
Week 6. Review and Quiz
Cognitive Science Study Questions for Quiz 1
Reading: M Chs. 1-4, CS 139-173 (skim case studies 142-151), 192-212 (skim
203-212)
I. Identify the following terms in a sentence or a phrase.
Artificial Intelligence Cognitive Psychology
La Mettrie Leibniz
Wundt Skinner
Behaviorism The Mind-Body Problem
CRUM GOFAI
Syllogism Predicate Logic
Universal Instantiation (Specification) Modus Ponens
Modus Tollens Logic Programming
PROLOG Resolution Rule
sound and complete Turing's Thesis
Godel's Theorem non-monotonic logic
Fuzzy Logic The Logic Theorist
General Problem Solver Newell and Simon
Minsky McCarthy
Winograd Blocks World
Search Space Semantic Net
Frames The Frame Problem
Blackboard Models CYC
ARCH-LEARNER Winston
MYCIN CHECKERS
HYPO SHRDLU
Expert system Scripts
ID3 Quinlan
selection task Wason
Tversky and Kahneman Rips
Shank and Abelson PTRANS ATRANS
Connectionism Piaget
Spelke Baillargeon
Theory-Theory Prototype
Rosch Chomsky
Lexicon Miller
II. Sample Questions (from 2-3 sentences to one paragraph)
1. What is the difference between cognitive science and cognitive psychology?
2. What parts of what disciplines make up cognitive science?
3. It has been suggested that the disciplines that make up cognitive science
all study the same question, but employ different methodologies to obtain
an answer. Discuss.
4. Discuss the contributions of each of the following disciplines to cognitive
science, citing the name of the subdiscipline that is especially relevant:
Philosophy, Psychology, Computer Science, Linguistics, Neuroscience, Anthropology.
5. Cite some of the practical applications of cognitive science mentioned
in your readings.
Describe the debate between dualists and naturalists. Explain the role
of the following figures in the debate: Plato, Descartes, Nagel, McGinn,
Aristotle, La Mettrie, Hobbes, Leibniz, Wundt
6. Briefly describe the difference between top-down and bottom-up methodologies
in cognitive science. Note only the main points of difference.
7. Mind : Brain :: Software : Hardware. Explain how this analogy might
characterize the classical approach to cognitive science.
8. Explain the formula: mental representations + computational procedures
= thinking.
9. Explain CRUM in detail.
10. What are five ways of evaluating theories of mental representation
suggested by Thagard?
11. Describe the fundamental assumptions of the classical school in cognitive
science.
12. Give reasons that a classical cognitive scientist might cite to argue
that the science of cognition is not the science of the brain.
13. What is the difference between deductive and inductive logic?
14. Express the following thoughts in Predicate Logic: Albert loves Betty.
Albert's father is rowdy. Everything is rowdy. All humans are mortal.
15. In PROLOG all data is expressed in clauses of the form A&..&B
> C. How can we say: not 16. Explain the resolution rule in about four
sentences.
17. From clauses 1-6 derive Cad > using the rule of resolution.
- -> Caf
- -> Cbe
- -> Cbc
- -> Ccd
- Cxy&Cyz -> Cxz
18. What difficulties surface in trying to represent in Predicate Logic
such sentences as: 'John believes that Mary believes that philosophers
are right on Thursdays under a full moon.'
19. Why is 'Tigger is an albino tiger' a problem for theories of intelligence
based on predicate logic? How might the use of defaults or non-monotonic
logic help resolve these problems?
20. Describe advantages and disadvantages of using predicate logic to explain
cognition.
21. Cite psychological experiments that suggest that predicate logic is
not a good model of how humans actually reason.
22. People do not perform very well on Wason's 4-card task. What factors
tend to improve performance? What does this suggest about the nature and
origin of human reasoning abilities?
23. Tversky and Kahneman's work shows that many people (given information
that Mary was a leftist in college) will judge that it is more likely that
Mary is a feminist and a bank teller than that she is a bank teller. This
of course violates a fundamental rule of probability: A and B can never
be more probable than A. What does this show about the nature of human
reason? Does it demonstrate that probability theory is irrelevant to cognitive
science? What would a classical theorist say about this?
24. What are some good reasons for abandoning logic programming and resorting
to more general rules and representations in cognitive science?
25. Evaluate the predicate logic approach to cognitive science along the
following 5 dimensions: representational power, computational power, psychological
plausibility, neurological plausibility, and practical applicability.
26. What was Dennett's point in describing the robots R1, R1D1, R1D2?
27. Describe the abilities and limitations of the following AI programs:
CHECKERS, SHRDLU, MYCIN, SOAR, HYPO, CYC.
28. How does MYCIN (or HYPO or SOAR) work?
29. Describe semantic nets and frames, citing points of similarity and
difference. What are limitations of these forms of representation?
30. What is the difference between forward and backward chaining? Explain
with an example.
31. What is heuristic search? Explain with an example.
32. What are strengths and weaknesses of the rules approach to artificial
intelligence? Evaluate the approach along Thagard's 5 dimensions.
33. A major problem in artificial intelligence research is to provide methods
for usefully storing vast amounts of relevant knowledge about the world.
Explain this problem with an example.
34. Why would exploration of differences in innate abilities in the population
be a misguided approach to the topic of innateness in cognitive science?
35. Cite some of the things that we all know and take for granted about
language, which nevertheless correspond to very sophisticated cognitive
abilities.
36. Describe some of the basic aspects of the notion of an object that
have been explored by psychological experiments with infants.
37. Describe one of the experiments given in class which suggests that
the concept of an object is innate. How does the data support the conclusion?
Can you think of any criticisms of the experiment?
38. Piaget believed that the concept of an object was pretty much absent
in infants for the first 9 months. His evidence was that infants do not
display an ability to manually search in the right direction once an object
is occluded. What alternative explanation of this behavior can you give?
39. What is the theory-theory of concepts? Explain with reference to how
the concept of mass is defined in science by such formula as F=ma.
40. What reasons are there for rejecting the idea that concepts are like
definitions in the dictionary?
41. What evidence supports the idea that concepts are prototypes? What
evidence undermines the idea?
42. Explain the major problem classicists face in explaining how we learn
genuinely new concepts.
43. Explain three different methods whereby cognition might acquire new
concepts.
44. Give an account of the history of the development of artificial intelligence
programs that learn, explaining the variety of methods that have been explored.
You should cite the contributions of at least three of the following programs:
CHECKER, ARCH-LEARNER, ID3, COBWEB, SOAR.
45. What are the hardest questions to resolve in developing artificial
intelligence programs that can learn? Give an account of at least four
different approaches to the problem.
46. Steven Pinker says that artificial intelligence reveals that the easy
problems are hard and the hard problems are easy. Provide examples to illustrate
his point.
Cognitive Science Notes Week 7: Neuroscience
CS Ch. 7 (skim 270-275, 282-289, 291-298, 321-325)
I. Why Study The Nervous System?
A. It is intrinsically interesting
B. To test models in cognitive science
(e.g. there is a distinction between short-term and long-term memory, in
just about every psychological theory of memory. OK, the model would be
confirmed if we could locate different neural structures that support long
and short-term memory.)
C. Knowledge about neurology can help suggest new hypotheses about how
cognition might work. We may discover genuinely new cognitive mechanisms
we hadn't thought of yet. Later I will argue that this is true in the case
of connectionist models.
II. The Brain
A. Brain Cells
- The brain contains at least 10^11 Neurons. Each of these consists of
a Soma (cell body) Dendrites (inputs) and an Axon that terminates in Synapses
(Output) (p. 276).
- The brain also contains 10^12 Glial Cells that cover axons with myelin
(a fatty insulator), absorb neurotransmitters, dispose of dead cells, etc..
B. Forebrain,
- Cerebral cortex (or Cortex for short)
a. Approximately six layers spread over about one square yard
b. Its all crumpled up in the shape of a walnut, with two sides joined
by the corpus collussum.
c. Top surface grey matter = cell bodies (soma) underneath is white matter
= axons covered with myelin, which is white.
- Limbic System (Concerned with emotion and motivation)
C. Midbrain
- Thalamus Lateral Geniculate nucleus (Visual Way-station)
D. Hindbrain
- Cerebellum (Fine tunes motor control - provides smooth coordination)
III. How Neurons Work
A. Ion pumps (Potassium Sodium) in the cell wall maintain a difference
in charge (called a membrane potential) across the cell membrane, so that
the inside is negative.
B . But there are channels that can open to let positive ions back
into the cell (or let negative ions out) and cancel the negative inside
charge near the channel. This is called depolarization of the cell wall.
C. When channels open at the root of the axon (the axon hillock),
the reduction of the charge difference (membrane potential) causes neighbor
channels to open as well. This causes a cascade of openings (depolarization)
down the axon all the way down to its synapses.
D. At the synapse, the change in charge causes little sacs full
of neurotransmitters (called synaptic vesicles) to open into the cell wall,
exposing the cell wall of the neighboring neuron's receptor sites to the
neurotransmitter. Depending on the neurotransmitter and receptor site,
the presence of the neurotransmitter may inhibit or sensitize the neighbor
neuron to possible future depolarization.
E. The effects at all the synapses of the neighbor neuron add together.
If there is enough over all activity at its axon hillock, the channels
there will depolarize and the neighbor cell will fire.
IV. Neural Plasticity
A. During early development neural structure is often formed by
the elimination of excess neurons and synapses.
B. The development of structure depends on the stimulation the brain
receives, and when it occurs. If a sighted child is blindfolded during
the critical period for creation of sight structures, the ability to see
will have great difficulty developing. The same sort of critical period
appears in the case of the recognition of phonemes (language sounds) and
the ability to process grammatical structure.
C. If a child loses cortex normally devoted to such functions before
the critical period there is a good chance another region of cortex will
take over. So the brain is plastic at an early age.
D. However after a critical period, lost of the relevant part of
the brain means that the ability cannot be restored, or is restored with
great difficulty.
V. Brain Regions and Topographic Maps
A. In a normal brain, there are standard locations in cortex of the
basic functions (although there are some variations as well). Here is a
crude picture of a left hemisphere:

B. Motor, sensory, auditory and visual cortex are all arranged in
topographic maps. This means that regions in cortex correspond to regions
of the body, the retina, or the cochlea. For example, parts of sensory
cortex respond to stimulation of the palm, and nearby ones to the thumb
etc. (See p. 299 for a sample map). In auditory cortex, some neurons are
devoted to low pitch, and their neighbors to slightly higher pitch and
their neighbors to pitches higher still etc.. An area such as visual cortex
may have many different topographic maps devoted to different functions,
such as general shape detection, motion, and color.
To some extent, the specific regions dedicated to a given sensory region
vary depending on much stimulation is received there. So the brain is still
somewhat plastic at the micro-level.
VI. Neural Representation
A. Is there a grandmother neuron, a neuron that fires when I see
a grandmother? For that matter, is there a neuron that fires when I see
a pure blue visual field? Almost certainly not.
B. Brain representations are distributed across many neurons. So
the representation of my grandmother is no doubt the combination of many
many neurons coding for lots and lots of features that make up my grandmother
experience: color of hair, facial shape, gait, sound of voice, etc..
C. Neural representation often uses what is called coarse coding.
We illustrate this in the case of color vision. You might think that there
are neurons that are responsive to particular wavelengths of light, say
neurons for 500 nanometers, for 510 nanometers, etc..
D. But color vision depends on the fact that we have 3 different
kinds of cones (sensory neurons) (called S, M, L) that respond somewhat
differently to color. These cones have a very large region of wavelength
overlap so that for most colors, all 3 kinds of cones are active at least
to some degree. The representation of the color red, for example, corresponds
to a characteristic amount of activity on the S, M and L cones. (So there
really aren't any red green or blue cones as some popularizations would
have it .) Green has its own pattern of activity, and so on for the other
colors. This means that a color sensation is represented as a triple of
numbers indicating the activity of S, M, and L. This kind of coarse coding
is surprisingly efficient.
E. A similar representation is used to code tastes, but here there
are 4 not 3 styles of tasters neurons (roughly for salt, sweet, sour, bitter).
F . For another example, the direction of a target object is coarse
coded as a collection of activities on various neurons. How can this information
be used by the brain to control the arm to grab the object? Is it ever
averaged together in one place in the brain? Probably not. We just send
the raw collection of directions in parallel to motor output to control
the position of the arm. The slightly contradictory muscle movements will
average out in the arm, and you will get the job done.
VII. Neuropsychology
A. Neuropsychology is the psychological study of how the brain carries
out specific cognitive functions. This is mostly unknown, but new methodologies
offer hope, and some older ones have already given some information.
- MRI (Magnetic resonance imaging) PET (positron emission tomography)
and ERP (event-related potentials) scans can now give us an inside look
at the brain during different cognitive activities. We can see regions
light up when we ask a subject to imagine a visual scene, do math, or understand
language.
- Study of the loss of function in people with damaged brains (stroke,
wounds, tumors, etc.) has contributed a lot. In the old days all we could
do is autopsy people with damage after they died, but now we can also use
a variety of scans to see where the damage is without opening the skull.
Furthermore, in the case of operations for epileptic seizures, the skull
has to be opened anyway, and then neurons can be stimulated, and the result
reported by the patient. (It doesn't hurt because there are no pain sensors
in the brain.)
B. Some results on memory. The study of the patient HM supports
the theory that there is a distinction between long term and short term
memory. (It seems that HM lost his ability to lay down new memories in
long term memory.) Furthermore, since HM can learn new skills despite not
remembering any training, his case supports the distinction between declarative
memory (memories for facts) and procedural memory (memory for how to do
things).
C. Some results on language. The neurological study of language
his revealed that language is primarily processed on the left hemisphere
(for right handed people and many lefties as well). Two areas (at least)
are specialized for linguistic functioning. Broca's area (a specialization
of motor cortex) seems to be involved in production of language. When it
is damaged, patients can understand, but have great difficulty producing
language. Wernicke's area (a specialization of auditory cortex) seems to
be related to speech understanding. For patients will damage here, language
can be produced, but it often makes no sense.
D. Left Brain, Right Brain. Sensory inputs are reversed so information
from the left goes to the right hemisphere and vice versa. In humans the
two hemispheres seem somewhat specialized for certain functions. Left brain:
language, manual dexterity Right brain: visual and spatial understanding.
E. Split Brain Studies. If you cut the corpus collussum, the main
pathway between the two hemispheres, the patient seems normal, but under
special conditions, interesting defects become apparent. Localization of
speech in the left hemisphere is confirmed. Bizarre results can occur that
undermine our conception of these patients as having a single consciousness.
Example: a snow scene is presented to the right-arm hemisphere and the
patient is asked to select one of four pictures as the one most closely
related. The right arm points to a picture of a snow shovel. The left arm
hemisphere is presented with a picture of a chicken and did not see the
snow scene at all. When asked why the right had has just selected a snow
shovel, the verbal response is: to shovel out the chicken coop. The left
arm hemisphere considers itself to be in control of the right arm, and
manufactures an explanation for the decision which we know is a fabrication.
(Is the idea that we are in control a fabrication even for normals?)
Further Reading
Mind and Brain Scientific American Special Issue, September, 1992.
[An excellent collection of introductory articles on the brain. The papers
on vision (p. 68) language (p. 88) and neural net learning (p. 144) are
especially relevant.]
Churchland, P. The Engine of Reason, the Seat of the Soul [An enjoyable
book. See especially Ch. 2 on sensory coding, and Ch. 7 on brain defects]
Hardin, C. Color for Philosophers [This is an excellent introduction
to the surprising facts of color vision, with interesting philosophical
morals as well.]
Kosslyn, S. and Koenig, O. Wet Mind [This is an excellent and accessible
account of cognitive science with strong emphasis on neurlolgy and neuropsychology.
Very good on brian lesions and corresponding cognitive deficits.
Kuffler, S. et. al. From Neuron to Brain [A fine textbook on neurology,
though not much on cognition]
Cognitive Science Notes Week 7: Video: Pieces of Mind
A. The Fundamental Issue
Alda, holding a brain in his hands, says this learned a (or even 2
or more) language(s). It felt rage, love and lust, and for a brief moment
it felt death. It seems astonishing that a mere blob of glup like a brain
could be responsible for intelligence and experience. The job of the scientist
who works from the bottom up is to try to make a plausible story about
how this could be so.
B. Split Brains
We visited Gazzaniga's lab at Dartmouth. We got to see the odd results
concerning Joe, whose corpus callosum was severed to control seizures.
The result is two hemispheres that operate pretty much independently. Alan
Alda (and other normals) cannot draw two pictures with each hand at once.
But Joe can do so easily.
When 'phone' is present to his right hemisphere his left hand (which is
controlled by the right hemisphere) can draw a phone. When asked to explain
his performance his left hemisphere, which controls his verbal output and
could not see the presentation of 'phone' does not know why he drew a phone.
The communication between the two hemispheres is by paper, not in the brain
itself. The only way for the left hemisphere to know what the right hemisphere
saw is to see what it drew.
When 'Bell' is presented to the right and 'Music' to the left hemispheres
and the left hand picks out a picture of a bell, the right hemisphere has
an explanation: I picked the bell because I hear music from bells recently.
He ignores the fact that there were other pictures more appropriate for
'Music'. We know the real reason was that the right hemisphere got 'Bell'.
Oddly his left hemisphere tells itself stories (confabulates) to try to
convince itself that it is really in control.
When painting by Archimbaldo that make images of faces out of pieces of
fruit are presented to left and right hemispheres, Gazzaniga predicts that
the right hemisphere, where face recognition is located will see a face,
while the left hemisphere will see pieces of fruit. That's what we saw.
C. Memory and Emotion
We visited Jim Lagaw at UC Irvine, who works with memory in rats and
humans. When a rat learns something (where a platform is in murky water)
while under the influence of adrenaline (the chemical produced during a
"fright or flight" experience), it remembers it much longer.
If beta blockers (which block the effects of adrenaline are given to a
rat in a stressful situation, it no longer gets the memory advantage associated
with the presence of adrenaline. [The Chronicle has recently run stories
now about other drugs that improve memory that may be a great help in slowing
the effects of Alzheimer's Disease.]
Similar results can be produced in people. When subjects are asked to remember
the contents of a video that is a bit gory (severed legs) their emotional
response correlates with how well they remembered details that had nothing
to do with the gory part (occupation of the father). When patients are
put under the emotion condition but given beta blockers, they still rate
the emotional content as high, but the improved memory effects are lost.
Brain scans of the amygdala (which is a center of emotional control) show
that for subjects with brighter (more active) amygdalas the memory is better
D. False Memory
We visited with Dan Schacter (Harvard), where Alan was treated to a boring
sight of a picnic in the park. By having Alan view pictures of the scene,
Dan was able to insert two false memories of the event. [So we should be
quite careful about testimony of witnesses in court.] Dan then went on
to discuss the question: where is memory located? Probably all across cortex,
with sights, sounds, smells, memorized in roughly the areas of cortex where
these things are perceived. The hippocampus seems to be central in the
process of laying down memories. Can we tell when a person is having a
false memory with a brain scan? Maybe. For example, for subjects who are
remembering words they have heard, the normal part of auditory cortex does
not light up when they are having false memories, and it does light up
when they are hiving true memories.
E. The Purpose of Dreaming
We learned about some work by Carlyle Smith. Alan Alda goes to his
dream lab. It turns out that his ability to detect whether associated symbol
strings are true words or not is enhanced if he tries the task just after
REM (dreaming) sleep. Could REM be used to help us tighten up or exercise
associations between out concepts?
We also learned that people who are intellectually active (taking a lot
of exams) have lots more REM sleep. What kind of associating does REM dreaming
help us with. We saw that students learning the logic game Wff'N Proof
in a room with a loudly ticking clock, did much better if a similar sound
was played during their REM sleep. Could REM sleep facilitate logical reasoning
- where we review and connect things in our dreams? Maybe.
F. The Location and Development of Language
In adults the grammar words like 'of', 'if', 'all', 'not', 'for', 'the'
are processed in the left hemisphere in a specific area (Wernike's area).
Is this true of children? Not at all. For children aged about 4, language
seems to be processed all over the brain. [Specialization of function in
development seems to correspond to focussing activities in separate parts
of the brain.]
There is a window of opportunity when pronunciation and grammar of a language
can be learned by the child easily. If a language is learned after this
period, the learner will almost certainly end up with an accent, and will
have a hard time learning the grammar. This is true even of people who
learn sign language. The brain is extremely plastic in youth, but becomes
"set in its ways" by adolescence.
Cognitive Science Notes Week 8-9: Vision
CS Ch. 12 (skim pp 467-479; 487-490; 506-512 )
I. Why the Problem of Vision is Hard
A. Information from the retina is an array of values, one from each
rod and cone. This representation is a far cry from what cognition needs:
a representation of a three-dimensional world filled with objects.
B. The representation cognition needs would allow us to distinguish one
object from another, to appreciate their positions, motions, sizes, shapes,
and textures, despite the fact that lighting in the environment is variable,
we ourselves may be moving, and we must recognize objects from many different
points of view.
C. The problem of vision is to explain the mechanism that transforms the
retinal array into an object-level representation that can be stored in
memory and processed by other cognitive systems.
II. The Nature of Low-Level Visual Processing
A. A fundamental question about vision is the extent to which higher cognitive
processes such as goals, expectations, attention, reasoning, and conceptual
structures, influence the transformation from retinal to cognitive level
representations. Although these factors are clearly important, a lot of
visual processing can be safely studied from bottom up, leaving aside these
top-down considerations.
B. Common Assumptions about low-level vision
- Processing is parallel and local. Local means that processing at one
point of the image depends for the most part only on activity of nearby
points.
- Processing is modular. There are different maps of the visual system
especially designed to resolve different parts of the problem. Examples:
edge detection, motion, depth. Each of these may be further subdivided.
For example, for depth information, we have several different systems based
on different sources of depth information: stereopsis, the differences
in the images on each eye; motion, as object recedes or advances the "size"
of object changes.
- Success of processing depends on certain regularities about the environment.
For example, using apparent size to tell whether objects are advancing
or receding won't work if objects can increase or decrease their sizes
like balloons. Since so few objects that humans encounter do this, the
visual system has come to depend on the assumption that objects stay pretty
much the same size.
III. David Marr and the Primal Sketch
A. Edge Detection. On Marr's theory, the first job the vision system has
to do is detect edges or object boundaries. Usually an object boundary
corresponds to a quick change in intensity, however there can be intensity
changes within the object boundary and sometimes boundaries do not correspond
to a noticeable intensity change (green book on a green table). So we will
need to supplement this idea in the long run.
- For starters, intensity changes can be computed by taking differences
between adjoining pixels in the image. (For math mavens, this is the first
derivative.) If we look at changes in those changes (the second derivative)
areas where there was a change in the original will correspond to zero
activity (zero crossings) in the new image. The new image will look something
like an outline version of the old one. (p. 472, image (b)). Since neurons
calculate weighted (think synapse) sums (think axon hillock), it is not
difficult to imagine arrays of neurons providing the second derivative
map.
- The Mexican Hat Trick. But how can we design clusters of neurons that
will make use of the zero crossing representations to pick out lines of
various orientations? For we need to do this if we are ever going to represent
the contours of objects.
On-Center and Off-Center cells. Two useful kinds of cells (than can
easily be created by adjusting the weights between the neurons in the right
way) are called on-center and off-center cells. On-center cells receive
signals from many adjacent cells in a roughly circular pattern from the
zero crossing map. The degree to which they fire depends on what happens
in their circular patch is described by the Mexican hat function (p. 471).
Activity on the central portion of the patch causes the on-center cell
to fire, but activity around the edges inhibits firing. So if there is
even illumination across the whole patch, the response will be a standard
firing rate. If the center only is active, the on-center cell increases
its activity above normal, and if the ring around it only is active that
cell is active less than normal. If activity is suppressed across an area
on the brim of the hat but not the center, then the inhibition will be
weakened and the cell will fire more. However if it is suppressed over
the center and a portion of the brim, then the positive center will be
inactive leaving the remaining inhibition on the brim to cause the cell
to fire less than normal.
On-Center Cells

a = normal.......a > normal....... a < normal
There are off-center cells that have the exact opposite function. They
have the feature that if an area is suppressed across the brim but not
the center, the activity will be less than normal, and when the center
and part of the brim is suppressed, their activity is greater than normal.
Off-Center Cells
a = normal..........a < normal..........a > normal
- Detecting and oriented edge. Now imagine that whole rows of off-center
and on-centers are lined up in rows. If a zero-crossing lines (line of
active on one side and inactive on the other) up just right all the cells
in that row will fire, and no other cells will. We have a perfect oriented
line detector!

- By using the same trick for larger or smaller sized patches we can
determine how sharp the edge is that we are detecting. (Smaller patches
detect sharper edges.)
IV. Marr's Theory of Higher Level Representation
A. (the task is to explain how the brain can go from information about:
Contrast, 2-D Velocity, Disparity, 2-D Orientation to a 3-D world of objects
involving:
Color, Texture, 3-D Shape, Distance, 3-D Trajectory
B. Levels on the Theory
- Grey level (raw output of the photoreceptors)
- Zero-crossing Representation (provides edges)
- Primal Sketch (provides oriented edges, which represent the boundary
contours of any object)
- Boundary Features (provides information about such features as blobs
bars and ends)
- 2.5 D Sketch (provides depth information)
- 3D Sketch (provides a three dimensional model of the world, which according
to Marr is done by representing the object as a nested set of more and
more complex cylinders: Human: Head Body; Body: Trunk, Arms, Leg; Arm:
Upper-Arm, Forearm, Hand; Hand: Palm, Fingers
V. Treisman's Theory of Attention and Primitive Processing
A. The thesis is that there are basic visual processes that are computed
in parallel that feed information to higher level processes responsible
for binding features together. These second processes are calculated in
series by an attention mechanism.
B. We can develop evidence fro this theory by presenting images with target
shapes surrounded by distractors. If we measure the reaction time for identifying
the targets and discover that it is fast and does not depend on the number
of distractors, then we assume it is a basic parallel process. If the reaction
time grows with the number of distractors then we assume the process is
serial and involves attending to one thing after another in the scene.
C. For example, the letters L and T have the same elements in the same
orientation, and differ only in how the elements are conjoined to each
other. Recognition of these targets depends on attention. The differences
between them do not just obviously "pop out". However if you
examine a field of |s and /s, where the only difference is the orientation,
the difference is immediately and easily apparent.
D. Basic features include orientation, brightness, curvature. A discrimination
that requires conjoining features (white triangles and black squares, vs.
black triangles and white squares) is extremely difficult to discriminate
and takes tedious one-by-one inspection.
VI. Biederman's Theory of Higher Level Processing
A. Biederman's theory describes the features of something like Marr's 2.5
D level, but the primitives he postulates (called geons) are more flexible
and varied: cones, cylinders, cubical shapes, etc and distortions of these
such as changing the length to width ratio and the shape of the center
line.
B. Biederman believes that objects are identified by a process of recovery
by components. Each of the components is recognized, and their identity
and arrangement allows us to tell what kind of object we have.
C. But how are the components recognized? By the boundaries between them,
which are typically concave and rather sharply sloped inwards. (Consider
the "joints" of the Michelin man, for an exaggerated version
of the idea.)
D. Some evidence for these ideas is found in how well we recognize line
drawings where parts of the scene are obscured. If the points of connection
between parts (apexes of triangles for example) are removed, the scene
is hard to recognize. If other parts are removed, but the connecting parts
left, recognition is fairly easy.
E. The fact that objects viewed from strange angles where the "joints"
are obscured are hard to recognize (p. 499) is a variant of the same line
of evidence.
VII. Top-Down vs. Bottom-Up
A. Marr and many other researchers have tried to create theories of vision
where the processing from retina to brain does not require higher-level
information to identify the object. (For example, concepts like animals
have 4 legs, or that the sky is above us and is blue or grey, etc.)
B. Clearly there are instance where higher level information is required
to resolve the ambiguities in a scene. For example the same shape (N) can
be read horizontally as an en, and vertically as a zee.
C. But to what extent does vision rely on top-down processing? Consider
the Kaniza Triangle. Here we see an image of a triangle hovering above
the scene, but there is no luminance differences on the two sides to allow
us to pick out the boundary. Why do we perceive the edge? Perhaps conceptual
information about how other things blot out other shapes helps us. However,
there is some evidence that this phenomenon is very low level. For example
we have evidence that the "edges" are already processed in V2,
the next center down from the earliest: V1. So maybe a bottom up explanation
is going to be more likely.
D. Another piece of evidence that low level processing is doing most of
the work is the fact that a zero-crossing representation of a set of letter
Bs obscured by blobs allows us to recognize the Bs much better than a blob
representation. Perhaps the zero-crossing is really doing most of the representational
work.
VIII. Further Topics on Vision
A. The Up-down axis is used as a major default assumption about how objects
are aligned. Violations of this alignment cause errors in identification.
(A problem faced by NASA designers of space stations, where expectations
about up-down are more often violated.)
B. The alignment approach to object recognition is a competitor with theories
based on recovery by components. One basic idea is that objects have conspicuous
alignment points. These can be used to rotate and rescale the image to
see if it matches stored representations from a "standard" point
of view.
C. There is a major division in visual pathways between the what system
(object recognition) and the where system (object location). These correspond
to different topographic maps on cortex. It is possible that the recognition
system is narrower view-window that the whole visual field, so that an
attention mechanism must be used to move access to this processing from
one element to another in the scene. We typically do not recognize all
objects in a scene at once.
D. Eye motions (saccades) that flit from one spot to another in the scene
are essential to effective vision. The detection of the motion of the visual
scene across the retinal array is suppressed during a saccade. This is
one form of visual attention. There is another form that operates even
if our direction of gaze is fixed. This serial process may be involved
with binding properties together in the same object. Binding errors can
be induce by presenting scenes very quickly.
Cognitive Science Notes Week 9-10 (Connectionism)
(M. Ch. 7; CS pp. 63-83; 324-325; 92-93; 114-116; 121-124)
I. Radical vs. Implementational Connectionism
A. The fundamental connectionist idea is to build models of cognition
that are guided by the nature of neural processing, but to abstract away
from irrelevant neural features.
B. There are three different ideas about how the classical or symbolic
processing account relates to connectionist theories.
- Implementational connectionists will view their role as explaining
how symbolic processing is implemented in the brain. They pretty much accept
the classical account, and attempt to explain how the processing described
by classicists could be carried out in the brain's neural nets.
- Radical connectionists view their theories as competitors to classical
ones. The idea is that classicists have an incorrect theory about what
cognition is like, and that connectionism can replace it with a much more
adequate view. Naturally, radical connectionists and classicists have engaged
in hot debate.
- Hybrid connectionists think that connectionism best describes only
some of our cognitive abilities, notably those in perception, pattern recognition,
and motor control. Classical theories are needed to explain other abilities
such as reasoning and language. So hybrid connectionists are radical for
some abilities and implentational for others.
II. The Basics
A. Connectionist models are known by many names: (artificial) neural nets,
parallel distributed architectures, subsymbolic models.
B. Units. Connectionist models are connected networks of simple processors
called units. The units are supposed to model the basic behavior of neurons.
C. Weights. The synapses which regulate signals between neurons are modeled
by values called weights. Weights can be positive (indicating that activity
at the synapse encourages the neighbor neuron to fire) or negative (indicating
that activity at the synapse inhibits firing by the neighbor neuron).
D. Activation Function. It is assumed that all units calculate the same
very simple function. The fundamental idea is that the unit i sums the
signals it receives from each of the neurons connected to it. The signal
aj coming from each unit j connected to i is multiplied by the weight wij
between i and j. The sum of these values for each connected unit is calculated.
This value might be any positive or negative number. But a neuron's activity
is best modeled as a number between 0 (inactive) and 1 (maximum firing
rate). So we adjust this sum so that it lies between 0 and 1 with sig,
the logistic (or sigmoid) function. (See p. 69 (b) for its graph.) Putting
these ideas together, we get the basic activation function for units.
ai = sig(Sj wij aj) where sig is the function: sig(n) = (1+e-n)-1
This says that the activity of unit i is the result of multiplying the
activity aj of each input neuron by the weight connecting it to i, and
then applying the sigmoid function to this sum. Connectionists assume that
all cognitive processing results from the behavior of many units all of
which compute this function or a minor variant of it. Note that any possible
arrangement of connections of such units can be expressed by simply setting
wij to zero for any two units that are not connected. Therefore the architecture
and behavior of the neural net is defined entirely by the weights between
the units.
III. Standard Feed-Forward Architecture
A. Many connectionist models conform to a standard configuration called
feed-forward nets. There is a bank of input units which contain the signals
coming into the system, a bank of output units, recording the system's
response, and usually one or more banks of hidden units that are waystations
in the processing. In a connectionist model of a whole brain, the input
units model the sensory neurons, the hidden units the interneurons, and
the output units the motor neurons.
B. The astonishing thought behind this model is that all the brain does
is simply the result of massively many units calculating the activation
function according to the settings of the weights (the synaptic connections).
Could such a simple-minded calculation really do the job? There is a lot
of intriguing evidence that it may.
IV. Recurrence
A. Feed-forward Architectures are limited in what they can do. The signal
flows directly from input to output. However we know that the brain contains
recurrent pathways, that is, pathways that loop back to earlier levels.
B. Winner Take All Arrays. One use of recurrence in connectionist models
is to provide for mutually inhibiting banks of neurons. Each bank sends
inhibiting connections to the other bank, with the result that only one
of the banks can be active. These arrays can be applied to problems that
involve parallel constraint satisfaction. Such nets can model decisions
between incompatible alternatives (M. p. 116), for example, the two ways
of viewing the Necker cube. Marr and Poggio have used the idea to model
how the brain matches up images from the two eyes to facilitate stereoscopic
vision. The same kind of models can be used to understand decision making,
planning, and explanation (M p. 115-117). C. Simple Recurrent Architectures.
In simple recurrent architectures, information on the hidden units is sent
back to the input level, so that information about the hidden units at
time t-1 is available at the inputs at time t. This provides for a kind
of short term memory, and is essential for processing where the net needs
to respond to the history of the inputs. Such nets have been shown to be
capable of simple grammatical processing.
V. Learning
A. The success or failure of a neural net model depends on the selection
of the right weights. But how can we determine which weights we need to
accomplish a certain task? One solution to the problem is to let the net
figure it out. Let the presentation of the input and its response adjust
the weights. There are two basic styles of learning in connectionist models:
unsupervised, where the net simply adjusts the weights on the basis of
the inputs it receives, and supervised learning where the adjustment is
done. Descriptions of the most famous unsupervised (Hebbian) and supervised
(Backpropagation) learning methods follows.
A. Hebbian Learning. The idea goes back to Donald Hebb. Put information
at the input units, and calculate the activity of all the units. Then increase
the weights between active units, and decrease those between inactive units.
Do this for all the inputs that the net will encounter. This process will
cause the net to classify regularities found in the input. For example,
imagine that the inputs code for different features of animals: fur/feathers,
2/4 legs, forward/sideways facing eyes, sharp/blunt teeth, wings/no wings,
carnivore/herbivore. Now train the net with features found in animals at
the zoo. Weights between such features as carnivore, forward facing eyes,
sharp teeth, will get strengthened. Also those between feathers, 2 legs
and wings. The net has "discovered" the concepts "bird"
and "predator". When features for a new animal are presented
it will activate the units that represent the closest category to which
those features belong. It is almost as if the net has extracted some prototypes
from the data which it can apply to novel inputs.
B. Backpropagation. Backpropagation is the most popular form of supervised
learning. We will illustrate with the example of a net trained to pronounce
English words. The spelling of a word is put on the inputs, and a code
for its correct pronunciation is to be presented on the output. This task
is hard because of the irregularities of English pronunciation: 'have'
does not rhyme with 'came' and 'same'; 'though' does not rhyme with 'rough'
and even 'tough'. The training set will consist of a list of words together
with their correct pronunciation codes. Training proceeds as follows. Start
with random weights. Now present the first word in the training set, and
calculate the activities of all the units. The output units will almost
certainly not match the desired code for that word. For each output unit,
trace the source of the error back through the network. Adjust weights
(slightly) in the direction that will correct the error. Now do the same
thing for the next item in the training set, and so on.
VI. Connectionist Representation
A. In local representation, single units are devoted to recording a
concept. (Think grandmother neuron.)
B. In distributed representation, the representation of an item consists
of a pattern of activity across all of the units. Nets trained with backpropagation
and Hebbian learning spontaneously generate distributed representations
of concepts they are learning. For example, a cluster analysis of the activation
patterns on the hidden units of NETtalk shows a hierarchy of clusters and
subclusters corresponding to phonetic distinctions. There is a main clustering
into two: vowel, consonant. And within the consonants subclusters for voice
or unvoiced, etc.. In learning the task, the network has acquired the concepts
that it needs to process the inputs correctly.
C. Distributed representations in connections models correspond to extremely
complex arrays of values across many units. Therefore the representation
for a concept like [cat] can code of lots of features of the concept such
as mammal, pet, furry, aloof, stalks-mice, and other features (like how
it looks) that we would be hard pressed to describe in language. This so
called subsymbolic form of representation allows the symbol to carry its
own information about what it is about. The symbol is not arbitrary and
atomic the way a word in a language is. By analysing the symbol, you can
find out what it "means".
VII. Famous Connectionist Models
A. Connectionist models have been used for such divergent tasks as
recognizing submarines, deciding bank loans, and predicting protein folding,
to name just a few. What follows are a few of the better known connectionist
models trained by backpropagation.
B. TRACE: Rummelhart and McClelland (1986) Predicting Past Tense of English
Verbs
*Input: Phonetic code of present tense verb (sing)
*Desired Output: Phonetic code of the past tense of that verb (sang)
*Architecture: Feedforward net without hidden units
*Training Set: Phonetic codes of present and past tense of 460 English
verbs
*Results: The net learned he past tenses of the 460 verbs in 200 rounds
of training, and it generalized fairly well to new verbs, with good appreciation
of "regularities" to be found among the irregular verbs (send
/ sent, build / built; blow / blew, fly / flew). During learning as the
system was acquiring more regular verbs, it overregularized: (break / broked).
This was corrected with more training. Children are known to exhibit the
same tendency to overregularize. Whether this is a good model of how humans
process verb endings is a matter of hot debate (Pinker & Price 1988).
C. NETtalk: Sejnowski and Rosenberg (1987) Pronouncing Written Text
*Input: 7 letters of the text (including space) in a moving window
*Desired Output: Phonetic code for the center few letters, which is sent
to a speech synthesizer
*Architecture: Standard 3 layer feed-forward net. (80 hidden units)
*Learning: A large training set of text coupled with its phonetic transcription.
*Results: During learning the system goes through stages of babbling, double-talk,
and finally intelligible speech, (with some accent). Generalization to
novel text is good. Statistical analysis shows that hidden units use a
distributed representation of basic phonological features.
D. Elman (1991)
* Input: Words drawn from a small set of English words (23 words plus End-of-Sentence)
coded in 1s and 0s.
* Output: One output unit for each each word in the set.
* Architecture: Simple Recurrent Net
*Training Set: Grammatical sentences of from this vocabulary for a brand
of English restricted to a small subset of its grammatical rules. The grammar
did, however, provide for a hard test of grammatical awareness: subject-verb
agreement across arbitrarily long relative clauses:
Any man who hates women who hate men .. also hates feminists.
*Desired Output: When a word from the sentence is applied to inputs, the
desired output is the next word in the sentence. (Of course the net can't
possibly succeed at this task.)
*Results: Nets were trained to be extremely accurate in the following sense,
on the presentation of a sequence of words, all and only words that would
be legal continuations at that point are active beyond a certain threshold
at the output. When a word is presented that violates the rules of grammar
no words reach threshold at the output. The trained net came very close
to this desired performance.
VII. Attractions of Connectionist Models
A. Biological Plausibility. Neural net models "look like" the
processing that we find in a brain, especially when we look at the processing
we know about: sensory input and motor output. There is evidence for Hebbian
learning at synapses. The 100 step rule would suggest that the brain's
processing, unlike the usual classical models, is highly parallel.
B. Soft Constraints. Nets can learn to appreciate subtle statistical patterns
that would be very hard to express as hard and fast rules. This allows
them to avoid the brittleness displayed by classical models.
C. Fast Processing of Multiple Constraints. Nets can quickly resolve in
parallel the complex set of conflicting forces to make a decision.
D. Graceful Degradation. When units are lost, the net behaves almost as
well. In classical systems the loss of a circuit typically causes a fatal
processing error.
F. Flexible Response to Noise. When the inputs are noisy (if part of the
input is inaccurate or obscured by some other signal) nets respond appropriately
(though somewhat less accurately).
G. Vector Representation. There is evidence that the brain is deeply committed
to representations in the form of vectors (arrays of values). For example,
coding for color and taste are both by vectors of 3 and 4 values. Neural
net architectures are perfectly designed to handle vector processing.
H. Prototypes. Connectionist classification methods remind us of the idea
of matching inputs to prototypes. So evidence for prototypes helps sway
us toward connectionism.
I. Unified Theory of Learning. Classical accounts employ a variety of different
learning techniques. Connectionists have a simple and fairly unified theory
of learning based on backpropagation and Hebbian processes.
J. No Programming. We do not need to hire programmers to make a system
do some complex task. Just build a training set for the desired behavior
and let the model learn what to do.
VIII. Weaknesses of Connectionist Models
A. Biological Implausibility
Further Reading
Clark, A. (1993) Associative Engines MIT Press [Provides a nice
review of connectionist work with special emphasis on the problem of representation
in language.]
Elman, J. L. (1991) "Distributed Representations, Simple Recurrent
Networks, and Grammatical Structure," in Touretzky, D. Connectionist
Approaches to Language Learning , Kluwer, Dordrecht, 91-122. [The classic
experiment on teaching nets to learn syntactic structure]
Pinker, S. and Prince, A. (1988) "On Language and Connectionism: Analysis
of a Parallel Distributed Processing Model of Language Acquisition,"
Cognition , 23, 73-193. [A through criticism of Rummelhart, D. and
McClelland (1986). Undermines claims that connectionist models correctly
simulate the process of language learning, especially the the acquisition
of regular verbs.]
Rummelhart, D. and McClelland, J. (1986) "On Learning the Past Tenses
of English Verbs," in Parallel Distributed Processing , vol.
I, MIT Press, Ch. 18, pp. 216-271. [Classical article describing TRACE.]
Sejnowski and Rosenberg (1987) "Parallel Networks that Learn to Pronounce
English Text", Complex Systems , 1, 145-68. [For more on NETtalk.]
Cognitive Science Study Questions Quiz 2
CS Ch. 7 (skim 270-275, 282-289, 291-298, 321-325)
CS Ch. 12 (skim pp 467-479; 487-490; 506-512)
M. Ch. 7 CS pp. 63-83; 324-325; 92-93; 114-116; 121-124)
Identify Terms
Neuron, Glial Cell, Soma, Dendrite, Axon, Synapse, (Cerebral) Cortex,
Myelin, Neurotransmitter, Forebrain, Midbrain, Hindbrain, Left/Right Hemisphere,
Limbic System, Amygdala, Hippocampus, Thalamus, Lateral Geniculate Nucleus,
Cerebellum, Corpus Collussum, Grey Matter, White Matter, axon hillock,
Membrane Potential, Ion Channels, Depolarization, Synaptic Vesicles, Receptor
Sites, (Somato-) Sensory Strip, Motor Strip, Wernike's Area, Broca's Area,
Visual Cortex, (Pre-)Frontal Cortex, Association Cortex, Auditory Cortex,
Topographic Map, Retina, Cochlea, Coarse Coding, Distributed Representation,
Cones (S, M, L), Grandmother Neuron, Neuropsychology, MRI (Magnetic resonance
imaging), PET (positron emission tomography), ERP (event-related potentials),
Dissociations, Adrenaline, Beta-Blockers, REM, David Marr, Grey Scale Representation,
Primal Sketch, 2.5 D Sketch, 3D Sketch, Zero Crossing Map, Mexican Hat
Function, On/Off-Center Cells, Treisman, Biederman, geons, recovery by
components, illusory contours, radical connectionism, implementational
connectionism,(artificial) neural nets, parallel distributed architectures,
subsymbolic models, input units, hidden units, output units, activation
function, sigmoid function, weight, fee forward net, winner-take-all net,
recurrence, simple recurrent net, Hebbian learning, backpropagation, training
set, parallel constraint satisfaction, unsupervised/supervised learning,
local/distributed representation, subsymbolic representation, Rummelhart
and McClelland, TRACE, Sejnowski and Rosenberg, NETtalk, Elman, Graceful
Degradation, Soft Constraints, Multiple Constraints, Systematic Processing
Questions
1. Approximately how may neurons/glial cells does the brain contain?
2. Explain how neurons fire. Explain the role of the following items in
your account: axon hillock, ion channels, depolarization, synapses, synaptic
vesicles, receptor sites,neurotransmitters
3. In youth. the brain shows immense plasticity. What are the limits of
this plasticity as the brain matures?
4. Cites at least two examples of cognitive functioning where early plasticity
disappears.
5. To what extent does brain development depend on stimulation from the
outside world? Explain with at least two examples
6. Diagram the cortex explaining what regions are devoted to what kind
of processing
7. Explain how topographic maps for hearing, vision and touch are laid
out in cortex.
8. Explain how colors and tastes are represented in the brain.
9. Why is it a distortion to say that the retina contains red, green and
blue cones? Explain with reference to the way in which color is calculated
in the brain.
10. What are the priciple methods used in neuropsychology. Explain what
we have learned about memory from the case of patient HM.
11. The human brain's left and right sides are usually specialized for
certain functions. Give details on these asymmetries.
12. Explain the odd effects that result from a patient's loss of the corpus
collosum
13. Joe is a split brain subject studied by Gazzaniga. Describe at least
two different experiments on Joe and explain their significance for our
understanding of the brain.
14. Describe at least two experiments that suggest that ability to remember
is related to emotion.
15. Explain how false memories of an event were inserted in Alan Alda's
brain.
16. How might we tell when a person is having a false memory with a brain
scan?
17. What did the experiment with the ticking clock show about the possible
function of dreams? How did the experiment show that?
18. What does Carlyle Smith think is a possible function of dreaming?
19. What is the problem faced by the visual processing system in the brain?
20. Distinguish top-down from bottom-up theories of visual processing.
Cite clear cases of visual processing where the processing is likely to
be (at least partly) top-down. Now cite evidence that tends to show that
visual processing is primarily bottom up. What does the phenomenon of illusory
contours have to do with the issue?
21. Describe two assumptions about neural processing that guide theories
of how visual processing works.
22. What is the zero crossing map and how do neurons compute it?
23. Given a bank of on-center and off-center cells which process the zero-crossing
map, explain how to construct an (oriented) edge detector.
24. Marr's theory of vision involves a number of levels. Name four of them
and explain what representations at each level are like.
25. Explain Marr's theory of higher level visual processing (3D sketch).
26. Explain how experiments can be used to determine which are the basic
features of the visual system that are processed in parallel, and which
are serially processed by an attention mechanism. Given examples of discriminations
of each kind.
27. Compare Biederman's theory of object recognition using recovery by
components with the alignment approach.
28. On Biederman's theory, how are the divisions between the components
that make of objects recognized? What experiments have been performed that
tend to confirm this view?
29. Describe the two main pathways in the visual system and explain what
each one does. How does the phenomenon of attention play a role in these
two systems?
30. How are attention and visual binding related?
31. What is the difference between those who believe in radical and those
who believe in implementational connectionism?
32. Explain fundamental assumption connectionists make to simply their
models of the brain?
33. Describe the items to be found in a generic connectionist model. Explain
the range of numerical values that is allowed for each.
34. What is the activation function. Give the equation, then explain what
the equation says about how a unit behaves.
35. What is recurrence? Give examples of two different kinds of recurrent
architectures and explain what they can do.
36. What is the difference between unsupervised and supervised learning?
Give examples of learning methods of each kind.
37. Explain how Hebbian learning works, and how nets trained with Hebbian
learning behave. Use an example to illustrate.
38. Explain how backpropagation works.
39. Explain the difference between local and distributed representation.
What kind of representations are typically generated by connectionist learning
methods?
40. What is subsymbolic representation?
41. Connectionist models have been been applied to a wide variety of different
tasks. Name some of them.
42. Describe TRACE or NETtalk. Mention the task to be accomplished, the
inputs, the desired outputs, the architecture, the training set, and the
results achieved.
43. Describe Elman's work on simple recurrent nets that process simple
grammars.
44. Give reasons for and against the view that neural nets are reasonable
models of a human brain.
45. List 9 reasons for thinking that neural nets are good models of cognition.
46. Why would neural net theory be especially well adapted to prototype
theory?
47. What are the major weaknesses of connectionist models?
48. It has been suggested that multiple constraints and soft constraints
are reasons for preferring connectionist over classical models. Discuss
the issue.
49. Explain why the systematic nature of cognitive processing and the mind's
ability to generalize pose difficult problems for connectionist models.
For which kind of connectionists are the problems most serious?
Cognitive Science Notes Week 12 (Cognitive Psychology - Imagination)
(CS 2.1-2.3; M Ch. 6; CS 2.7)
I. Imagination in Cognitive Psychology
A. Imagination: the Historical Context
II. The Imagery Debate
- Imagination presents an interesting alternative to the classical approach
to cognitive science. Since the classical view takes cognition to be the
result of symbolic processing similar to what goes on in a computer, either
imagination must be understood as a form of symbolic processing, or imagination
must be considered to play no interesting cognitive role.
- The classical theory is that cognitive science's job is to discover
what is common to the human cognitive architecture. This amounts to something
like discovering the functional structure of a computer. Humans are remarkably
alike in their basic cognitive abilities, so it is a good guess that there
is a structure here that we all have in common.
- The basic picture of this architecture is to divide it in three: Sensory
systems; Motor systems; and Central systems. The central systems include
thinking attention memory learning and language. The idea is that sensory
systems are modular. They do their jobs of delivering information to the
central system independently from each other and from direction by the
central system. The central system receives information from the senses
and controls motor output, but it is not driven by sensation. A
lot of thought and reasoning can proceed without help at all from sensation.
(In fact sensation may hinder thought; why else do we stare at the ceiling
when we think hard?) A computer processing away without the help of any
new input is a good model of this feature.
- Classicists presume that the central system is highly dependent on
symbolic processing. There are several reasons for believing this.
a. Turing machines are a model of computation, and known to be capable
of carrying out any function that can be specified by a set of rules, including
presumably all possible tasks in thinking, reasoning, attention, memory,
learning and language. Turing machines are the simplest kind of symbolic
processors (computers). Although the brain almost certainly doesn't implement
such a simple and cumbersome machine, variants of the same basic design
(von Neumann architectures) have been proven to be extremely useful and
powerful information processing devices. What better model of intelligence
could we find?
b. Abilities in reasoning and language appear to require processing that
is sensitive to the structure of symbol strings. For example, our language
and thinking processing abilities are productive , which means that
we are capable of recognizing and understanding an unlimited variety of
sentences. The computational model explains how we are capable of this
amazing feat. We compute language meaning much the way that a calculator
computes arithmetic - not by storing up large quantities of information
(the answers), but by following simple rules (similar to routines you learned
in grade school to add subtract multiple and divide). Furthermore, our
abilities in language and thought are systematic, which means that
we use the same processing methods to deal with structures that have the
same form. Humans that can understand 'John loves Mary' also understand
'Mary loves John' and every other sentence of the form: Name Verb Name.
Humans do not learn the meanings of sentences piecemeal. They compute meanings
on the basis of a sentence's structure. The symbolic processing hypothesis
would explain this fact. A calculator deals with 3x(5+7) using fundamentally
the same operations as it does to calculate 7x(5+3), for it too bases the
calculation in the same symbolic structure in each case.
III. The Advantages of Imagination
- Now let us turn to the facts of imagery. Kosslyn has championed the
view that imagery is a separate cognitive ability that involves feeedback
to the visual system. The idea that imagery is important to cognition provides
an alternative to the view that central processing is essentially symbolic.
Brains might contain a special graphic processor along with a symbolic
processor. What advantages would this graphic processor bring?
a. One important idea is that visual imagery carries much more information
that symbolic representations are capable of. A picture is worth a thousand
words.
b. Consulting an image makes things obvious that we would otherwise have
to think out, (eg. the chair is close to the couch in figure 6.2 M p. 96)
c. There are a number of skills such as finding things, planning errands,
trying out ways of building things such as bridges, explaining continental
drift etc. where an ability to imagine the various objects, actions and
likely outcomes is extremely helpful. We can literally see in our mind's
eye the things we should avoid doing when we imagine a course of events.
Imagination gives us foresight. It also allows us to adapt ahead of time.
For example, just by imagining a task, the athlete can train herself to
improve.
d. There is excellent evidence that much of language understanding is based
on metaphors which are in turn founded on visual imagery . For example,
top, up ,etc. mean better, stronger (top of his game) down, bottom mean
worse, weaker etc. (in the pits).
e. We also know that imagery is useful in solving problems (the architect's
diagrams) and improving memory. There is even some new interest in imagery
in artificial intelligence research.
IV. Experiments on Imagery
- Koslyn had people imagine moving attention from one point to another
on a map. The time to move attention was proportional to the distance on
the map suggesting that attention actually "moves" from point
to point in an imaging space.
- Mental rotation experiments support the same basic idea. Subjects asked
whether two images matched apparently used a mental rotation technique
to solve the task, for the time it took for solution depended on the angle
through which the image would have to be rotated to align the two. However
work of Pylyshyn shows that angle of rotation is not the only feature that
effects speed on this task.
- Experiments with PET and rCBF scanning show that visual imagination
and other cognitive tasks differ in that the former involve activation
of visual areas of the brain. Furthermore work with brain-damaged patients
shows that brain damage can selectively impair abilities at mental rotation.
Cognitive Science Notes Weeks 13-14 (Linguistics) (CS 6.1, 6.3, 6.4)
I. What is Linguistics?
A. Linguistics is the study of language. The most important thing to explain
is how the phonetic input (which is just a sound wave) is converted into
meaning. (This is analogous to the fundamental question about vision: how
we convert the visual input into a 3-D world of objects.) Other questions
include how is knowledge about language represented, and how it is possible
for a child to learn language.
B. Traditionally linguists have taken a less cognitive view of their discipline.
The idea is that first we need to characterize the regularities in language
with a grammar: a formal theory of exactly how the meaningful units of
language are formed. Since meaning of a sentence or phrase depends on grammatical
structure, a grammatical account might serve as a framework for understanding
how sound is converted into meaning.
C. Prescriptive vs. Descriptive. Cognitive linguistics, where the concern
is with human cognitive activity, should provide a descriptive rather
than prescriptive account of human speech. A prescriptive theory
would explain how we ought to speak with such rules as: A preposition
is something you should never end a sentence with. A descriptive theory
explains how we actually do speak: ending prepositions are common in English
speech, so the grammar must accommodate that construction.
D. Competence vs. performance. Nevertheless, even cognitive linguists do
not intend to study all the gory details of actual language use, with all
its ahs and ums, false starts, sentence fragments, etc.. The theory is
not intended to explain our mistakes in the use of language in actual performance
. The idea is that each of us had a basic linguistic competence
in the language we speak. This competence is reflected in the fact that
we all know that 'he loves she' is not grammatical, even though
we might actually utter this sentence by mistake on occasion. So linguistics
studies the fundamental rules of the various human languages (our competence),
but does not try to account to violations of these rules in our everyday
performance .
E. A fundamental presupposition of computational linguistics is that competence
can be represented with a set of rules called a grammar. These rules are
somewhat different from the grammatical rules you learned in English class
because their role is not to preach proper usage, but to reflect linguistic
structure. This assumption is an attractive one, for the rule based approach
seems ideally suited to explaining productivity (or unlimited ability to
understand novel sentences) and systematicity (the fact that we process
phrases with similar grammatical form in the same way) of language.
F. Building a grammar for a language is a very difficult task. We are far
from having a grammar for English, or any other natural language. To start
to appreciate what some of the obstacles might be consider the pattern
of cases where 'that' is optional, rather than required. The pain (that)
I feel is unpleasant. Optional 'that'. The dog that bit me has rabies.
Non-optional 'that'. Everybody who knows English knows this, but what is
the rule here? This is just one of thousands of regularities of English
that need to be explained by a grammar.
G. Another fundamental question is language learning. Every normal human
learns a language effortlessly. After a certain age this ability to acquire
language is lost, and learning a new language takes great effort. What
is the mechanism that makes language learning possible. This question seems
especially pressing when we consider how complicated the task of learning
language is. (A person with an average size vocabulary, has learned 5-10
words a day during childhood, and has mastered scores an scores
of complicated rules such as when 'that' is optional.) Many linguists believe
that the data that a child has to go on is way carries too little information
for the child to learn language from scratch. Some special mechanism, part
of our genetic endowment, must exist that explains our spectacular performance
in language learning.
H. This suggests that there must be linguistic universals, that is regularities
common to all languages that the child can bank on to make the right guesses
about the structure of the language she is learning.
I. The process of language understanding can be divided onto 3 levels.
At the phonological level the brain extracts words (morphemes) from the
pattern of sound at the ears. At the syntactic level, the sequence of words
is analysed into a grammatical structure. At the semantical level, rules
are used to convert the syntactic structure into the meaning. Unfortunately
we wont have time to cover phonology. Since the semantical theory is not
as well understood, we will concentrate our study of linguistics on the
syntactic level.
II. Syntax
A. The same sentence can have more than one syntactic structure: Time
flies like an arrow. They talked over the noise. The differences in structure
can be revealed with a constituency test. 'They talked the noise over',
has one meaning: they talked-over (discussed) the noise. This is explained
by a rule that allows a verb-particle cluster to move the particle to the
end of the sentence: John called-up the candidate -> John called the
candidate up. But you can't do this with prepositions. John called up the
stairs -> * John called the stairs up. So this reveals that there are
two readings of 'They talked over the noise.' one where over is a particle
(they talked-over the noise), the other where it is a preposition (They
talked, and over that was noise).
B. How can we account for such regularities with a grammar (set of rules)?
Almost all theories of syntax appeal at some point to the notion of a phrase
structure grammar. Here is a simple phrase structure grammar for a
fragment of English:
S -> NP TENSE VP (Sentence -> NounPhrase, TENSEmarker, VerbPhrase)
NP -> DET N (NounPhrase -> DETerminer, Noun, RELativeCLause)
TENSE -> {PRES, PAST} (TENSEmarker -> either PRESent or PAST)
VP -> V AP (VerbPhrase -> Verb, AdjectivePhrase)
AP -> A (Adjective Phrase -> Adjective)
Rules such as these may be used to create Phrase Markers into which words
can be inserted to create sentences. (See CS p. 245, p. 247). Class exercise.
What else would we need to add to accommodate the facts of English structure?
Some answers: relative clauses, prepositional phrases, ...
C. Linguists presume that information on individual words is stored in
memory in what is called the lexicon . The lexicon contains information
about each word's pronunciation, meaning and grammatical features. For
example, the entry for 'eat' would say this is a verb that takes a direct
object, and the one for 'dines' says this is a verb that cannot take a
direct object. The entry for 'rice' would say that this a mass noun, and
the one for 'steak' that this is a count noun. This information can then
be used to check whether combinations of words that form phrases and sentences
are legal or not. 'John eats steak' is legal since 'dine' takes a direct
object and 'steak' is a noun that can fill that role. But 'John dines steak'
is illegal since 'dines' takes no direct object. 'John ate two steaks'
is legal because 'steaks' is a count noun, and 'two' is a counting adjective.
'John at two rices' is illegal since 'rice' is a mass noun and does not
take the plural or combine with counting adjectives.
D. The theory of transformations was a central feature of linguistics in
the 60s and 70s. Ideas in the theory are still incorporated in some way
or another in all theories today. The basic idea is that phrase markers
generated from a phrase structure grammar are then modified by a set of
transformations. Examples include transformations that do the work of generating
questions or converting active voice into passive voice. For example, to
get a question from the marker S(NP(DET N) TENSE(PRES) VP(V NP)), we insert
the question pronoun 'what' into the last NP and swap it with the first
NP: S(NP('what') TENSE(PRES) VP(V NP(DET N)) as in 'What is the problem?'.
A similar transformation takes us from the marker for 'John loves Mary'
to the marker for 'Mary is loved by John'. The question transformation
also explains the form of the sentence 'I know who Bill insulted' It is
derived from a phrase marker for the sentence 'I know Bill insulted who'
which is then transformed. Transformations also explain how sentences like
'Moe and Curly added salt ' are derived by deletion from markers for the
sentence 'Moe added salt and Curly added salt.'
E. To illustrate how a linguistic theory might be applied let us examine
a regularity concerning case. When a noun or pronoun appears in the subject
of a sentence we say it is in the nominative case. When it is in the object
of a verb, it is in the accusative case. In English pronouns are marked
for nominative and accusative cases. We say 'He ran' (not 'Him ran') because
'he' is the nominative case pronoun. We say 'Mary loves him' (not 'Mary
loves he') because 'him' is the accusative case pronoun. We would like
to explain the following puzzling phenomena. (The star: * indicates a sentence
that is not well formed in English.)
(1) I believe she is a spy
(2) *I believe her is a spy
(3) *I believe she to be a spy
(4) I believe her to be a spy
(5) *It was believed she to be a spy
(6) *It was believed her to be a spy
(7) *It was believed Brenda to be a spy
This phenomenon may be explained with some simple ideas. Nominative case
(she) is assigned by TENSE. The verb 'to be' is tenseless, and that every
NP is assigned some case. Why is it 'she' in (1), not 'her'? Because present
TENSE assigns nominative case. Why is it 'her' in (4) not 'she'? Well there
is no TENSE in 'to be' so nominative is not assigned. Instead, 'believe'
takes accusative case in its second argument: 'John believes her' so the
same case gets set by default in 'John believes her to be a spy'. Why do
neither (5) nor (6) work? First (5) would be wrong because there is no
TENSE to assign nominative case. For (6), note that 'It was believed' does
not take a direct object: 'It was believed Mary'. So the explanation for
why (4) worked won't work here. Sentence (7) is harder to explain because
if you substitute 'Brenda' for the pronoun in (1)-(4) they work. The reason
'Brenda' doesn't work in (7) is that 'Brenda' must be assigned some case
in every good sentence in which it appears, even though the case is not
marked explicitly in the word 'Brenda', Since neither nominative (5) nor
accusative (6) is a possible assignment in (7) the sentence does not work.
III. Universals
A. The Case for and Against Linguistic Universals
Cognitive Science Notes Week 14 (Linguistics) (After a talk by Justin
Leiber)
I. How Did Linguistics Get Started?
A. Language is so ubiquitous, it drops out of view. Therefore we believe
that it is somehow easy or obvious. When (say the French) encounter people
who speak another language there is an irresistible tendency to think their
failure to speak "properly" betrays an inability to reason. (Why
after all would they be so perverse to call everything by the wrong name?)
B. Study of language prior to the 1950s tended to be diachronic or historically
oriented. The concern was with how languages developed. There was also
a concern with written text. And the concern was prescriptive: this is
how language ought to be used, rather than descriptive: here is
how people actually speak.
C. So linguistics was ripe for correction. In the 1920s, there arose new
interest non-European languages and cultures. Cultural anthropology is
born. There is new emphasis on linguistics as a synchronic study, a study
of how a language actually works at a given moment in time. There is a
move away from prescriptive linguistic study and to a descriptive account
of how people actually talk and how this integrates with the rest of their
social doings.
II. Structural Linguistics
This was the school (1920-1955) that grew out of these new trends. The
two major works in the movement were Bloomfield's Language , and
Zelig Harris' Methods of Structural Linguistics . Some features
of the school:
- Strict behaviorism. We are not supposed to ask people what they mean.
Just look at verbal behavior. (Bloomfield does talk a bit about meaning,
but he characterizes phrase meaning as the influence it has on others behavior
when uttered.)
- Interest in languages outside Europe, especially American Indian languages.
- Spoken language is emphasized over text. Language is seen as a stream
of sound.
- To understand a language you need to understand a corpus (set of utterances)
used in a culture. Linguistics is the study of "sonic behavior".
- All higher level structure in language is determined by phonetic features
(bottom up), so for example...
- Words are merely the collecting together of sounds that often are uttered
together.
- Harris was interested in discourse analysis (the analysis of whole
groups of sentences). Another influential idea of his was to idea that
sentences could be reduced to normal or kernel form. For example passive
sentences are converted to active, and relative clauses into separate sentences:
'The visible world was created by invisible God.' ==> 'The world is
visible'. 'God is invisible'. 'God created the world'. This inspired the
theory of transformations.
III. Chomsky's Work
A. Chomsky has had an immense influence. He is the most cited living author.
Chomsky proof-read Harris' book and with the support of Nelson Goodman
went off to Harvard. There he created a massive work The Logical Structure
of Linguistic Theory which laid the groundwork for a whole new approach
to linguistics. A piece of this called Syntactic Structures (1957)
had a strong influence on linguistics, and can be thought of as a major
influence on the birth of cognitive science.
B. Chomsky borrowed ideas from the theory of formal systems (logic), and
applied them to language. In such a formal theory we define what we mean
by a well-formed formula (wff) using a set of recursive rules. Like this:
p, q, r are wffs. If X is a wff, then so is ~X. If X and Y are both wffs,
then so are (X&Y), (X->Y), (XvY), and (X<->Y). These allow
the creation of a potentially infinite class of wffs: p, q, ~p, (~p&q),
~(~p&q), (~p&q)->~p, etc... Chomsky's idea was that the same
was true of language. There is a set of rules that can be used to generate
all (and only) the well-formed sentences of English.
C. Chomsky also borrowed the notion of a transformation from Harris, an
idea that is also inspired by the notion of the application of rules to
a set of sentences to create good reasoning. One such rule is Modus Ponens:
from X and X->Y, deduce Y. Correct logical rules have the feature that
they preserve truth. (If the premises are true then so must be the conclusion.)
This logical idea is the analogue of the Katz-Postal Hypothesis that transformations
preserve meaning .
D. On Chomsky's view the goal of linguistics is to produce a grammar, i.e
a set of recursive rules that generate the whole host of possible sentences
of a language. Language is productive and it is essential to capture this
fact in linguistics. The linguist's role is not to study a corpus (set)
of actual utterances. Instead it studies human competence to produce an
unlimited variety of well formed sentences.
E. So methodology in linguistics may be very different from methodology
in psychology where we must do experiments on many subjects. In linguistics,
the linguistic can use his own linguistic competence to make judgements
about what is or is not a sentence of her language. Since this competence
is presumably shared, there is no need to constuct experiments across different
speakers. (This methodology makes psychologists very uneasy.)
E. The grammar is what the baby learns when it learns language, so the
linguists job is to learn explicitly what is learned unconsciously by the
child.
F. By studying grammars we study the rules that children acquire when they
learn language. Since the language heard by children does not contain enough
clues to determine the grammar, there must be some innate mechanism that
helps the child learn the linguistic rules.
G. The goal is to provide formal theories of language at three possible
levels (the higher the better). 1. Observational Adequacy: (Weak generative
capacity) Rules that generate all and only the sentences of the language.
- Descriptive Adequacy: (Strong generative capacity) Rules that assign
syntactic structure to strings of words.
- Explanatory Adequacy: A theory of the mechanisms that explain exactly
how the hearer processes language.
IV. The Hierarchy of Grammars
A. Finite State Machines. These are devices that merely move from one box
or node to another, making a selection from each box and continuing along
any arrow from a box. Chomsky showed that no finite state machine can generate
all and only strings of the form: aaaa...bbbb... where the number of as
and bs is the same. But language has structures with similar complexity,
notably wherever there are rules of agreement (say) between noun phrase
and verb phrase: 'John loves Mary', but 'Men love Mary'.
B. Phrase Structure Grammars. Phrase structure grammars allow the introduction
of rewrite rules with variables referring to grammatical types. This vastly
improves the power of the grammar to account for the structure of language.
Rules of language must be expressed not by how one transitions from one
'box' to another, but by the understanding of grammatical categories: NP,
VP, Auxiliary Verb,. How else did Justin's daughter generalize to the sentence
: 'I am going am'nt I?' She "knew" the rule that tag negation
works with auxiliary verbs 'I have it haven't I' (but not regulars '*I
walk, walkn't I') so this construction makes sense. This kind of over-generalization
is a basic feature of language learning: children learn irregular plurals:
mice, but as they become more familiar with the regular plural rule, they
go back to mistakes: mouses. Click experiments also suggest that we are
aware of the phrase structure of sentences that we hear, For the time at
which clicks are heard subjectively moves towards the major grammatical
boundaries.
C. Transformational Grammars. These introduce yet another innovation. Rules
that transform phrase structures into alternative forms. Transformations
provide especially economical explanations for the formation of questions,
and passive voice, but also in accounting for deletions ('John and Mary
like Jill' instead of 'John likes Jill and Mary Likes Jill') that we may
be using to help memory chunking that helps overcome the 7 plus or minus
2 constraint on short term memory.
Cognitive Science Notes Weeks 14-15 (Philosophy) (M Ch. 9; CS 8.3 but
not pp. 355-362)
I. The Mind Body Problem
A. What is the Mind? How is the Mind related to the Brain? Some possible
answers:
- Dualism. The mind is a radically different sort of substance or property
outside the physical world.
- Materialism (Physicalism, Naturalism) The Mind is a part or feature
of the physical world.
a. Reductive materialism. Mental states just are physical states of the
brain.
b. Functionalism. Mental states are computational states (or more abstract
higher level states).
c. Eliminative materialism. Mental states are myths that science will do
away with.
d. Instrumentalism. The notion of a mental state may be theoretically useful,
but like vectors and centers of mass, it does not refer to something that
actually exists in the brain.
B. If Cognitive Science seeks to provide a complete account of the mind,
then dualism is to be avoided, for it puts the mind beyond the reach of
the natural sciences. Most cognitive scientists (where they have a philosophical
view) believe in some form of functionalism. This fits rather nicely with
the classical computational approach (CRUM)
II. Some Problems for Functionalism and Nearby Brands of Materialism
A. Emotions. It does not seem that emotions could be computational
states of anything. Suppose a computer-robot were to mimic my brain's computational
states exactly. Still it wouldn't feel anything. There seems to be an important
divide between thought and rationality on one hand and the emotions. Perhaps
functionalism is a good theory of the former but it seems much less able
to account for the latter. Here are some responses to the challenge.
Further Reading on Consciousness and Qualia
Chalmers, D. The Conscious Mind (Presents a dualist theory. Has
a good bibliography)
Chalmers, D. "The Puzzle of Conscious Experience," Scientific
American 273, (1995) pp. 80-86 (A very accesible review of Chalmer's
dualism)
Churchland, Paul "The Rediscovery of Light" Journal of Philosophy
, 1995
Churchlands, Paul and Patricia "Could a Machine Think?" (Jan.
1990) Scientific American pp. 32-37
Crick, F. and Koch, C. "Towards a Neurobiological Theory of Consciousness,"
Seminars in Neurosciences 2 (1990) pp. 263-275 (A neurphysiological
view: consciousness is a certain synchronized activity of neural bundles
oscillating at about 40 hertz).
Damasio, A. Descartes' Error (Argues that emotion is central to
cognition)
Dennett, D. Consciousnes Explained (An instrumentalist account of
consciousness. Fun reading but challenging noetheless.)
Searle, J. "Is the Brain's Mind a Computer Program?" Scientific
American (Jan. 1990) pp. 26-31 (An easy entry to some of the arguments
against functional accounts of qualia)
Cognitive Science Study Questions Final
Part I
Cognitive Psychology - Imagination (CS 2.1-2.3; M Ch. 6; CS 2.7)
Linguistics (CS 6.1, 6.3, 6.4)
Philosophy of Mind (M Ch. 9; CS 8.3 but not pp. 355-362)
Identify Terms
John Locke, William James, Principles of Psychology, Behaviorism,
Central Systems, Motor Systems, Sensory Systems, The Imagery Debate, Turing
Machine, von Neumann architectures, Productivity, Systematicity, Symbolic
Processing, Kosslyn, Descriptive vs Prescriptive Linguistics, Performance
vs. Competence, Grammar, Finite State Machine, Phrase Structure Grammar,
Transformational Grammar, Universal Grammar, Phonology, Phoneme, Morpheme,
Syntax, Semantics, Discourse Analysis, Constituency Test, DETerminer, Lexicon,
Mass vs. Count Noun, Active vs. Passive Voice, Nominative vs. Accusative
Case, Principles and Parameters, Binding Theory, Bonding Theory, X-bar
theory, Agent vs. Patient, Structural Linguistics, Bloomfield, Zelig Harris,
Normal Form, Kernel Form, Chomsky, Syntactic Structures, Well-Formed
Formula (wff), Katz-Postal Hypothesis, Corpus, Observational Adequacy,
Weak Generative Capacity, Descriptive Adequacy, Strong Generative Capacity,
Explanatory Adequacy, The Mind Body Problem, Materialism, Physicalism,
Naturalism, Reductive Materialism, Identity Theory, Functionalism, Eliminative
materialism, Instrumentalism, Damasio, amygdala, dopamine, serotonin, qualia,
quale, Absent Qualia Objection, Chinese Nation Objection, thalamus, argument
from lack of imagination, zombies, ersatz qualia.
Questions
Cognitive Psychology
1. Give an account of the shifting attitudes towards imagination that
have developed in the last 100 years in cognitive psychology. Give the
names of major influential figures, and explain the cultural events that
affected these intellectual trends.
2. Why has research on imagination been relatively unpopular in cognitive
psychology until relatively recently. What has encouraged new interest
in imagination?
3. Give reasons for thinking that a complete cognitive science should study
imagery.
4. What evidence do we have that the ability to imagine is important to
thought and reasoning?
5. What happens to a man blind from birth due to cloudy corneas (but whose
eyes are otherwise normal) when his sight is restored. Explain your answer.
6. Cite evidence for each side in the imagery debate. A good answer will
explain what the debate is about and how the evidence is relevant.
7. "The classical theory is that cognitive science's job is to discover
what is common to the human cognitive architecture." Explain in detail.
8. According to the classical theory what are the main divisions in the
human cognitive architecture? Where in this architecture is symbolic processing
carried out?
9. Cite reasons for thinking that the brain depends on symbolic processing.
10. How do the phenomena of productivity and systematicity support the
symbolic processing approach to cognitive science?
11. What advantages do images have over symbolic forms of representation?
12. Describe two of Kosslyn's experiments on imagery and explain their
significance.
Linguistics
1. What is the fundamental question to be answered by linguistics?
2. Explain the difference between prescriptive and descriptive linguistics.
How would you classify linguistics today and why?
3. What is the distinction between competence and performance theories
in linguistics. Explain how the decision to create a competence theory
affects methodology in linguistics.
4. What is a grammar and why have grammars been important to linguistics
in the last 40 years?5. Why is it so hard to provide a grammar for English?
Cite examples to make your case.
5. Many linguistics think that the human brain contains a special device
especially dedicated to language. What evidence is there to support this
view?
6. What are linguistic universals? Why are they important in theories of
language learning?
7. Use an example to explain how a constituency test may be used to reveal
the linguistic structure of a sentence.
8. Create a phrase structure grammar that is capable of generating the
following sentences (and sentences like them). Sally drank booze. The man
drank. The man who visited Sally drank booze.
9. What is the lexicon? Given an example of how it plays a role in determining
which sentences are grammatically well-formed.
10. Explain what the theory of transformations says and give examples of
some phenomena it explains.
11. Give a linguistic theory that explain the following data (* means the
sentence is ungrammatical):
(1) I believe she is a spy (2) *I believe her is a spy (3) *I believe she
to be a spy (4) I believe her to be a spy (5) *It was believed her to be
a spy (6) *It was believed Brenda to be a spy.
12. What are linguistic universals? Give some examples. Why do linguistics
tend to believe in them?
13. What is Principles and Parameters Theory? Illustrate with an example
or two.
14. What are binding theory, theta theory and X-bar theory about?
15. Linguistics prior to 1950 tended to be diachronic rather than synchronic.
Explain with examples.
16. How did the birth of cultural anthropology in the 1920s influence the
development of linguistics?
17. Describe the structural linguistics movement. Name its most
influential authors, and explain the fundamental assumptions made by this
school.
18. How was Chomsky's work different from work done by structural linguists?
19. How did ideas drawn from logic and the theory of formal systems influence
Chomsky's ideas abut how to proceed in linguistics?
20. What is the goal of linguistics according to Chomsky?
21. Chomsky felt that the linguist's role is not to study a corpus (set)
of actual utterances. Then what is the linguists role and how does it affect
the linguist's methods?
22. Why is learning so important to Chomsky's theory of language.
23. Why are finite state grammars inadequate for explaining language?.
How does the use of phrase structure grammars improve matters?
24. What evidence do we have that the human brain actually processes a
phrase structure grammar?
25. Name an advantage that transformational grammars might provide for
human language processing..
Philosophy of Mind
1. What is the Mind-Body Problem? Why is it so difficult?
2. Give at least 4 different positions one might take in the philosophy
of mind and explain how they differ from each other.
3. Why do you think that most cognitive scientists who have a worked out
philosophical position tend to be functionalists?
4. Why do emotions challenge the functionalist account of the mind? Explain
a number of different strategies one might take to meeting the challenge.
5. Give the basic outlines of a functionalist theory of the emotions.
6. Give the basic outline of a psychophysical account of the emotions.
7. What are qualia? Why is accounting for qualia such a difficult problem
for functionalist theories of the mind?
8. Describe at least two different strategies for providing a scientific
account of qualia.
9. What is the absent qualia objection to functionalism? Explain with reference
to zombies and the Chinese Nation. What responses to the problem of absent
qualia have been explored?
10. "I just cannot imagine how the activity of neurons could amount
to conscious experiences." Give reasons for an against the idea that
this counts as a good objection to functionalist theories of consciousness.
Part II The Whole Course
(I will select items from the study questions for quizzes 1 and 2. I will
also include a section where I will ask more general questions such as
these:)
Our course in cognitive science has been drawn from the following academic
areas: logic, artificial intelligence, neuroscience, neuropsychology (especially
vision), cognitive psychology (especially concepts and imagery), connectionism,
linguistics, and the philosophy of mind. Select as many of these areas
as are relevant and explain how research in these disciplines throws light
on the following questions:
- Whether top-down or bottom-up theories of cognition are more apt.
- What is the nature of reasoning?
- Whether a symbolic processing model of cognition is adequate.
- What are concepts?
- What is consciousness?
- How do we understand language?
- How are objects recognized?
- What are perceptions, for example, the sensation of seeing the color
red?
- What features of cognition are innate?
- How does the brain learn?
- What is human memory?