Artificial Intelligence: A Modern Approach (3rd Edition) by Peter Norvig and Stuart J. Russell is the standard textbook of the artificial intelligence field. It is featured in many university course reading list.

This series “Reading Artificial Intelligence: A Modern Approach”, I am going to post what I learnt from reading this book, as well as some codes.

This is a dense book, with 1000+ pages, I have been trying to finish it a few times but do not succeed, but the more I visit, the more I become familiar with it, hopefully I can finish it this time.

There are bibliographies, summaries, and exercises at the end of each chapter. It is good for you if you want to give a run through after you finish the book as well as test your understanding.

## Chapter 1: Introduction

There are **4 ways of approaching AI** – thinking/ acting, humanly/ rationally. They are thinking humanly (e.g. Cognitive science), thinking rationally (e.g. Logic), acting humanly (e.g. Turing Test), and acting rationally (e.g. Rational agent). The first dimension, thinking or acting deals with thinking processes or behavior, while humanly or rationally means humanly or does the right thing. The book’s approach to AI is acting rationally, and seek to build rational agents. Agents are robots or software.

Then, the chapter introduces the **different fields that contributed to AI** – philosophy (dualism, rationalism, materialism, empiricism, induction, logical positivism, observation sentences, confirmation theory), mathematics (logic, computation, probability, algorithm, incompleteness theorem, tractability, NP-completeness), economics (utility, decision theory, game theory, operations research, Markov decision processes, satisficing), neuroscience (neuron, singularity), psychology (behaviorism, cognitive psychology), computer engineering (efficient computer, computer as artifact, programmable computer), control theory and cybernetics (homeostatic, maximizing objective function), linguistic (computational linguistic, natural language processing, knowledge representation).

Next,** the history of AI**.

The first period is the gestation of artificial intelligence (1943-1955).

- First work on AI by Warren McCulloch and Walter Pitts (1943) – artificial neuron model where each neuron is either “on” or “off”.
- Donald Hebb introduces Hebbian learning to update connection strengths between artificial neuron (1949).
- Marvin Minsky and Dean Edmonds built the first neural network computer (SNARC) with 3000 vacuum tubes and 40 neurons (1950).
- Marvin Minsky proved the limitations of neural network.
- Alan Turing introduces the Turing Test (1950).

The birth of artificial intelligence (1956)

- Two-month AI workshop at Darmouth in the summer (1956) by John McCarthy. The workshop introduced key players in AI to each other.
- In the workshop, Allen Newell and Herbert Simon introduces the reasoning program called Logic Theorist (LT). It can prove most of the mathematical theorem of Chapter 2 of Russell and Whitehead’s Principia Mathematica and came up with a shorter proof for one theorem.

Early enthusiasm, great expectations (1952-1969)

- Allen Newell and Herbert Simon introduced the General Problem Solver (GPS) that approaches problem with subgoals and possible actions, similar to human, embodying the “thinking humanly” approach.
- Allen Newell and Herbert Simon formulated the physical system hypothesis (1976) which states that “a physical symbol system has the necessary and

sufficient means for general intelligent action.” - Herbert Gelernter introduced the Geometry Theorem Prove. It can prove mathematical theorem many mathematics students find tricky.
- Arthur Samuel wrote a series of programs for checkers that can play at strong amateur level, disporved the idea that computers can only do what it is programmed to do, when it played better than its creator.
- John Mccarthy defined the language LISP (1958), invented time sharing (1958), published “Programs with Common Sense” which introduced the Advice Taker (1958).
- J. A. Robinson discoverd the resolution method (a complete theorem-proving algorithm for first-order logic) (1965)
- Cordell Green’s question-answering and planning system (1969) uses logic.
- Shakey robotics project at Stanford Research Institute (SRI) integrates logic and physical activity.
- Minsky and students work on solving limited problems called microworlds.
- James Slagle’s SAINT program (1963) solved closed-form calculus integration problems typical of first-year college course.
- Tom Evan’s ANALOGY program (1968) solved geometric analogy problems that appeared in IQ tests.
- Daniel Bobrow’s STUDENT program (1967) solved algebra story problems.
- Daniel Huffman’s vision project (1971), David Waltz’s vision and constraint-propagation work, Patrick Winston’s learning theory (1970), Terry Winograd’s natural language understanding program (1972), Scott Fahlman’s planner (1974).
- Terry Winograd abd Cowan showed large number of elements could represent an individual concept (1963).
- Bernie Widrow enhances Hebb’s learning methods and called his networks adalines (1960, 1962).
- Frank Rosenblatt did the same with his perceptrons (1962).
- Perceptron convergence theorem introduced (1962).

A dose of reality (1966-1973)

- A report by an advisory committee found that “there has been no machine translation of general scientific text, and none is in immediate prospect.” All U.S. government funding for translation was cancelled.
- The realisation of intractability of many AI problems e.g. genetic algorithms.
- British government ends support in AI except two universities.
- Marvin Minsky proved that two-input perceptrons cannot recognize when two inputs are different. Research funding for neural net ends.

Knowledge-based systems: The key to power? (1969-1979)

- Domain specific systems/ expert systems.
- Ed Feigenbaum, Bruce Buchanan, and Joshua Lederberg introduced DENDRAL program (1969) that can infer molecular structure from mass spectrometer data.
- Ed Feigenbaum and others began the Heuristic Programming Project (HPP) to investigate the extend of expert systems.
- Ed Feigenbaum, Bruce Buchanan, and Dr. Edward Shortliffe introduced MYCIN to diagnose blood infection using 450 rules acquired from experts. It incorporates a calculus of uncertainty called certainty factors.
- Winograd’s SHRDLU to understand natural language.
- Schank and students developed programs to understand natural language.

AI becomes an industry (1980-present)

- First successful commercial expert system, R1 at Digital Equipment Corporation (1982).
- Nearly every major U.S. corporation has its own AI group or investigating expert systems.
- The Japanese announced the “Fifth Generation” project (1981), a 10 year program to build intelligence computers using Prolog.
- United States formed the Microelectronics and Computer Technology Corporation (MCC).
- Britain’s Alvey report reinstated funding cut by Lighthill report.
- AI industry boomed from a few million dollars in 1980 to billions of dollars in 1988).
- Then came “AI Winter”, failure to meet extravagant promises.

The return of neural networks (1986-present)

- Back-propagation reinvented (mid-1980s)

AI adopts the scientific method (1987-present)

- Build on existing theories rather than propose new ones.
- Base claims on rigorous theorems and hard experimental evidences rather than intuition.
- Work on real-world applications rather than toy examples.
- Replicate experiments using shared repositories of test data and code.
- Evident in speech recognition, e.g. hidden Markov models (HMM) – based on rigorous mathematical theory, training on large corpus.
- Machine translation, neural networks, data mining, robotics, computer vision, and knowledge representation.
- Bayesian network introduced.

The emergence of intelligent agents (1995-present)

- Trying to build “whole agent” or human-level AI (HLAI) or Artificial General Intelligence (AGI).
- Allen Newell, John Laird, and Paul Rosenbloom introuced SOAR (1987, 1990), a complete agent architecture.
- Web-based applications e.g. search engines, recommender systems, website aggregators in internet, “-bot” suffix in everyday language.
- Realization that isolated subfields of AI need to be reorganized, sensors need to handle uncertainty.

The availability of very large data sets (2001-present)

- Increasing availability of very large data sources, e.g. trillions of works of English, billions of images from the Web, billions of base pairs of genomic sequences.
- A mediocre algorithm with large data sets is better than a best algorithm with small data sets.

Lastly, the chapter gives the **state of the art of AI**. State of the art AI includes robotics vehicles (STANLEY, BOSS), speech recognition (airline booking), autonomous planning and scheduling (NASA’s remote agent, MAPGEN, MEXAR2), game playing (IBM’s DEEP BLUE), spam fighting, logistic planning (DART), robotics (Roomba, PackBot), and machine translation.