Robin Garden / 基础知识 / Artificial intelligence

0 0


Artificial intelligence

2011-09-15  Robin Gar...

Artificial intelligence

A modern approach

Second edition

Stvart J.Russell

Peter Norvig




The main unifying theme is the idea of an intelligence agent.

Using the web site:


1 introduction

How we think.

AI: understand and build intelligent entities.

One of the newest sciences.

Start in earnest soon after world war II.

AI still has openings for several full-times Einsteins.


1.1   what is AI


human-centered approach: empirical

rationalist approach

acting humanly: the turing test approach

the computer past the test if a human interrogator cannot tell whether the written responses come from a person or not.



Natural language processing

Knowledge representation

Automated reasoning

Machine learning


Thinking humanly: the cognitive modeling approach


Cognitive science: similarity and difference


Thinking rationally: the “laws of thought” approach




Acting rationally: the rational agent approach


1.2   the foundations of AI

neuroscience: how brains process information?




2 intellligence agents

2.1 agents and environments

Agent, sensors, actuators, perceiving, acting, environment

Percept, percept sequence.

An agent’s choice of action at any given instant can depend on the entire percept sequence observed to date.

Agent function: maps any given percept sequence to an action.

The agent function is an abstract mathematical description; the agent program is a concrete implementation, running on the agent architecture.


2.2 good behavior: the concept of rationality

A rational agent is one that does the right thing.

Performance measures

A rational agent should select an action that is expected to maximize its performance measure.


Rationality is not the same as perfection. Rationality maximize the expected performance, while the perfection maximize the actual performance.

Rationality does not require omniscience.

Doing actions in order to modify future percepts, sometimes called information gathering, is an important part of rationality.

Rational agents not only gather information, but also learn as much as possible from what it perceives.

Successful agents split the task of computing the agent function is done by its designers; when it is deliberating on its next action, the agent does more computation; and as it learns from experience, it does more computation to decide how to modify its behavior.

A rational agent should be autonomous, it should learn what it can to compensate for partial or incorrect prior knowledge.

When the agent has had little or no experience, it would have to act randomly unless the designer gave some assistance.


2.3 the nature of environments

(1) Specifying the task environment

PEAS: performances, environments, actuators, sensors


(2) properties of task environments

Fully observable vs. partially observable

Deterministic vs. stochastic

Episodic vs. sequential

Static vs. dynamic

Discrete vs. continuous

Single agent vs. multiagent, competitive multiagent, cooperative multiagent

Environment class

Environment generator


2.4 the structure of agents

Agent = architecture + program

The job of AI is to design the agent program that implement the agent function.


agent program

Key challenge: find out the how to write programs that produce rational behavior from a small amount of code rather than from a large number of table entries.

For example: square roots table vs. calculator


(1) simple reflex agents

Condition-action rule: if…then…

Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. Escape from infinite loops is possible if the agent can randomize its actions.


(2) model-based reflex agents

The agent should maintain some sort of internal state that depends on the percept history and thereby reflects at least some of the unobservable aspects of the current state.

Keep track of the current state of the world using an internal model


(3) goal-based agents

Search and planning are the subfields of AI devoted to finding action sequences that achieve the agent’s goals.


(4) utility-based agents

If one world state is preferred to another, then it has higher utility for the agent.

Utility function: map a state onto a real number, which describes the associated degree of happiness.


(5) learning agents

Turing: build learning machines and then teach them.

Learning agent: learning element, performance elelment, critic, problem generator

Performance standard




3 solving problems by searching

Goal formulation

Problem formulation, the process of deciding what actions and states to consider, given a goal.


An agent with several immediate options of unknown value can decide what to do by first examining different possible sequences of actions that lead to states of known value, and then choosing the best sequence.

This process of looking for such a sequence is called search.



Open-loop system: ignoring the percepts breaks the loop between agent and environment


Well-defined problems and solutions

Four components of a problem:

(1) Initial state

(2) A description of the possible actions available to the agent.

Successor function, successor

State Space, form a graph in which the nodes are states and the arcs between nodes are actions


(3) Goal test funciton

determine whether a given state is a goal state.

(4) Path cost

path coat function, step cost


a solution to a problem is a path from the initial state to a goal state. Solution quality is measured by the path cost function, and the optimal solution has the lowest path cost among all solution.


Formulating problems

Abstraction: removing detail from a representation



3.2 example problems

Toy problem

Real-world problem


Toy problems

Sliding-block puzzles

8-queens problem

Incremental formulation vs. complete-state formulation


Real-world problems

Route-finding problem

Touring problem

Traveling salesperson problem (TSP)

VLSI layout, cell layout, channel routing

Robot navigation

Automatic assembly sequencing, protein design

Internet searching


3.3 searching for solutions

Search tree, search graph

Search node

Expanding, generating

Search strategy

Distinguish between state space and search tree, nodes and states

Node’s five components:

State, parent-node, action, path-cost, depth

Fringe is collection of leaf nodes, leaf node

The search strategy would be a function that selects the next node from fringe to e expanded.

The collection of node is implemented as a queue.


Measuring problem-solving performance

Completeness, optimality, time complexity, space complexity

Complexity is expressed in terms of three quantities:

Branching factor b, depth of the shallowest goal node d, max length of any path m

Time is measured in terms of the number of nodes generated during the search, and space in terms of the max number of nodes stored in memory.

Search cost

Total cost, search cost, path cost


3.4 Uninformed search strategies

Uninformed search, blind search

Informed search, heuristic search


(1) Breadth-first search

Fringe: FIFO queue

Shallow nodes are expanded before deeper nodes.

The memory requirements are a bigger problem for breadth-first search than is the execution time.

Exponential-complexity search problems cannot be solved by uniformed methods for any but the smallest instances.


(2) Uniform-cost search

Expands the node n with the lowest path cost


(3) Depth-first search

Expand the deepest node in the current fringe

Implemented by tree-search with a LIFO queue, eg. stack.

Modest memory requirement, bm+1

A variant of depth search: backtracking search


(4) Depth-limited search

Imposes a fixed depth limit on a depth-first search

Introduce an additional source of incompleteness

Diameter of the state space

Can be implemented as a simple modification to the general tree-search algorithm or to the recursive depth-first search algorithm.

Two kinds of failure: the standard failure value indicates no solution; the cutoff value indicates no solution within the depth limit


(5) Iterative deepening depth-first search

(6) Bidirectional search

(7) Comparing uninformed search strategies



3.5 avoiding repeated states

Detection of repeated states

Algorithm that forget their history are doomed to repeat it

Closed list, stores every expanded node

Open list, fringe of unexpanded nodes

Graph-search is more efficient than tree-search

Graph-search could miss an optimal solution.


3.6 searching with partial information

Different types of incompleteness lead to three distinct problem types:

Sensorless problem (conformant problems)

Contingency problems

Exploration problems



Sensorless problem :


Belief state

To solve sensorless problem: search in the space of belief states rather than physical states

Solution: a belief state that all of whose members are goal states


Contingency problem:

Agent can act before it has found a guaranteed plan.

Rather than considering in advance every possible contingence that might arise during execution, it is often better to start acting and see which contingencies do arise. The agent can then continue to solve the problem, taking into account the additional info…

This type of interleaving of search and execution is also useful for exploration problems and for game playing.



4 informed search and exploration

4.1 informed (heuristic) search strategies

Best-first search, a node selected for expansion based on an evaluation function, f(n)


A key component of these algorithm is a heuristic function, h(n), that is the estimated cost of the cheapest path from node n to a goal node.



Expand the node that is closest to the goal

It evaluates nodes by using just the heuristic function: f(n) = h(n)



A * SEARCH: minimizing the total estimated solution cost

F(n) = g(n)+h(n)

g(n) is the cost to reach the node, h(n) is the cost to get from the node to the goal, f(n) is the estimated cost of the cheapest solution through n

A* is complete and optimal


Analyzing the optimality using TREE-SEARCH:

A* is optimal if h(n) is an admissible heuristic, that is provided that h(n) never overestimates the cost to reach the goal.

A* using TREE-SEARCH is optimal if h(n) is admissible.


Analyzing the optimality using GRAPH-SEARCH:

Consistency (monotonicity): h(n) <= c(n,a,n’) + h(n’), which is general triangle inequality

A* using GRAPH-SEARCH is optimal if h(n) is consistent.

If h(n) is consistent, then the values of f(n) along any path are nondecreasing.

We can draw contours in the state space, just like the contours in a topographic map.


I f C* is the cost of the optimal solution path, then we can say:

A* expands all nodes with f(n) < C*.

A* might then expend some of the nodes right on the “goal contour” (where f(n) = C*) before selecting a goal node.

A* expends no nodes with f(n) > C*.


Pruning, contours


A* is optimally efficient for any given heuristic function.

The number of nodes within the goal contour search space is still exponential in the length of the solution.

Computation time is not A*’s main drawback.

A* is not practical for many large-scale problems.


(2) Memory-bounded heuristic search

Iterative-deepening A* (IDA*) algorithm


Recursive best-first search (RBFS)

Keep track of the f-value of the best alternative path available from any ancestor of the current node.

Back up the value of the forgotten node to its parent.

Somewhat more efficient than IDA*, but still suffers from excessive node regeneration.


IDA* and RBFS suffer from using too little memory.


MA*: memory-bounded A*

SMA*: simplified MA*, expands the best leaf and deletes the worst leaf.

SMA* is complete if there is any reachable solution; might well be the best general purpose algorithm for finding optimal solution particularly when the state space is a graph, step costs are not uniform, and node generation is expensive compared to the additional overhead of maintaining the open and closed lists.


Memory limitation can make a problem intractable from the point of view of computation time.

No theory to explain the tradeoff between time and memory, the only way out is drop the optimality requirement.


(3) learning to search better (???)

Metalevel state space

Object-level state space

Metalevel learning



4.2 heuristic function

(1) the effect of heuristic accuracy on performance

Effective branching factor b*

A well-designed heuristic would have a value of b* close to 1, allowing fairly large problem to be solved.

Domination: for any nodes n, h2(n)>=h1(n)

A* using heuristic function with larger value will never expand more nodes than A* using heuristic function with smaller value.

It is always better to use a heuristic function with higher values, provided it does not overestimated and that the computation for the heuristic is not too large.


(2) inventing admissible heuristic functions

Relaxed problem: a problem with fewer restrictions on the actions

The cost of an optimal solution to a relaxed problem is an admissible heuristic for the original problem.

Heuristic of relaxed problem is also admissible and consistent.

ABSOLVER: can generate heuristic automatically from problem definitions, using the relaxed problem method and various other techniques.

Composite heuristic: h(n)=max{h1(n),…,hm(n)}

Admissible heuristic can also be derived from the solution cost of a subproblem of a given problem.

Pattern databases: store exact solution costs of every possible subproblem instance.

Disjoint pattern databases: the sum of the two costs is still a lower bound on the cost of solving the entire problem

Disjoint pattern databases work for sliding-tile puzzles because the problem can divided up in such a way that each move affects only one subproblem.


(3) learning heuristic from experience

Inductive learning




4.3 local search algorithm and optimization problem

In many problems, the path to the goal is irrelevant.

Local search algorithms operate using a single current state, rather than multiple paths, and generally move only to neighbors of that state, do not worry about path at all.

Two key advantages:

They use very little memory

They often find reasonable solutions in large or infinite state spaces for which systematic algorithm are unsuitable


Useful for solving pure optimization problems


State space landscape, location, elevation

Global minimum, global maximum

Complete local search algorithm: find a goal if one exists

Optimal algorithm: find a global minimum/maximum


(1) hill-climbing search

Complete-state formulation

Greedy local search


Hill-climbing often gets stuck for the following reasons:

Local maxima


Plateau, shoulder


Stochastic hill climbing

First-choice hill climbing: generating successors randomly until one is generated that is better than the current state.


Hill-climbing are imcomplete for they can stuck on local maxima


Random-restart hill climbing: it is complete with probability approaching 1

If each hill climbing search has a probability p of success, then the expected number of restarts required is 1/q.

The success of hill climbing depends very much on the shape of the state-landscape: if there are few local maxima and plateau, random-restart hill climbing will find a good solution very quickly.


(2) simulated annealing search

A hill-climbing is imcomplete. A purely random walk is complete, but extremely inefficient.

Simulated annealing: combine hill climbing and random walk.

The simulated annealing is to start by shaking hard (i.e. at a high temperature) and then gradually reduce the intensity of the shaking (i.e. lower the temperature).

simulated annealing: pick a random move, if the move improves the situation, it is always accepted. Otherwise, the move is accepted with some probability less than 1


(3) local beam search

Keep track of k states rather than just one

Begin with k randomly generated states, all the successors of all k states are generated, if any one is a goal, halts. Otherwise, selects the k best successors from the complete list and repeats


In a local beam search, useful information is passed among the k parallel search threads.


Stochastic beam search: chooses k successors at random with some probability


(4) genetic algorithm

Variant of stochastic, generate successors by combining two parent states


individual is represented as a string over a finite alphabet

Evaluation function, fitness function

Crossover, mutation


The primary advantage, if any, of genetic algorithm comes from the crossover operation.

Schema, instance of a schema


Genetic algorithm have had a widespread impact on optimization problems, such as circuit layout and job-shop scheduling.



4.4 local search in continuous spaces (?)


Gradient, Empirical gradient

Line search: extending the current gradient direction until f starts to decrease again

Newton-raphson method

Hessian matrix


Constrained optimization

Linear programming problem


4.5 online search agents and unknown environments

Offline search:

Have to come up with an exponentially large contingency plan that considers all possible happenings

Online search: interleaving computation and action

Need only consider what actually does happen

Necessary idea for an exploration problem


Online search problems:

Can be solved only by an agent executing actions, rather than by a purely computational process.


Competitive ratio

No algorithm can avoid dead ends in all state spaces.

Adversary argument

Safely explorable


Online search agents:

Online algorithm can expand only a node that it physically occupies.

Depth-first search has exactly this property.


Iterative deepening


Online local search

Hill-climbing search is already an online search algorithm.

Using random walk to explore the environment


Learning real time A*, one important detail is that actions that have not yet been tried in a state s are always assumed to lead immediately to the goal with the least possible cost. This optimism under uncertainty encourages the agent to explore new, possibly promising paths.



Learning in online search


5 constraint satisfaction problems


Black box

Constraint satisfaction problems


5.1 constraint satisfaction problems


Variables, constraints, domain, values, assignment

Consistent or legal assignment

Complete assignment

Objective function

Constraint graph

Simplest kind of CSP: Discrete and finite domains

Linear programming problems

Unary constraint

Binary constraint

Auxiliary variables

Constraint hypergraph

Absolute constraint

Preference constraint


5.2 backtracking search for CSPs

Commutativity: the order of application of any given set of actions has no effect on the outcome


All CSP search algorithms generate successors by considering possible assignments for only a single variable at each node in the search tree.


Backtracking search

Minimum remaining values (MRV) heuristic, it also has been called the “most constrained variable” or “fail-first” heuristic

Initial selection: Degree heuristic, selecting the variable that is involved in the largest number of constraints on other unassigned variables

Least-constraining-value heuristic, prefer the value that rules out the fewest choices for the neighboring variables in the constraint graph


Propagating information through constraints

Forward checking

Looks at each unassigned variable Y that is connected to X by a constraint and deletes from Y’s domain any value that is inconsistent with the value chosen for X


Constraint propagation:

Propagating the implications of a constraint on one variable onto other variables


Arc consistency:

Arc consistency checking can be applied either as a preprocessing step before the beginning of the search process, or as a propagation step after every assignment during search.


k-consistency, node consistency, arc consistency, path consistency

strongly k-consistent


handling special constraints

special-purpose algorithm

resource constraint, or atmost constraint

bound-constraint, bounds propagation


intelligent backtracking: looking backward

chronological backtracking

go all the way back to one of the set of variables that caused the failure, the conflict set


backjumping, backtracks to the most recent variable in the conflict set



Every branch pruned by backjumping is also pruned by forward checking.


Conflict-directed backjumping


Constraint learning


5.3 local search for constraint satisfaction problems

They use a complete-state formulation: the initial state assigns a value to every variable, and the successor function usually works by changing the value of one variable at a time.


Min-conflicts heuristic

The runtime of min-conflicts is roughly independent of problem size.

Local search can be used in an online setting when the problem changes.

Repairing the airline schedule affected by bad weather: with minimum number of changes, easily done with local search algorithm starting from the current schedule


5.4 the structure of problems

Independent subproblems

Connected components

Any tree-structured CSP can be solved in time linear in the numberof variables.

Two ways to reduce graphs to trees:

First approach: assigning values to some variables so that the remaining variables form a tree

Cycle cutest

Finding the smallest cycle cutest is NP-hard.

Efficient approximation algorithm: cutest conditioning


Second approach: tree decomposition techniques

Constructing a tree decomposition of the constraint graph into a set of connected subproblems

Tree decomposition techniques transform the CSP into a tree of subproblems and are efficient if the tree width of the constraint graph is small.




6 adversarial search

6.1 games

Multiagent environments


Competitive environments , in which the agents’ goals are in conflict, give rise to adversarial search problems, often known as games.


In AI, games are usually of a rather specialized kind, what game theorists call deterministic, turn-taking, two-player, zero-sum games of perfect information.


6.2 optimal decisions games

MAX and MIN, MAX moves first.

A game can be defined as a kind of search problem with the following components:

Initial state

Successor function: returns a list of (move, state)

Terminal test, terminal states

Utility function (objective function, payoff function),


Game tree


Optimal strategies

Minimax value

Minimax decision


The minimax algorithm

Computers the minimax decision from the current state

It uses a simple recursive computation of the minimax values of each successor state, directly implementing the defining equations.


Minimax algorithm performs a complete depth-first exploration of the game tree.


Optimal decisions in multiplayer games

This is straight forward from the technical viewpoint

Replace the single value for each node with a vector of values

Multiplayer gmaes usually involve alliances.


6.3 alpha-beta pruning

Pruning some leaves

Pruning entire subtrees


Effectiveness is highly dependenton the order in which the successors are examined.

Transpositions, transposition table


6.4 imperfect, real-time decisions

Shannon suggested to alter minimax or alpha-beta in two way:

Utility function is taken place by heuristic evaluation function

Terminal test is taken place by cutoff test


Features define various categories or equivalence, classes of states

Material value

Weighted linear function

Quiescent, quiescence search

Horizon effect

Singular extensions

Forward pruning


6.5 games that include an element of chance

Expectminimax alue

Card games

“average over clairvoyancy”


6.6 state of the art game programs

Chess: deep blue, IBM





7 logical agents

7.1 knowledge-based agents

Central component: knowledge base

Sentences, knowledge representation language

Inference: deriving new sentences from old

Logical agents: when one asks a question of the KB, the answer should follow from what has been told to the KB previously

Background knowledge



    本站是提供个人知识管理的网络存储空间,所有内容均由用户发布,不代表本站观点。如发现有害或侵权内容,请点击这里 或 拨打24小时举报电话:4000070609 与我们联系。




    请遵守用户 评论公约

    喜欢该文的人也喜欢 更多