分享
Artificial intelligence A modern approach Second edition Stvart J.Russell Peter Norvig 20110822 Preface The main unifying theme is the idea of an intelligence agent. Using the web site: http://aima.cs.berkelay.edu 1 introduction How we think. AI: understand and build intelligent entities. One of the newest sciences. Start in earnest soon after world war II. AI still has openings for several fulltimes Einsteins. 1.1 what is AI rationality humancentered approach: empirical rationalist approach acting humanly: the turing test approach the computer past the test if a human interrogator cannot tell whether the written responses come from a person or not. Capabilities: Natural language processing Knowledge representation Automated reasoning Machine learning Thinking humanly: the cognitive modeling approach Cognitive science: similarity and difference Thinking rationally: the “laws of thought” approach Syllogism(演绎推理) Logic Acting rationally: the rational agent approach 1.2 the foundations of AI neuroscience: how brains process information? (20110822) 2 intellligence agents 2.1 agents and environments Agent, sensors, actuators, perceiving, acting, environment Percept, percept sequence. An agent’s choice of action at any given instant can depend on the entire percept sequence observed to date. Agent function: maps any given percept sequence to an action. The agent function is an abstract mathematical description; the agent program is a concrete implementation, running on the agent architecture. 2.2 good behavior: the concept of rationality A rational agent is one that does the right thing. Performance measures A rational agent should select an action that is expected to maximize its performance measure. Rationality is not the same as perfection. Rationality maximize the expected performance, while the perfection maximize the actual performance. Rationality does not require omniscience. Doing actions in order to modify future percepts, sometimes called information gathering, is an important part of rationality. Rational agents not only gather information, but also learn as much as possible from what it perceives. Successful agents split the task of computing the agent function is done by its designers; when it is deliberating on its next action, the agent does more computation; and as it learns from experience, it does more computation to decide how to modify its behavior. A rational agent should be autonomous, it should learn what it can to compensate for partial or incorrect prior knowledge. When the agent has had little or no experience, it would have to act randomly unless the designer gave some assistance. 2.3 the nature of environments (1) Specifying the task environment PEAS: performances, environments, actuators, sensors (2) properties of task environments Fully observable vs. partially observable Deterministic vs. stochastic Episodic vs. sequential Static vs. dynamic Discrete vs. continuous Single agent vs. multiagent, competitive multiagent, cooperative multiagent Environment class Environment generator 2.4 the structure of agents Agent = architecture + program The job of AI is to design the agent program that implement the agent function. agent program Key challenge: find out the how to write programs that produce rational behavior from a small amount of code rather than from a large number of table entries. For example: square roots table vs. calculator (1) simple reflex agents Conditionaction rule: if…then… Infinite loops are often unavoidable for simple reflex agents operating in partially observable environments. Escape from infinite loops is possible if the agent can randomize its actions. (2) modelbased reflex agents The agent should maintain some sort of internal state that depends on the percept history and thereby reflects at least some of the unobservable aspects of the current state. Keep track of the current state of the world using an internal model (3) goalbased agents Search and planning are the subfields of AI devoted to finding action sequences that achieve the agent’s goals. (4) utilitybased agents If one world state is preferred to another, then it has higher utility for the agent. Utility function: map a state onto a real number, which describes the associated degree of happiness. (5) learning agents Turing: build learning machines and then teach them. Learning agent: learning element, performance elelment, critic, problem generator Performance standard （20110901） 3 solving problems by searching Goal formulation Problem formulation, the process of deciding what actions and states to consider, given a goal. An agent with several immediate options of unknown value can decide what to do by first examining different possible sequences of actions that lead to states of known value, and then choosing the best sequence. This process of looking for such a sequence is called search. Solution Execution Openloop system: ignoring the percepts breaks the loop between agent and environment Welldefined problems and solutions Four components of a problem: (1) Initial state (2) A description of the possible actions available to the agent. Successor function, successor State Space, form a graph in which the nodes are states and the arcs between nodes are actions Path (3) Goal test funciton determine whether a given state is a goal state. (4) Path cost path coat function, step cost a solution to a problem is a path from the initial state to a goal state. Solution quality is measured by the path cost function, and the optimal solution has the lowest path cost among all solution. Formulating problems Abstraction: removing detail from a representation 3.2 example problems Toy problem Realworld problem Toy problems Slidingblock puzzles 8queens problem Incremental formulation vs. completestate formulation Realworld problems Routefinding problem Touring problem Traveling salesperson problem (TSP) VLSI layout, cell layout, channel routing Robot navigation Automatic assembly sequencing, protein design Internet searching 3.3 searching for solutions Search tree, search graph Search node Expanding, generating Search strategy Distinguish between state space and search tree, nodes and states Node’s five components: State, parentnode, action, pathcost, depth Fringe is collection of leaf nodes, leaf node The search strategy would be a function that selects the next node from fringe to e expanded. The collection of node is implemented as a queue. Measuring problemsolving performance Completeness, optimality, time complexity, space complexity Complexity is expressed in terms of three quantities: Branching factor b, depth of the shallowest goal node d, max length of any path m Time is measured in terms of the number of nodes generated during the search, and space in terms of the max number of nodes stored in memory. Search cost Total cost, search cost, path cost 3.4 Uninformed search strategies Uninformed search, blind search Informed search, heuristic search (1) Breadthfirst search Fringe: FIFO queue Shallow nodes are expanded before deeper nodes. The memory requirements are a bigger problem for breadthfirst search than is the execution time. Exponentialcomplexity search problems cannot be solved by uniformed methods for any but the smallest instances. (2) Uniformcost search Expands the node n with the lowest path cost (3) Depthfirst search Expand the deepest node in the current fringe Implemented by treesearch with a LIFO queue, eg. stack. Modest memory requirement, bm+1 A variant of depth search: backtracking search (4) Depthlimited search Imposes a fixed depth limit on a depthfirst search Introduce an additional source of incompleteness Diameter of the state space Can be implemented as a simple modification to the general treesearch algorithm or to the recursive depthfirst search algorithm. Two kinds of failure: the standard failure value indicates no solution; the cutoff value indicates no solution within the depth limit (5) Iterative deepening depthfirst search (6) Bidirectional search (7) Comparing uninformed search strategies 3.5 avoiding repeated states Detection of repeated states Algorithm that forget their history are doomed to repeat it Closed list, stores every expanded node Open list, fringe of unexpanded nodes Graphsearch is more efficient than treesearch Graphsearch could miss an optimal solution. 3.6 searching with partial information Different types of incompleteness lead to three distinct problem types: Sensorless problem (conformant problems) Contingency problems Exploration problems Sensorless problem : Coercion Belief state To solve sensorless problem: search in the space of belief states rather than physical states Solution: a belief state that all of whose members are goal states Contingency problem: Agent can act before it has found a guaranteed plan. Rather than considering in advance every possible contingence that might arise during execution, it is often better to start acting and see which contingencies do arise. The agent can then continue to solve the problem, taking into account the additional info… This type of interleaving of search and execution is also useful for exploration problems and for game playing. 4 informed search and exploration 4.1 informed (heuristic) search strategies Bestfirst search, a node selected for expansion based on an evaluation function, f(n) A key component of these algorithm is a heuristic function, h(n), that is the estimated cost of the cheapest path from node n to a goal node. (1) GREEDY BESTFIRST SEARCH Expand the node that is closest to the goal It evaluates nodes by using just the heuristic function: f(n) = h(n) A * SEARCH: minimizing the total estimated solution cost F(n) = g(n)+h(n) g(n) is the cost to reach the node, h(n) is the cost to get from the node to the goal, f(n) is the estimated cost of the cheapest solution through n A* is complete and optimal Analyzing the optimality using TREESEARCH: A* is optimal if h(n) is an admissible heuristic, that is provided that h(n) never overestimates the cost to reach the goal. A* using TREESEARCH is optimal if h(n) is admissible. Analyzing the optimality using GRAPHSEARCH: Consistency (monotonicity): h(n) <= c(n,a,n’) + h(n’), which is general triangle inequality A* using GRAPHSEARCH is optimal if h(n) is consistent. If h(n) is consistent, then the values of f(n) along any path are nondecreasing. We can draw contours in the state space, just like the contours in a topographic map. I f C* is the cost of the optimal solution path, then we can say: A* expands all nodes with f(n) < C*. A* might then expend some of the nodes right on the “goal contour” (where f(n) = C*) before selecting a goal node. A* expends no nodes with f(n) > C*. Pruning, contours A* is optimally efficient for any given heuristic function. The number of nodes within the goal contour search space is still exponential in the length of the solution. Computation time is not A*’s main drawback. A* is not practical for many largescale problems. (2) Memorybounded heuristic search Iterativedeepening A* (IDA*) algorithm Recursive bestfirst search (RBFS) Keep track of the fvalue of the best alternative path available from any ancestor of the current node. Back up the value of the forgotten node to its parent. Somewhat more efficient than IDA*, but still suffers from excessive node regeneration. IDA* and RBFS suffer from using too little memory. MA*: memorybounded A* SMA*: simplified MA*, expands the best leaf and deletes the worst leaf. SMA* is complete if there is any reachable solution; might well be the best general purpose algorithm for finding optimal solution particularly when the state space is a graph, step costs are not uniform, and node generation is expensive compared to the additional overhead of maintaining the open and closed lists. Memory limitation can make a problem intractable from the point of view of computation time. No theory to explain the tradeoff between time and memory, the only way out is drop the optimality requirement. (3) learning to search better (???) Metalevel state space Objectlevel state space Metalevel learning (20110902) 4.2 heuristic function (1) the effect of heuristic accuracy on performance Effective branching factor b* A welldesigned heuristic would have a value of b* close to 1, allowing fairly large problem to be solved. Domination: for any nodes n, h2(n)>=h1(n) A* using heuristic function with larger value will never expand more nodes than A* using heuristic function with smaller value. It is always better to use a heuristic function with higher values, provided it does not overestimated and that the computation for the heuristic is not too large. (2) inventing admissible heuristic functions Relaxed problem: a problem with fewer restrictions on the actions The cost of an optimal solution to a relaxed problem is an admissible heuristic for the original problem. Heuristic of relaxed problem is also admissible and consistent. ABSOLVER: can generate heuristic automatically from problem definitions, using the relaxed problem method and various other techniques. Composite heuristic: h(n)=max{h1(n),…,hm(n)} Admissible heuristic can also be derived from the solution cost of a subproblem of a given problem. Pattern databases: store exact solution costs of every possible subproblem instance. Disjoint pattern databases: the sum of the two costs is still a lower bound on the cost of solving the entire problem Disjoint pattern databases work for slidingtile puzzles because the problem can divided up in such a way that each move affects only one subproblem. (3) learning heuristic from experience Inductive learning Features 4.3 local search algorithm and optimization problem In many problems, the path to the goal is irrelevant. Local search algorithms operate using a single current state, rather than multiple paths, and generally move only to neighbors of that state, do not worry about path at all. Two key advantages: They use very little memory They often find reasonable solutions in large or infinite state spaces for which systematic algorithm are unsuitable Useful for solving pure optimization problems State space landscape, location, elevation Global minimum, global maximum Complete local search algorithm: find a goal if one exists Optimal algorithm: find a global minimum/maximum (1) hillclimbing search Completestate formulation Greedy local search Hillclimbing often gets stuck for the following reasons: Local maxima Ridges Plateau, shoulder Stochastic hill climbing Firstchoice hill climbing: generating successors randomly until one is generated that is better than the current state. Hillclimbing are imcomplete for they can stuck on local maxima Randomrestart hill climbing: it is complete with probability approaching 1 If each hill climbing search has a probability p of success, then the expected number of restarts required is 1/q. The success of hill climbing depends very much on the shape of the statelandscape: if there are few local maxima and plateau, randomrestart hill climbing will find a good solution very quickly. (2) simulated annealing search A hillclimbing is imcomplete. A purely random walk is complete, but extremely inefficient. Simulated annealing: combine hill climbing and random walk. The simulated annealing is to start by shaking hard (i.e. at a high temperature) and then gradually reduce the intensity of the shaking (i.e. lower the temperature). simulated annealing: pick a random move, if the move improves the situation, it is always accepted. Otherwise, the move is accepted with some probability less than 1 (3) local beam search Keep track of k states rather than just one Begin with k randomly generated states, all the successors of all k states are generated, if any one is a goal, halts. Otherwise, selects the k best successors from the complete list and repeats In a local beam search, useful information is passed among the k parallel search threads. Stochastic beam search: chooses k successors at random with some probability (4) genetic algorithm Variant of stochastic, generate successors by combining two parent states Population individual is represented as a string over a finite alphabet Evaluation function, fitness function Crossover, mutation The primary advantage, if any, of genetic algorithm comes from the crossover operation. Schema, instance of a schema Genetic algorithm have had a widespread impact on optimization problems, such as circuit layout and jobshop scheduling. 4.4 local search in continuous spaces (?) Discretize Gradient, Empirical gradient Line search: extending the current gradient direction until f starts to decrease again Newtonraphson method Hessian matrix Constrained optimization Linear programming problem (20110903) 4.5 online search agents and unknown environments Offline search: Have to come up with an exponentially large contingency plan that considers all possible happenings Online search: interleaving computation and action Need only consider what actually does happen Necessary idea for an exploration problem Online search problems: Can be solved only by an agent executing actions, rather than by a purely computational process. Competitive ratio No algorithm can avoid dead ends in all state spaces. Adversary argument Safely explorable Online search agents: Online algorithm can expand only a node that it physically occupies. Depthfirst search has exactly this property. ONLINEDFSAGENT Iterative deepening Online local search Hillclimbing search is already an online search algorithm. Using random walk to explore the environment Learning real time A*, one important detail is that actions that have not yet been tried in a state s are always assumed to lead immediately to the goal with the least possible cost. This optimism under uncertainty encourages the agent to explore new, possibly promising paths. Learning in online search (20110904) 5 constraint satisfaction problems States Black box Constraint satisfaction problems 5.1 constraint satisfaction problems CSP Variables, constraints, domain, values, assignment Consistent or legal assignment Complete assignment Objective function Constraint graph Simplest kind of CSP: Discrete and finite domains Linear programming problems Unary constraint Binary constraint Auxiliary variables Constraint hypergraph Absolute constraint Preference constraint 5.2 backtracking search for CSPs Commutativity: the order of application of any given set of actions has no effect on the outcome All CSP search algorithms generate successors by considering possible assignments for only a single variable at each node in the search tree. Backtracking search Minimum remaining values (MRV) heuristic, it also has been called the “most constrained variable” or “failfirst” heuristic Initial selection: Degree heuristic, selecting the variable that is involved in the largest number of constraints on other unassigned variables Leastconstrainingvalue heuristic, prefer the value that rules out the fewest choices for the neighboring variables in the constraint graph Propagating information through constraints Forward checking Looks at each unassigned variable Y that is connected to X by a constraint and deletes from Y’s domain any value that is inconsistent with the value chosen for X Constraint propagation: Propagating the implications of a constraint on one variable onto other variables Arc consistency: Arc consistency checking can be applied either as a preprocessing step before the beginning of the search process, or as a propagation step after every assignment during search. kconsistency, node consistency, arc consistency, path consistency strongly kconsistent handling special constraints specialpurpose algorithm resource constraint, or atmost constraint boundconstraint, bounds propagation intelligent backtracking: looking backward chronological backtracking go all the way back to one of the set of variables that caused the failure, the conflict set backjumping, backtracks to the most recent variable in the conflict set (20110915) Every branch pruned by backjumping is also pruned by forward checking. Conflictdirected backjumping Constraint learning 5.3 local search for constraint satisfaction problems They use a completestate formulation: the initial state assigns a value to every variable, and the successor function usually works by changing the value of one variable at a time. Minconflicts heuristic The runtime of minconflicts is roughly independent of problem size. Local search can be used in an online setting when the problem changes. Repairing the airline schedule affected by bad weather: with minimum number of changes, easily done with local search algorithm starting from the current schedule 5.4 the structure of problems Independent subproblems Connected components Any treestructured CSP can be solved in time linear in the numberof variables. Two ways to reduce graphs to trees: First approach: assigning values to some variables so that the remaining variables form a tree Cycle cutest Finding the smallest cycle cutest is NPhard. Efficient approximation algorithm: cutest conditioning Second approach: tree decomposition techniques Constructing a tree decomposition of the constraint graph into a set of connected subproblems Tree decomposition techniques transform the CSP into a tree of subproblems and are efficient if the tree width of the constraint graph is small. 6 adversarial search 6.1 games Multiagent environments Contingency Competitive environments , in which the agents’ goals are in conflict, give rise to adversarial search problems, often known as games. In AI, games are usually of a rather specialized kind, what game theorists call deterministic, turntaking, twoplayer, zerosum games of perfect information. 6.2 optimal decisions games MAX and MIN, MAX moves first. A game can be defined as a kind of search problem with the following components: Initial state Successor function: returns a list of (move, state) Terminal test, terminal states Utility function (objective function, payoff function), Game tree Optimal strategies Minimax value Minimax decision The minimax algorithm Computers the minimax decision from the current state It uses a simple recursive computation of the minimax values of each successor state, directly implementing the defining equations. Minimax algorithm performs a complete depthfirst exploration of the game tree. Optimal decisions in multiplayer games This is straight forward from the technical viewpoint Replace the single value for each node with a vector of values Multiplayer gmaes usually involve alliances. 6.3 alphabeta pruning Pruning some leaves Pruning entire subtrees Effectiveness is highly dependenton the order in which the successors are examined. Transpositions, transposition table 6.4 imperfect, realtime decisions Shannon suggested to alter minimax or alphabeta in two way: Utility function is taken place by heuristic evaluation function Terminal test is taken place by cutoff test Features define various categories or equivalence, classes of states Material value Weighted linear function Quiescent, quiescence search Horizon effect Singular extensions Forward pruning 6.5 games that include an element of chance Expectminimax alue Card games “average over clairvoyancy” 6.6 state of the art game programs Chess: deep blue, IBM (20110916) 20110916: 7 logical agents 7.1 knowledgebased agents Central component: knowledge base Sentences, knowledge representation language Inference: deriving new sentences from old Logical agents: when one asks a question of the KB, the answer should follow from what has been told to the KB previously Background knowledge


来自： Robin Garden > 《基础知识》