MIT press provides another excellent book in creative commons.

Algorithms for decision making: free download book

I plan to buy it and I recommend you do. This book provides a broad introduction to algorithms for decision making under uncertainty.

The book takes an agent based approach

An *agent* is an entity that acts based on observations of its environment. Agents

may be physical entities, like humans or robots, or they may be nonphysical entities,

such as decision support systems that are implemented entirely in software.

The interaction between the agent and the environment follows an *observe-act cycle* or *loop*.

- The agent at time t receives an
*observation*of the environment - Observations are often incomplete or noisy;
- Based in the inputs, the agent then chooses an action at through some decision process.
- This action, such as sounding an alert, may have a nondeterministic effect on the environment.
- The book focusses on agents that interact intelligently to achieve their objectives over time.
- Given the past sequence of observations and knowledge about the environment, the agent must choose an action at that best achieves its objectives in the presence of various sources of uncertainty including:

*outcome uncertainty*, where the effects of our actions are uncertain,*model uncertainty*, where our model of the problem is uncertain,

3.*state uncertainty*, where the true state of the environment is uncertain, and*interaction uncertainty*, where the behavior of the other agents interacting in the environment is uncertain.

The book is organized around these four sources of uncertainty.

Making decisions in the presence of uncertainty is central to the field of *artificial intelligence*

**Table of contents is**

*Introduction*

Decision Making

Applications

Methods

History

Societal Impact

Overview

PROBABILISTIC REASONING

* Representation*

Degrees of Belief and Probability

Probability Distributions

Joint Distributions

Conditional Distributions

Bayesian Networks

Conditional Independence

Summary

Exercises

viii contents

* *

*Inference*

Inference in Bayesian Networks

Inference in Naive Bayes Models

Sum-Product Variable Elimination

Belief Propagation

Computational Complexity

Direct Sampling

Likelihood Weighted Sampling

Gibbs Sampling

Inference in Gaussian Models

Summary

Exercises

* Parameter Learning*

Maximum Likelihood Parameter Learning

Bayesian Parameter Learning

Nonparametric Learning

Learning with Missing Data

Summary

Exercises

* Structure Learning*

Bayesian Network Scoring

Directed Graph Search

Markov Equivalence Classes

Partially Directed Graph Search

Summary

Exercises

* *

*Simple Decisions*

Constraints on Rational Preferences

Utility Functions

Utility Elicitation

Maximum Expected Utility Principle

Decision Networks

Value of Information

Irrationality

Summary

Exercises

SEQUENTIAL PROBLEMS

* Exact Solution Methods*

Markov Decision Processes

Policy Evaluation

Value Function Policies

Policy Iteration

Value Iteration

Asynchronous Value Iteration

Linear Program Formulation

Linear Systems with Quadratic Reward

Summary

Exercises

*Approximate Value Functions*

Parametric Representations

Nearest Neighbor

Kernel Smoothing

Linear Interpolation

Simplex Interpolation

Linear Regression

Neural Network Regression

Summary

Exercises

* Online Planning*

Receding Horizon Planning

Lookahead with Rollouts

Forward Search

Branch and Bound

Sparse Sampling

Monte Carlo Tree Search

Heuristic Search

Labeled Heuristic Search

Open-Loop Planning

Summary

Exercises

* *

* Policy Search*

Approximate Policy Evaluation

Local Search

Genetic Algorithms

Cross Entropy Method

Evolution Strategies

Isotropic Evolutionary Strategies

Summary

Exercises

* Policy Gradient Estimation*

Finite Difference

Regression Gradient

Likelihood Ratio

Reward-to-Go

Baseline Subtraction

Summary

Exercises

*Policy Gradient Optimization*

Gradient Ascent Update

Restricted Gradient Update

Natural Gradient Update

Trust Region Update

Clamped Surrogate Objective

Summary

Exercises

* Actor-Critic Methods*

Actor-Critic

Generalized Advantage Estimation

Deterministic Policy Gradient

Actor-Critic with Monte Carlo Tree Search

Summary

* *

* Policy Validation*

Performance Metric Evaluation

Rare Event Simulation

Robustness Analysis

Trade Analysis

Adversarial Analysis

Summary

Exercises

MODEL UNCERTAINTY

* Exploration and Exploitation*

Bandit Problems

Bayesian Model Estimation

Undirected Exploration Strategies

Directed Exploration Strategies

Optimal Exploration Strategies

Exploration with Multiple States

Summary

Exercises

* Model-Based Methods*

Maximum Likelihood Models

Update Schemes

Exploration

Bayesian Methods

Bayes-adaptive MDPs

Posterior Sampling

Summary

Exercises

*Model-Free Methods*

Incremental Estimation of the Mean

Q-Learning

Sarsa

Eligibility Traces

Reward Shaping

Action Value Function Approximation

Experience Replay

Summary

Exercises

* *

* Imitation Learning*

Behavioral Cloning

Dataset Aggregation

Stochastic Mixing Iterative Learning

Maximum Margin Inverse Reinforcement Learning

Maximum Entropy Inverse Reinforcement Learning

Generative Adversarial Imitation Learning

Summary

Exercises

PART IV STATE UNCERTAINTY

*19 Beliefs* 373

Belief Initialization

Discrete State Filter

Linear Gaussian Filter

Extended Kalman Filter

Unscented Kalman Filter

Particle Filter

Particle Injection

Summary

Exercises

*20 Exact Belief State Planning* 399

Belief-State Markov Decision Processes

Conditional Plans

Alpha Vectors

Pruning

Value Iteration

Linear Policies

Summary

Exercises

*Offline Belief State Planning*

Fully Observable Value Approximation

Fast Informed Bound

Fast Lower Bounds

Point-Based Value Iteration

Randomized Point-Based Value Iteration

Sawtooth Upper Bound

Point Selection

Sawtooth Heuristic Search

Triangulated Value Functions

Summary

Exercises

*Online Belief State Planning*

Lookahead with Rollouts

Forward Search

Branch and Bound

Sparse Sampling

Monte Carlo Tree Search

Determinized Sparse Tree Search

Gap Heuristic Search

Summary

Exercises

*Controller Abstractions*

Controllers

Policy Iteration

Nonlinear Programming

Gradient Ascent

Summary

Exercises

PART V MULTIAGENT SYSTEMS

*Multiagent Reasoning*

Simple Games

Response Models

Dominant Strategy Equilibrium

Nash Equilibrium

Correlated Equilibrium

Iterated Best Response

Hierarchical Softmax

Fictitious Play

Gradient Ascent

Summary

Exercises

*Sequential Problems*

Markov Games

Response Models

Nash Equilibrium

Fictitious Play

Gradient Ascent

Nash Q-Learning

Summary

Exercises

*State Uncertainty*

Partially Observable Markov Games

Policy Evaluation

Nash Equilibrium

Dynamic Programming

Summary

Exercises

*Collaborative Agents*

Decentralized Partially Observable Markov Decision Processes

Subclasses

Dynamic Programming

Iterated Best Response

Heuristic Search

Nonlinear Programming

Summary

Exercises

APPENDICES

*Mathematical Concepts*

Measure Spaces

Probability Spaces

Metric Spaces

Normed Vector Spaces

Positive Definiteness

Convexity

Information Content

Entropy

Cross Entropy

Relative Entropy

Gradient Ascent

Taylor Expansion

Monte Carlo Estimation

Importance Sampling

Contraction Mappings

Graphs

* Probability Distributions*

* Computational Complexity*

Asymptotic Notation

Time Complexity Classes

Space Complexity Classes

Decideability

* *

* Neural Representations*

Neural Networks

Feedforward Networks

Parameter Regularization

Convolutional Neural Networks

Recurrent Networks

Autoencoder Networks

Adversarial Networks

* Search Algorithms*

Search Problems

Search Graphs

Forward Search

Branch and Bound

Dynamic Programming

Heuristic Search

* Problems*

Hex World

2048

Cart-Pole

Mountain Car

Simple Regulator

Aircraft Collision Avoidance

Crying Baby

Machine Replacement

Catch

F.10 Prisoner’s Dilemma

Rock-Paper-Scissors

Traveler’s Dilemma

Predator-Prey Hex World

Multi-Caregiver Crying Baby

Collaborative Predator-Prey Hex World

* *

* Julia*

Types

Functions

Control Flow

Packages

Convenience Functions

Book link