Probability, Markov Chains, Queues, and Simulation: The Mathematical Basis of Performance Modeling
2/5
()
About this ebook
Probability, Markov Chains, Queues, and Simulation provides a modern and authoritative treatment of the mathematical processes that underlie performance modeling. The detailed explanations of mathematical derivations and numerous illustrative examples make this textbook readily accessible to graduate and advanced undergraduate students taking courses in which stochastic processes play a fundamental role. The textbook is relevant to a wide variety of fields, including computer science, engineering, operations research, statistics, and mathematics.
The textbook looks at the fundamentals of probability theory, from the basic concepts of set-based probability, through probability distributions, to bounds, limit theorems, and the laws of large numbers. Discrete and continuous-time Markov chains are analyzed from a theoretical and computational point of view. Topics include the Chapman-Kolmogorov equations; irreducibility; the potential, fundamental, and reachability matrices; random walk problems; reversibility; renewal processes; and the numerical computation of stationary and transient distributions. The M/M/1 queue and its extensions to more general birth-death processes are analyzed in detail, as are queues with phase-type arrival and service processes. The M/G/1 and G/M/1 queues are solved using embedded Markov chains; the busy period, residual service time, and priority scheduling are treated. Open and closed queueing networks are analyzed. The final part of the book addresses the mathematical basis of simulation.
Each chapter of the textbook concludes with an extensive set of exercises. An instructor's solution manual, in which all exercises are completely worked out, is also available (to professors only).
- Numerous examples illuminate the mathematical theories
- Carefully detailed explanations of mathematical derivations guarantee a valuable pedagogical approach
- Each chapter concludes with an extensive set of exercises
Related to Probability, Markov Chains, Queues, and Simulation
Related ebooks
Numerical Methods for Stochastic Computations: A Spectral Method Approach Rating: 5 out of 5 stars5/5Experiments with Mixtures: Designs, Models, and the Analysis of Mixture Data Rating: 5 out of 5 stars5/5Simulation Rating: 3 out of 5 stars3/5Introduction to Dynamic Programming: International Series in Modern Applied Mathematics and Computer Science, Volume 1 Rating: 0 out of 5 stars0 ratingsDynamic Models in Biology Rating: 3 out of 5 stars3/5Introduction to Numerical Analysis: Second Edition Rating: 4 out of 5 stars4/5Nonlinear Dynamics: Exploration Through Normal Forms Rating: 5 out of 5 stars5/5Machine Learning: A Bayesian and Optimization Perspective Rating: 3 out of 5 stars3/5Introduction to Applied Statistical Signal Analysis: Guide to Biomedical and Electrical Engineering Applications Rating: 0 out of 5 stars0 ratingsHidden Markov Processes: Theory and Applications to Biology Rating: 5 out of 5 stars5/5Mathematical Methods in Computer Aided Geometric Design II Rating: 0 out of 5 stars0 ratingsKalman Filters: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsIntroduction to Bayesian Statistics Rating: 0 out of 5 stars0 ratingsIntroduction to Computational Science: Modeling and Simulation for the Sciences - Second Edition Rating: 3 out of 5 stars3/5Experimentation for Engineers: From A/B testing to Bayesian optimization Rating: 0 out of 5 stars0 ratingsConformal Prediction for Reliable Machine Learning: Theory, Adaptations and Applications Rating: 0 out of 5 stars0 ratingsMathematical Methods of Statistics (PMS-9), Volume 9 Rating: 3 out of 5 stars3/5Nonlinear Optimization Rating: 5 out of 5 stars5/5Modern Anti-windup Synthesis: Control Augmentation for Actuator Saturation Rating: 5 out of 5 stars5/5Computational Number Theory and Modern Cryptography Rating: 3 out of 5 stars3/5Quadratic Form Theory and Differential Equations Rating: 0 out of 5 stars0 ratingsThe EM Algorithm and Extensions Rating: 0 out of 5 stars0 ratingsModeling with Data: Tools and Techniques for Scientific Computing Rating: 3 out of 5 stars3/5Introduction to Nature-Inspired Optimization Rating: 0 out of 5 stars0 ratingsEmerging Trends in Image Processing, Computer Vision and Pattern Recognition Rating: 0 out of 5 stars0 ratingsStochastic Analysis of Mixed Fractional Gaussian Processes Rating: 0 out of 5 stars0 ratingsApplied Automata Theory Rating: 0 out of 5 stars0 ratingsDiscrete Cosine Transform: Algorithms, Advantages, Applications Rating: 0 out of 5 stars0 ratingsFoundations of Data Intensive Applications: Large Scale Data Analytics under the Hood Rating: 0 out of 5 stars0 ratingsC4.5: Programs for Machine Learning Rating: 3 out of 5 stars3/5
Data Modeling & Design For You
The Secrets of ChatGPT Prompt Engineering for Non-Developers Rating: 5 out of 5 stars5/5Supercharge Power BI: Power BI is Better When You Learn To Write DAX Rating: 5 out of 5 stars5/5Thinking in Algorithms: Strategic Thinking Skills, #2 Rating: 5 out of 5 stars5/5Data Analytics for Beginners: Introduction to Data Analytics Rating: 4 out of 5 stars4/5Learn T-SQL Querying: A guide to developing efficient and elegant T-SQL code Rating: 0 out of 5 stars0 ratingsMastering Agile User Stories Rating: 4 out of 5 stars4/5Data Visualization: a successful design process Rating: 4 out of 5 stars4/5A Concise Guide to Object Orientated Programming Rating: 0 out of 5 stars0 ratingsData Analytics with Python: Data Analytics in Python Using Pandas Rating: 3 out of 5 stars3/5Living in Data: A Citizen's Guide to a Better Information Future Rating: 4 out of 5 stars4/5Think Like a Data Scientist: Tackle the data science process step-by-step Rating: 0 out of 5 stars0 ratingsDAX Patterns: Second Edition Rating: 5 out of 5 stars5/5Raspberry Pi :Raspberry Pi Guide On Python & Projects Programming In Easy Steps Rating: 3 out of 5 stars3/5The Systems Thinker - Mental Models: The Systems Thinker Series, #3 Rating: 0 out of 5 stars0 ratingsWordPress For Beginners - How To Set Up A Self Hosted WordPress Blog Rating: 0 out of 5 stars0 ratingsLogic Design: A Review Of Theory And Practice Rating: 0 out of 5 stars0 ratingsLearning Cypher Rating: 0 out of 5 stars0 ratingsNeural Networks: Neural Networks Tools and Techniques for Beginners Rating: 5 out of 5 stars5/5Bayesian Analysis with Python Rating: 5 out of 5 stars5/5Brainstorming and Beyond: A User-Centered Design Method Rating: 0 out of 5 stars0 ratingsPrinciples of Data Science Rating: 4 out of 5 stars4/5150 Most Poweful Excel Shortcuts: Secrets of Saving Time with MS Excel Rating: 3 out of 5 stars3/5Graph Databases in Action: Examples in Gremlin Rating: 0 out of 5 stars0 ratingsText as Data: A New Framework for Machine Learning and the Social Sciences Rating: 0 out of 5 stars0 ratings
Related categories
Reviews for Probability, Markov Chains, Queues, and Simulation
1 rating0 reviews
Book preview
Probability, Markov Chains, Queues, and Simulation - William J. Stewart
Index
Preface and Acknowledgments
This book has been written to provide a complete, yet elementary and pedagogic, treatment of the mathematical basis of systems performance modeling. Performance modeling is of fundamental importance to many branches of the mathematical sciences and engineering as well as to the social and economic sciences. Advances in methodology and technology have now provided the wherewithal to build and solve sophisticated models. The purpose of this book is to provide the student and teacher with a modern approach for building and solving probability based models with confidence.
The book is divided into four major parts, namely, Probability,
Markov Chains,
Queueing Models,
and Simulation.
The eight chapters of Part I provide the student with a comprehensive and thorough knowledge of probability theory. Part I is self-contained and complete and should be accessible to anyone with a basic knowledge of calculus. Newcomers to probability theory as well as those whose knowledge of probability is rusty should be equally at ease in their progress through Part I. The first chapter provides the fundamental concepts of set-based probability and the probability axioms. Conditional probability and independence are stressed as are the laws of total probability and Bayes’ rule. Chapter 2 introduces combinatorics—the art of counting—which is so important for the correct evaluation of probabilities. Chapter 3 introduces the concepts of random variables and distribution functions including functions of a random variable and conditioned random variables. This chapter prepares the ground work for Chapters 4 and 5: Chapter 4 introduces joint and conditional distributions and Chapter 5 treats expectations and higher moments. Discrete distribution functions are the subject of Chapter 6 while their continuous counterparts, continuous distribution functions, are the subject of Chapter 7. Particular attention is paid to phase-type distributions due to the important role they play in modeling scenarios and the chapter also includes a section on fitting phase-type distributions to given means and variances. The final chapter in Part I is devoted to bounds and limit theorems, including the laws of large numbers and the central limit theorem.
Part II contains two rather long chapters on the subject of Markov chains, the first on theoretical aspects of Markov chains, and the second on their numerical solution. In Chapter 9, the basic concepts of discrete and continuous-time Markov chains and their underlying equations and properties are discussed. Special attention is paid to irreducible Markov chains and to the potential, fundamental, and reachability matrices in reducible Markov chains. This chapter also contains sections on random walk problems and their applications, the property of reversibility in Markov chains, and renewal processes. Chapter 10 deals with numerical solutions, from Gaussian elimination and basic iterative-type methods for stationary solutions to ordinary differential equation solvers for transient solutions. Block methods and iterative aggregation-disaggregation methods for nearly completely decomposable Markov chains are considered. A section is devoted to matrix geometric and matrix analytic methods for structured Markov chains. Algorithms and computational considerations are stressed throughout this chapter.
Queueing models are presented in the five chapters that constitute Part III. Elementary queueing theory is presented in Chapter 11. Here an introduction to the basic terminology and definitions is followed by an analysis of the simplest of all queueing models, the M/M/1 queue. This is then generalized to birth-death processes, which are queueing systems in which the underlying Markov chain matrix is tridiagonal. Chapter 12 deals with queues in which the arrival process need no longer be Poisson and the service time need not be exponentially distributed. Instead, interarrival times and service times can be represented by phase-type distributions and the underlying Markov chain is now block tridiagonal. The following chapter, Chapter 13, explores the z-transform approach for solving similar types of queues. The M/G/1 and G/M/1 queues are the subject of Chapter 14. The approach used is that of the embedded Markov chain. The Pollaczek-Khintchine mean value and transform equations are derived and a detailed discussion of residual time and busy period follows. A thorough discussion of nonpreemptive and preempt-resume scheduling policies as well as shortest-processing-time-first scheduling is presented. An analysis is also provided for the case in which only a limited number of customers can be accommodated in both the M/G/1 and G/M/1 queues. The final chapter of Part III, Chapter 15, treats queueing networks. Open networks are introduced via Burke’s theorem and Jackson’s extensions to this theorem. Closed queueing networks are treated using both the convolution algorithm and the mean value approach. The flow-equivalent server
approach is also treated and its potential as an approximate solution procedure for more complex networks is explored. The chapter terminates with a discussion of product form in queueing networks and the BCMP theorem for open, closed, and mixed networks.
The final part of the text, Part IV, deals with simulation. Chapter 16 explores how uniformly distributed random numbers can be applied to obtain solutions to probabilistic models and other time-independent problems—the Monte Carlo
aspect of simulation. Chapter 17 describes the modern approaches for generating uniformly distributed random numbers and how to test them to ensure that they are indeed uniformly distributed and independent of each other. The topic of generating random numbers that are not uniformly distributed, but satisfy some other distribution such as Erlang or normal, is dealt with in Chapter 18. A large number of possibilities exist and not all are appropriate for every distribution. The next chapter, Chapter 19, provides guidelines for writing simulation programs and a number of examples are described in detail. Chapter 20 is the final chapter in the book. It concerns simulation measurement and accuracy and is based on sampling theory. Special attention is paid to the generation of confidence intervals and to variance reduction techniques, an important means of keeping the computational costs of simulation to a manageable level.
The text also includes two appendixes; the first is just a simple list of the letters of the Greek alphabet and their spellings; the second is a succinct, yet complete, overview of the linear algebra used throughout the book.
Genesis and Intent
This book saw its origins in two first-year graduate level courses that I teach, and have taught for quite some time now, at North Carolina State University. The first is entitled An Introduction to Performance Evaluation;
it is offered by the Computer Science Department and the Department of Electrical and Computer Engineering. This course is required for our networking degrees. The second is a course entitled Queues and Stochastic Service Systems
and is offered by the Operations Research Program and the Industrial and Systems Engineering Department. It follows then that this book has been designed for students from a variety of academic disciplines in which stochastic processes constitute a fundamental concept, disciplines that include not only computer science and engineering, industrial engineering, and operations research, but also mathematics, statistics, economics, and business, the social sciences—in fact all disciplines in which stochastic performance modeling plays a primary role. A calculus-based probability course is a prerequisite for both these courses so it is expected that students taking these classes are already familiar with probability theory. However, many of the students who sign up for these courses are returning students, and it is often the case that it has been several years and in some cases a decade or more, since they last studied probability. A quick review of probability is hardly sufficient to bring them up to the required level. Part I of the book has been designed with them in mind. It provides the prerequisite probability background needed to fully understand and appreciate the material in the remainder of the text. The presentation, with its numerous examples and exercises, is such that it facilitates an independent review so the returning student in a relatively short period of time, preferably prior to the beginning of class, will once again have mastered probability theory. Part I can then be used as a reference source as and when needed.
The entire text has been written at a level that is suitable for upper-level undergraduate students or first-year graduate students and is completely self-contained. The entirety of the text can be covered in a two-semester sequence, such as the stochastic processes sequence offered by the Industrial Engineering (IE) Department and the Operations Research (OR) Program at North Carolina State University. A two-semester sequence is appropriate for classes in which students have limited (or no) exposure to probability theory. In such cases it is recommended that the first semester be devoted to the Chapters 1–8 on probability theory, the first five sections of Chapter 9, which introduce the fundamental concepts of discrete-time Markov chains, and the first three sections of Chapter 11, which concern elementary queueing theory. With this background clearly understood, the student should have no difficulty in covering the remaining topics of the text in the second semester.
The complete content of Parts II–IV might prove to be a little too much for some one-semester classes. In this case, an instructor might wish to omit the later sections of Chapter 10 on the numerical solution of Markov chains, perhaps covering only the basic direct and iterative methods. In this case the material of Chapter 12 should also be omitted since it depends on a knowledge of the matrix geometric method of Chapter 10. Because of the importance of computing numerical solutions, it would be a mistake to omit Chapter 10 in its entirety. Some of the material in Chapter 18 could also be eliminated: for example, an instructor might include only the first three sections of this chapter. In my own case, when teaching the OR/IE course, I concentrate on covering all of the Markov chain and queueing theory chapters. These students often take simulation as an individual course later on. When teaching the computer science and engineering course, I omit some of the material on the numerical solution of Markov chains so as to leave enough time to cover simulation.
Numerous examples with detailed explanations are provided throughout the text. These examples are designed to help the student more clearly understand the theoretical and computational aspects of the material and to be in a position to apply the acquired knowledge to his/her own areas of interest. A solution manual is available for teachers who adopt this text for their courses. This manual contains detailed explanations of the solution of all the exercises.
Where appropriate, the text contains program modules written in Matlab or in the Java programming language. These programs are not meant to be robust production code, but are presented so that the student may experiment with the mathematical concepts that are discussed. To free the student from the hassle of copying these code segments from the book, a listing of all of the code used can be freely downloaded from the web page:
http://press.princeton.edu/titles/8844.html
Acknowledgments
As mentioned just a moment ago, this book arose out of two courses that I teach at North Carolina State University. It is, therefore, ineluctable that the students who took these courses contributed immeasurably to its content and form. I would like to express my gratitude to them for their patience and input. I would like to cite, in particular, Nishit Gandhi, Scott Gerard, Rong Huang, Kathryn Peding, Amirhosein Norouzi, Robert Shih, Hui Wang, Song Yang, and Shengfan Zhang for their helpful comments. My own doctoral students, Shep Barge, Tugrul Dayar, Amy Langville, Ning Liu, and Bin Peng, were subjected to different versions of the text and I owe them a particular expression of thanks. One person who deserves special recognition is my daughter Kathryn, who allowed herself to be badgered by her father into reading over selected probability chapters.
I would like to thank Vickie Kearn and the editorial and production staff at Princeton University Press for their help and guidance in producing this book. It would be irresponsible of me not to mention the influence that my teachers, colleagues, and friends have had on me. I owe them a considerable debt of gratitude for helping me understand the vital role that mathematics plays, not only in performance modeling, but in all aspects of life.
Finally, and most of all, I would like to thank my wife Kathie and our four children, Nicola, Stephanie, Kathryn, and William, for all the love they have shown me over the years.
Part I
PROBABILITY
Chapter 1
Probability
1.1 Trials, Sample Spaces, and Events
The notions of trial, sample space, and event are fundamental to the study of probability theory. Tossing a coin, rolling a die, and choosing a card from a deck of cards are examples that are frequently used to explain basic concepts of probability. Each toss of the coin, roll of the die, or choice of a card is called a trial or experiment. We shall use the words trial and experiment interchangeably. Each execution of a trial is called a realization of the probability experiment.
At the end of any trial involving the examples given above, we are left with a head or a tail, an integer from one through six, or a particular card, perhaps the queen of hearts. The result of a trial is called an outcome. The set of all possible outcomes of a probability experiment is called the sample space . The outcomes that constitute a sample space are also referred to as sample points or elements. We shall use ω to denote an element of the sample space.
Example 1.1 The sample space for coin tossing has two sample points, a head (H) and a tail (T= {H, T}, as shown in Figure 1.1.
Figure 1.1. Sample space for tossing a coin has two elements {H, T}.
Example 1.2 = {1, 2, 3, 4, 5, 6}, Figure 1.2, which represents the number of spots on the six faces of the die.
Figure 1.2. Sample space for throwing a die has six elements {1, 2, 3, 4, 5, 6}.
Example 1.3 For choosing a card, the sample space is a set consisting of 52 elements, one for each of the 52 cards in the deck, from the ace of spades through the king of hearts.
Example 1.4 If an experiment consists of three tosses of a coin, then the sample space is given by
Notice that the element HHT is considered to be different from the elements HTH and THH, even though all three tosses give two heads and one tail. The position in which the tail occurs is important.
A sample space may be finite, denumerable (i.e., infinite but countable), or infinite. Its elements depend on the experiment and how the outcome of the experiment is defined. The four illustrative examples given above all have a finite number of elements.
Example 1.5 The sample space derived from an experiment that consists of observing the number of email messages received at a government office in one day may be taken to be denumerable. The sample space is denumerable since we may tag each arriving email message with a unique integer n is the set of nonnegative integers.
Example 1.6 The sample space that arises from an experiment consisting of measuring the time one waits at a bus stop is infinite. Each outcome is a nonnegative real number x .
If a finite number of trials is performed, then, no matter how large this number may be, there is no guarantee that every element of its sample space will be realized, even if the sample space itself is finite. This is a direct result of the essential probabilistic nature of the experiment. For example, it is possible, though perhaps not very likely (i.e., not very probable) that after a very large number of throws of the die, the number 6 has yet to appear.
Notice with emphasis that the sample space is a set; the subsets {H}, {T. = {H, Tis
Events
The word event by itself conjures up the image of something having happened, and this is no different in probability theory. We toss a coin and get a head, we throw a die and get a five, we choose a card and get the ten of diamonds. Each experiment has an outcome, and in these examples, the outcome is an element of the sample space. These, the elements of the sample space, are called the elementary events of the experiment. However, we would like to give a broader meaning to the term event.
Example 1.7 Consider the event of tossing three coins and getting exactly two heads. There are three outcomes that allow for this event, namely, {HHT, HTH, THH}. The single tail appears on the third, second, or first toss, respectively.
Example 1.8 Consider the event of throwing a die and getting a prime number. Three outcomes allow for this event to occur, namely, {2, 3, 5}. This event comes to pass so long as the throw gives neither one, four, nor six spots.
In these last two examples, we have composed an event as a subset of the sample space, the subset {HHT, HTH, THH} in the first case and the subset {2, 3, 5} in the second. This is how we define an event in general. Rather than restricting our concept of an event to just another name for the elements of the sample space, we think of events as subsets of the sample space. In this case, the elementary events are the singleton subsets of the sample space, the subsets {H}, {5}, and {10 of diamonds}, for example. More complex events consist of subsets with more than one outcome. Defining an event as a subset of the sample space and not just as a subset that contains a single element provides us with much more flexibility and allows us to define much more general events.
The event is said to occur
if and only if, the outcome of the experiment is any one of the elements of the subset that constitute the event. They are assigned names to help identify and manipulate them.
Example 1.9 occurs if the outcome of the trial, the number of spots obtained when the die is thrown, is any one of the numbers 4, 5, and 6. This is illustrated in Figure 1.3.
: Throw a number greater than 3.
Example 1.10 occurs if the card chosen is one of these four.
Example 1.11 = {2 ≤ t occurs if the wait is 2.1 minutes or 3.5 minutes or 9.99 minutes, etc.
To summarize, the standard definition of an event is a subset of the sample space. It consists of a set of outcomes. The null (or empty) subset, which contains none of the sample points, and the subset containing the entire sample space are legitimate events—the first is called the null
or impossible event (it can never occur); the second is called the universal
or certain event and is sure to happen no matter what the outcome of the experiments gives. The execution of a trial, or observation of an experiment, must yield one and only one of the outcomes in the sample space. If a subset contains none of these outcomes, the event it represents cannot happen; if a subset contains all of the outcomes, then the event it represents must happen. In general, for each outcome in the sample space, either the event occurs (if that particular outcome is in the defining subset of the event) or it does not occur.
defined on the same sample space are said to be equivalent or identical .
Example 1.12 Consider an experiment that consists of simultaneously throwing two dice. The sample space consists of all pairs of the form (i, j) for i = 1, 2, . . . , 6 and j be the event that the sum of the number of spots obtained on the two dice is even, i.e., i + j be the event that both dice show an even number of spots or both dice show an odd number of spots, i.e., i and j are even or i and j .
Viewing events as subsets allows us to apply typical set operations to them, operations such as set union, set intersection, set complementation, and so on.
is an event, then the complement that are not does not occur.
2. The union . It occurs if either occurs.
3. The intersection and occurs if both occur.
4. The difference . This means that
has also occurred.
Example 1.13 occurs if a 1, 3, or 5 is thrown. Thus both events occur if a 5 is thrown (this is the event that is the intersection ) and neither event occurs if a 2 is thrown (this is the event that is the complement ). These are represented graphically in Figure 1.4. We have
Figure 1.4. Two events on the die-throwing sample space.
Example 1.14 occurs if any one of these 16 cards is chosen: i.e., if one of the 13 diamond cards is chosen or occurs only if a diamond card is chosen and occurs if any diamond card, other than the queen of diamonds, occurs.
.
The definitions of union and intersection may be extended to more than two events. For n they are denoted, respectively, by
):
Venn diagrams can be used to illustrate these results and can be helpful in establishing proofs. For example, an illustration of DeMorgan’s laws is presented in Figure 1.5.
Figure 1.5. DeMorgan’s laws:
Mutually Exclusive and Collectively Exhaustive Events
is the null set), the events are said to be mutually exclusive or incompatible.
Example 1.15 has any element in common. The four sets are mutually exclusive.
Example 1.16 are mutually exclusive.
are mutually exclusive. In general, a list of events is said to be mutually exclusive if no element in their sample space is in more than one event. This is illustrated in Figure 1.6.
When all the elements in a sample space can be found in at least one event in a list of events, then the list of events is said to be collectively exhaustive. In this case, no element of the sample space is omitted and a single element may be in more than one event. This is illustrated in Figure 1.7.
are mutually exclusive.
are collectively exhaustive.
Events that are both mutually exclusive and collectively exhaustive, such as those illustrated in Figure 1.8, are said to form a partition are both mutually exclusive and collectively exhaustive and constitute a partition of the sample space. Furthermore, since the elementary events (or outcomes) of a sample space are mutually exclusive and collectively exhaustive they too constitute a partition of the sample space. Any set of mutually exclusive and collectively exhaustive events is called an event space.
constitute a partition.
Example 1.17 Bit sequences are transmitted over a communication channel in groups of five. Each bit may be received correctly or else be modified in transit, which occasions an error. Consider an experiment that consists in observing the bit values as they arrive and identifying them with the letter c if the bit is correct and with the letter e if the bit is in error.
The sample space consists of 32 outcomes from ccccc through eeeee, consist of all outcomes in which i bits are in error. Thus
partition the sample space and therefore constitute an event space. It may be much easier to work in this small event space rather than in the larger sample space, especially if our only interest is in knowing the number of bits transmitted in error. Furthermore, when the bits are transmitted in larger groups, the difference becomes even more important. With 16 bits per group instead of five, the event space now contains 17 events, whereas the sample space contains 2¹⁶ outcomes.
1.2 Probability Axioms and Probability Space
Probability Axioms
So far our discussion has been about trials, sample spaces, and events. We now tackle the topic of probabilities. Our concern will be with assigning probabilities to events, i.e., providing some measure of the relative likelihood of the occurrence of the event. We realize that when we toss a fair coin, we have a 50−50 chance that it will give a head. When we throw a fair die, the chance of getting a 1 is the same as that of getting a 2, or indeed any of the other four possibilities. If a deck of cards is well shuffled and we pick a single card, there is a one in 52 chance that it will be the queen of hearts. What we have done in these examples is to associate probabilities with the elements of the sample space; more correctly, we have assigned probabilities to the elementary events, the events consisting of the singleton subsets of the sample space.
Probabilities are real numbers in the closed interval [0, 1]. The greater the value of the probability, the more likely the event is to happen. If an event has probability zero, that event cannot occur; if it has probability one, then it is certain to occur.
Example 1.18 In the coin-tossing example, the probability of getting a head in a single toss is 0.5, since we are equally likely to get a head as we are to get a tail. This is written as
is the event {H}.
Similarly, the probability of throwing a 6 with a die is 1/6 and the probability of choosing the queen of hearts is 1/52. In these cases, the elementary events of each sample space all have equal probability, or equal likelihood, of being the outcome on any given trial. They are said to be equiprobable events and the outcome of the experiment is said to be random, since each event has the same chance of occurring. In a sample space containing n equally likely outcomes, the probability of any particular outcome occurring is 1/n. Naturally, we can assign probabilities to events other than elementary events.
Example 1.19 is the sum of the probabilities of these three elementary events and is therefore equal to 0.5.
This holds in general: the probability of any event is simply the sum of the probabilities associated with the (elementary) elements of the sample space that constitute that event.
Example 1.20 Consider Figure 1.6 once again (reproduced here as Figure 1.9), and assume that each of the 24 points or elements of the sample space is equiprobable.
Figure 1.9. Sample space with 24 equiprobable elements.
contains eight elements, and so the probability of this event is
} = 8/24 = 1/3.
Assigning probabilities to events is an extremely important part of developing probability models. In some cases, we know in advance the probabilities to associate with elementary events, while in other cases they must be estimated. If we assume that the coin and the die are fair and the deck of cards completely shuffled, then it is easy to associate probabilities with the elements of the sample space and subsequently to the events described on these sample spaces. In other cases, the probabilities must be guessed at or estimated.
Two approaches have been developed for defining probabilities: the relative frequency approach and the axiomatic approach. The first, as its name implies, consists in performing the probability experiment a great many times, say N, and counting the number of times a certain event occurs, say n. An estimate of the probability of the event may then be obtained as the relative frequency n/N with which the event occurs, since we would hope that, in the limit (limit in a probabilistic sense) as N → ∞, the ratio n/N tends to the correct probability of the event. In mathematical terms, this is stated as follows: Given that the probability of an event is p, then
to be, the probability that the difference between n/N and p tends to zero as N → ∞. Use of relative frequencies as estimates of probability can be justified mathematically, as we shall see later.
The axiomatic approach sets up a small number of laws or axioms on which the entire theory of probability is based. Fundamental to this concept is the fact that it is possible to manipulate probabilities using the same logic algebra with which the events themselves are manipulated. The three basic axioms are as follows.
Axiom 1: } ≤ 1; i.e., probabilities are real numbers in the interval [0, 1].
Axiom 2: } = 1; The universal or certain event is assigned probability 1.
Axiom 3: For any countable collection , . . . that are mutually exclusive,
In some elementary texts, the third axiom is replaced with the simpler
Axiom 3†:
and a comment included stating that this extends in a natural sense to any finite or denumerable number of mutually exclusive events.
These three axioms are very natural; the first two are almost trivial, which essentially means that all of probability is based on unions of mutually exclusive events. To gain some insight, consider the following examples.
Example 1.21 } = p} = pare given by
Example 1.22 will be less than p1 + pwill be greater than zero; it will be the sum of the probabilities of the elementary events found in the intersection of the two subsets. It follows then that
, formed from the union of a set of mutually exclusive events, is equal to the sum of the probabilities of those mutually exclusive events, i.e.,
In particular, the probability of any event is equal to the sum of the probabilities of the outcomes in the sample space that constitute the event since outcomes are elementary events which are mutually exclusive.
A number of the most important results that follow from these definitions are presented below. The reader should make an effort to prove these independently.
} = 1.
} = 1).
are any events, not necessarily mutually exclusive,
}.
,
,
It is interesting to observe that an event having probability zero does not necessarily mean that this event cannot occur. The probability of no heads appearing in an infinite number of throws of a fair coin is zero, but this event can occur.
Probability Space
The set of subsets of a given set, which includes the empty subset and the complete set itself, is sometimes referred to as the superset or power set that is closed under countable unions and complementation is called a σ-field . The term σ-algebra is also used. Using DeMorgan’s law, it may be shown that countable intersections of subsets of a σ.
Example 1.23 } is the smallest a σ-field defined on a sample space. It is sometimes called the trivial σand is a subset of every other a σis the largest a σ.
Example 1.24 is a σ-field.
Example 1.25 In a die-rolling experiment having sample space {1, 2, 3, 4, 5, 6}, the following are all σ-field:
, {1, 2, 4, 6}, {3, 4, 5}} are not.
We may now define a probability space or probability systemis a σ.
1.3 Conditional Probability
} is the prior probability occurring must be altered.
Example 1.26 Let us return to the example in which we consider the probabilities obtained on three throws of a fair coin. The elements of the sample space are
= {HHH} = 1/8. Now, how do the probabilities change if we know the result of the first throw?
= {THH, THT, TTH, TTT} and we know that we are not going to get our three heads! Once we know that the result of the first throw is tails, the event of interest becomes impossible, i.e., has probability zero.
= {HHH, HHT, HTH, HTT= {HHH} is still possible. The question we are now faced with is to determine the probability of getting {HHH} given that we know that the first throw gives a head. Obviously the probability must now be greater than 1/8. All we need to do is to get heads on the second and third throws, each of which is obtained with probability 1/2. Thus, given that the first throw yields heads, the probability of getting the event HHH contains four equiprobable outcomes and is known to have occurred. It follows that the probability of any one of these four equiprobable outcomes, and in particular that of HHH, is 1/4.
The effect of knowing that a certain event has occurred changes the original probabilities of other events defined on the sample space. Some of these may become zero; for some others, their associated probability is increased. For yet others, there may be no change.
Example 1.27 Consider are equiprobable, after renormalization, they must each have probability 1/12.
Figure 1.10. Sample space with 24 equiprobable elements.
has occurred. Because of the need to renormalize the probabilities so that they continue to sum to 1 after this given event has taken place, we must have
} is called the conditional probability of event given the hypothesis } ≠ 0. Notice that a rearrangement of Equation (1.1) gives
Similarly,
} > 0.
Since conditional probabilities are probabilities in the strictest sense of the term, they satisfy all the properties that we have seen so far concerning ordinary probabilities. In addition, the following hold:
} = 0.
} = 1.
Example 1.28 }, the probability that a red queen is pulled given that a red card is chosen, is
} are prior probabilitiescontains 26 of the 52 possible outcomes.
Example 1.29 If we observe .
} = 1/3.
, i = 1, 2, . . . , k, be k Then
The proof is by induction. The base clause (k = 2) follows from Equation (1.2):
and assume the relation is true for k, i.e., that
That the relation is true for k + 1 follows immediately, since
Example 1.30 occur, and using the relationship
we obtain
1.4 Independent Events
We saw previously that two events are mutually exclusive if and only if the probability of the union of these two events is equal to the sum of the probabilities of the events, i.e., if and only if
Now we investigate the probability associated with the intersection of two events. We shall see that the probability of the intersection of two events is equal to the product of the probabilities of the events if and only if the outcome of one event does not influence the outcome of the other, i.e., if and only if the two events are independent of each other.
is said to be independent if
has occurred.
Example 1.31 be the event that the red die shows 3. There are a total of 36 outcomes, each represented as a pair (i, j), where i denotes the number of spots on the red die, and j must therefore be independent since
; i.e., independence is a symmetric } it must follow that, for independent events
or, rearranging terms, that
are said to be independent if and only if
Pursuing this direction, it then follows that, for two independent events
which conveniently brings us back to the starting point.
Example 1.32 independent events?
is the event of a head occurring on the first throw and must be independent, since
Example 1.33 is the event that the card pulled is the queen of hearts?
} and so the events are independent.
} = 1/52. Therefore the two events are not independent since
} we obtain
are pairwise independent, has a number of useful applications.
Example 1.34 Before being loaded onto a distribution truck, packages are subject to two independent tests, to ensure that the truck driver can safely handle them. The weight of the package must not exceed 80 lbs and the sum of the three dimensions must be less than 8 feet. It has been observed that 5% of packages exceed the weight limit and 2% exceed the dimension limit. What is the probability that a package that meets the weight requirement fails the dimension requirement?
The sample space contains four possible outcomes: (ws, ds), (wu, ds), (ws, du), and (wu, du), where w and d represent weight and dimension, respectively, and s and u } = 0.02.
care independent and hence
Multiple Independent Events
be an arbitrary class of events, i.e.,
These events are said to be mutually independent
must satisfy
must satisfy
and so on, for quadruples of events, for quintuples of events, etc.
Example 1.35 The following example shows the need for this definition. , each with probability 1/2, are defined. Also, observe that
are pairwise independent.
Figure 1.12. Sample space with 16 equiprobable elements.
However, they are not mutually independent since
Alternatively,
defined above, are not are mutually independent only if all the following conditions hold:
Example 1.36 Consider a sample space that contains four equiprobable outcomes denoted a, b, c, and d= {a, b= {a, b, c. This time
but
are not independent, nor even pairwise independent.
1.5 Law of Total Probability
constitute a partition. Thus
Substituting the second of these into the first and then applying DeMorgan’s law, we find
are mutually exclusive. This is illustrated in cannot have any outcomes in common, the intersection cannot have any outcomes in common with the intersection .
are mutually exclusive.
are mutually exclusive, and applying Axiom 3, we obtain
by some other method.
The same rule applies for any partition of the sample space and not just a partition defined by an event and its complement. Recall that a partition is a set of events that are mutually exclusive and collectively exhaustive. Let the n , i = 1, 2, . . . , n, we can write
This is the law of total probabilityi = 1, 2, . . . , n, i = 1, 2, . . . , nimplies that
Hence, using Axiom 3,
Example 1.37 As an illustration, consider .
Figure 1.14. Law of total probability.
for i = 1, 2, . . . , 6. For this particular example it can be seen that these six probabilities are given by 0, 1/24, 1/12, 0, 1/24, and 1/12 which when added together gives 1/4.
The law of total probability is frequently presented in a different context, one that explicitly involves conditional probabilities. We have
, for all i, since in many instances we are provided with information concerning conditional probabilities of an event and we need to use Equation (1.5) to remove }.
and hence
Thus
and the desired result follows.
Example 1.38 Suppose three boxes contain a mixture of white and black balls. The first box contains 12 white and three black balls; the second contains four white and 16 black balls and the third contains six white and four black balls. A box is selected and a single ball is chosen from it. The choice of box is