Designed by Engin A. Sungur
Lesson 4:
Contents
This fourth lesson covers ...
Objectives
After completing this lesson, you should be able to
- phrase an inference about a population by making a jump from sample
to population and then to give a measure of reliability for the inference.
- understand basic concepts of probability
- identify random variables and find probabilities of specific
numerical outcomes
- find and interpret mean and variance of random variables
Reading Assignment
Read chapter 4 in Introduction to the Practice of Statistics.
Activities
The Detailed Learning Objectives
Chapter 4
4.1 Probability Models
- Relative frequency and subjective probability
- Determination of sample space (S)
- Rules of probability
- Finding P(Ac), P(A or B), P(A and B)
for disjoint and independent events
- Independence (verifying A and B independent or not,
- finding probabilities given independence)
4.2 Random Variables
- discrete and continuous random variables
- given a probability distribution finding event probabilities
- constructing a probability distribution
4.3 Mean and Variances of Random Variables
- given a probability distribution finding mean, variance
and standard deviation
4.4 Probability Laws
- finding P(A or B), P(A and B) ingeneral
- conditional probabilities (finding )
- construction of a tree diagram
- Bayes's Rule
Key Terms
- probability model
- sample space
- event
- complement of an event
- disjoint events
- independent events
- complement rule
- addition rule
- multiplication rule
- random variable
- discrete random variable
- continuous random variable
- density curve
- Normal distribution
- mean
- standard deviation
- law of large numbers
- tree diagram
- Bayes's rule
Chapter 4: Study Questions
- What is the difference between sample space and event?
- What is the difference between disjoint events and
independent events?
- What are complement, addition and multiplication rules?
- What does a probability distribution of a random variable give us?
- What is the difference between discrete and continuous random
variables?
- What is the density curve?
- What is the Normal distribution?
- What is the mean and standard deviation of a random variable?
- What is conditional probability?
- What is the use of Bayes's rule?
Chapter 4: Study Notes
A phenomenon is random if
- individual outcome is uncertain
- there is a regular distribution of outcomes in a large number of
repetitions
Examples Toss a coin, choose a random sample
Probability of any outcome of a random phenomenon is the
proportion of times the outcome would occur in a very long series of
repetitions.
Here is how it is done historically:
The Person who Tossed the
Coin | Number of Tosses | Number of
Heads | Relative Frequency |
Buffon (naturalist)
Coin | 4040 | 2048 | 2048/4040=0.5069 |
Pearson
(statistician) | 24000 | 12012 | 12012/24000=0.5005 |
Kerrich (mathematician, in
prison) | 10000 | 5067 | 5067/10000=0.5067 |
Note that if we go with this definition of probability one should be able
to repeat the exreiment under the same condition many times. Do you think
that it is possible to do this for all the cases? For example, you need
to figure out the probability that you will pass this course. How would
you figure this probability? The definition given above is the one that
is introduced by the school of probabilist who are relative frequenist.
Other approach to the probability is known as personal probability or
subjective probability approach. This school believes that a probability
is assigned to an event based on subjective judgement, experience,
information, and belief.
Whichever appraoch one selects to use there should be a common framework.
In section 4.2 we will develop this common framework.
4.2. and 4.5. PROBABILITY MODELS
We start with an experiment and define the following:
Sample Space is the set of all possible outcomes of an
experiment, which is represented by S.
Event is any subset of the sample space. In other words it a any
combination of possible outcomes of the experiment. Generally, they are
represented by capital letters, such as A, B, C, ....
Here are some examples:
Experiment: Toss a coin one time
Sample Space: S={H,T}
An Event: A=getting a head={H}
Experiment: Toss a coin three times
Sample Space: S={HHH,THH,HTH,HHT,TTH,THT,HTT,TTT}
An Event: A="getting two heads"={THH,HTH,HHT}
Another event could be
B="longest run of tails being 2"={TTH,HTT}
Sometimes a "tree" could help us to understand the sample space better.
Experiment: Flip a coin if head occurs flip it for the second
time. If tail occurs toss a die.
Sample Space: S={HH,HT,T1,T2,T3,T4,T5,T6}
An Event: A="getting a 3 the die"={T3}
Note that, I did not put A={3}, because 3 is not the element of the
sample space.
Experiment: Select a number between 0 and 1
Sample Space: S={all numbers between 0 and 1}
An Event: A="selecting a number less than or equal to
the median"={all numbers between or equal to 0 and 0,5}
BASIC PROBABILITY RULES
Let us start with the first two rules. These rules help us to end up with
legitimate values for the probabilities:
Rule 1: P(A) is always between 0 and 1.
Rule 2: P(S)=1
Here are the steps for assigning probability to an event:
- Define the experiment
- List all possible outcomes
- Assign probabilities to each outcome
- Determine the outcomes in the event, say A
- Sum the outcome probabilities that are in the event A.
Example:
Experiment: taking a course on statistics
S={A,B,C,D,F,I,W}
the sample space gives us the possible grades.
Based on the past experience the following probabilities are assigned to
the outcomes of this experiment:
OUTCOME | A | B
| C | D | F | I | W |
PROBABILITY | 0.2 | 0.3 | 0.2 | 0.1 |
0.1 | 0.05 | ? |
The "?" should be 0.05 by the Rule 2.
Say we want to find the probability that a student will get C or better.
H="getting C or better"={A,B,C}
Therefore,
P(H)=P(A)+P(B)+P(C)=0.2+0.3+0.2=0.7.
Finding a probability of an event gets easier. if the sample space
includes finite number of outcomes and these outcomes are eqaually likely
to occur. Please note that not all sample spaces has equally likely outcomes.
Suppose that there are k possible outcomes, and outcomes are equally
likely, then
- Probability of each outcome=1/k
- P(A)=(Number of outcomes in A)/k
SUMMARY OF BASIC COMBINED EVENTS
Operation |
Explanation |
Notation |
Venn Diagram |
Complement of an event A |
The event that A does not occur |
Ac |
|
Intersection of A and B |
Both A and B occur at the same time |
A or B |
|
Union of A, B |
Either A or B or both occurs
At least one of the events A, B occurs |
A and B |
|
OTHER RULES OF PROBABILITY
For the other rules of probability we need to distinguish three cases:
- Disjoint Events If the events have no common outcome they are
called disjoint events. In other words, the venets, say A and B, can not
occur together.
- Independent Events If whether A occurs or not does not change
the probability that B occurs then A and B are called independent.
Note that being disjoint is a property of events itself, but being
independent is a property of the probability of events.
Here is the summary of all the rules
|
GENERAL CASE |
DISJOINT EVENTS |
INDEPENDENT EVENTS |
Rule 3.
Complement Rule |
P(AC)=1-P(A) |
P(AC)=1-P(A) |
P(AC)=1-P(A) |
Rule 4.
Addition Rule |
P(A or B)=
P(A)+P(B)-P(A and B) |
P(A or B)=
P(A)+P(B) |
P(A or B)=
P(A)+P(B)-P(A)P(B) |
Rule 5.
Multiplication Rule |
P(A and B)=
P(A|B)P(B) |
P(A and B)=
0 |
P(A and B)=
P(A)P(B) |
Note that general multiplication rule uses conditional probability.
To understand this concept better please work on the activity
LET US ROLL A DIE. Here is the formal definition of the conditional
probability.
Definition The conditional probability of B given A is:
P(B|A)=P(A and B)/P(A)
Note that for independent events
P(B|A)=P(B)
P(A|B)=P(A)
There are two types of questions that one might face related with the
independence
- You are given that the events are independent and asked to find the
probability that all of these events will occur at the same time. In this
case you multiply the individual probabilities.
- Based on the information given verify whether two events A and B are
independent or not. Here you need to check whether or not P(A and B)
equal to P(A)P(B). Or you can also check whether or not P(A|B) equal to P(A).
MORE ON PROBABILITY TREES AND BAYES'S RULE
To be able to understand the importance of rule one should know the
difference between the events A|B and B|A, and one should be aware of the
fact that the difficulty of obtaining information on probability of
occurrence of these two events may not be the same. For example let
A="the person has AIDS", B="the HIV test turns out to be positive".
Suppose that you are researcher. Which one of the two is easier to find
through experiments P(A|B) or P(B|A)?
If you would like to have more information on
"How to Solve Problems
Related with Probability?" and "Use of Probability Trees and Bayes's
Rule", please click on anywhere on this sentence.
Most of the time Bayes's rule is applied to "testing" problem. Here is an
example:
In one of Marilyn Savant's columns in parade Magazine the following
question was asked.
Suppose we assume that 5% of the people are drug-users. The test
is 95% accurate, which we'll say means that if a person is a user, the
result is positive 95% of the time; and if s/he isn't, it's negative 95%
of the time. A randomly chosen person tests positive. Is the individual
highly likely to be a drug-user?
Marilyn's answer was:
Given your condition, once the person has tested positive, you
may as well flip a coin to determine whether she of he is a drug-user.
The chances are only 50-50.
If you are having a hard time understanding why,
please click on any where on this sentence.
If you would like to see another case where Marilyn was right again,
please try the activity
TO SWITCH OR NOT TO SWITCH.
4.3 and 4.4 RANDOM VARIABLES AND THEIR MEANS AND VARIANCES
RANDOM VARIABLE: is a variable whose value is a numerical outcome
of a random phenomenon.
There are two types of random variables
- Discrete Random Variables: These are the ones which takes coutable
number of possible values
- Continuous Random Variables: These can assume any value in one or
more intervals.
Notation: X,Y,Z represent the random variable, x,y.z represent
possible values of the random variable.
Here are some examples:
- X=shoe size; x=5, 5.5, 6, 6.5, ..., 12 (discrete)
- X=life length of a battery; x can be any thing from 0 to infinity
(continuous)
- X=number of trials to obtain the first head; x=1,2,3,... (discrete)
- X=amount of time you need to wait to obtain the first head; x can be
any thing from 0 to infinity
(continuous)
Probability Distribution of a Discrete Random Variable lists
all possible values and their probabilities.
Here are some important things that you should be able to do:
- Verify whether or not you are dealing with a legitimate probability
distribution. Check all the probabilities. They should be between o
and 1. Sum of all the probabilities should be 1.
- Construct a probability distribution. By using the knowledge
that you have gained in section 4.2 and 4.5, list all possible values and
associated probabilities.
- Find a probability of an event related with the X. Add the
probabilities of the values of X that makes up the event.
- Find a probability associated with one possible values of X, given
all of the remaining probabilities. Remember that sum of the
probabilities should be 1.
When finding mean, variance, and standard deviation of a discrete random
variable, constructing a table like the one below may help.
Possible values, xi |
Probability, pi |
xi pi |
(xi-mX) |
(xi-mX)2 |
(xi-mX)2 pi |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Sum of this column=m=
mean |
|
|
Sum of this column=s2
variance |
Written Assignment
Do the following assignment. The problems listed are from
"Introduction to the Practice of Statistics".
When you have worked on the problems and are ready to turn in
your findings, click the assignment
link below. It will take you to a template where you can fill in your
answers to the questions. When you are
finished entering your answers, click the submit button, you will be
given the location of your completed web
page. You may check your assignment responses with your browser at any
time, and submit a revision at any
time before the due date of the assignment. The due date is Wednesday
August 4..
SECTION 4.1.Exercise 4.3 (page 293) |
SECTION 4.2.Exercises 4.11, 4.15, 4.21, 4.25, 4.31, 4.33 (pages
306-312) |
SECTION 4.3.Exercises 4.41, 4.45 (pages 323, 324) |
SECTION 4.4.Exercise 4.63 (page 343) |
SECTION 4.5.Exercises 4.79, 4.87, 4.89, (page
360-363) |
|
Lesson Submission 4
Assignment #4.
Internet Links
Each day you go online, be sure to check out the Random
Statistical Quote for the Day