程序代写代做代考 data science Introduction to information system
Introduction to information system
Popular Distributions (2/2)
Bowei Chen
School of Computer Science
University of Lincoln
CMP3036M/CMP9063M Data Science
• Univariate Distributions
– Discrete Distributions
• Uniform
• Bernoulli
• Binomial
• Poisson
– Continuous Distributions
• Uniform
• Exponential
• Normal/Gaussian
• Multivariate Distributions
– Multivariate Normal Distribution
Objectives
Today’s Objectives
Popular Distributions (2/2)
Quick Recap
on Discrete
Distributions!
Discrete Uniform Distribution
When to Use it?
• The experiment has finite outcomes
• Each outcome has the same probability to occur
Example:
• Flipping a fair coin
• Tossing a fair dice
Bernoulli and Binomial Distributions
When to Use it?
• The experiment has two outcomes
– If the experiment is performed once, it is the Bernoulli distribution
– If the experiment is performed many times, it is the Binomial distribution
Example: Flipping a fair/unfair coin
𝑛 flips
Ber(𝑝)
Bin(𝑛, 𝑝)
Head occurs with
probability 𝑝
Think Deeper
𝑛 tosses
A dice can be fair or unfair, the 𝑛 tosses of the
dice become the Multinomial distribution. It is not
required for this course but can be an interesting
advanced topic for your direct study.
𝑝1 𝑝2 𝑝3
𝑝4 𝑝5 𝑝6
Poisson Distribution
When to Use it?
• The experiment has two outcomes
• If the experiment is performed infinite times in a period
• The average rate of one outcome is finite.
Example:
• Number of text message arrivals in a period
What if we look
at the seconds,
milliseconds,
microseconds?
Continuous Uniform Distribution
• Notation
𝑋~U a, b
𝑓 𝑥; 𝑎, 𝑏 =
1
𝑏 − 𝑎
, if 𝑥 ∈ [𝑎, 𝑏],
0, otherwise.
• Expectation and variance
𝔼(𝑋) =
𝑎 + 𝑏
2
,
𝕍(𝑋) =
(𝑏 − 𝑎)2
12
.
𝐴𝑟𝑒𝑎 =
𝑎
𝑏
𝑓 𝑥; 𝑎, 𝑏 𝑑𝑥 = 1
You usually receive 5 text messages per hour
Event counts!
Waiting time!
You receive 0
message in
the next hour
Your waiting
time for the
first message
is less than or
equal to 1 hour
𝑋: the waiting hour for the first message
𝑌: the number of messages received for the next hour
The PMF of the Poisson
distribution is ℙ 𝑌 = 𝑦 =
𝑒−𝜆𝜆𝑦
𝑦!
𝑦 = 0
𝜆 = 5
Your waiting
time for the
first message
is more than
1 hour
ℙ 𝑋 ≤ 1 = 1 − ℙ 𝑋 > 1 = 1 − ℙ 𝑌 = 0 = 1 −
𝑒−550
0!
= 1 − 𝑒−5.
General Solution
ℙ 𝑋 ≤ 𝑥 = 1 − ℙ 𝑋 > 𝑥 = 1 − ℙ 𝑌 = 0 = 1 −
𝑒−𝜆𝑥𝜆𝑥0
0!
= 1 − 𝑒−𝜆𝑥.
Since ℙ 𝑋 ≤ 𝑥 = 𝔽(𝑥), then
𝑓 𝑥 = ℙ(𝑋 = 𝑥) =
𝑑𝔽(𝑥)
𝑑𝑥
= 𝜆𝑒−𝜆𝑥
Then, 𝑋 follows the Exponential distribution, denoted by 𝑋 ∼ Exp(𝜆)
Exponential Distribution
• Notation
𝑋~𝐸𝑥𝑝(𝜆)
𝑓 𝑥; 𝜆 =
𝜆𝑒−𝜆𝑥, if 𝑥 ≥ 0,
0, otherwise.
• Expectation and variance
𝔼(𝑋) =
1
𝜆
,
𝕍(𝑋) =
1
𝜆2
.
𝐴𝑟𝑒𝑎 =
0
∞
𝑓 𝑥; 𝜆 𝑑𝑥 = 1
Memoryless Property
Let 𝑋 be exponentially distributed with parameter 𝜆. Suppose we know 𝑋 > 𝑥1.
What is the probability that 𝑋 is also greater than some value 𝑥1 + 𝑥2?
ℙ 𝑋 > 𝑥1 + 𝑥2 𝑋 > 𝑥1 =
ℙ(𝑋 > 𝑥1 + 𝑥2 and 𝑋 > 𝑥1)
ℙ(𝑋 > 𝑥1)
If 𝑋 > 𝑥1 + 𝑥2, 𝑋 > 𝑥1. Therefore
ℙ 𝑋 > 𝑥1 + 𝑥2 𝑋 > 𝑥1 =
ℙ(𝑋 > 𝑥1 + 𝑥2)
ℙ(𝑋 > 𝑥1)
=
𝑒−𝜆(𝑥1 +𝑥2)
𝑒−𝜆𝑥1
= 𝑒−𝜆𝑥2 = ℙ(𝑋 > 𝑥2)
The memoryless property means that
the future is independent of the past.
Why is the Gambler Wrong?
If you flip a fair coin 8 times and do not observe a head. In your 9th flipping,
would you bet on head or tail?
1st
flipping
2nd
flipping
3rd
flipping
4th
flipping
5th
flipping
6th
flipping
7th
flipping
8th
flipping
9th
flipping
No. of
students
Marks195 30
Normal/Gaussian Distribution
• Notation
𝑋~𝒩(𝜇, 𝜎2)
𝑓 𝑥; 𝜇, 𝜎2 =
1
2𝜋𝜎2
exp −
𝑥 − 𝜇 2
2𝜎2
,
where −∞ < 𝑥 < ∞. • Expectation and variance 𝔼(𝑋) = 𝜇, 𝕍(𝑋) = 𝜎2. 𝐴𝑟𝑒𝑎 = −∞ ∞ 𝑓 𝑥; 𝜇, 𝜎2 𝑑𝑥 = 1 Standard Normal Distribution −1 1 1 2𝜋 exp − 𝑥2 2 𝑑𝑥 = 0.68269 −2 2 1 2𝜋 exp − 𝑥2 2 𝑑𝑥 = 0.95450 −3 3 1 2𝜋 exp − 𝑥2 2 𝑑𝑥 = 0.9973 Bivariate Normal Distribution V𝑜𝑙𝑢𝑚𝑒 = −∞ ∞ 𝑓 𝑥1, 𝑥2 𝑑𝑥1𝑑𝑥2 = 1 𝑓 𝑥1, 𝑥2; 𝜇1, 𝜎1 2, 𝜇2, 𝜎2 2; 𝜌 = 1 2𝜋𝜎1𝜎2 1 − 𝜌 2 exp − 1 2(1 − 𝜌2) 𝑄 , where 𝑄 = 𝑥1 − 𝜇1 𝜎1 2 + 𝑥2 − 𝜇2 𝜎2 2 −2𝜌 𝑥1 − 𝜇1 𝜎1 𝑥2 − 𝜇2 𝜎2 𝑓 (𝒙 ) 𝜇1 = 0, 𝜎1 2 = 10, 𝜇2 = 0, 𝜎2 2 = 10, 𝜌 = 0.5 Multivariate Normal Distribution • Notation 𝑿~𝒩𝑑 𝝁, 𝚺 𝑿 = 𝑋1 ⋮ 𝑋𝑑 𝝁 = 𝜇1 ⋮ 𝜇𝑑 𝚺 = 𝜎11 ⋯ 𝜎1𝑑 ⋮ ⋱ ⋮ 𝜎𝑑1 ⋯ 𝜎𝑑𝑑 • PDF 𝑓 𝒙; 𝝁, 𝚺 = (2𝜋) − 𝑑 2 𝚺 − 1 2exp − 1 2 𝒙 − 𝝁 𝑇𝚺−1 𝒙 − 𝝁 . • Expectation and variance 𝔼(𝑿) = 𝝁 Cov(𝑿) = 𝚺 References • G.Casella, R.Berger (2002) Statistical Inference. Chapter 3 • K.Murphy (2012) Machine Learning: A Probabilistic Perspective. Chapter 2 Thank You! bchen@Lincoln.ac.uk mailto:bchen@Lincoln.ac.uk