Skip to main content

Binomial Distribution

BinomialDistribution

The binomial distribution gives the discrete probability distribution P_p(n|N) of obtaining exactly n successes out of N Bernoulli trials (where the result of each Bernoulli trial is true with probability p and false with probability q=1-p). The binomial distribution is therefore given by

P_p(n|N)=(N; n)p^nq^(N-n)
(1)
=(N!)/(n!(N-n)!)p^n(1-p)^(N-n),
(2)

where (N; n) is a binomial coefficient. The above plot shows the distribution of n successes out of N=20 trials with p=q=1/2.

The binomial distribution is implemented in Mathematica as BinomialDistribution[n, p].

The probability of obtaining more successes than the n observed in a binomial distribution is

 P=sum_(k=n+1)^N(N; k)p^k(1-p)^(N-k)=I_p(n+1,N-n),
(3)

where

 I_x(a,b)=(B(x;a,b))/(B(a,b)),
(4)

B(a,b) is the beta function, and B(x;a,b) is the incomplete beta function.

The characteristic function for the binomial distribution is

 phi(t)=(q+pe^(it))^N
(5)

(Papoulis 1984, p. 154). The moment-generating function M for the distribution is

M(t)=" border="0" height="20" width="27">
(6)
=sum_(n=0)^(N)e^(tn)(N; n)p^nq^(N-n)
(7)
=sum_(n=0)^(N)(N; n)(pe^t)^n(1-p)^(N-n)
(8)
=[pe^t+(1-p)]^N
(9)
M^'(t)=N[pe^t+(1-p)]^(N-1)(pe^t)
(10)
M^('')(t)=N(N-1)[pe^t+(1-p)]^(N-2)(pe^t)^2+N[pe^t+(1-p)]^(N-1)(pe^t).
(11)

The mean is

mu=M^'(0)
(12)
=N(p+1-p)p
(13)
=Np.
(14)

The moments about 0 are

mu_1^'=mu=Np
(15)
mu_2^'=Np(1-p+Np)
(16)
mu_3^'=Np(1-3p+3Np+2p^2-3Np^2+N^2p^2)
(17)
mu_4^'=Np(1-7p+7Np+12p^2-18Np^2+6N^2p^2-6p^3+11Np^3-6N^2p^3+N^3p^3),
(18)

so the moments about the mean are

mu_2=Np(1-p)=Npq
(19)
mu_3=Np(1-p)(1-2p)
(20)
mu_4=Np(1-p)[3p^2(2-N)+3p(N-2)+1].
(21)

The skewness and kurtosis excess are

gamma_1=(1-2p)/(sqrt(Np(1-p)))
(22)
=(q-p)/(sqrt(Npq))
(23)
gamma_2=(6p^2-6p+1)/(Np(1-p))
(24)
=(1-6pq)/(Npq).
(25)

The first cumulant is

 kappa_1=np,
(26)

and subsequent cumulants are given by the recurrence relation

 kappa_(r+1)=pq(dkappa_r)/(dp).
(27)

The mean deviation is given by

 MD=sum_(k=0)^N|k-Np|(N; k)p^k(1-p)^(N-k).
(28)

For the special case p=q=1/2, this is equal to

MD=2^(-N)sum_(k=0)^(N)(N; k)|k-1/2N|
(29)
={(N!!)/(2(N-1)!!) for N odd; ((N-1)!!)/(2(N-2)!!) for N even,
(30)

where N!! is a double factorial. For N=1, 2, ..., the first few values are therefore 1/2, 1/2, 3/4, 3/4, 15/16, 15/16, ... (Sloane's A086116 and A086117). The general case is given by

 MD=2(1-p)^(N-|_Np_|)p^(|_Np_|+1)(|_Np_|+1)(N; |_Np_|+1).
(31)

Steinhaus (1999, pp. 25-28) considers the expected number of squares S(n,N,s) containing a given number of grains n on board of size s after random distribution of N of grains,

 S(n,N,s)=sP_(1/s)(n|N).
(32)

Taking N=s=64 gives the results summarized in the following table.

nS(n,64,64)
023.3591
123.7299
211.8650
3 3.89221
4 0.942162
5 0.179459
6 0.0280109
7 0.0036840
8 4.16639×10^(-4)
9 4.11495×10^(-5)
10 3.59242×10^(-6)

An approximation to the binomial distribution for large N can be obtained by expanding about the value n^~ where P(n) is a maximum, i.e., where dP/dn=0. Since the logarithm function is monotonic, we can instead choose to expand the logarithm. Let n=n^~+eta, then

 ln[P(n)]=ln[P(n^~)]+B_1eta+1/2B_2eta^2+1/(3!)B_3eta^3+...,
(33)

where

 B_k=[(d^kln[P(n)])/(dn^k)]_(n=n^~).
(34)

But we are expanding about the maximum, so, by definition,

 B_1=[(dln[P(n)])/(dn)]_(n=n^~)=0.
(35)

This also means that B_2 is negative, so we can write B_2=-|B_2|. Now, taking the logarithm of (◇) gives

 ln[P(n)]=lnN!-lnn!-ln(N-n)!+nlnp+(N-n)lnq.
(36)

For large n and N-n we can use Stirling's approximation

 ln(n!) approx nlnn-n,
(37)

so

(d[ln(n!)])/(dn) approx (lnn+1)-1
(38)
=lnn
(39)
(d[ln(N-n)!])/(dn) approx d/(dn)[(N-n)ln(N-n)-(N-n)]
(40)
=[-ln(N-n)+(N-n)(-1)/(N-n)+1]
(41)
=-ln(N-n),
(42)

and

 (dln[P(n)])/(dn) approx -lnn+ln(N-n)+lnp-lnq.
(43)

To find n^~, set this expression to 0 and solve for n,

 ln((N-n^~)/(n^~)p/q)=0
(44)
 (N-n^~)/(n^~)p/q=1
(45)
 (N-n^~)p=n^~q
(46)
 n^~(q+p)=n^~=Np,
(47)

since p+q=1. We can now find the terms in the expansion

B_2=[(d^2ln[P(n)])/(dn^2)]_(n=n^~)
(48)
=-1/(n^~)-1/(N-n^~)
(49)
=-1/(Npq)
(50)
=-1/(Np(1-p))
(51)
B_3=[(d^3ln[P(n)])/(dn^3)]_(n=n^~)
(52)
=1/(n^~^2)-1/((N-n^~)^2)
(53)
=(q^2-p^2)/(N^2p^2q^2)
(54)
=(1-2p)/(N^2p^2(1-p)^2)
(55)
B_4=[(d^4ln[P(n)])/(dn^4)]_(n=n^~)
(56)
=-2/(n^~^3)-2/((n-n^~)^3)
(57)
=(2(p^2-pq+q^2))/(N^3p^3q^3)
(58)
=(2(3p^2-3p+1))/(N^3p^3(1-p)^3)).
(59)
BinomialGaussian

Now, treating the distribution as continuous,

infty)sum_(n=0)^NP(n) approx intP(n)dn=int_(-infty)^inftyP(n^~+eta)deta=1. " border="0" height="48" width="285">
(60)

Since each term is of order 1/N∼1/sigma^2 smaller than the previous, we can ignore terms higher than B_2, so

 P(n)=P(n^~)e^(-|B_2|eta^2/2).
(61)

The probability must be normalized, so

 int_(-infty)^inftyP(n^~)e^(-|B_2|eta^2/2)deta=P(n^~)sqrt((2pi)/(|B_2|))=1,
(62)

and

P(n)=sqrt((|B_2|)/(2pi))e^(-|B_2|(n-n^~)^2/2)
(63)
=1/(sqrt(2piNpq))exp[-((n-Np)^2)/(2Npq)].
(64)

Defining sigma^2=Npq,

 P(n)=1/(sigmasqrt(2pi))exp[-((n-n^~)^2)/(2sigma^2)],
(65)

which is a normal distribution. The binomial distribution is therefore approximated by a normal distribution for any fixed p (even if p is small) as N is taken to infinity.

If infty" border="0" height="14" width="40"> and 0" border="0" height="14" width="33"> in such a way that lambda" border="0" height="14" width="46">, then the binomial distribution converges to the Poisson distribution with mean lambda.

Let x and y be independent binomial random variables characterized by parameters n,p and m,p. The conditional probability of x given that x+y=k is

 P(x=i|x+y=k)=(P(x=i,x+y=k))/(P(x+y=k))  =(P(x=i,y=k-i))/(P(x+y=k))  =(P(x=i)P(y=k-i))/(P(x+y=k))  =((n; i)p^i(1-p)^(n-i)(m; k-i)p^(k-i)(1-p)^(m-(k-i)))/((n+m; k)p^k(1-p)^(n+m-k))  =((n; i)(m; k-i))/((n+m; k)).
(66)

Note that this is a hypergeometric distribution.

Comments

Popular Posts

Graphical Distribution of Frequency Distribution

Frequency distribution can be presented graphically in any one of the following ways: Histogram Frequency Polygon Smooth Frequency Curve Cumulative Frequency Curve of Ogive Curve Pie-Chart Histogram: - A histogram is an area diagram in which the frequencies corresponding to each class interval of frequency distribution are by the area of a rectangle without leaving no gap between the cosective rectangles. Frequency Polygon: - This is one kind of histogram which is represented by joining the straight lines of the mid points of the upper horizontal side of each rectangle with adjacent rectangles. Smooth Frequency Curve: - This is one kind of histogram which is represented by joining the mid points by free hand of the upper horizontal side of each rectangle with adjacent rectangles. Comulative Frequency Curve or Ogive Curve: - The total frequency of all values less then the upper class boundary of a...

Empirical Relation between Mean, Median and Mode

A distribution in which the values of mean, median and mode coincide (i.e. mean = median = mode) is known as a symmetrical distribution. Conversely, when values of mean, median and mode are not equal the distribution is known as asymmetrical or skewed distribution. In moderately skewed or asymmetrical distribution a very important relationship exists among these three measures of central tendency. In such distributions the distance between the mean and median is about one-third of the distance between the mean and mode, as will be clear from the diagrams 1 and 2 Karl Pearson expressed this relationship as:

Correlation and Linearity

Correlation coefficients measure the strength of association between two variables. The most common correlation coefficient, called the Pearson product-moment correlation coefficient , measures the strength of the linear association between variables. In this tutorial, when we speak simply of a correlation coefficient, we are referring to the Pearson product-moment correlation. Generally, the correlation coefficient of a sample is denoted by r , and the correlation coefficient of a population is denoted by ρ or R . How to Interpret a Correlation Coefficient The sign and the absolute value of a correlation coefficient describe the direction and the magnitude of the relationship between two variables. The value of a correlation coefficient ranges between -1 and 1. The greater the absolute value of a correlation coefficient, the stronger the linear relationship. The str...

Poisson Distribution

A Poisson experiment is a statistical experiment that has the following properties: The experiment results in outcomes that can be classified as successes or failures. The average number of successes (μ) that occurs in a specified region is known. The probability that a success will occur is proportional to the size of the region. The probability that a success will occur in an extremely small region is virtually zero. Note that the specified region could take many forms. For instance, it could be a length, an area, a volume, a period of time, etc. Notation The following notation is helpful, when we talk about the Poisson distribution. e : A constant equal to approximately 2.71828. (Actually, e is the base of the natural logarithm system.) μ: The mean number of successes that occur in a specified region. x : The actual number of successes that occur in a specified region. P( x ; μ): The Poisson probability that exactly x ...