Selasa, 21 Mei 2013

Areas under the Normal Curve

The curve of any continuous probability distribution or density function is constructed so that the area under the curve bounded by the two ordinatos x — x\ and x — x2 equals the probability that the random variable X assumes a value between x = x\ and x = x2. Thus, for the normal curve in Figure 6.6,

In Figures 6.3, 6.4, and 6.5 we saw how the normal curve is dependent on the mean and the standard deviation of the distribution under investigation. The area under the curve between any two ordinates must then also depend on the values p and a. This is evident in Figure 6.7, where we have shaded regions corresponding
to P(xi < X < x-2) for two curves with different means and variances. The P{x\ < X < x2), where X is the random variable describing distribution A, is indicated by the darker shaded area. If X is the random variable describing distribution B, then P(x.\ < X < x2) is given by the entire shaded region. Obviously, the two
shaded regions are different in size; therefore, the probability associated with each distribution will be different for the two given values ol X. The difficulty encountered in solving integrals of normal density functions necessitates the tabulation of normal curve areas for quick reference. However, it

Definition
The distribution of a normal random variable with mean 0 and variance 1 is called
a standard normal distribution.
The original and transformed distributions are illustrated in Figure 6.8. Since
all the values of X falling between x\ and x2 have corresponding z values between
z\ and z2, the area under the X-curve between the ordinates x = x\ and x = x2 in
Figure 6.8 equals the area under the Z-curve between the transformed ordinates
z = z\ and z — z2.
We have now reduced the required number of tables of normal-curve areas to
one, that of the standard normal distribution. Table A.3 indicates the area under
the standard normal curve corresponding to P(Z < z) for values of z ranging from
—3.49 to 3.49. To illustrate the use of this table, let us find the probability that Z
is less than 1.74. First, we locate a value of z equal to 1.7 in the left column, then
move across the row to the column under 0.04, where we read 0.9591. Therefore,
P(Z < 1.74) = 0.9591. To find a z value corresponding to a given probability, the
process is reversed. For example, the z value leaving an area of 0.2148 under the
curve to the left of z is seen to be —0.79

Senin, 20 Mei 2013

Sebaran Seragam Kontinu

One of the simplest continuous distributions in all of statistics is the continuous
uniform distribution. This distribution is characterized by a density function
that is "flat," and thus the probability is uniform in a closed interval, say [A, B].
Although applications of the continuous uniform distribution are not as abundant
as they are for other distributions discussed in this chapter, it is appropriate for
the novice to begin this introduction to continuous distributions with the uniform
distribution.
Uniform Distribution

It should be emphasized to the reader that the density function forms a rectangle
with base B — A and constant height -g^j- As a result, the uniform distribution
is often called the rectangular distribution. The density function for a uniform
random variable on the interval [1, 3] is shown in Figure 6.1.
Probabilities are simple to calculate for the uniform distribution due to the
simple nature of the density function. However, note that the application of this
distribution is based on the assumption that the probability of falling in an interval
of fixed length within [A, B] is constant.
Example :
Suppose that a large conference room for a certain company can be reserved for no
more than 4 hours. However, the use of the conference room is such that both long
and short conferences occur quite often. In fact, it can be assumed that length X
of a conference has a uniform distribution on the interval [0, 4].
(                  (a)    What is the probability density function?

(b)   What is the probability that any given conference lasts at least 3 hours?

Solution: (a) The appropriate density function for the uniformly distributed random variable
X in this situation is

Minggu, 19 Mei 2013

Sebaran Poisson

Experiments yielding numerical values of a random variable X, the number of
outcomes occurring during a given time interval or in a specified region, are called
Poisson experiments. The given time interval may be of any length, such as
a minute, a day, a week, a month, or even a year. Hence a Poisson experiment
can generate observations for the random variable X representing the number of
telephone calls per hour received by an office, the number of days school is closed
due to snow during the winter, or the number of postponed games due to rain
during a baseball season. The specified region could be a line segment, an area,
a volume, or perhaps a piece of material. In such instances X might represent
the number of field mice per acre, the number of bacteria in a given culture, or
the number of typing errors per page. A Poisson experiment is derived from the
Poisson process and possesses the following properties:
Properties of Poisson Process
1. The number of outcomes occurring in one time interval or specified region is
independent of the number that occurs in any other disjoint time interval or
region of space. In this way wc say that the Poisson process has no memory.
2. The probability that a single outcome will occur during a very short time
interval or in a small region is proportional to the length of the time interval
or the size of the region and does not depend on the number of outcomes
occurring outside this time interval or region.
3. The probability that more than one outcome will occur in such a short time
interval or fall in such a small region is negligible.
The number X of outcomes occurring during a Poisson experiment is called a
Poisson random variable, and its probability distribution is called the Poisson
distribution. The mean number of outcomes is computed from p = Xt, where
t is the specific "time," "distance," "area," or "volume" of interest. Since its
probabilities depend on A, the rate of occurrence of outcomes, we shall denote
them by the symbol p(x; Xt). The derivation of the formula for p(x; Xt), based on
the three properties of a Poisson process listed above, is beyond the scope of this
book. The following concept is used for computing Poisson probabilities.
Poisson Distribution
The probability distribution of the Poisson random variable X, representing
the number of outcomes occurring in a given time interval or specified region
denoted by t, is P(x;λt) =  ,    x = 0, 1, 2, ...
where A is the average number of outcomes per unit time, distance, area, or volume, and  e=2.71828
for a few selected values of At ranging from 0.1 to 18. We illustrate the use of this
table with the following two examples.

During a laboratory experiment the average number of radioactive particles passing
through a counter in 1 millisecond is 4. What is the probability that 6 particles
enter the counter in a given millisecond?
Solution: Using the Poisson distribution with x = 6 and Xt = 4 and Table A.2, we have
P(6,4) = =  -
= 0.8893 – 0.7851
= 0.1042
Example :
Ten is the average number of oil tankers arriving each day at a certain port city.
The facilities at the port can handle at most 15 tankers per day. What is the
probability that on a given day tankers have to be turned away?
Solution: Let X be the number of tankers arriving each day. Then, using Table A.2, we have
P(X>15) = 1 – P(X= 1) = 0,0487
Like the binomial distribution, the Poisson distribution is used for quality control,
quality assurance, and acceptance sampling. In addition, certain important
continuous distributions used in reliability theory and queuing theory depend on
the Poisson process. Some of these distributions arc discussed and developed in
Chapter 6.
Teorem
Both the mean and variance of the Poisson distribution p(x; Xt) are Xt.

The proof of this Theorem is found in Appendix A.26.
In Example 5.20, where Xt — 4, we also have a24 and hence a = 2. Using
Chebyshev's theorem, we can state that our random variable has a probability of at
least 3/4 of falling in the interval p±2a = 4± (2)(2), or from 0 to 8. Therefore, we
conclude that at least three-fourths of the time the number of radioactive particles
entering the counter will be anywhere from 0 to 8 during a given millisecond.

The Poisson Distribution as a Limiting Form of the Binomial
It should be evident from the three principles of the Poisson process that the
Poisson distribution relates to the binomial distribution. Although the Poisson
usually finds applications in space and time problems as illustrated by Examples
5.20 and 5.21, it can be viewed as a limiting form of the binomial distribution.
In the case of the binomial, if n is quite large and p is small, the conditions
begin to simulate the continuous space or time region implications of the Poisson
process. The independence among Bernoulli trials in the binomial case is consistent
with property 2 of the Poisson process. Allowing the parameter p to be close to
zero relates to property 3 of the Poisson process. Indeed, if n is large and p is
close to 0, the Poisson distribution can be used, with p = np, to approximate
binomial probabilities. If p is close to 1, we can still use the Poisson distribution
to approximate binomial probabilities by interchanging what we have defined to
be a success and a failure, thereby changing p to a value close to 0.

Teorem
Let X be a binomial random variable with probability distribution b(x;n,p). When →∞, p → 0, and η =nremains constant, b(x;n,p) —>p(x;.µ).

Hypergeometric Distribution

The simplest way to view the distinction between the binomial distribution of Section 5.3 and the hypergeometric distribution lies in the way the sampling is clone. The types of applications of the hypergeometric arc very similar to those of the binomial distribution. We are interested in computing probabilities for the number of observations that fall into a particular category. But in the case of the binomial, independence among trials is required. As a result, if the binomial is applied to. say, sampling from a lot of items (deck of cards, batch of production items), the sampling must be clone withreplacement of each item after it is observed. On the other hand, flic hypergeometric distribution does not require independence and is based on the: sampling done with outreplacement. Applications for the hypergeometric distribution are found in many areas, with heavy uses in acceptance sampling, electronic testing, and quality assurance. Obviously, for many of these fields testing is done at the expense of the item being tested. That is, the item is destroyed and hence cannot be replaced in the sample. Thus sampling without replacement is necessary. A simple example: with playing cards will serve as our first illustration. If we wish to find the: probability of observing 3 red cards in 5 draws from an ordinary deck of 52 playing cards, the binomial distribution of Section 5.3 does not apply unless each card is replaced and the deck reshuffled before the next drawingis made. To solve the problem of sampling without replacement, let. us restate the problem. If 5 cards are drawn at random, we are interested in the probability of selecting 3 red cards from the 26 available and 2 black cards from the 26 black cards available in the deck. There are   ways of selecting 3 red cards, and for each of these ways wc can choose 2 black cards in  ways. Therefore, the total number of ways to select 3 red and 2 black cards in 5 draws is the product  The total number of ways to select any 5 cards from the 52 that are available is   . Hence the probability of selecting 5 cards without replacement of which 3 are red and 2 are black is given by

= 0,3251

In general, we are interested in the probability of selecting x successes from
the A" items labeled successes and n — x failures from the Ar — k items labeled
failures when a random sample of size n is selected from AT items. This is known
as a hypergeometric experiment, that is, one that possesses the following two
properties:
1. A random sample of size n is selected without replacement from N items.
2. k of the Ar items may be classified as successes and N k are classified as failures.

The number X of successes of a hypergeometric experiment is called a hypergeometric
random variable. Accordingly, the probability distribution of the hypergeometric variable is called the hypergeometric distribution, and its values will be denoted by h(x; N, n, A:), since they depend on the number of successes k in the set N from which we select n items.

Hypergeometric Distribution In Acceptance Sampling

As in the case of the binomial distribution, the hypergeometric distribution finds applications in acceptance sampling where lots of material or parts are sampled in order to determine whether or not the entire lot is accepted.

Example 5.11:1
A particular part that is used as an injection device is sold in lots of 10. The producer feels that the lot is deemed acceptable if no more than one defective is in the lot. Some lots are sampled and the sampling plan involves random sampling and testing 3 of the parts out of 10. If none of the 3 is defective, the lot is accepted. Comment on the utility of this plan.

Solution: Let us assume that the lot is truly unacceptable (i.e., that 2 out of 10 are
defective). The probability that our sampling plan finds the lot acceptable is
P (X = 0) =   = 0,467
Thus, if the lot is truly unacceptable with 2 defective parts, this sampling plan will allow acceptance roughly 47% of the time. As a result, this plan should be considered faulty.  Let us now generalize in order to find a formula for h(x;N,n,k). The total number of samples of size n chosen from A7 items is . These samples are assumed to be equally likely. There are  ways of selecting x successes from the A* that are available, and for each of these ways we can choose the n — x failures in  ways> Thus the total number of favorable samples among the  possible samples is given by . Hence we have the following definition.

Hypergeometric Distribution
The probability distribution of the hypergeometric random variable X, the number of successes in a random sample of size n selected from Ar items of which k are labeled success and N — k labeled failure, is
h(x; N, n, k) =  , max {0, n - (N- k)}< x < min {n,k}.
The range of x can be determined by the three binomial coefficients in the definition,
where x and n — x are no more than k and N — A:, respectively; and both of
them cannot be less than 0. Usually, when both k (the number of successes) and
N - k (the number of failures) are larger than the sample size n, the range of a
hypergeometric random variable will be x = 0 , 1 , . . . ,n.
Example 5.12:1 Lots of 40 components each are called unacceptable if they contain as many as 3 defectives or more. The procedure for sampling the lot is to select 5 components at random and to reject the lot if a defective is found. What is the probability that exactly 1 defective is found in the sample if there are 3 defectives in the entire lot?
Solution: Using the hypergeometric distribution with n = 5, N = 40, k — 3, and x = 1, we
find the probability of obtaining one defective to be
h(1; 40,5, 3)
Once again this plan is likely not desirable since it detects a bad lot (3 defectives)
only about 30% of the time.
Theorem 5.3: The mean and variance of the hypergeometric distribution h(x; N, n, A;) are
and 2 =  (n)[ (1- ]
The proof for the mean is shown in Appendix A.25.
Example 5.13:1 Let us now reinvestigate Example 3.9. The purpose of this example was to illustrate
the notion of a random variable and the corresponding sample space. In
the example, we have a lot of 100 items of which 12 are defective. What is the
probability that in a sample of 10, 3 are defective?
Solution: Using the hypergeometric probability function we have
h(3; 100, 10, 12) =
Example 5.14:1 Find the mean and variance of the random variable of Example 5.12 and then use Chebyshev's theorem to interpret the interval //. ± 2 .

Solution: Since Example 5.12 was a hypergeometric experiment, with N = 40, u = 5, and
k = 3, then by Theorem 5.3 we have
Taking the square root of 0.3113, we find that a = 0.558. Hence the required interval is 0.375 ± (2)(0.558), or from -0.741 to 1.491. Chebyshev's theorem states that the number of defectives obtained when 5 components are selected at random from a lot of 40 components of which 3 are defective has a probability of at least 3/4 of falling between - 0.741 and 1.491. That is, at, least three-fourths of the time, the 5 components include: less than 2 defectives.

Relationship to the Binomial Distribution
In this chapter wc discuss several important discrete distributions that have wide
applicability. Many of these distributions relate nicely to each other. The beginning
student should gain a clear understanding of these relationships. There is an
interesting relationship between the: hypergeometric and the binomial distribution.
As one might expect, if n is small compared to N, the nature of the N items changes
very little in each draw. So a binomial distribution can be used to approximate
the hypergeometric distribution when n is small, compared to N. In fact, as a rule
of thumb the approximation is good when  < 0.05.
Thus the quantity  plays the role of the binomial parameter p. As a result, the binomial distribution may be viewed as a large population edition of the hypergeometric: distributions. The mean and variance then come from the formulas
2 = npq =
Comparing those formulas with those of Theorem 5.3, we see that the mean is the
same whereas the variance differs by a correction factor of (N — n)/(N — 1), which
is negligible when n is small relative to N.
Multivariate Hypergeometric Distribution
If N items can be partitioned into the A: cells A1, A2,..., Ak with at, a.2,..., ak elements, respectively, then the probability distribution of the random variables X1,X2,...,Xk,, representing the number of elements selected from A1 ,A2,..., Ak in a random sample' of size n, is f(x1, x2, ... xk ; a1, a2, ... ak, N, n) =   with
Example 5.16:1 A group of 10 individuals is used for a biological case study. The group contains 3 people with blood type O, 4 with blood type A, and 3 with blood type B. What
is the probability that a random sample of 5 will contain I person with blood type O, 2 people with blood type A, and 2 people with blood type 13?

Solution: Using the extension of the hypergeometric distribution with x1= 1, x2 = 2, x3 = 2,
aI = 3, a2 = 4, a3 = 3, iV = 10, and n = 5, we find t h a t the desired probability is
f(1,2,2;3,4,3,10,5) =