Bangkitkan Semangat: Mei 2013

Selasa, 21 Mei 2013

Areas under the Normal Curve

The curve of any continuous probability distribution or density function is constructed so that the area under the curve bounded by the two ordinatos x — x\ and x — x2 equals the probability that the random variable X assumes a value between x = x\ and x = x2. Thus, for the normal curve in Figure 6.6,

In Figures 6.3, 6.4, and 6.5 we saw how the normal curve is dependent on the mean and the standard deviation of the distribution under investigation. The area under the curve between any two ordinates must then also depend on the values p and a. This is evident in Figure 6.7, where we have shaded regions corresponding

to P(xi < X < x-2) for two curves with different means and variances. The P{x\ < X < x2), where X is the random variable describing distribution A, is indicated by the darker shaded area. If X is the random variable describing distribution B, then P(x.\ < X < x2) is given by the entire shaded region. Obviously, the two

shaded regions are different in size; therefore, the probability associated with each distribution will be different for the two given values ol X. The difficulty encountered in solving integrals of normal density functions necessitates the tabulation of normal curve areas for quick reference. However, it

Definition

The distribution of a normal random variable with mean 0 and variance 1 is called

a standard normal distribution.

The original and transformed distributions are illustrated in Figure 6.8. Since

all the values of X falling between x\ and x2 have corresponding z values between

z\ and z2, the area under the X-curve between the ordinates x = x\ and x = x2 in

Figure 6.8 equals the area under the Z-curve between the transformed ordinates

z = z\ and z — z2.

We have now reduced the required number of tables of normal-curve areas to

one, that of the standard normal distribution. Table A.3 indicates the area under

the standard normal curve corresponding to P(Z < z) for values of z ranging from

—3.49 to 3.49. To illustrate the use of this table, let us find the probability that Z

is less than 1.74. First, we locate a value of z equal to 1.7 in the left column, then

move across the row to the column under 0.04, where we read 0.9591. Therefore,

P(Z < 1.74) = 0.9591. To find a z value corresponding to a given probability, the

process is reversed. For example, the z value leaving an area of 0.2148 under the

curve to the left of z is seen to be —0.79

Senin, 20 Mei 2013

Sebaran Seragam Kontinu

One of the simplest continuous distributions in all of statistics is the continuous

uniform distribution. This distribution is characterized by a density function

that is "flat," and thus the probability is uniform in a closed interval, say [A, B].

Although applications of the continuous uniform distribution are not as abundant

as they are for other distributions discussed in this chapter, it is appropriate for

the novice to begin this introduction to continuous distributions with the uniform

distribution.

Uniform Distribution

It should be emphasized to the reader that the density function forms a rectangle

with base B — A and constant height -g^j- As a result, the uniform distribution

is often called the rectangular distribution. The density function for a uniform

random variable on the interval [1, 3] is shown in Figure 6.1.

Probabilities are simple to calculate for the uniform distribution due to the

simple nature of the density function. However, note that the application of this

distribution is based on the assumption that the probability of falling in an interval

of fixed length within [A, B] is constant.

Example :

Suppose that a large conference room for a certain company can be reserved for no

more than 4 hours. However, the use of the conference room is such that both long

and short conferences occur quite often. In fact, it can be assumed that length X

of a conference has a uniform distribution on the interval [0, 4].

( (a) What is the probability density function?

(b) What is the probability that any given conference lasts at least 3 hours?

Solution: (a) The appropriate density function for the uniformly distributed random variable

X in this situation is

Minggu, 19 Mei 2013

Sebaran Poisson

Experiments yielding numerical values of a random variable X, the number of

outcomes occurring during a given time interval or in a specified region, are called

Poisson experiments. The given time interval may be of any length, such as

a minute, a day, a week, a month, or even a year. Hence a Poisson experiment

can generate observations for the random variable X representing the number of

telephone calls per hour received by an office, the number of days school is closed

due to snow during the winter, or the number of postponed games due to rain

during a baseball season. The specified region could be a line segment, an area,

a volume, or perhaps a piece of material. In such instances X might represent

the number of field mice per acre, the number of bacteria in a given culture, or

the number of typing errors per page. A Poisson experiment is derived from the

Poisson process and possesses the following properties:

Properties of Poisson Process

1. The number of outcomes occurring in one time interval or specified region is

independent of the number that occurs in any other disjoint time interval or

region of space. In this way wc say that the Poisson process has no memory.

2. The probability that a single outcome will occur during a very short time

interval or in a small region is proportional to the length of the time interval

or the size of the region and does not depend on the number of outcomes

occurring outside this time interval or region.

3. The probability that more than one outcome will occur in such a short time

interval or fall in such a small region is negligible.

The number X of outcomes occurring during a Poisson experiment is called a

Poisson random variable, and its probability distribution is called the Poisson

distribution. The mean number of outcomes is computed from p = Xt, where

t is the specific "time," "distance," "area," or "volume" of interest. Since its

probabilities depend on A, the rate of occurrence of outcomes, we shall denote

them by the symbol p(x; Xt). The derivation of the formula for p(x; Xt), based on

the three properties of a Poisson process listed above, is beyond the scope of this

book. The following concept is used for computing Poisson probabilities.

Poisson Distribution

The probability distribution of the Poisson random variable X, representing

the number of outcomes occurring in a given time interval or specified region

denoted by t, is P(x;λt) =

, x = 0, 1, 2, ...

where A is the average number of outcomes per unit time, distance, area, or volume, and e=2.71828

for a few selected values of At ranging from 0.1 to 18. We illustrate the use of this

table with the following two examples.

During a laboratory experiment the average number of radioactive particles passing

through a counter in 1 millisecond is 4. What is the probability that 6 particles

enter the counter in a given millisecond?

Solution: Using the Poisson distribution with x = 6 and Xt = 4 and Table A.2, we have

P(6,4) =

= 0.8893 – 0.7851

= 0.1042

Example :

Ten is the average number of oil tankers arriving each day at a certain port city.

The facilities at the port can handle at most 15 tankers per day. What is the

probability that on a given day tankers have to be turned away?

Solution: Let X be the number of tankers arriving each day. Then, using Table A.2, we have

P(X>15) = 1 – P(X= 1) = 0,0487

Like the binomial distribution, the Poisson distribution is used for quality control,

quality assurance, and acceptance sampling. In addition, certain important

continuous distributions used in reliability theory and queuing theory depend on

the Poisson process. Some of these distributions arc discussed and developed in

Chapter 6.

Teorem

Both the mean and variance of the Poisson distribution p(x; Xt) are Xt.

The proof of this Theorem is found in Appendix A.26.

In Example 5.20, where Xt — 4, we also have a² — 4 and hence a = 2. Using

Chebyshev's theorem, we can state that our random variable has a probability of at

least 3/4 of falling in the interval p±2a = 4± (2)(2), or from 0 to 8. Therefore, we

conclude that at least three-fourths of the time the number of radioactive particles

entering the counter will be anywhere from 0 to 8 during a given millisecond.

The Poisson Distribution as a Limiting Form of the Binomial

It should be evident from the three principles of the Poisson process that the

Poisson distribution relates to the binomial distribution. Although the Poisson

usually finds applications in space and time problems as illustrated by Examples

5.20 and 5.21, it can be viewed as a limiting form of the binomial distribution.

In the case of the binomial, if n is quite large and p is small, the conditions

begin to simulate the continuous space or time region implications of the Poisson

process. The independence among Bernoulli trials in the binomial case is consistent

with property 2 of the Poisson process. Allowing the parameter p to be close to

zero relates to property 3 of the Poisson process. Indeed, if n is large and p is

close to 0, the Poisson distribution can be used, with p = np, to approximate

binomial probabilities. If p is close to 1, we can still use the Poisson distribution

to approximate binomial probabilities by interchanging what we have defined to

be a success and a failure, thereby changing p to a value close to 0.

Teorem

Let X be a binomial random variable with probability distribution b(x;n,p). When n →∞, p → 0, and η =np remains constant, b(x;n,p) —>p(x;.µ).

Hypergeometric Distribution

The simplest way to view the distinction between the binomial distribution of Section 5.3 and the hypergeometric distribution lies in the way the sampling is clone. The types of applications of the hypergeometric arc very similar to those of the binomial distribution. We are interested in computing probabilities for the number of observations that fall into a particular category. But in the case of the binomial, independence among trials is required. As a result, if the binomial is applied to. say, sampling from a lot of items (deck of cards, batch of production items), the sampling must be clone withreplacement of each item after it is observed. On the other hand, flic hypergeometric distribution does not require independence and is based on the: sampling done with outreplacement. Applications for the hypergeometric distribution are found in many areas, with heavy uses in acceptance sampling, electronic testing, and quality assurance. Obviously, for many of these fields testing is done at the expense of the item being tested. That is, the item is destroyed and hence cannot be replaced in the sample. Thus sampling without replacement is necessary. A simple example: with playing cards will serve as our first illustration. If we wish to find the: probability of observing 3 red cards in 5 draws from an ordinary deck of 52 playing cards, the binomial distribution of Section 5.3 does not apply unless each card is replaced and the deck reshuffled before the next drawingis made. To solve the problem of sampling without replacement, let. us restate the problem. If 5 cards are drawn at random, we are interested in the probability of selecting 3 red cards from the 26 available and 2 black cards from the 26 black cards available in the deck. There are ways of selecting 3 red cards, and for each of these ways wc can choose 2 black cards in ways. Therefore, the total number of ways to select 3 red and 2 black cards in 5 draws is the product The total number of ways to select any 5 cards from the 52 that are available is . Hence the probability of selecting 5 cards without replacement of which 3 are red and 2 are black is given by

= 0,3251

In general, we are interested in the probability of selecting x successes from

the A" items labeled successes and n — x failures from the Ar — k items labeled

failures when a random sample of size n is selected from AT items. This is known

as a hypergeometric experiment, that is, one that possesses the following two

properties:

1. A random sample of size n is selected without replacement from N items.

2. k of the Ar items may be classified as successes and N — k are classified as failures.

The number X of successes of a hypergeometric experiment is called a hypergeometric

random variable. Accordingly, the probability distribution of the hypergeometric variable is called the hypergeometric distribution, and its values will be denoted by h(x; N, n, A:), since they depend on the number of successes k in the set N from which we select n items.

Hypergeometric Distribution In Acceptance Sampling

As in the case of the binomial distribution, the hypergeometric distribution finds applications in acceptance sampling where lots of material or parts are sampled in order to determine whether or not the entire lot is accepted.

Example 5.11:1

A particular part that is used as an injection device is sold in lots of 10. The producer feels that the lot is deemed acceptable if no more than one defective is in the lot. Some lots are sampled and the sampling plan involves random sampling and testing 3 of the parts out of 10. If none of the 3 is defective, the lot is accepted. Comment on the utility of this plan.

Solution: Let us assume that the lot is truly unacceptable (i.e., that 2 out of 10 are

defective). The probability that our sampling plan finds the lot acceptable is

P (X = 0) = = 0,467

Thus, if the lot is truly unacceptable with 2 defective parts, this sampling plan will allow acceptance roughly 47% of the time. As a result, this plan should be considered faulty. Let us now generalize in order to find a formula for h(x;N,n,k). The total number of samples of size n chosen from A⁷ items is . These samples are assumed to be equally likely. There are ways of selecting x successes from the A* that are available, and for each of these ways we can choose the n — x failures in ways> Thus the total number of favorable samples among the possible samples is given by . Hence we have the following definition.

Hypergeometric Distribution

The probability distribution of the hypergeometric random variable X, the number of successes in a random sample of size n selected from Ar items of which k are labeled success and N — k labeled failure, is

h(x; N, n, k) = , max {0, n - (N- k)}< x < min {n,k}.

The range of x can be determined by the three binomial coefficients in the definition,

where x and n — x are no more than k and N — A:, respectively; and both of

them cannot be less than 0. Usually, when both k (the number of successes) and

N - k (the number of failures) are larger than the sample size n, the range of a

hypergeometric random variable will be x = 0 , 1 , . . . ,n.

Example 5.12:1 Lots of 40 components each are called unacceptable if they contain as many as 3 defectives or more. The procedure for sampling the lot is to select 5 components at random and to reject the lot if a defective is found. What is the probability that exactly 1 defective is found in the sample if there are 3 defectives in the entire lot?

Solution: Using the hypergeometric distribution with n = 5, N = 40, k — 3, and x = 1, we

find the probability of obtaining one defective to be

h(1; 40,5, 3)

Once again this plan is likely not desirable since it detects a bad lot (3 defectives)

only about 30% of the time.

Theorem 5.3: The mean and variance of the hypergeometric distribution h(x; N, n, A;) are

and ²= (n)[ (1- ]

The proof for the mean is shown in Appendix A.25.

Example 5.13:1 Let us now reinvestigate Example 3.9. The purpose of this example was to illustrate

the notion of a random variable and the corresponding sample space. In

the example, we have a lot of 100 items of which 12 are defective. What is the

probability that in a sample of 10, 3 are defective?

Solution: Using the hypergeometric probability function we have

h(3; 100, 10, 12) =

Example 5.14:1 Find the mean and variance of the random variable of Example 5.12 and then use Chebyshev's theorem to interpret the interval //. ± 2 .

Solution: Since Example 5.12 was a hypergeometric experiment, with N = 40, u = 5, and

k = 3, then by Theorem 5.3 we have

Taking the square root of 0.3113, we find that a = 0.558. Hence the required interval is 0.375 ± (2)(0.558), or from -0.741 to 1.491. Chebyshev's theorem states that the number of defectives obtained when 5 components are selected at random from a lot of 40 components of which 3 are defective has a probability of at least 3/4 of falling between - 0.741 and 1.491. That is, at, least three-fourths of the time, the 5 components include: less than 2 defectives.

Relationship to the Binomial Distribution

In this chapter wc discuss several important discrete distributions that have wide

applicability. Many of these distributions relate nicely to each other. The beginning

student should gain a clear understanding of these relationships. There is an

interesting relationship between the: hypergeometric and the binomial distribution.

As one might expect, if n is small compared to N, the nature of the N items changes

very little in each draw. So a binomial distribution can be used to approximate

the hypergeometric distribution when n is small, compared to N. In fact, as a rule

of thumb the approximation is good when < 0.05.

Thus the quantity plays the role of the binomial parameter p. As a result, the binomial distribution may be viewed as a large population edition of the hypergeometric: distributions. The mean and variance then come from the formulas

²= npq =

Comparing those formulas with those of Theorem 5.3, we see that the mean is the

same whereas the variance differs by a correction factor of (N — n)/(N — 1), which

is negligible when n is small relative to N.

Multivariate Hypergeometric Distribution

If N items can be partitioned into the A: cells A₁, A₂,..., A_k with at, a.2,..., ak elements, respectively, then the probability distribution of the random variables X₁,X₂,...,X_k,, representing the number of elements selected from A₁ ,A₂,..., A_k in a random sample' of size n, is f(x₁, x₂, ... x_k ; a₁, a₂, ... a_k, N, n) = with

Example 5.16:1 A group of 10 individuals is used for a biological case study. The group contains 3 people with blood type O, 4 with blood type A, and 3 with blood type B. What

is the probability that a random sample of 5 will contain I person with blood type O, 2 people with blood type A, and 2 people with blood type 13?

Solution: Using the extension of the hypergeometric distribution with x₁= 1, x₂ = 2, x₃ = 2,

a_I = 3, a₂ = 4, a₃ = 3, iV = 10, and n = 5, we find t h a t the desired probability is

f(1,2,2;3,4,3,10,5) =

Bangkitkan Semangat

Selasa, 21 Mei 2013

Areas under the Normal Curve

Senin, 20 Mei 2013

Sebaran Seragam Kontinu

Minggu, 19 Mei 2013

Sebaran Poisson

Hypergeometric Distribution

Entri Populer

Label

Pengikut