Mathematics Master Year 2013-2014 Royal University of Phnom Penh

a biased coin, that is, a random variable which takes only two values, 0, with probability 1 ... It is better to make a graphic representation : ListPlot[[S[20]] ... Representing the sum in the case of mean value 0 : a discrete brownian motion, or the ...
182KB taille 3 téléchargements 222 vues
Mathematics Master Royal University of Phnom Penh

Year 2013-2014

EXPERIMENTS IN PROBABILITY PIERRE ARNOUX

1. Introduction This is NOT a lecture; it should be compared to a laboratory session, or practical works, in the physical sciences. The goal is that the students will build a large set of random sequences, and see by themselves that, while these sequences display random variations and are typically very different, some elements, depending on the nature of the sequences, seem to be asymptotically constant. These asymptotic properties can indeed be proved, under some hypothesis, and it is the aim of a probability course to give the concepts and techniques necessary to give these theorems and proofs. The goal of this short session of practical works is much more restricted : it is to give some intuition and feeling on the behavior of random sequences, to show experimentally what kind of theorem we can expect to prove, and what it the practical meaning of such theorems, and, at the end of the session, to show experimentally that some conditions (existence of mean value and standard deviation) must be satisfied for these theorems to hold. The present text is for the organizer of the session, not for students, since it does not contain any figure. 2. An ordered sequence of experiments As in any practical session, we start by something that is given (after all, in a chemistry lab, if somebody tells you that the bottle contains chlorhydric acid, or silver nitrate, you believe him...). Here, what is given will be a random variable X. Till Subsection 2.9 included, we will only use 4 types of random variables : • a fair coin, that is, a random variable which takes only two values, 0 and 1, with probability 21 . We will call this random variable B (for Bernoulli). • it will be useful to consider the related variable 2B − 1, which takes values 1 and −1 with probability 21 . It is essentially the same as before, but it has expectation 0 and standard deviation 1, which makes all computations simpler. We will call this random variable R (for random walk). • a biased coin, that is, a random variable which takes only two values, 0, with probability 1 − α and 1, with probability α, where α is a fixed number in [0, 1]. We will call this random variable Bα (for Bernoulli of parameter α). Of course, B above is just B 1 . 2 • A random number taken uniformly in the interval [0, 1], that is, a random variable which takes any value in [0, 1] in such a way that the probabilty to have the value in a given subinterval is equal to the length of this subinterval. We call this random variable U . Date: April 2, 2013. 1

Any reasonable system provides these random variables; as we will show at the end, it is in fact enough to have the last one, and any real random variable can be built from the uniform variable on [0, 1]. It is however much simpler to start working with the variables B, R and Bα . 2.1. Building a random variable. I will be using here Mathematica, because I happen to have it on computer, but any reasonable system can do the same. We must just check that it is possible to make at least one million random draws in a second, to get reasonable pictures. First define B:=RandomInteger[1] R:=2B-1 This gives the first two variables we need. Test them to check that we do get two values with apparently equal probabilities, and other students do not get exactly the same values (otherwise, it means that the random variable, in fact a pseudo-random variable, laways starts with the same seed, which is a bad thing). The biased coin is slightly more tricky in Mathematica; do not forget to give it a different name, like Ba to avoid confusion in the computer program : Ba[a_] := RandomChoice[{1 - a, a} -> {0, 1}] The uniform variable is easy : U:=RandomReal[1] We have now the 4 variables we need. Let us go to work. All of our work will be with only one variable X, and we will start each time by defining it as X:=B or another of the 4 basic variables. 2.2. Building a list of random draws. If we only make one draw, we can see nothing: probabilities work with large numbers. We need to make a list of draws, and this is easy in Mathematica : X:=B Table[X, {i, 0, 10}] We can try this a few times, and see that we come up each time with a different list, which we will denote by (X1 , . . . , Xn ). 2.3. Building the sum of the list : a discrete staircase. It is however not very convenient to examine the list in this form;Pan idea is to consider the sum of the first terms, that is, the sequence (Si ) defined by Si = ik=1 Xi . Fortunately, Mathematica has exactly a function that do that : Accumulate. Hence we define the sum function by : S[n_]:= Accumulate[Table[X, {i, 1, n}] We can now look at the sequence (Si ); if we started with the Bernoulli sequence, this sequence is obviously increasing, but except from that, it is delicate to get information from these numbers. It is better to make a graphic representation : ListPlot[[S[20]] 2

2.4. Representing the sum : a first remark : the drift toward mean value. We can repeat this operation several times, with a parameter around 20. We see that we get every time a different shape, and the last point is not always the same. It is however interesting to increase the number: ListPlot[[S[100]] ListPlot[[S[1000]] ListPlot[[S[10000]] What can we see? For large numbers, the plot becomes very close to a line, and the endpoint is very nearly the same : there is a drift in the direction of the mean value. You can try it for another random variable : X:=Ba[0.3] ListPlot[[S[10000]] X:=U ListPlot[[S[10000]] We see the same drift in direction of the mean value. 2.5. Representing the sum in the case of mean value 0 : a discrete brownian motion, or the drunken man walk. We always see this drift in the direction of the mean value. To get a more detailed information, it would be nice to substract this mean value, that is, to consider the sequence Sn − nE; an easier way is to consider a random variable with mean value 0, like R. X:=R ListPlot[[S[10000]] We get now a quite wild picture, which is called a random walk, or a discrete brownian motion (or a drunken man walk). Each time we try this, we get a completely different picture. Why is it completely changed from the previous one, which looked like a line? The answer is on the scale of the vertical line! On the previous line, the scale was up to 3000, now it is usually less than 200: we have made a large zoom, which increases the irregularities. 2.6. Representing the sum in the case of mean value 0 : a second remark: the enclosing parabola and the standard deviation. We will get a better view if we do not plot one sequence, but many of them; the following command makes a list of 100 sequences Sn of length 1000 (we recommend playing with the numbers, depending of the power of your computer : larger numbers take longer to proceed, but give better pictures). ListPlot[Table[S[1000], {i, 1, 100}]] We see now that, under the apparent irregularity, some shape seems to be appearing: the set of all random walks seems to be included in a kind of rough parabola, and the largest possible excursion at time n is not n (which would be the case of the constant sequence 1; but this sequence, √ for n = 1000, has such a small probability that it nevers appears), but of the order of n. Indeed, the following command computes the sequences √ √ ±2 n and ±3 n and displays √ them on the same graph as all the random walks: we see that few of them overshoot 2 √ n, and, except at the beginning, it is very exceptional for a sequence to overshoot 3 n. L = { Table[2 Sqrt[i], {i, 1, 1000}], Table[3 Sqrt[i], {i, 1, 1000}], Table[-2 Sqrt[i], {i, 1, 1000}], Table[-3 Sqrt[i], {i, 1, 1000}]}; ListPlot[Join[L, Table[S[1000], {i, 1, 100}]]] 3

Here again, we recommend to play with numbers, depending on the power of the computer. The mathematical comment here is that the standard deviation can be proved to √ increase as n (it is in fact an application of Pythagoras theorem : these random variables are pairwise orthogonal, and of same length), hence we cannot have many events further √ than a multiple of n (this is a form of the weak law of large numbers) 2.7. Taking averages : convergence to the mean value and the strong law of large numbers. To see more exactly the drift in the direction of the expectation, one can plot the averages Sii . For any list L, define the operation that replace Ln by Lnn : A[L_] := Table[L[[i]]/i, {i, 1, Length[L]}] And plot the resulting sequence, using the Bernoulli variable B : X:=B ListPlot[A[S[10000]]] Be careful when you observe the result, look at the scale on he vertical axis! It might be better to fix it : ListPlot[A[S[10000]], PlotRange -> {0, 1}] And it is interesting to play with the other variables too : X:=Ba[0.3]; ListPlot[A[S[10000]], PlotRange -> {0, 1}] In each case, we can check that the sequence always converges to the mean value. This is the content of the strong law of large numbers: although there exist sequences that do not converge (for example, the sequence of a coin always falling on head exists in theory), the probability of the set of such sequences is 0 : they never appear in practice. But of course, it only works for large number: if you wait long enough, you will see a coin falling 1 10 consecutive times on head, the probability is 1024 for a fair coin, which is not so small. √ 2.8. The good distance to the average : dividing by n. How fast is the convergence? From what we saw above, we can suspect the answer : it should be approximately p (n). To check it more precisely, we can define a new average for a random variable Sn with mean value 0, by considering √ instead of Snn : n AS[L_] := Table[L[[i]]/Sqrt[i], {i, 1, Length[L]}] We now represent this new sequence, taking care to use the random variable R with mean value 0 : X := R; ListPlot[Table[AS[S[1000]], {i, 1, 100}]] And sure enough, we see that all these sequences are typically non zero, but they are bounded, between -3 and 3 in that case. 2.9. Distribution to the average: the histogram and the bell curve. But how is the distribution in this interval? If we draw 1000 times, we know that we will probably get a number less than 100, but what is the probability that we get something between 20 and 30? Is this number well-defined? Well, an idea might be to look at the value we get at the end of a draw (we divide by 2, because it is not difficult to check that Sn has the same parity as n) : X := R; S[1000][[1000]]/2 4

We make a list of such values, for multiple draws ( do not forget the ; at the end, or you will get a very large output!): X := R; L = Table[S[1000][[1000]]/2, {i, 1, 1000}]; And now, count how many times every integer occurs in the sequence, and represent the histogram : ListPlot[Table[Count[L, i], {i, -60, 60}]] You will see that the histogram has a rather well-defined shape: it tends to the famous bell-curve, or Gaussian curve, given up to e change of coordinates by the Gaussian func2 tion e−x . The convergence gets better and better when you test Sn for a very large n, and when you make a very large number of tests. What we have seen here, in an experimental way, is a very important theorem, the famous Central Limit Theorem. Of course, purists here will say that we just rediscovered in a complicated way the binomial function, and remarked that, for large n, this converges to the Gaussian, which is true. But the interest here is that we could replace the Bernoulli Random Variable by any other and make the same remark (but you have to be careful if you take real variables, it is a bit more complicated to draw the histogram for real sequences : you have to cut the axis in intervals, and count the number of points in a given interval). 2.10. A counter example: the Cauchy distribution. Well, can we rally take any random variable? Here is a very simple one with strange 1 results. Consider the Cauchy variable, with density π1 1+x 2 ; it can be obtained from the 1 uniform variable by tan(π(U − 2 )). One can make the following : cauchy := Tan[Pi (U - 1/2)] X := cauchy ListPlot[S[1000]] ListPlot[A[S[1000]]] Now we get something very wild : the random walk S has very large discontinuities, and pieces which look more continuous; but even when we take averages, we see no convergence (by symmetry, if there was a convergence, it would be to 0); the value of A can be as large as 6 for large values, and successive draws are completely different. If we draw several draws, we get a picture with little information, and where Mathematica will cut out part of the data : ListPlot[Table[A[S[1000]], {i, 1, 100}]] if we want to see all data, we need to specify the plotrange; in that case, any time we will get a rather different picture (the hyperbolas come from the fact that, when we get for X a very large value M , An essentially behaves as M till the next large value). n ListPlot[Table[A[S[1000]], {i, 1, 100}], PlotRange -> All] If we try to compute the mean value of this Cauchy variable, we get the integral R +∞ x dx; by symmetry, this integral is obviously 0, except that... it does not exist! It −∞ 1+x2 is easy to check that there is an explicit primitive log(1 + x2 ), which has no finite limit at infinity. So the mean value is not defined, and we can see here the practical consequences of this fact. 2.11. Practical realization of a given random variable using an inverse or the repartition. 5

It is in fact easy to build an arbitrary random variable with a given repartition function. If F is a bijection from R to (0,1), as is the case for the Cauchy distribution, define G = F −1 , and consider the variable G(U ), where U is uniform on [0, 1]; one proves easily that G(U ) has F as repartition function. It is just slightly more complicated when F has jumps (corresponding to atomic parts of the measure, as for any discrete random variable) and constant parts (corresponding to parts of R of measure 0). Let G be the map G : (0, 1) → R x 7→ sup{y ∈ R|F (y) < x}. One can prove that G(U ) has F as repartition function. There are other ways to simulate a random variable, like the rejection method, but it seems better to reserve them for further study. 2.12. Some other examples. To end this laboratory session, it can seem interesting to test other random variables, for example random variables which have a mean value but no standard deviation; some Pareto variables are of this type, and are easy to program. These Pareto variable depend on a coefficient β > 0 (in the general presentation, there is another coefficient x0 which can be changed to 1 by a suitable change of variable). β They have density xβ+1 for x ≥ 1, and repartition function F (x) = 1 − xβ ; hence, they 1 can be defined as 1 . (1−U ) β

These variables have infinite mean value if β ≤ 1; if 1 < β ≤ 2, they have a finite mean value, but an infinite standard deviation. It can be interesting to play with these variables and test the previous figures in that case. One of the interesting applications of Pareto distribution is that, for values of β around 2, they offer a good model of the income repartition in most countries, at least for high incomes.

6