Mathematics for Computer Scientists By Gareth J.. - The-Eye.eu!

t:lyrnbol F 1\ ib nonnall,? ubed ab a shorthand for ~~. ..... Simply hy multiplying hot.h sideR of line (:3) hy ab gives: binee the order of ...... Abaci (1202) poses.
3MB taille 1 téléchargements 287 vues
Gareth J. Janacek & Mark Lemmon Close

Mathematics for Computer Scientists

2

Mathematics for Computer Scientists © 2009 Gareth J. Janacek, Mark Lemmon Close & Ventus Publishing ApS ISBN 978-87-7681-426-7

3

Mathematics for Com puter Scientists

Contents

Contents Introduction

7

Chapter 1

~umbers

8

1.0,1

Integers Rationals and Reals

8 12

Number Systems

14

Chapter 2

The Statement Calculus and Logic

22

2.0.4 2.0.5

Analyzing Arguments Using Truth Tables

33

Contradiction and Consistency

36

Chapter 3

~lathematical

37

Chapter 4

Sets

41

4.0.6

Relations and Functions

47

Chapter 5

Counting

51

5.0.7

Binomial Expensions

55

1.0.2 1.0.3

Induction

Mathematics for Com puter Scientists

Contents

Chapter 6

Functions

58

6.0.8

Imp01iant Functions

66

6.1

Functions and Angular Measw'e

71

Chapter 7

Sequences

75

7.0.1

Limits or Sequences

76

7.1 7.1.1

Series T111111 i te Ser; es

79

Chapter 8

Calculus

85

8.0.2 R.0.3 8.0.4

Continuity and Differentiability Newto11-Raphson Methods Intehrals and Integrations

87 93 96

Chapter 9

Algehra:

9.0.5 9.0.6 9.0.7

Equation Sol ving More on Matrices Addition and Subtraction

101 108 108

9.1

Detenninants

116

9.2 9.2.1

Propeliies of the Determinant Cramer's Rule

117 119

Chapter 10

Prop ability

121

~latrices,

RO

Vectors ect.

100

5

Mathematics for Com puter Scientists

Contents

10.0.2

Propability - the Rules

123

10.0.3 10.0A

Equally Likely Events Conditional Probability

123

10.0.5

Bayes

126 129

10.0.6

Random Variables and Distributions

131

10.1

Expectation

132

10.1.1

133

10.1.2

Monents Some Discrete Probability Distributions

10.1.3

Continuous Variables

135 13R

10.1.4

Some Continuous Probability Distributions

140

10.2

The Normal Distribution

146

Chapter 11

Looking at Data

148

11.1

Looking at Data

14R

11.1.1

149

11.1.2

Summary Statistics Diagrams

11.2

Scatter Diagrams

154

152

6

Mathematics for Com puter Scientists

Introduction

Introduction The a.ln1 of this book is 1,0 presenL sorne the basic 111;1Lhernatics tha,t is needed by cOlnputer scientisLs. The reader is not expected 1,0 be a, rnatherna,tician and \ve hope will find whaL follows useful. J1lst. a word of w;:trning. Unless yon ;:tre one of the irritat.ing minority mat hemat.ics iR 118,1'0. 1"011 cannot jnst re8d 8, ll1Rt hematicR hook like a novel. The cOIIlbination of the cOlnpression Inade by the SYIIlbolb used and the precibion of the arguIIlent Inakes this irnpossible. It takes tiIIle and effort to decipher the lnathernatics and understand the IIleaning. It is a, liLtle like progrannning, it takes tirne to undersLand a 101, of code and you never understand how to wriLe code by just reading; a nlcUlUctl - }'OU have to do it! IVlat.hell1R.tic:s is eXRdly t.he Rame, yon need to do it.

7

Mathematics for Com puter Scientists

Chapter 1. Numbers

Chapter 1 NUlllbers

\Ve begin by t;:dking :='thont nnmbers. ThiR m8Y seen rather elementary hut iR does Ret the Rcene and introdnce A lot of notAtion. In addition mnch of \vhat follmvR is iInportant in cornputing.

1.0.1

Integers

\Ve hegin hy ass11ming you are familiar ,vith the intr:gers 1,2,:3,4" , ,,101,102: " , ,11" , , , 232582657 - 1'" "

Rometime ulllen the \vll0le numbers. TheRe are jnRt the n11mherR \ve 11Re for connting. To thet:le integers \ve add the ;;ero, 0, defined as

0+

allY

integer n

= 0+ n =n + 0=n

OlKe we have the int egers and hero nlctLhernaticians u'eat e negctLi ve integers uy de llning (- n) as: tlle n11mher Wllidl ,vhen anded to n giveR 7.ero: RO n + (-n) = (-n) + n = O. Eventnally \ve get fen np \vith writing n+ (-n) = 0 8nn ,vrite this 8,R n-n = O. \Ve hmre nmv got tlle positive ann negative integers {... ,-3, -2, -1,O,l,2,3,4, ... J Yon are probably URen to 8Tit hmetic \vith integers ,vhic;h followR simple r11leR. To he Oll the safe t:lide \ve iternize theIn, so for integers a cUld b

1. a+b=b+a

2. a x b = b x a or a b = ba 3. -a

X

b

=

-ab

8

Mathematics for Com puter Scientists

Chapter 1. Numbers

4. (-a)x(-b)=ab

5. To save space \ve \vrite a k as a shorthand for a Inultiplied by itself k tinleb. So 34 = 3 x 3 x 3 x 3 and 210 = 1024. Note an x am = ani m

Factors and PriInes lVla.ny integers are products of srnaller integers, [or exarnple 2 x 3 x 7 = 42. IIere 2, :3 and 7 are cctlled the .radon) o[ 42 and the splitting; o[ 42 into the individual componentR iR known aR far:torization. This c;;m be a difficlllt exerc:iRe fOf large integefs: inoeed it iR RO difficult tll;;tt it is the hasiR of Rome methodR in cfyptography. Of course not all integers have factors and those that do not, buch as

3 , 5 " 7 11 1·,.:), ... , 2216lW 1 - 1, ... )

are kllO\Vll as primes. PrirIleb have long fascinated rnathelnaticians and others see

http://primes.utm.edu/, ano tllere iR a c:onRidenthle industry looking for prirnes and faRt v'laYR of fadori:;>;ing integerR.

To get Inuch further vve need to consider division, vvhich for integer:::; can be tricky :::;inee \ve Inay have a. rebult \vhich is an integer. Divibion rnay give ri:::;e to a rt'1nai1tder, [or eXcuIlple

9=2x4+1. and RO if \ve try to divide 9 by 4 ,ve have a femainoer of 1 In general for any integefs Q ;;1110 b

b=kxa+r where T i:::; the Te-rnaindcT. If r is zero then we say a di'uidc8 b vvritten a I b. A :::;ingle vertiull bar is used to denote di·uisibilii-y. For exarnple 2 I 128, 7 I 49 but ~i does llot divide /1, bYlnbolically 3 /4. Aside To fino the fadorR of an integef \ve C;:'IJ1 jllRt atternpt diviRion by primeR i.e.

2,35,7,11,19, ... _ If it iR diviRihle hy k then k is a f;:'l.ctOf and \ve try ;:'I.gain. \\Then we ca.nnot divide by k we take the next prirne and continue until we aTe left with a prilne. So for exalnple: 1. 2:394/2=1197 can'1, divide by 2 again so tr}' :3

9

Mathematics for Com puter Scientists

Chapter 1. Numbers

2. 1197/3=:399 o'J. 'Jon .J.:J:-:J / ") oJ =

1.h) ')') c;:'m·'t, u]Vlue)y -1" . -1 1 2 " RO try .(. ., ( no,t u]VIRl -1" . "11 ag;;:un ) e hy ,)r.:')

/'1. Li~i/7 = 19 which ib prirne so 239J =2

X

3

X

3 x 7 x 19

IVlodular arithrnetic The mod operator you rlleet in cornputer languetges sirnpl}' gives the the rernainder after oivision. For example,

1. 25 rllod 4

=

1 becetuse 25 -;- 4 = 6 reillaincier 1.

2. 19 moo 5

=

4 Rince 19 = 3 x 5 + 4 .

~i.

= 4.

24 Inod 5

4. 99 nlOd 11 = O.

There aTe sonle cornplications "vhen negative nUInbers are used~ but \ve "vill ignore theIn. \Ve also point out that .you \'lill often bee thebe results \'lritten in a slightly diIIerent way i.e. 24 = 4 rnod 5 or 21 = 0 IlIod 7. v.'hic"h just rneeU1S 24 IlIod 5 = 4 and 27 IlIod 7 = 0 l\:lodula.r arithrneLic is sOllIeLirnes called dock aritlnnetic. Suppose \Ve take a 24 h011r clock so 9 in the morning iR 09.00 ana (:) in the evening iR 21.00. If I start a jonrney at 07.00 ;:'I,no it t;:'l,kes 2,) h011rs then I will arrive at Otl.OO. \Ve can think of this as 7+225 = ~i2 and B2 Inod 2/1 = 8. All \ve are doing is starting at 7 and going around the (25 hour) clock face until we get to 8. I have ahvays thought this is d, cOlnplex exmnple bO take a siInpler version. Four peOlJle sit around a table etnd vve label their positions 1 to 4. \Ve have a pointer IJoint to position 1 which v.'e spin. Suppose it spins 11 and three quarters or 47 q11arterR. Tlle it iR pointing; ;:'It 47 1'noo 4 or ::L

1

~\

4

( f ) ~

10

2

Mathematics for Com puter Scientists

Chapter 1. Numbers

The Euclidecill algorithm Algoritlnns which are bcherneb for cornputing and we cannot resist putting one in a.t Lhis point. The Euclidean algori 1111n [or finding the gcd is one of the oldest a.lgori11nns known, iL appeared in Euclid's Elernents around ;300 13C.11 gives a \vay of finding t.he greateRt c;ommon diviso1' (gu~) of two nnrnherR. That. is t.he l;;trgest numhef whic;h will oivioe t.hern both. Onr aim is to fino ;;t a "VRY of finding tlle gfe;:ttest. c;ommon rliviRof, gc;rl( a, b) of two integers Q and b. Suppobe Q ib an integer srnaller than b. 1. Then 1.0 find the greatest c-onlIIlOn fador between a andb: divide b by a. If the rerna.inder is zero: then b is a. InulLiple of Q and we are done. 2. If not, divide the divisor a by the rerna.inder. Cont.inue t.his p1'oceRS: oivioing t.he last diviso1' by the l;:tRt 1'emainoef, nntil the rem;:tinrle1' is 7.e1'O. The l;:tRt nOn-7.e1'O 1'ernainoef is then the greatest C;0111mon f;:tc:t.01' of the intege1's a anrl b.

11

Mathematics for Com puter Scientists

Chapter 1. Numbers

T11A ;:dgorit11m is illw:;t.ratAo by t.he follovYing example. Consioer 72 and 246. \VA have t.he following 4 steps:

1. 246

=

3 x 72 + 30 or 246 IlIod 72

=

30

2. 72 = 2 x 30 + 12 or 72 Inod 30 = 12 ;:L 30

=

2 x 12 + 6 or 30 moo 12 = 6

it 12=2x6+0 so the gcd is 6. There are several websites that offer Java applications using thib give a Python function

algoritllIn~

v'll'

def gcd(a,b): II"" the euclidean algorithm """ if b == 0: return a else: return gcd(b, (a%b)) Those of you who would like to see a direct appliccttion of SOllIe these ideas 100 c;omputing should look ;:tt the section on ranoorn numbers

1.0.2

Rationals and Reals

Of c;onrRe life wonlo he harcl if we only hacl int.egers :='tncl it. is a short Rtep t.o the rat.ionalR or fractions. By a nttion;:d nnmher \ve mean a nnrnher th;:tt (::='tn he vYritten :='tR PI Q w11e1'e P and Q :='Ire integers. Examples are

3 2

4

7

7 11

6

These nUInbers aribc ill a.n obvious v'lay~ you can iInagine a rulcr divided into 'ith:::;: thCll v'll' can Ineabure a. length in ~iths~. :rvIathcrnaticians~ of course~ hmrc Inore c·onl}Jlic·a.1oed dellnitions based on IIlodular a.riLhrnetic . They vvollld argue Lhat for every integer 11, excluding :,.';ero, there is an inverse, writ ten 1In which has Lhe properLy Lha.1o 1 1 i:llld

nx-=-xn=l

Of course lIlllltiplying ral:iO'lI.a[

lin

n n by m gives a fraction

min. These ctre of Len called

nwnbers.

\Ve can Illana.g;e \viLh the silnple idea, of fractions.

12

Mathematics for Com puter Scientists

Chapter 1. Numbers

One problcln \ve encounter is that there are nUlnbers which arc neither integers or rationals but sOlnething else. The Greeks vvere surprised and confused \vhen it was dell10nstraLed Lhat V2 could not be written exa.ctlv ctS a [racLion. Technicallv there ;=tre no int.eger values P ann Q snch th;=d, P/ Q = F1'om onr point of vievY we will not need t.o nelve muc;h fnrther into the netails: eRpecially A.R we can get goon enongh Rl)proxim;=ttion nRing fractions, For example 22/7 ib a reabonablc approxilnation for 'iT "vhile ~i2)5/1 Li is better. )rou will find people refer to the Teal n'uTnbcT'8, sonletinles "vritten IR~ by which they rnean all the nUlIlbers we have discussed to daLe.

,'J2.

'

Notation As yon vyill hAve realized hy no,\, there is a goon oe;;d of notation ann \ve liRt some of t.he Ryrnhols and functionR YOll rnay meet .

• If x is If.'i8 than 1-) tllen we write x < 1-). If there is a pORRihilit.y t hat they rnight be equal then x :::; 1), Of course \ve can write thebe the other \vay around. So 1J > x or 1) :2: x. Obvioubl,Y we can also say 1J is greater than x or greater Lhan or equal 1.0 x

• The 17001' f(l'nd:ion o[ ct rectlnurnber x, denoted by lx.J or floor(x), is a. fUIlcLion that returns Lhe largest integer less Lhcu1 or equal to x.. So l2.7J = 2 and l-3.6 J = -3. The function floor in .JAva ann Pyt.hon performs this operation.

13

Mathematics for Com puter Scientists

Chapter 1. Numbers

r

• A leRR uRed function iR the tf'1hng j111l.dion, \\'ritten xl or ceil (x) or ceiling( x) 1 is t.he function that. ret.urns tlle sm:=,dlest integ;er not IfS.'! than x. Henu~ r2.7l = 3. There iR an obviouR(?) connection t.o moo Rince b mod a can be written b - flOOT(b ~ Q) x Q, SO 25 IJlod 4 = 25 - l25/4J x 4 = 25 - 6 x 4 = 1 • The 1flo(lulu,'j o[x vniLLenl xl is just. x \\'hen x ~ 0 and -x when x < O. So I 2 1= 2 and I -6 1= 6. The LUTloUS result etbout the 1110dulus is tha,t [or anv x and y I x + 1:1 1 0) x = 2y} (b) B = {x E Z I for SOnll' intl'geI1J > 0) x = 21) (c) A = {x E Z I for sorne intl'gerx < 10} (a) A = {x E Z

1}

Describe ~ A, .... (A U BL--- C,A- .... C, ctndC - (A U B) 4. Sho,v that for all sets A, Band C (A

n B)

U C

= A n (B U C)

iff C C A 5. \Vhd,t is the cal'dinalty of {{l ) 2}) {3L 1}. 6. Give tlle dom8jn ana the range of each of the following rel8tions. DrH,v the graph in e8,ch case.

49

Mathematics for Com puter Scientists

Olapter 4. Sets

(8.) Ux,Y)EJRxJR}lx2 +4y2=1}

(b) {(x, y)

E

1R x 1Ft} I x 2 = y2}

(c:) Ux, y) E 1R x JR} I 0 < y, Y < x an a x + 1y < 1J 7. Define the relation c> on {(x, y) and (u, v) E Z} \VheIe (x, y) c> (u, v) IIleans xv = yu. Show that c> is an equivalence relation.

50

Mathematics for Com puter Scientists

Chapter 5. Counting

Chapter 5 Counting Therc are thrce t'.I)PC8 of people in this world: Thosc 'who can COLtn( and thosc who can"t.

Counting seern quite birnplc but this is quite deceptive, ebpecially "vhen \ve have cOInplicated s,ystern. If,you do not believe rne have a look at the probability bection. To rnake like et lit tIe sirnpler we lay elovvn sorne rules.

Sets If we have two sets A and B the nllnlber of itern in the sets ( the cetrdinality) is wriLLellllAl1 etnel IIBII. Then we can show that

IIA U BII = IIAII + IIBII-IIA II BII Tllis iR fairly easy to see if Y011 use 8, Venn riiagr8Tn. For;3 Rets

IIA U BI = IIAII + IIBII + I ell -IIA rl BI -liB rl ell -IIA rl ell + IIA rl Br~1 ell Exalnple

Let S be the set of all outcorneb when t\VO dice (one blue; one green) are thro\vn. Let A be the subset of outcornes ill which both dice are odd~ and let B be the subset of ollLcOlnes in which both dke are even. \\7e \\-Tite e for the set of olltcornes when the 1,\\,'0 dice have the sa.lne nurnber sho\\,'ing;. Ho\v m8,ny elernents are there in the follov.ring RetR? 11 is useful to have the set S set out as belo\\'

51

Mathematics for Com puter Scientists

11 12 21 22 ') :3 2 .J 1 -1 1 -12 51 52 () 1 ()2

Cha pter 5. Cou ntin 9

1 ;3 2 ;3

14 24

15 25

1G 2G

:3 3

:3 4

43

44

:3 5 45

:3 6 46

~i

5 (1 () ,1

55 ()5

56 () 6

5 ()

~i

then we have

1.

IIAII=9

2.

IIBII = 9

3.

Ilell = 6

4.

IIA n BII=O

5.

IIA U BII = 18

6.

IIArlCl1 = 11(1) 1L(3 3L(S)S)11 =3

7.

IIA U CII = IIAII + Ilcll-IIA r~1 cil = 9+ 6 - 3 = 12

1

Chains of actions If \lIre have to perfonn two actions in sequence ctnd the 11rst CcUI be done rn \vays \vhile the second can be done in n there will be mn possibilities in tohtI. • SnppoRe vve ,viRh to pick 2 people from 9. The first c;:'m be pic;keo in 9 \V8ys the sec;ono in 8 giving 9 x 8 = 72 possibilities in tohd .

• If \'le roll a die and then toss a coin there are 6 x 2

52

=

12 possibilities.

Mathematics for Com puter Scientists

Cha pter 5. Cou ntin 9

This extends to several successive actiOlls. Thus L If we roll a die 3 tirnes then there arc 6 x 6

= 216

possibilities.

2. If we toss a coin 7 tirnes there are 2 x 2 x 2 x 2 x 2 x 2 x 2 possi bili tics. ~i.

= 27 = 128

rvIy bic.yele lock has ,'1 rotors each \vith 10 digits. That gi\.'es lOx lOx lOx 10 = 104 cOlHbina.Lions.

4. Suppose ,you have to provide an 8 charadeI' passvvord for a. credit ca.rd con1pany. They say thaL you can use a to z ( case is ignored) a.nd 0 to 1 but there m11Rt he at leRst one n111'nher 8nn 8t least one letter. there aTe 2C1 letters and 10 Ilurnbers so you ca.n rnake 8 36 possible P,lss\vords. Of these Lhere are 810 which are all nunlbers and 826 which are all let ters. This gi yes 836 - 826 - 810 = 3.245 x 10 32 a.llowable passwords.

Permutations SUPIJOSe I have n distinct iLerns and I want to arrange thern in a line. I C"tln do Lhis ill n x (n -1) x (n - 2) x (n - 3) x ... x 3 x 2 x 1 \Ve C'on1lJute Lhis product so orten iL has a special sylHbol n!. IIowever to avoid prohlems ,ve dr/int 1! = 0 and O! = 1

So 3! = 3 x 2 x 1 = 6 \vhile 5! = 5 x 4 x 3 x 2 x 1 = 120 If we look at the charac'Lers in (lD4Y) there ctre 4! = 24 possible distilld arrangementR. SOlIletilIles we do not have all distinct iterns. \:Ve lHight have n itelll of \vhich r are identical then there are n!/r! dilIerent possible ctrrangelHenLs. So \VALL'Il can ue arrallged ill 5!/2! = 60 ways. 11 is sirnpler to jusL sLate a rule in Lhe rnore genentl case: Suppose \ve have n objects and • there (},Ie n 1 of type 1. • there (},Ie n2 of type 2.

·

..... .

• there (},Ie nk of type k.

53

Mathematics for Com puter Scientists

Cha pter 5. Cou ntin 9

Tlle total nnmher of ite1'nR in n, so n = nl

+ n2 + ... nk then

there are

possible arrangell1ent s. SnpPoRe we 11Hve :) white, 4 reo 8no 4 hlRc;k hallR. They C;8n be arntnged in a row 111 11 ! 3!4!4! = 11550 possible w8ys while the letters in \VALLY can he arrangeo in

51 2t1!1!1!

- - - = 60

ways

'

COlnbinations Tlle n11mher of vvaYR of picking k itemR from a gronp of Ri7.e n is \vritten (~) or (for the traditionalists) nCk. The definition is

(n) k

n!

- (n- k)!k!

So the n111'n her of ,V8YR of picking 5 Rt110ents from a gro11p of 19 iR

19! 5!14!

19x 18x 17x 16 4x3x2x2

54

Mathematics for Com puter Scientists

Cha pter 5. Cou ntin 9

Examples J _ SuppoRe yon want t.o win tlle 1ott ery_ There are 4(:) nnmhers ano yon can

pick (). This can be done in

49! -4= 13983816 vvavs ! ?! '

6.

J.

so your chances of a win are 1/1:398;3816.

2. How IIl,ln}' \Vll}'S can you pick 5 correct rllullbers in the lottery. There are (~) Wll.yS to pick the 5 correct rllullbers and /19-()=/ti \VaYb of picking the rernaining rllunbeI. This gives 6 x 43 \vays. ::L \Vhen vve pick 3 correct nllmhers there are (~) \V8yS of pir;king t.he \vinning

nnmbers and Ci~) ,V8yS of picking the losing oneR_ This gives (~) x C~~) = 20 x 12341 = 246820 w8yS in AlL

5.0.7

Binomial Expansions

Now we have cOlnbinctLions we ("cUI excunine ct very useful result known as the binomial expanRion. To Rtart. ,ve UUl show that

ana (0 + b)J =

0

3

+ 3a2 b + 30b2 + b J

In gener8J \ve can prove that for an integer n

>0

or

This UU} be dOlle by For exmnple

illductioll~

but there isib a page or so of algebra!

or

5S

Mathematics for Com puter Scientists

Cha pter 5. Cou ntin 9

SuppoRe }'"OU vvere given (3x + 5/x3)~ ;;tno. you wanteo the term in the expansion wllic:h did not lHfve 8n x. Fl'om the Rhove the genr,r81 term iR

The x tenus cancel when 8 - i

=

3i or i

=

2. Then the ternl is

\Ve can do sorIlething sirIlilar for non-integral n as follo\vb: TL

( 1+x 1

-

1

= +nx+

11 ( n

- 1)

1.2

2

x+

hut tllis is only true vvhen Ixl Thus (1 + x) 1/2 = 1 + (~)

n (11

-

1)( n - 2)

1.2.3

+ ... +

11 ( n

- 1 )( n - 2) ...

(11 -

k

+ 1)

1.2.3 ... k

< 1. 1 2 X /

+ (~) (-j) X- 1/ 2 + (~)

(-~)

(-1) X-3 / 2 + ...

Exanlples

1. Suppose \'le look at sportb scholarbhipb i-l\varded by ArIlerican uni'versitiet:l. A toLal of 147~000 schohtrships were earned in 2001. OuL of the 5~500 scholarships for athletics, 1500 v.'ere earned by vvornen. \\-'ornen earned 75,000 sc;holarships in tohd. How many 1'nen earned scholarships in athleticR? 2. In clinical trials of the suntan lotion, Delta Sun, 100 test bubjectt:l experienced third degree burns or nausea (or both). Of these~ a total of ~35 people experiellced third degree burns, and 25 experienced both third degree uurns awJ llausea. IIow 111 any subjects experienced nausect? 3. A total of Hli) 0 1\,18c o.egrees ,vere e;:trneo. in 2002. Ont of the 41 1\{Sc o.egrees in 1'nnsic: and 1'nnsic: therapy~ ,) ,vere e;;trneo. by men. 1\,1en earneo 050 1\:ISc degrees. Hovv Inany \VOnlCn earned 1\:ISc degrees in fields other than nlut:lic (},nd nmsic therapy?

4. A survey of 200 (-redit card cusLOluers revectled thctL 98 of thelTl have a Visa ;:'I,c:collnt, 11;3 of them 11ave a I'vTaster Card, 62 of them have a Visa acconnt ;:'I,no. ;:'I, Americ;;:'I,n Express: :36 of the1'n have ;:'I, 1\,1;:'1,Rter Card ;:'I,cconnt ;:'1,110. ;:'1,11 .AlIleIican Express, .J 7 of thenl have only a rvIaster Card account, ;32 have a Viba account and a. ~,:[aster Card account and an ArIlerican Exprebb. AssllIIle that every cllbtolner has at least one of the bcrvices. The nurIlber of CUbtoIIlers who have only have a Visa, card is?

56

k

x + ...

Mathematics for Com puter Scientists

Cha pter 5. Cou ntin 9

5. So for example from t.he New York Tirnes According to a New York Times report on the 1() top-perfonning rebtaurallt chaillb (a) 11 serve breakfabt. (h) 1 1 serve beer. (c) 10 have full table service i.e. the,V server ctlc-ohol and all rneals. All 16 olTered at least one of these services. A total of ;) vvere classified as "family c:hainR,'~ meaning t.hat. t.hey serve breakf:='tRt, but. do not serve akohol. Fnrt.her a tot.al of five Rerve hre;:tkf.qst 8no have full t.ahle service: while none serve hn~akf;:tRt, heel', ;:tno. alRo h8ve fnll t.R hIe Rervice. \Ve 8sk (a) (How rnany serve beer and breakfast '! (h) HovY rnany serve heer hut not. breakf;:tRt? (c) IIow rna.ny serve brea.kLtst, but neither have full table service, nor serve beer'? (d) How rnan,? serve beer and have full table ser\,rice'? (), vVIlCn 1x

1< 1 then

• 1/(1-x)

=

sho\v that

1 +X+X2+XJ+ X4+"·+XT1+ .. ·

• 1/(1 -X)',/2

= 1+

(1/2)x+ ll,/2)(-1/'2Jx2 + (1/'2H-l/~H-3./2)x3 +x/1 + ... 1.2

1._.3

• 1/(1-x)2 = 1 +2x+3x2 +4x 3 +5x4 + ... +nx n - 1 + ...

7. Expand (1

+ 2xf

8, vVhich is the coefficient of the terrn \vithout an x in (x + 2/x) 11 .

!l Find an approxirnation for (0.95) 11. 10. Find the first B ternlb of the expanbion of (1

57

+ x) 1//1,

Mathematics for Com puter Scientists

Chapter 6. Functions

Chapter 6 Functions 1\1atheTflal:ician,8 are Like Fre'u.chmen: whalever YO'(l say 10 them they translale i'u.to lheir O'wn lany'uaye and j01'lh'with it i8 sO'lnething enli'rely rhfleTent.

Johann \Volfgang von Goethe

QIle of the rnost fundarnenLal ( and useful) ideas in rna.1ohernatics is tha.1o of a

juncl'io'u.. As a prelirnina.ry dellnition suppose we have tvvo seLs X etnel Ya.llu \\.'e also have a rule vvhidl assigns 100 every x E X a U:\IQCE value y E Y. \Ve will call tlle rllle f ;;mo say tll;;tt for eac;h x tllere is ;:'} 1:1 = f( x) in the Ret Y. This is a very "vioe oefinition ;;md one th;:tt is very similar to th;:tt of ;;t relation: the critical point is tb.:l,t for ed,ch Cl. there it:; Cl. 'unique 'value y. A connnon \vay of \vriting functiolls is

f: X

-----1

Y

wllic;h illllstrates that \ve have two setR X and Y together \vith ;:'I, rnle f giving vaIneR in Y for values ill X. \Ve can thillk of the pairt:; (x) 1)) or Inore clearly (Xl f(x)). This bet of pd.irs ib the graph of the function In \vhat follo\vs \ve sho\v how fnndions ;:'ITiRe frorn the ioe;:'l, of relationR ana come np \vith Rome of the main oefinitions. -YOll neeo to keep in mino the sirnple ioea a fUllctioll is d, rule that takes in x "Talues and produces y values. It it:; probably ellough to visualize f as a device \vhich when given an x value produceb d, y.

257

58

Mathematics for Com puter Scientists

x

Chapter 6. Functions

f the fnndion y =f(x)

59

Mathematics for Com puter Scientists

Chapter 6. Functions

Figure 6.1: FuncLion f

Clearly if yon think of f ;:'tR a m;:tc:hine \ve need to take C8Te ahont ,vhat ,ve are allo\ved to put in, x, and have a good idea of the range of \vhat cornes out~ -yo It is these technical issues "ve look at next. The set X is called the dOlna.in o[ the [une·Lion f and Y is codOrn(l:ln. \Ve are lloflllally IIlOre interested in the set o[ vctlues { f( x) : x EX}. This is the range R SOllletiInes called the 'ima,Yt' o[ the [undion. See figure 6.1

Examples

'Ve call have f:X---+Y

1. f(x) = L" where X = {x: 0

2. f(x) =

'Vx vvhere X =

'o tenns. An interesting; aside is Lhat Lhe nth Fibollacci nUlnber F( n) can v,re \\TiLLen as F(n)

= [q/I -

(1 - ¢rlJ

/15 where ¢ =

(1

+ 15)/2 ~ 1.618 ...

vvhich is a surprise since F( n) is an integer and the forrnula contains lIlOfe on sequences see http://www.researdl.att.com/ njas / sequences /

7S

VS.

For lots

Mathematics for Com puter Scientists

7.0.1

Chapter 7. Sequences

Limits of sequences

\Ve turn our attention to the behaviour of sequences such as {an} as n becolnes very large. J. A Reqnenu~ m8y approadl RO for example we write

1,

R

finite v81ue A. \Ve Ray that it tenOR to a limit,

2: (2:1) ) (1)2

!

(l)n ,...

(1)'~ 2: ... 2:

or 1.0000

0.5000

0.2500

0.1250

0.0625

0.0312

0.0156

0.0078

0.0039

{Gf} ana we shall Ree that

2. If a sequence does not converge it lIla,? go to ±oo, that is keep increasing or decreasillg. 2 4

Informally {2rt} 'J

,).

---1 00

8 ;:'tR

16

32

64

128

256

512

1

-1

1

1024 ...

---1 00.

A sequellce IIlay just oscillate

1

-1

1

-1

1

-1

-1

Limit vVe need a definitioll of a liIIlit alld after 2000 years of trying \ve use: {an} ---1 A as ---t 00 if and only if, given any nunlber € there is an N such that for n ~ Niall. - AI < E. In essence I gi ve you a. guaranLee that I can gel, as close as }'OU \vish 1,0 a lirni L (if it exiRtR) for all members of tlle Reqnence ,vith snfficient ly large N : that is after N all the valueR of tlle Reqnenc;e RatiRfy I an - A 1< E. The ioe;:'\, iR th;:'\t if there iR a. lilnit then if you give rne SOIIle tolcrance~ here €, I can guarantee thi:l,t for sorne point in the sequence all the tenns beyond that all lie vvithin E' of the linlit.

76

0.0020 ...

Mathematics for Com puter Scientists

Chapter 7. Sequences

Exanlples

• \Ve 8Tgne aR follows: Sn]1pose yon give me 8, (Rm8,1l) v8Jne for

\vhere N > IIE'. vVe can do this

as~

It then follows that as N > 1/£ then so we can say: if we choose N

for

E

€.

1x 1
liN. But ifn > N then lin < liN

> 11£ the when N > n

11/n - 0 1< € and

so lin

----+ 0

• \Ve ctrgue as follovvs: Sn]1pose yon give me 8, (Rm8Jl) v8Jne for €. I can then choose a v8Jne N where 1x IN< E. Or N log 1x 1< log €. Rearranging

N > But if log;

log E 1

log x

1

beware the signs!

1x 1< 1 then

So v.'e dlOose N > log aHci so 1xTl 1----+ 0

€I log;

1

x 1 then \\'hen N > n

77

1x 1=1 xn Tl

0

1
2 -21+ (1') -3+4-. -~' 1 + -2

and

58

=

1+

1 (1 1) (' 1 1')

l' 1

-2 + -3+4- + 5 - +6 - +7 -+ - > 1 + -2+2 - +2- > :>/2 8

80

'1

Mathematics for Com puter Scientists

Chapter 7. Sequences

111 +- +2 2 2 In general we can ( \vith care sho\v )

> 1+ -

52 k >

1

+ -2 > 6/2

k

"2 + 1

So we can rnake the partial sunlS or 2k tenns as large as we like a.nd Lhey axe increasing and unbounded. Thus Lhe series rnusL be di vergenL. This has an irnporhtnL consequence if Un ---t 0 it does not lllean that the sum is convergent. It m8y he hut it m8y not he!



L~ l) xn is convergent for serie:::l is divergent.

Ixl < 1 ano the Rum

iR l/(l-x). \\Then

Ixl > 1 the

\Ve can argue that

1 _ XN1_ X

N

L xn

=

1 ---1

1/(1 - x)

n=O

(},nd :::linee "ve have all explicit forrn for the snIn the result follmvs,



L~=l n[l~ll) converges and the surn is 1 silKe

L n(n 1+ 11 - L (1n: - (n +111) = 1 N

n

• ,OC .::::... n

N

1

1~ is di·ver.Q:ent rL~"

-

for ex

rL

1

> 1 and

-

1 N+1

c;onveraent othen.viRe . h

Some R.ules for series of positive terms • If

L:=l Un

L~ 1 (Un

L:=l

ann Vn are both convergent \vith snnlS Sand T then ± v n ) converges to 5 ± 1

• If L'~

1 Un converges then adding or t:mbtracting a finite rnunber of terrns docs not affed convergence~ it "vill howe\rer affect the SUIn.

81

Mathematics for Com puter Scientists

• If

Un

Chapter 7. Sequences

does not converge

1,0

hero Lhen L~=,

Un

does not converge.

• The cOlnpa.rison tesL: If L~=l U-n and L~=l Vn are 10\\'0 series of positive tenns and if {un/vl , } Lends to et non zero finite lilnit R then the series either l)ot.h c;onverge or hotll diverge. • The Rat.io teRt.: If L~

, Un is;1 RerieR ofpoRitive t.ermR and Rnppose tun+,ju n} ~

l then If l < 1 the series convergcb. If l

> 1 the

Reries divergeR.

If L = 1 the question is unresolved. • The integral teRt.: SllPPoRe -vve 118ve L~=, f Wllidl RR.tiRfieR

Un

ano f(n)

=

Un

for Rome funct.ion

J. f( x) is oecreaRing 8.R x increRseR. 2. f( x)

>

0 for x

2 1

Then

L O 0 f or x < xo

< 0 for x < xo

dx

dy

dx

.

-dy < 0 for x> xo

- > 0 for x> xo dx

dx

iR 8, rmmmnm

Xo

Xl)

d 2y ~>O

d\J dx2 < 0

x~

1. The fllnction f(x)

=

-3' x 3

iR a rnaximnm

+ 1x2 - 6x + 8 h8,R

derivative dy = x 2

dx

+X- 6

RO

at X = 2 we 118ve dl~1 = O. \V11e11 x < 2 the oerivative is neg8,tive vvhile ,vhen (X x > 2 it iR pORitive so we h8ve a minimnm.

2. Or perhaps simpler ~3. \Vhcll x = -3 again

~~ dy dx

=

2x + 1 > 0 at x = 2 so we have

= O. For x

it

minimum.

< -3 dx dy > 0 vvhilc \vhcll x > -3 dlJ < 0 ~

irnpl,yillg a lIla,xirnurn. 2

4"

,\ ' [or -,-~g;alIl

, 1"ICILy --2 d y = 2 x+ 1 Sllnp dx


.

0-

";"'

I

-

~ -~~I----~I----~I----~I----~I -2

-1

0

2

x

9.1

Determinants

Considcr the matrix when \7

= ad - be

(~ ~).

=1= O~

vVc ca:n show that this has an invcrse

(~ ~)

bee g.O. 7. The quantity \7 is called the determinant of the

116

Mathematics for Com puter Scientists

llw,Lrix A

=

(~ ~)

Olapter 9. Algebra: Matrices, Vectors stc.

and is wriLLen

I ~ ~ lor del (A).

Simibrly

has an inverse when abc d e f 9 h i The general definition of a, deLenninanL of an n x n n1ctLrix A is as fo11ov.'s.

1. If n

=

1 Lhen del, ( A)

= all

2. if n > 1 Let Mij be Lhe delennin,anl of the (n-1) x (n-1) rnatrix obLa,ined fr0111 A by deleting row i and colU111n j. Mij is called a, min,or. ThAn

det(A)

= Ul1Mll-U12M12+UnMn-ullJMH+ ... (-1)rt+1UlrtMlrt =

rt L(-1)i+1a1jM1j i=l

DAtermin;:tntR arA pretty nasty but -VVA :='ITe fortun:='lte as -vve really only 11eed them

for n

9.2

= 1) 2 01' ;3.

Properties of the Determinant

J. Any m:='ltrix A

:='1110

T its tranRpoRe AT have the same oeterminant: i.e. oet(A)=oet(A ).

:\ote: ThiR is useful Rince it implieR that -vvhene"ver -vve use ro,\'s, a sirnil:='lT

behavior \vill result if \ve use colurnns, In particular \ve \vill sec ho,v fO,V elelIlentary operations are helpful in finding the deterrninant, 2. The detenninant (Of OUoa tr~(-mCfigl)llar rnatrix is the product of the entries on the oiago11al, that is

e

= uel

o ;), If \ve interchange tV{O rows, the deterrninant of the new rnatrix is the opposite sign of the old one~ that is

117

Mathematics for Com puter Scientists

Olapter 9. Algebra: Matrices, Vectors stc.

(~ ~ ~) 9 h

i

Note Lh1.tL whenever you want to replace a, rovv by sOlnething (Lhrough elenlentary opentLions)~ do not rnultiply Lhe row iLself by 1.t conshtnL. Otherwise, it is easy to rnake errors: see property 4

6. det(AB)=dct(A)dct(B)

7. A is invertible if 1.tnel only if deLlA)

-I- O.

Note in Lh1.tL case det (A -1 )=l/deL(A)

\Vhile detenninant s can be useful in geornetry and Lheory they are cOlnplex and quite difficult to handle. Our last result is for cOlnpleteness and links nl1.tLrix inverses wiLh detenninants. Recall tl18t tlle n x n rnatrix A doeR not have 8,11 inverse \vhen net (A)=O. Hovvever the connedion between netennin8,11tR 8,11d matrices iR more complex. Suppose vve dcfine a nc\v Inatrix: thc adjoint of A say adj (A) as

.

(l,dJA

i-l

T

= (( - 1) Mij) =

M"

-M12

-M21

M22

, ..

(

(-1)ll. 1M l,ll. (-11. TL- 2 M,.

)T

~.rL

(-1 )2n M Ttn

(-1 pL+l MTLl

Her~ot::cAM~j ('lr~ iT Tl)ni~ll~::l :~~;:~I:b(OV~~ ~7 1 5 ...

1

-2

!3) T (~~ ~9 ~2) 1

2

-3

\Vhy is 8.nyone interest en in the adjoint: The rnain reason is

A- 1 = adjA dct(A) Of courbC you \vould havc to havc a vcry special rca.SOIl to cornputc an invcrsc this wa)',

118

Mathematics for Com puter Scientists

9.2.1

Olapter 9. Algebra: Matrices, Vectors stc.

Cramer's Rule

S11ppose we 118ve the Ret of eCju8tions

Q2X

+ b(!-) + C1Z + btl) + C2 Z

d2

Q3X

+ b J1J + CJZ

d3

U1X

b, c, b 2 C2

Q,

and let D

=

d1

02

OJ b J CJ Then Cramer' R rule stRtes that

d, b, c, X=-

0

1 Y =D 1

Z=-

0

UJ

b 2 C2 b 3 C3

Q,

d,

C,

Q2

d2

C2

U3

d3

C3

d2

°l 02 OJ

b,

dl

b2 d2 bJ d J

There is even a, nlOre general case. Suppose vve ba ve

Ax=d \vhere x T = Then

(Xl! X2, ...

Xk

=

,x n ) and d T = (d,) d2 , ... ,dn ), Let D =det(A).

~ ( ~n Qn'

U'tk-ll

d1

U'tk III

QrL(k-lj

dn

QrL(k+l)

Ul

n

)

Q nn

\Vhile this is a nice fonnula you \vould have to be Inad to UbC it to solve equatiollb since the best v,ray or evaluating big deLenninanLs is b}' ro\\! reduction, etnd this gi ves solutions directl}'. Exercises J. Evaluate

2 4

3 6

119

Mathematics for Com puter Scientists

Olapter 9. Algebra: Matrices, Vectors stc.

2_ Eval118te

2 4 3 3 6 5 2 5 2

3_ Evah18te

2 x x- 2x+ 1 x3 0 3x-2 2 )

4. If A = ( :

o

~~ ~

) show that

0 9 14 del(A)

= I

~ ~ II ~ ~

120

Mathematics for Com puter Scientists

Chapter 10. Probability

Chapter 10 Probability Probabil-ity theory 'is nothi'l/.Y bnt COTflU/D'l/. sense 're(hlCed to ca[c:alafion.

Pierre Simon L;:'I,plac;e In what. followR we are going t.o c;over t.he h;:'l,RiUl of prohahilit.y.

The ioeaR are

reasonably btraightforward, ho\vever a.s it involves counting it ib very easy to rnake rnibtakeb - ab we shall sec. Suppose we perfortn an experirnent vvhobe out corne is not perfectly predictable e.g. roll a, die or Loss a coin. Irnagine we rnake a, list of all possible ouLcornes, (·all Lhis list S Lhe sample space. So

• If we Loss a coin S cOllsists of {lIead: Tail},

\\'e

write S

=

{lIealL Tctil}:

• If a princebb kisses a frog then \ve have two pObbibilities S= { \ve get a. prince, \ve get an ernbarrassed frog} • \Vhell we roll t \,vo dice then S is the seL of pairs

(1,1 ) (2,1 ) (3,1 ) (4, J)

(L2) (2:2) (:l2)

(lA)

(L5)

(2A) (:3A)

(2:5)

( 4:2)

(4~5)

(i), J )

(5:2)

(4~4) (5~4)

(f)))

(6~2)

(6~(t)

(:l5) (5~5) (6~5)

(1,6) (2,6)

(:3,6 ) (4,6) (5,6) (fjJj)

An event A is a (·ollecLion of outcOlnes of interest, for exa.lnple rolling getting a double. In this case Lhe event A is dellned as

121

10\\'0

dice and

Mathematics for Com puter Scientists

A

= {

Chapter 10. Probability

(1: 1) : (2: 2) , (:),;) ) , (4,4) : (5: 5) : (6: 6) } .

SnpPoRe that tllA Avent B is that tllA Rum iR less th;;tt 4 when we roll two rlice, then

B={ (1,1),(1,2),(2,1)} . If two events A and B have no elernents in conUI1on then v,re say the}' aremu['ualLy p:rrlllsi'l!(. For eX8,rnple let A be the event {At le8st one 6} that is

Since A and B have no elernents in COIInI10n they are rnuLuall}' exclusive. Define Lhe event C as

C={ (2,;3),(25,7)} Then A and C ,HC also Irmtually exclusive. If D={tmrn cxceeds IO} then A and o axe not lI1utually exclusive! Check this YO'(l'tselj.

COlnbining events • It is handy to have a s,yrnbol for not A, \ve UbC and not A is acccptable.

r.,.

A but \VC are not 'vcry picky

• The event A and B, often \vrittcn A rl B ib the bCt of OUtCOIrlCS vvhich belong both to A etHci 1,0 B. • The event A or B; often \\'riLLen A U B is the seL of ouLCOlnes \vhich belong; either to A or 1,0 B or to boLh. Yon \vill recognise the notation from the e8Tlier disc;ussion on setR. Sup}Jose S={O,L2;:14)\Q,7,8,9} then if\ve define A={1,:3,5,7,9} etnd B={4:5:7} \ve h8ve .

• AnB

=

A an rl B

=

.

{5,7}

• \Vllile AU B = A or B = {l ,;),4,5,7,9}

122

Mathematics for Com puter Scientists

10.0.2

Chapter 10. Probability

Probability - the rules

Now to Aadl Avent WA :='ITe going to Rssig;n a mAAsure ( in RomA vvay ) callAo thA probability. \Ve will vvritc the probabilit,)" of an evcnt A as P[AJ. \Vc will set out SOIne rules for probabilities~ the nw.in ones are u.s follows: 1. 0:::; P:AJ :::; 1.

2. P[SJ

=

1

3. For InuLually exclusive events A and B P:A or BJ = P[A:

+ P[BJ

\Ve will ada a few extrR rulAR (i) For mutu:='IJly exclusivA AVAnts A 1 Ana A2 ano A3 ... An thAn

or written oifferAnt ly

P[A, or A2 or A 3 ··· or An"': = P[A,:

+ P[A 2J + P:AJJ + ... + P[AnJ +.,'

(ii) For all evenL A P [ not AJ = 1 - P :AJ

(iii) For Avents A ana B

P[A or

B: = P[AJ + P:BJ -

P:A and BJ

All this is (}, bit fiddle,)" but is not really vcry hanL If you vverc not too confused at this point you vvill have noticed that v.'e do noL have a V..'cl}' or geLLing the prolJalJiliLies. This is a dilTicult point exc·epL in Lhe case v..'e are going to discuss.

10.0.3

Equally likely events

Suppose that every outcoIne of Cl.n experirncnt is equally likely. Thcn \ve rrOln Lhe rules alJove

1'01'

any event A

'AJ = p.

the nunlber of out COlnes in A the nUlnber of possible outcOlnes

123

UUl

shovv

Mathematics for Com puter Scientists

Chapter 10. Probability

This IIleans we can do SOIIle calculatiOlls.

examples 1. Snppose that the ontcomes • tl18t a baby is 8, girl • tl18t a baby is 8, hoy 8re eCj118,lly likely. Then RS there 8re t\VO possihle ontcomes "ve have P[girl]=] /2=P[hoy]. 2. Suppose now a f8Tnily has ;) dliloren: the possihilities are

BB BG GB GG etHci so P[ one boy and one girl]= 2/4=1/2 while P[tv.,o girls]=1/4

;3. The fanlOus shtListician R A Fisher had seven daughters. If }'OU C'OLlI1t the possihle SACplenC;eR BBBBBBB to GGGGGGG yon \vill fino that there are 27 = 128. Only one seqnenc;e iR 8,11 girl so the prohahility of t hiR event is ]/128.

4. A pair of dice iR thrown. \\lhat is the prohahility of getting totalR of 7 ana 11, Suppose no\\' \\'e throv{ the t\VO dice t\vice. vVhat is the probability of getting a total of 11 and 7 in this case'! 5. \Ve draw 2 balls frolIl an urn containing () \vhite and 5 bhl,ck: vVHat is the }Jrobability that we get one \\'hite etnd one bIac·k betll? As you ean see vve really need SOine help in counting.

ExercisesS 1. A }Joker hanu c'onsists of 5 cards drawn fron1 et pack of 52. \Vhat is the }Jfobability tha.1o a hand is a slru;igh r tha.1o is 5 cards in nUlnerical order, but not 8,11 of tlle S8Tne snit. 2. \Vl18t is tlle proh8,hility that a poker hand iR 8, full house, that iR 8, triple and 8, p8,ir. 3. A 8,no B flip 8, coin in tnrn. Tlle firRt to get 8, head wins. Find the s8,mple space. \Vhd,t is the probability that A wins?

124

Mathematics for Com puter Scientists

Chapter 10. Probability

4. The garne of craps is pla}'ed as 1'o11ov.'s: A player rolls tvvo dice. If Lhe sunl is a, 2~ ;3 or 12 he loses. If the Sllln is ct seven or cUI 11 he \\'ins. OLhenvise the playAr rollR t.he dice llnt.il l1A gAts l1is initial scon~: in which case he wins or gets :='1, 7 in whir;h U1Re he 10ReR. \Vh;:d, is the probability of "vinning?

,). A rnan has n keyb, one of \vhieh "vill open hib door. He trieb keys at randOIIl, discarding thobe that don't \vork until he opens the door. \Vhat is the probctbility that he is successful on the kth try.

6. The birthday problenl lIow Inany people should be in a rOOIn Lo Inake the proh;:thility of t.wo or more hRving t.he SRme birthdRY 1'nore th;:tn O.S? ThiR iR Cjllit.e diffkll1t. ana a si1'npler approar;h is to c:onRirier t.he prohahility that. no two people hRve tl1e S:='Ul1e birthday. It is often a, useful dodge in probability Lo look at P[ noL A: vvhen P[A] is 11:='1,1'0.

So P [ no coincidences]

=

365 x 364 x 363 x ... x (365 - n + 1) 365 x 365 x ... x 365

=

1 x (1-364/365) x (1-364/365) x (1-363/365) x··· x (1-(365-n+ 1)/365)

Nunlber 125

Probabilit),

16 17 18

0.7/1709S()8 0.716:39599 0.684992:3:3 0.65308858

HJ 20

0.62088147 0.58f356162

21 22

0.55fBIH)6 0.52 ·j]0·'i()9 0./19270277 0.46165574 0.4:31:300:30

2;3

24 25

125

Mathematics for Com puter Scientists

Chapter 10. Probability

Prob of coincident birthdays :

~r----/~Q~!i!j.5

he Rure tll:='tt if YOll roll a die YOll \vill never get ::L,),

hOWf'?}fT

if YOll rolled

a die and kept an average of the score you \vill find that thib \vill approach the plot bclovv

0

..... M

=

~3,5~

see

-

0

m

0)

c: 'E

2

= N

.....

0

= o

20

40

60

80

100

no rolls

For a. coin \ve have Head and Tail. Suppose we count head as 1 and tail as zero, then P[X ana so E ~Xl

=

1x

= 1J = 1/2 and P:X = 0: = 1/2

1+ 0 x :l = ~.

A similar experiment gives the following

132

Mathematics for Com puter Scientists

I"--

a

0

0

0 000

c.o 0 ~ 0 u

V)

(]) 0)

~

Chapter 10. Probability

o~ 0

~ 0

-

(1J 0)

C? 0

.~

c: c:

2

"! 0

..-

0 0

0

c>

0

20

40

60

80

100

no rolls

10.1.1

Moments

SOllle inl}JorLant expec-Led values in statistics are the ·l1/.otru:,ul.s

I1-r

= E[XI']

r

=

1)2""

since ,ve can lls11811y esti1'nate theRe \vhile proh;=thilitieR are much 1'nore difficult. Yon vvill 11Hve met tlle •

m.f'(w. ~l = E [Xj

• The 'Lim'iauu:,

(J2

=

E[(X - 11-)2:.

• The pararneter () is knovvn a.s the standard deviation. Tlle c;entral mo1'nents are defined as

r

=

1,2, ...

The third and fourth rnornentb E[(X -11-)3],E:(X -11-)1J: (U'e less eorrnnonly used. \\-'e ea.n pIove i:U} interebting link bet"\veen the rnean 11- and the variance 0'2, The result s kno\vn as Chebyshev:s inequaliL}'

(10.3) This tellb Ub that dep(}Tture frorn the rnean have srnall probability when

133

0'

is srnall.

Mathematics for Com puter Scientists

10.1.2

Chapter 10, Probability

Some Discrete Probability Distributions

\Ve shall rUll Lhrough sorne of the 1110sL COIIllIlOn and irnportant discrete probability dis t ri bu tiOllS,

The Discrete 1) niform distribution

Suppose X can take OIle valueb 1) 2) ... ,n \vith equal

PlX • The Hleal1 is E:XJ

=

=

1 kj = n

k = 1,2

"

probability~

that ib

" ,n

(10.4 )

TL;-l

• Lhe vctriculCe is var(X)

= 1n:~ + ~n2 + in - ~

For example a oie is tllro\vn: the oistrihlltion of the sc;ore X is uniform on the integer J to 6,

The Binomial distribution

Suppose we have a. series if trials each of vvhich hab t\VO outcorneb, :::;UlTCSS S i:l,Ild failure F. vVe aSSlll11e Lhat Lhe probctbiliLy of suc'c'ess, P: is c'onshtnL, so for every trial Pl Snccessj = p ;:'I,nd Pl fRilure j = 1 - p e the prolJalJility of X successes in

11

Lra,ils is gi ven by

k=0,1)2,···n • Tlle meRn is E~Xj

(10.25)

= np

• thc Vi:l,rii:l,flCC is "Tar(X)

= np( 1 - p)

The IJrobability LhaL a person \vill survive a, serious blood disectse is 0.4. If 15 IJeople have Lhe disease the nUlllber of survi vors X has a, 13inornial 13( 15,0.4) distribntion.

• P[X = 3J =

c:) (

0.4 ) 3 ( 0.6) 12

• P[X ::; 8J = L~=o (~) (0.4 )X( 0.6) 15-x • P[3::; X::;

8J

= P[X::;

8: -

P[X:=;

2:

= L~ 2 C:)(0.4r(0.6)1!l-X

134

Mathematics for Com puter Scientists

Chapter 10. Probability

Applying expectation using the Binomial A Inore interesting usc is: Suppose we wish to test wheLher N people have ct disease. It \vould SeelI1 that Lhe only way 1,0 do Lhis is 1,0 Lake a blood Lest, v/hich v/ill require N blood tesLs. Snppose WA try the following: 1. \Ve pool t.he hlood of k < N people. 2. If t.he combined s;:trnple is neg;:ttive ,ve have k people vvit hOllt. t.he diRease. ~i.

If the pooled test is pot:litive vve then test all k people individually, retmlting in k + 1 tet:ltt:l ill all.

"'1. Repeat until everyone it:l diagnosed \VhaL does this save us'?

135

Mathematics for Com puter Scientists

Chapter 10. Probability

Assnrne the probabilit:y of 8, person hmring thp, oise8,Re iR p ana that we h8ve a

Binornial dibtribution for the nurnber with the disease. Then for a group of k 1.

Pl jnst

J teRt

2. P[ k+ 1 tests]

= (1 - p) k

= 1 - P[ jUt:lt 1 test] = 1 - (1 -

p) k

So the expected nurnber of tebtb is

Thib does give a cOllbiderable buving in the rnunber of tests~ see the diagrarn belo\v p=

p=

0.1

0.01

k

p=

p=

0.001

1e-04

OJ

o~

~

~

J

-

a

..-'

0

_

0

0



~

o~

~

I

0°0 .0 0 0 ° 0

-

-

=

_

-

1,.,

0

0

0

0°0

[a caoo

.rl

111

Cl



~()

1()

1;-'

~()

The Hypergeolnetric distribution SnppoRe \ve have :'\ itemR and D of theRe 8Te oefedive. I take a Rample of si~e n from tllese itemR, then the probability th8t this sample contains k oefediveR is

P [X = kJ =

• The lIleall is E[X:

(D) (N-D) n-l< k

k

(~)

=

0, 1,2, ... n

(10.G)

= n~

. . (X) • t 1le vanance IS vax

=

(N-nJ

(N-l :;

D nN

(1

D) - N

\\-'hile sit nations involving the lIypergeorneLric are c-onlIIlOn ii COIIllnon practice Lo approxirnate \\'iLh Lhe 13inornial \vhen N is large cornpared Lo D. \Ve set p = DIN 16J

-

= L~ x=16

lOx -exp(-lOi xl

'

=

15 lOx 1 - ~ -exp(-lOi L xl '

x=O

137

=

1 - 0.9513

Mathematics for Com puter Scientists

10.1.3

Chapter 10. Probability

Continuous variables

All the eases we have considered so feu' have been vvhere X ta,kes discrete values. This does not have to be true - we CeUl irnagine X t etking a continuous set of values. SInce we have tll011gh of :='t probability 8t X=k vve rnight think of the proh.:thility of X being in Rome Rrnall interv;:tJ x, x + 8x This lwobability will he P~x

< X
14J = 0.25 or

PlX < 14

=

PlZ < (X -

~l)/U

< (X - I-..l)/uj

=

lIenee (14 - 1-1)/IT = 1.96 \Ve have a pa,ir of equations

141

1 - 0.25

=

0.975

Mathematics for Com puter Scientists

1. 2 - I-l = -0-

X

Chapter 10. Probability

1.645

2. 14 - I-l = 0- x 1.96 Solving gives

(14 -I-t) - (2 -I-t)

= 12 = 0.3150-

or 0- = 3.32871 and so I-t = 7.475728

The N ornlal approxirnation to the Binornial A 13illOll1ial vctrictble X which is B (n) p) can be apprOXill1a1,ed b}' a, :.\ onnal va,riahle Y, 11lean np. varia.lKe np(l - p). This can be very useful as the 13illOll1ial 1,ahles provioeo RTB not very extensive. This is known 0.8 the lVormal apPTo:n:moho17. to the Rinomi(JL In tllis CRse z = (Y - np)/ J(np( 1 - p)) is st;;mdard NormaL

Excunple SUPIJOSe X is nurnber of 6:s in 40 rolls of a die. Let '{ be :.\ (~\ 4i~). TheIl

P[X

~ P[Y

< 5J

5 - 20/3

< 5J = P[z < . J5679 J =
You can refine thib approxiInation but \ve "vill bettIe for this at the rnornent.

Exercises 1. A die is rolled; \x,rhat is 1,he probahili1,}' that (a) Tlle ol1tc;ome is even.

(0) The outC'OllIe is a, prirne. ( c) The

0

u t C'Olne exceeds 2.

(d) The outcolIle is -1.

(e) Tlle ol1tc;ome is leRR tl18Jl 12. 2. T\vo dice are rolled. \Vhat is the probability that (a) The SUIn of the upturned faces ib 7'! (b) The score on one die is exactly twice the score on the other.

142

Mathematics for Com puter Scientists

Chapter 10. Probability

(c J You thro\v a double, that is the dice each have the SHIne beore. 3. Suppobe we toss a coin ;3 tinleb. Find the probability distribution of (a) X=the nUIIlber of tails, (b) Y = tlle number of nms. Here;:'l, run is a Rtring of he;:'ll~s or tailR. So for

HTT Y=2,

;1. The student population in the rvraths departIIlent at the Diego \vas rnade up ab follo\vs

l~ niversity

of San

• 10% were frOI11 California

• 6% \vere of Spanibh origin • 2(7r: were fron1 California, and of Spanish origin. If;:'l, Rtuoent from the d;:'l,RR vvas to be oravvn ;:'It ra11o.o1'n ,vhat iR the prohahility that they are

(a) FroIn California or of Spanibh origin. (b) .\either frol11 California, nor of Spanish origin.

(c) Of Spanish origin but not fr'oIn California

143

Mathematics for Com puter Scientists

Chapter 10. Probability

5. FbI' two event.s A and 13 the follov.'ing probabilities are kno\vn P:A]

=

0.52

P:B]

=

0.36

P:A

u B:

= 0.68

Determine tlle probabilitieR

(a,) P[A

n B]

(h) Pl"" A

(c)

P[~

BJ

6. A hospihtl Lrust dassilles a group of Iniddle aged Inen according Lo bod}' weigllt and the incjoence of hypertension. The reRults :='ITe gi'ven in the table.

IIypertensi ve Not HypertenRive

Total

() ver\veight 0.10 0.15 0.25

Nonna'! \Vcight 0.08 0.4;) 0.53

l~ ndcnveight

0.02 0.20 0.22

Total 0.20 0.80 1.00

(a) \Vhat is the probability that a person bclectcd at randorn horn this group will have hypertension? (h) A perRon seledeo at ranoo1'n from this gronp iR founo to he overvveight: \vhat is the prohahility that t hiR person is :='IJso hypertensive?

(c) Find P)l.YI)('rtensivc U Undenvcight: (d) Find P )wperLensi ve U Not Undervveight] 7. T\vo cardR :='ITe orav{n from an ordinary deck of 52 C:='ITOS. \Vhat iR the prohability of dru\villg (a,) Two ac·es. (h) Tlle two hlack :='I,ces.

(c) T\vo ulTds froIn the court cards KJl,.J

(a) Four cardt:l arc aceb (b) Fbur cards are the sarne Le. 4 10's, 4 9'2 etc. (c) All the C:='ITOS :='ITe of the Rame snit.

(d) All the card are of the sarne suit and are in sequence.

144

Mathematics for Com puter Scientists

Chapter 10. Probability

9. A Rtudent. of st.atistics "vas t.old t.hat. tllere \vas a chance of 1 in a million that there wab a. bornb on an aircraft. The reasoned that there would be a one in 10 12 chance of being two bornbs on a plane. He thus decided that he bhould take a bornb with hirn ( defused - he vvas lloL st upid) 1,0 reduce Lhe odds of an explosioll.

Assurning no securit.y problcrnb is this a bensible strategy? 10. There arc four ticketb rllunbered 1),;3,.-1. A two digit nurnber ib fonned by drawing; a ticket at randorn frolll the four and a secolld frorn the relnaining; three. So if Lhe tickets were 4 and 1 Lhe resulting; nurnber would be 41. \\7hat is the probetbiliLy Lhat

(a) TllA rARulting nllmher is even. (b) The resulting; nUlllber exceeds 20

(c) The rebulting rnunber ib bet\veen 22 and ;30. 11. Three production lines contribute to the total pool of parts ubed by a corIl-

pan,Y· • Lille 1 conLributes 20% etnd 15(7r: of iLelllS etre defective.

• Lille 2 contributes 50% and 5% of iterIlS arc defective. • Lille :3 conLributes 30% etnd 6% of iterns are defective. (a) \Vhat. perr;entage of items in the pool :='ITe oefective? (b) Suppose an iLelll was selected at ntndorn etnd foulld to be defective, ",,,-hal, is the probabiliLy LhetL it c-a.lne frolll line I?

(c) Suppose all itelIl "vas bclected at randorIl and found not to be defective, whaL is the probabiliLy LhetL it c-a.lne frolll line I?

145

Mathematics for Com puter Scientists

10.2

Chapter 10. Probability

The Normal distribution

This table gi,ies the cumulative probabilities for the standard

r

lIor1l1al d1stribllt1(11) thHt is

P[Z S zJ =

2 -= 1 exp( _x j2)dx

-~ yl2n

This is the shaded area in the figure. ?;

-3.'1 -::L~

-3.2 -3.1 -::~.O

-2.9 -2.8 -2.7

-2.G -2.5 -2.4 -2.:) -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0.0

0.00 0.0003 0.0005 0.0007 0.0010 0.001 ;) 0.0019 0.0026 0.00:)5 0.0047 0.0062 0.00~2

OJ)107 0.0139 0.0179 0.022~

0.0287 0.0359 0.044ti 0.0548 0.0668 O.O~O~

0.0968 0.1151 0.1357 0.15~7

0.1841 0.2119 0.2420 0.274:1 0.308.5 0.:)44ti 0.3821 0.1207 0.4002 0.5000

-0.01 0.0003 0.0005 0.0007 0.0009 0.001 ;) 0.0018 0.002.5 0.00;)4 0.0045 0.0060 O.OOSO 0.0104 0.01:16 0.017--1 0.0222 0.0281 0.03.51 0.04;)ti 0.05:17 0.06.5.5 0.079;) 0.0951 0.11:11 0.133.5 0.15ti2 0.1814 0.2090 0.2;)~9

0.2709 0.30.50 0.:)409 0.378:1 0..t168 0.45ti2

-U.U2 0.0003 0.0005 O.OOOG 0.0008 0.0013 0.0018 0.002·-'1 0.00;)3 0.0044 0.00.58 0.007S 0.0102 0.01:12 0.0170 0.0217 0.0274 0.03·1-'1 0.0427 0.0526 0.06.-13 0.077S 0.09:14 0.1112 0.13V1 0.15;)9 0.1788 0.2061 0.2;)5S 0.267G 0.3015 (};);)72 0.:1745 0.·-1128 0.4522

-U.(X~

0.0003 0.OUU4 O.OOOG 0.0008 0.0012 0.0017 0.0023 0.0032 0.004:1 0.0057 0.0075

o.omm 0.0129 0.0166 0.0212 0.0268 0.0336 0.041S 0.0516 0.0630 0.0704 0.0918 0.109:1 0.1282 0.1515 0.1762 0.2033 0.2327 0.2G4:1 0.2881 0.;)330 0.:1707 0.·-1080 0.44S3

L

-0.04 0.0003 U.0004 O.OOOG 0.0008 0.0012 O.OOlG 0.0023 0.00:)1 0.0041 0.0055 0.007:) O.OODti 0.0125 0.0162 0.0207 0.0262 0.0329 0.040D 0.0505 0.0618 0.074D 0.0901 0.1075 0.1271 (.I. 14D2 0.1736 0.2005 (.I. 22 Do 0.2G11 0.28·16 0.3300 0.:1G69 0.·-'1052 0.444:)

146

-0.05 0.0003 (.1.0004 O.OOOG 0.0008 (.1.0011 O.OOlG 0.0022 O.O();)O 0.0040 0.005.-1 (.1.0071 0.00D4 0.0122 0.0158 (.1.0202 0.0256 0.0322 0.0401 0.0495 0.0606 0.07:)5 0.0885 0.1056 0.1251 (.I. 1409 0.1711 0.1977 (.I. 220ti 0.2578 0.2912 (.1.3204 0.:1632 0.·-'1013 (.1.4404

-0.00 0.0003 0.0004 O.OOOG 0.0008 0.0011 0.0015 0.0021 0.0029 0.00:19 0.00.52 0.00ti9 0.0091 0.0119 0.01.5'1 OJ)197 0.0250 0.031·-1 0.();)92 0.0485 0.059·-1 0.0721 0.0869 0.10:18 0.1230 0.144ti 0.1685 0.19,19 0.22;)ti 0.2546 0.2877 0.:)22~

0.3594 0.397--1 0.4:)ti4

-0.7 0.0003 0.0004 0.0005 0.0008 0.0011 0.0015 0.0021 0.002S 0.00:18 0.00.51 O.OOtiS 0.00~9

0.0116 0.01.50 0.0192 0.0244 0.0307 0.0;)~4

0.0475 0.0.582 0.070S 0.085:1 0.1020 0.1210 0.1423 0.1660 0.1922 0.2200 0.2514 0.28.-13 (};)192 0.:1557 0.3936 0.4;)25

-O.O~

0.0003 0.0004 0.0005 0.0007 0.0010 0.0014 0.0020 0.0027 0.00:17 0.00·-'18 0.0000 0.00S7 0.011:1 0.01-'16 O.OlSS 0.02:19 0.0301 0.0375 0.0465 0.0571 0.0094 0.08:18 0.100:1 0.1180 0.1401 0.16:15 0.188;1 0.2177 0.248:1 0.2810 0.;)150 0.:1520 0.3887 0.42SO

-O.OD 0.0002 O.OO(J::~

0.0005 0.0007 0.0010 0.0014 0.0019 (.1.0020 0.003G 0.0018 0.00ti4 0.00~4

0.0110 0.01·13 0.01 ~:) 0.0233 0.020·1 (.1.0307 0.0455 0.0550 (.l.Oo~l

0.0823 0.0985 0.1170 0.137D 0.1611 0.1867 0.214~

0.2451 0.2776 0.3121 0.:1483 0.3859 0.4247

Mathematics for Com puter Scientists

Chapter 10. Probability

This table gives the cumulative probabilities for the standard lIor1l1al distrilmt101I) thHt is

P[Z ~ z] =

fZ

1 exp( _x 2' /2) dx 271

f)";

--oc '\

This is Lhe shaded area ill Lhe Iigure. z

0.0 0.1 lL2 0.;1 0.--'1 0 ..5 lU::i 0.7 0.8 0,9 1.0 1.1 1.2 1.;1 1.,'1 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2,2 2.;1 2.,'1 2,5 2.6 2"" ./ 2.8 2,9 ;1.0 3.1 ;),2 ;1.;1 3.--'1

0.00 0.5000 0.5398 0.579:) 0.6179 0.6.5.5--'1 0.691.5 0.7257 0.7580 0.7881 lU~159

0.841;1 0.86--'13

0.01 0.5040 0 ..5''138

0.02 0.5080 0 ..5,'178

lL5~32

lL5~71

0.6217 0.6.5[H 0.6950 lL 7291 0.7611 0.7910 lL~l Sti 0.84;18 0.8665

0.6255 0.6628 0.6885 lL7324 0.7642 0.7839 lL~212

0.8461 0.8686

0.S~49

lL~~ti9

lL~SS~

0.90;12 0.0192 0.D;);)2 0.9452 0.9554 0.06--'11 0.D71 ;) 0.9772 0.0821

0.9049 0.9207 0,9;)45 0.946;1 0.9564 0.96,'18 lUJ719 0.9778 0.9826

O.D~ti1

lUJ~ti4

0.989;1 0.0918

0.9896 0.9920 0,9940 0.9955 0.9966 0.9975 lUmS2 0.9987 0.9981 lUm93 0.9995 0.9987

0.9066 0.9222 lUJ357 0.9474 0.9573 0.9656 lUJ 72 ti 0.9783 0.9830 lUJStiS 0.9898 0.9822 lUW41 0.9956 0.9867 0.9876 lUWS2 0.9987 0.9881 0,9994 0.9995 0.9887

0.D9;)~

0.995;1 0.096.5 0.097-'1 0.D9~1

0.9987 0.0990 O.D~m;)

0.9995 0.9997

0.03 0.5120 0.5517 U.5910 0.6293 0.666'1 0.7019 0.7357 0.7673 0.7867 (JS2:)S 0.8485 0.8708 (JS907 0.9082 0.8236 0.9370 0.9484 0.9582 0.866,1 0.97:)2 0.9788 0.883,1 (J9S71 0.9901 0.8825 0.994:) 0.9957 0.8868 0.8877

0.0'1 0.5160 0.5557

0.0.5 0.5199 0.5.596

U.5D4~

0.59~7

0.6331 0.6700 0.705--'1

0.6;168 0.6736 0.7088 0.7422 0.77;14 0.8023

0.7:)~9

0.7704 0.799.5 lJ.S2ti4 0.8508 0.8729

(JS2~9

0.85;11 0.87-'19

0.06 0.52;19 0 ..5636 lU::i02ti 0.6406 0.6772 0.7123 lL7454 0.7764 0.80.51 lL~;) 15 0.8554 0.8770

0.07 0.5279 0 ..5675 lU::iOti4 0.644;1 0.6808 0.7157 lL74~ti

0.7794 0.8078 lL~340

0.8577 0.8780

lJ.~D25

(J~944

lL~9ti2

lL~9S0

0.9099 0.9251

0.979;1 0.9838

0.9115 0.926.5 lU);)94 0.9505 0.9599 0.9678 0.D744 0.9798 0.98-'12

0.91;11 0.9278 lUJ40ti 0.9515 0.9608 0.9686 lUJ750 0.980;1 0.98-'16

0.D~75

0.D~7~

0,9~~1

0.9904 0.9927 0.DD45 0.9959 0.9969 0.9977

0.9906 0.9929 0.D94ti 0.9960 0.9970 0.9978

0.9909 0.9931 lUm4S 0.9961 0.9971 0.9978

0.9147 0.9282 lUJ41S 0.9525 0.9616 0.9683 lUJ75ti 0.9808 0.9850 lUJSS4 0.9911 0.9832 lUW49 0.9962 0.9872 0.9878 lUWS5 0.9989 0.9882 lUW95 0.9996 0.9887

0.D:)~2

0.9495 0.9591 0.9671 0.D7:)~

0.99~:)

0.DD~4

0.D9~4

lUm~5

0.9988 0.8891 0.99D4 0.9996 0.8897

0.9988 0.9992 0.DDD4 0.9996 0.9997

0.9989 0.9992 0.D994 0.9996 0.9997

0.9989 0.9992 0,mm4 0.9996 0.9997

147

0.08 0.5;119 0.5711 0.ti10:) 0.6480 0.68'11 0.7190 0.7517 0.7823 0.8106 (JS3ti5 0.8599 0.8810 0.S9D7 0.9162 0.8306 0.942D 0.9535 0.9625 0.8600 0.97til 0.9812 0.885-1

0.09 0.5359 0.5753 0.ti141 0.6517 0.6879 0.722--'1 0.7549 0.7852 0.8133 0.S:)S9 0.8621 0.8830 0.9015 0.9177 0.8319 0.9441 0.9545 0.963;1 0.8706 0.97ti7 0.9817 0.8857

0.9S~7

0.9~DO

0.9913 0.883;1 0.9951 0.9963 0.8873 0.8880

0.9916 0.8036 lUHJ52 0.9964 0.807--'1 0.8081

0.99~ti

0.9D~ti

0.9990 0.8803 (J99D5 0.9996 0.8897

0.9990 0.8003 0.9DD5 0.9997 0.8998

Mathematics for Com puter Scientists

Chapter 11. Looking at Data

Chapter 11 Looking at Data It is very much more difficult to handle data rather than to construct nice probability

arguments. \Ve begin by considering the problems of handling data. The first questions an~ tll~

prOVE-nla1}(;p. of tIlE-l data.

• Is it rp.lia,bh-l·?

• \Vho collected it? • Ie; iL w haL iL is said Lo be? • Is it a srlmplp. rlnd frolll \vIlat population?

Such questions are inlportant because If the data is wrong no amount of statisticaL theory '/1,"ill

Tf/,(],k:p

'it lwttpf', Colh-lding YOllr O\Vll drltc-l, is tbE-l

b~st

rlS you sll()llld knO\v \vhrlt is

going on. AlmosL all sLatistical Lheory is based on Lhe assumption thaL Lhe observaLions are independent and in consequence there is a large body of methodology on sampling alld drlta (;ollp.ctioll.

11.1

Looking at data

Once you have Lhe data vvhat is he nexL sLep? If it is presenLed as a Lable ( do read the description) it may well be worth reordering the table and normalising the entries. Silllplifyi11g amI rOllnding call

b~ v~ry ~ffE-ldiv~,

p.spE-lcirllly ill rE-lports.

Aft~r gatll~ri1lg

data, iL pays to look at Lhe daLa in as many \Nays as possible. Any unusual or inLeresUng patterns in the data should be flagged for further investigation.

The Histogranl Anyone \vho does noL dra\v a picLure 01' their daLa deserves all the problems LhaL they '-vill undoubtedly encounter. The ba.,ic picture is the histogram, F'or the histogram we split tbE-l rallgE-l of

tll~

drlta i11tO intE-lrvrlls a11(1 COll1lt

148

tll~

nll1llhE-lr of ohsE-lrvi-l.tions i11

~rlch

Mathematics for Com puter Scientists

Chapter 11. Looking at Data

interval. VVc thcn construct a diagrmn Inadc up of rcctangles crccted on cach intcrval. ThE-l

a'f'P(].

of tllE-l rE-lctallg1E-l

lH~ing

110 190 11 ;1 . t 19 6~i 150 29 22 11 7;3 84 30 27 17,) 18 17 6 . tl 61 50 7~i 8 27

proportiOllii,1 to thp. COllllt.

5,)

65

7f) 70 18 26

2~i

4:3 28 17 60 82 82

12 21 29 44

54 8,)

')r.

5

..h)

10 20 130 47 116 55 . t;3 80 ;32 75

29 1]5 67 52 10

]5 40 :32 12 15 37 .", ,19 If) ,13 2",.) ;3() 6 21 64 16 95 29 22 ,)2 ]9 Hi Hi 20 ')'""' ,)0 ]7 IJ ( 2() 251 9 17 22 28 45 ,),.)

Table 11.1: Dorsal lengths of octapodt:l

Histog..-a.-n of' oct:

~

=

,I

~

-

~

()

1 (ll)

1~()

L(H)

oct

11.1.1

Summary Statistics

LocaUon This he; ofLen called Lhe ~: measure of cenLral tendenc'y~: in our texLbooks, or the '~centre~: of Lhe daLaseL in oLher sources. Common measures of location are the mcan and nlcdian. Lcss comnlon meW31HCS arc thc modc and thc truncated nlcan. Giwm obsE-lrvrlti01IS

Xl) X2, ... ) Xn

• Thc sample mcan isjust ~ L~1.

1 Xi.

writtcn x. For thc Octopods it is ;H.67021.

• Thp. H1E-ldirln is thE-l llliddlp. ValllE-l, WE-l armngp. tbE-l obsE-lI"vatlolls 111 oniE-lr ii.lId if n he; odd pick Lhe middle one. If n he; even Lhen \\Ie take Lhe average of the two middle values. For thc Octopods it is 32.5

149

Mathematics for Com puter Scientists

Chapter 11. Looking at Data

• A trllllcatE-ld H1P.