# 辅导ST104留学生、讲解STATISTICAL LABORATORY、R程序语言辅导、讲解R 讲解Database|辅导R语言编程

- 首页 >> 其他 STATISTICAL LABORATORY

Practical 6: Assessed coursework

ST104

Term 3, 2019

Important Information

This practical session is assessed. The deadline for submission is 11am on Thursday,

9

th May.

Your reports should be submitted electronically on Moodle. Here is the link for submission:

https://moodle.warwick.ac.uk/mod/assign/view.php?id=674702

You can also find the submission link on the right hand side of the module webpage.

Please note that your report must:

- be submitted in PDF,

- be no more than 5 sides of A4 in length (excluding figures),

- be 12pt font for the main body of the report.

You will lose marks if you do not follow these requirements.

Also, please make sure you include your student ID code and lab group number on the

front sheet, and NOT your name!

Exercises

1. Pseudo-random numbers and the inversion method.

(a) For arbitrary choices of initial seeds U1 and U2 (in the interval (0, 1]), let

Ui+2 = [Ui+1 + Ui

] (mod 1) i ∈ N.

Why is this not a good pseudo-random independent U(0, 1] generator? [2 marks]

(b) During lectures we have seen how the inversion method can be used to simulate

from a Bernoulli(p) distribution using U ～ U(0, 1]. To achieve this task the

function generate.bernoulli() is written and presented below.

generate.bernoulli <- function(n = 1, p = 0.5) {

sample <- runif(n)

bernoulli <- as.numeric(sample < p)

return(bernoulli)

}

Explain in words what each line of code is doing and briefly justify why this

method does what it was intended to do. [3 marks]

1(c) How would you simulate from a Geometric(p) distribution:

(i) Using U ～ U(0, 1]?

Hint: Show that F(x) = 1 (1 p)

bxc

, where bxc denotes the greatest

integer less than or equal to x. [2.5 marks]

(ii) Using E ～ Exp(1)? [1.5 marks]

Write a function generate.geometric() that, given a sample size n and a

probability p, returns a vector of length n which contains realisations from a

Geometric(p). Your function should use either a sample of size n from the

U(0, 1] distribution or a sample of size n from the Exp(1) distribution. Choose

only one of these two approaches but you should NOT use the built-in rgeom()

function. Investigate (via comparisons you deem appropriate) what happens if

you change the size n or the probability p. Try to experiment with the following

combinations of n and p and include your comments in the report.

n p

10 0.1

10000 0.1

10 0.9

10000 0.9

[4 marks]

[Total: 13 marks]

2. Bernoulli random variables and their relatives.

(a) Given a source of Bernoulli random variables, it’s relatively easy to write a

function to generate from the Binomial distribution (remember that a Binomial

random variable with parameters n and p is the sum of n independent Bernoulli

random variables of common success probability p). Write a function which given

n and p will generate a single Binomial(n, p) random variable. Your function can

make use of generate.bernoulli() as given above (or any other user defined

function which simulates from a Bernoulli(p) distribution) but you should NOT

use the built-in rbinom() function. [1 mark]

(b) Write another function which, given m, n and p, will generate m realisations

from the Binomial(n, p) distribution. You can use any of the previously defined

functions but you should NOT use the built-in rbinom() function. Use your

function to generate 5000 realisations from a Binomial(10, 0.25) distribution. In

your report, only include the code for the function and the R command you used

to call it (NOT the 5000 realisations). [1.5 marks]

(c) Plot a histogram of those realisations (normalised like a probability density).

You may need to use the argument breaks to get a sensible histogram. In your

report, include both the histrogram and the R command you used to obtain the

histogram.

[1.5 marks]

2(d) Use R to compute the sample mean and sample variance for your realisations. In

your report, include both the R commands you used and your answers. How do

your answers compare to the expectation and variance of a Binomial(10, 0.25)

random variable? [2 marks]

(e) If we wish to plot the graph of a function in R, we can evaluate that function

on a grid of points and use the plotting functions to join the dots. Use seq to

generate a suitable grid of points to add the density of a normal distribution of

the same mean and variance to your histogram. Use the lines function (which

adds lines to an existing graph rather than plotting a new one) to add a blue

line showing this normal density to your histogram. Include both your code and

the corresponding graph in your report. [2 marks]

(f) Repeat steps (b)-(e) for a Binomial(1000, 0.25) distribution. What do you observe?

In your report, only include the corresponding histogram (with the corresponding

normal density in blue) and your comments. [2 marks]

(g) Repeat steps (b)-(e) for a Binomial(10000, 0.0001) distribution. What do you

observe? Try to also add the probability mass function of a Poisson distribution

with the same mean. For the Poisson mass function use a red colour. What

is significant about what you observe here? In your report only include the

corresponding histogram (with the corresponding Normal density in blue and

Poisson mass function in red) and your comments. [3 marks]

[Total: 13 marks]

3. Convolutions.

(a) Use rexp() to obtain a sample of 10,000 Exp(1) random variables. Plot a

histogram of your sample on the same scale as a probability density. What do

you observe and why? In your report, include your code, the histogram and

your comments. [1.5 marks]

(b) Write a function which has one argument, n, and which returns a vector of length

n each element of which is obtained as the sum of two Exp(1) random variables.

Plot a histogram of the values obtained using this function for n = 10, 000. In

your report, include your code and the histogram. [1.5 marks]

(c) Compute the density of the sum E1 and E2 if these random variables are independent

Exp(1) distributed random variables. How does this density relate with

the density of a Gamma(α, β) distribution? Add the density you obtained to the

histogram you produced in part (b) using red colour. In your report, include the

computations, your comments, the code and the corresponding plot. [4 marks]

(d) Adapt the function you wrote in part (b) to accept 2 arguments, n and k, and to

return a vector of length n, each element of which comprises the sum of k independent

Exp(1) random variables. Include your code in the report. [2 marks]

(e) Plot a histogram of the values you obtain using the function of part (d) for

k = 10 and k = 50, when n = 10, 000. In your report, include the code

3you used to obtain the vectors (NOT the vectors) and the corresponding histograms.

[2 marks]

(f) Do the histograms you obtain resemble any common probability density? If so,

add the appropriate density function to the plot in red colour. In your report,

include your answer (only brief justification needed and not a proof) as well as

the corresponding histograms with the appropriate density. [3 marks]

[Total: 14 marks]

Note: For full marks, do not forget to add suitable titles to your plots.

4

Practical 6: Assessed coursework

ST104

Term 3, 2019

Important Information

This practical session is assessed. The deadline for submission is 11am on Thursday,

9

th May.

Your reports should be submitted electronically on Moodle. Here is the link for submission:

https://moodle.warwick.ac.uk/mod/assign/view.php?id=674702

You can also find the submission link on the right hand side of the module webpage.

Please note that your report must:

- be submitted in PDF,

- be no more than 5 sides of A4 in length (excluding figures),

- be 12pt font for the main body of the report.

You will lose marks if you do not follow these requirements.

Also, please make sure you include your student ID code and lab group number on the

front sheet, and NOT your name!

Exercises

1. Pseudo-random numbers and the inversion method.

(a) For arbitrary choices of initial seeds U1 and U2 (in the interval (0, 1]), let

Ui+2 = [Ui+1 + Ui

] (mod 1) i ∈ N.

Why is this not a good pseudo-random independent U(0, 1] generator? [2 marks]

(b) During lectures we have seen how the inversion method can be used to simulate

from a Bernoulli(p) distribution using U ～ U(0, 1]. To achieve this task the

function generate.bernoulli() is written and presented below.

generate.bernoulli <- function(n = 1, p = 0.5) {

sample <- runif(n)

bernoulli <- as.numeric(sample < p)

return(bernoulli)

}

Explain in words what each line of code is doing and briefly justify why this

method does what it was intended to do. [3 marks]

1(c) How would you simulate from a Geometric(p) distribution:

(i) Using U ～ U(0, 1]?

Hint: Show that F(x) = 1 (1 p)

bxc

, where bxc denotes the greatest

integer less than or equal to x. [2.5 marks]

(ii) Using E ～ Exp(1)? [1.5 marks]

Write a function generate.geometric() that, given a sample size n and a

probability p, returns a vector of length n which contains realisations from a

Geometric(p). Your function should use either a sample of size n from the

U(0, 1] distribution or a sample of size n from the Exp(1) distribution. Choose

only one of these two approaches but you should NOT use the built-in rgeom()

function. Investigate (via comparisons you deem appropriate) what happens if

you change the size n or the probability p. Try to experiment with the following

combinations of n and p and include your comments in the report.

n p

10 0.1

10000 0.1

10 0.9

10000 0.9

[4 marks]

[Total: 13 marks]

2. Bernoulli random variables and their relatives.

(a) Given a source of Bernoulli random variables, it’s relatively easy to write a

function to generate from the Binomial distribution (remember that a Binomial

random variable with parameters n and p is the sum of n independent Bernoulli

random variables of common success probability p). Write a function which given

n and p will generate a single Binomial(n, p) random variable. Your function can

make use of generate.bernoulli() as given above (or any other user defined

function which simulates from a Bernoulli(p) distribution) but you should NOT

use the built-in rbinom() function. [1 mark]

(b) Write another function which, given m, n and p, will generate m realisations

from the Binomial(n, p) distribution. You can use any of the previously defined

functions but you should NOT use the built-in rbinom() function. Use your

function to generate 5000 realisations from a Binomial(10, 0.25) distribution. In

your report, only include the code for the function and the R command you used

to call it (NOT the 5000 realisations). [1.5 marks]

(c) Plot a histogram of those realisations (normalised like a probability density).

You may need to use the argument breaks to get a sensible histogram. In your

report, include both the histrogram and the R command you used to obtain the

histogram.

[1.5 marks]

2(d) Use R to compute the sample mean and sample variance for your realisations. In

your report, include both the R commands you used and your answers. How do

your answers compare to the expectation and variance of a Binomial(10, 0.25)

random variable? [2 marks]

(e) If we wish to plot the graph of a function in R, we can evaluate that function

on a grid of points and use the plotting functions to join the dots. Use seq to

generate a suitable grid of points to add the density of a normal distribution of

the same mean and variance to your histogram. Use the lines function (which

adds lines to an existing graph rather than plotting a new one) to add a blue

line showing this normal density to your histogram. Include both your code and

the corresponding graph in your report. [2 marks]

(f) Repeat steps (b)-(e) for a Binomial(1000, 0.25) distribution. What do you observe?

In your report, only include the corresponding histogram (with the corresponding

normal density in blue) and your comments. [2 marks]

(g) Repeat steps (b)-(e) for a Binomial(10000, 0.0001) distribution. What do you

observe? Try to also add the probability mass function of a Poisson distribution

with the same mean. For the Poisson mass function use a red colour. What

is significant about what you observe here? In your report only include the

corresponding histogram (with the corresponding Normal density in blue and

Poisson mass function in red) and your comments. [3 marks]

[Total: 13 marks]

3. Convolutions.

(a) Use rexp() to obtain a sample of 10,000 Exp(1) random variables. Plot a

histogram of your sample on the same scale as a probability density. What do

you observe and why? In your report, include your code, the histogram and

your comments. [1.5 marks]

(b) Write a function which has one argument, n, and which returns a vector of length

n each element of which is obtained as the sum of two Exp(1) random variables.

Plot a histogram of the values obtained using this function for n = 10, 000. In

your report, include your code and the histogram. [1.5 marks]

(c) Compute the density of the sum E1 and E2 if these random variables are independent

Exp(1) distributed random variables. How does this density relate with

the density of a Gamma(α, β) distribution? Add the density you obtained to the

histogram you produced in part (b) using red colour. In your report, include the

computations, your comments, the code and the corresponding plot. [4 marks]

(d) Adapt the function you wrote in part (b) to accept 2 arguments, n and k, and to

return a vector of length n, each element of which comprises the sum of k independent

Exp(1) random variables. Include your code in the report. [2 marks]

(e) Plot a histogram of the values you obtain using the function of part (d) for

k = 10 and k = 50, when n = 10, 000. In your report, include the code

3you used to obtain the vectors (NOT the vectors) and the corresponding histograms.

[2 marks]

(f) Do the histograms you obtain resemble any common probability density? If so,

add the appropriate density function to the plot in red colour. In your report,

include your answer (only brief justification needed and not a proof) as well as

the corresponding histograms with the appropriate density. [3 marks]

[Total: 14 marks]

Note: For full marks, do not forget to add suitable titles to your plots.

4