辅导ST104留学生、讲解STATISTICAL LABORATORY、R程序语言辅导、讲解R 讲解Database|辅导R语言编程
- 首页 >> 其他 STATISTICAL LABORATORY
Practical 6: Assessed coursework
ST104
Term 3, 2019
Important Information
This practical session is assessed. The deadline for submission is 11am on Thursday,
9
th May.
Your reports should be submitted electronically on Moodle. Here is the link for submission:
https://moodle.warwick.ac.uk/mod/assign/view.php?id=674702
You can also find the submission link on the right hand side of the module webpage.
Please note that your report must:
- be submitted in PDF,
- be no more than 5 sides of A4 in length (excluding figures),
- be 12pt font for the main body of the report.
You will lose marks if you do not follow these requirements.
Also, please make sure you include your student ID code and lab group number on the
front sheet, and NOT your name!
Exercises
1. Pseudo-random numbers and the inversion method.
(a) For arbitrary choices of initial seeds U1 and U2 (in the interval (0, 1]), let
Ui+2 = [Ui+1 + Ui
] (mod 1) i ∈ N.
Why is this not a good pseudo-random independent U(0, 1] generator? [2 marks]
(b) During lectures we have seen how the inversion method can be used to simulate
from a Bernoulli(p) distribution using U ~ U(0, 1]. To achieve this task the
function generate.bernoulli() is written and presented below.
generate.bernoulli <- function(n = 1, p = 0.5) {
sample <- runif(n)
bernoulli <- as.numeric(sample < p)
return(bernoulli)
}
Explain in words what each line of code is doing and briefly justify why this
method does what it was intended to do. [3 marks]
1(c) How would you simulate from a Geometric(p) distribution:
(i) Using U ~ U(0, 1]?
Hint: Show that F(x) = 1 (1 p)
bxc
, where bxc denotes the greatest
integer less than or equal to x. [2.5 marks]
(ii) Using E ~ Exp(1)? [1.5 marks]
Write a function generate.geometric() that, given a sample size n and a
probability p, returns a vector of length n which contains realisations from a
Geometric(p). Your function should use either a sample of size n from the
U(0, 1] distribution or a sample of size n from the Exp(1) distribution. Choose
only one of these two approaches but you should NOT use the built-in rgeom()
function. Investigate (via comparisons you deem appropriate) what happens if
you change the size n or the probability p. Try to experiment with the following
combinations of n and p and include your comments in the report.
n p
10 0.1
10000 0.1
10 0.9
10000 0.9
[4 marks]
[Total: 13 marks]
2. Bernoulli random variables and their relatives.
(a) Given a source of Bernoulli random variables, it’s relatively easy to write a
function to generate from the Binomial distribution (remember that a Binomial
random variable with parameters n and p is the sum of n independent Bernoulli
random variables of common success probability p). Write a function which given
n and p will generate a single Binomial(n, p) random variable. Your function can
make use of generate.bernoulli() as given above (or any other user defined
function which simulates from a Bernoulli(p) distribution) but you should NOT
use the built-in rbinom() function. [1 mark]
(b) Write another function which, given m, n and p, will generate m realisations
from the Binomial(n, p) distribution. You can use any of the previously defined
functions but you should NOT use the built-in rbinom() function. Use your
function to generate 5000 realisations from a Binomial(10, 0.25) distribution. In
your report, only include the code for the function and the R command you used
to call it (NOT the 5000 realisations). [1.5 marks]
(c) Plot a histogram of those realisations (normalised like a probability density).
You may need to use the argument breaks to get a sensible histogram. In your
report, include both the histrogram and the R command you used to obtain the
histogram.
[1.5 marks]
2(d) Use R to compute the sample mean and sample variance for your realisations. In
your report, include both the R commands you used and your answers. How do
your answers compare to the expectation and variance of a Binomial(10, 0.25)
random variable? [2 marks]
(e) If we wish to plot the graph of a function in R, we can evaluate that function
on a grid of points and use the plotting functions to join the dots. Use seq to
generate a suitable grid of points to add the density of a normal distribution of
the same mean and variance to your histogram. Use the lines function (which
adds lines to an existing graph rather than plotting a new one) to add a blue
line showing this normal density to your histogram. Include both your code and
the corresponding graph in your report. [2 marks]
(f) Repeat steps (b)-(e) for a Binomial(1000, 0.25) distribution. What do you observe?
In your report, only include the corresponding histogram (with the corresponding
normal density in blue) and your comments. [2 marks]
(g) Repeat steps (b)-(e) for a Binomial(10000, 0.0001) distribution. What do you
observe? Try to also add the probability mass function of a Poisson distribution
with the same mean. For the Poisson mass function use a red colour. What
is significant about what you observe here? In your report only include the
corresponding histogram (with the corresponding Normal density in blue and
Poisson mass function in red) and your comments. [3 marks]
[Total: 13 marks]
3. Convolutions.
(a) Use rexp() to obtain a sample of 10,000 Exp(1) random variables. Plot a
histogram of your sample on the same scale as a probability density. What do
you observe and why? In your report, include your code, the histogram and
your comments. [1.5 marks]
(b) Write a function which has one argument, n, and which returns a vector of length
n each element of which is obtained as the sum of two Exp(1) random variables.
Plot a histogram of the values obtained using this function for n = 10, 000. In
your report, include your code and the histogram. [1.5 marks]
(c) Compute the density of the sum E1 and E2 if these random variables are independent
Exp(1) distributed random variables. How does this density relate with
the density of a Gamma(α, β) distribution? Add the density you obtained to the
histogram you produced in part (b) using red colour. In your report, include the
computations, your comments, the code and the corresponding plot. [4 marks]
(d) Adapt the function you wrote in part (b) to accept 2 arguments, n and k, and to
return a vector of length n, each element of which comprises the sum of k independent
Exp(1) random variables. Include your code in the report. [2 marks]
(e) Plot a histogram of the values you obtain using the function of part (d) for
k = 10 and k = 50, when n = 10, 000. In your report, include the code
3you used to obtain the vectors (NOT the vectors) and the corresponding histograms.
[2 marks]
(f) Do the histograms you obtain resemble any common probability density? If so,
add the appropriate density function to the plot in red colour. In your report,
include your answer (only brief justification needed and not a proof) as well as
the corresponding histograms with the appropriate density. [3 marks]
[Total: 14 marks]
Note: For full marks, do not forget to add suitable titles to your plots.
4
Practical 6: Assessed coursework
ST104
Term 3, 2019
Important Information
This practical session is assessed. The deadline for submission is 11am on Thursday,
9
th May.
Your reports should be submitted electronically on Moodle. Here is the link for submission:
https://moodle.warwick.ac.uk/mod/assign/view.php?id=674702
You can also find the submission link on the right hand side of the module webpage.
Please note that your report must:
- be submitted in PDF,
- be no more than 5 sides of A4 in length (excluding figures),
- be 12pt font for the main body of the report.
You will lose marks if you do not follow these requirements.
Also, please make sure you include your student ID code and lab group number on the
front sheet, and NOT your name!
Exercises
1. Pseudo-random numbers and the inversion method.
(a) For arbitrary choices of initial seeds U1 and U2 (in the interval (0, 1]), let
Ui+2 = [Ui+1 + Ui
] (mod 1) i ∈ N.
Why is this not a good pseudo-random independent U(0, 1] generator? [2 marks]
(b) During lectures we have seen how the inversion method can be used to simulate
from a Bernoulli(p) distribution using U ~ U(0, 1]. To achieve this task the
function generate.bernoulli() is written and presented below.
generate.bernoulli <- function(n = 1, p = 0.5) {
sample <- runif(n)
bernoulli <- as.numeric(sample < p)
return(bernoulli)
}
Explain in words what each line of code is doing and briefly justify why this
method does what it was intended to do. [3 marks]
1(c) How would you simulate from a Geometric(p) distribution:
(i) Using U ~ U(0, 1]?
Hint: Show that F(x) = 1 (1 p)
bxc
, where bxc denotes the greatest
integer less than or equal to x. [2.5 marks]
(ii) Using E ~ Exp(1)? [1.5 marks]
Write a function generate.geometric() that, given a sample size n and a
probability p, returns a vector of length n which contains realisations from a
Geometric(p). Your function should use either a sample of size n from the
U(0, 1] distribution or a sample of size n from the Exp(1) distribution. Choose
only one of these two approaches but you should NOT use the built-in rgeom()
function. Investigate (via comparisons you deem appropriate) what happens if
you change the size n or the probability p. Try to experiment with the following
combinations of n and p and include your comments in the report.
n p
10 0.1
10000 0.1
10 0.9
10000 0.9
[4 marks]
[Total: 13 marks]
2. Bernoulli random variables and their relatives.
(a) Given a source of Bernoulli random variables, it’s relatively easy to write a
function to generate from the Binomial distribution (remember that a Binomial
random variable with parameters n and p is the sum of n independent Bernoulli
random variables of common success probability p). Write a function which given
n and p will generate a single Binomial(n, p) random variable. Your function can
make use of generate.bernoulli() as given above (or any other user defined
function which simulates from a Bernoulli(p) distribution) but you should NOT
use the built-in rbinom() function. [1 mark]
(b) Write another function which, given m, n and p, will generate m realisations
from the Binomial(n, p) distribution. You can use any of the previously defined
functions but you should NOT use the built-in rbinom() function. Use your
function to generate 5000 realisations from a Binomial(10, 0.25) distribution. In
your report, only include the code for the function and the R command you used
to call it (NOT the 5000 realisations). [1.5 marks]
(c) Plot a histogram of those realisations (normalised like a probability density).
You may need to use the argument breaks to get a sensible histogram. In your
report, include both the histrogram and the R command you used to obtain the
histogram.
[1.5 marks]
2(d) Use R to compute the sample mean and sample variance for your realisations. In
your report, include both the R commands you used and your answers. How do
your answers compare to the expectation and variance of a Binomial(10, 0.25)
random variable? [2 marks]
(e) If we wish to plot the graph of a function in R, we can evaluate that function
on a grid of points and use the plotting functions to join the dots. Use seq to
generate a suitable grid of points to add the density of a normal distribution of
the same mean and variance to your histogram. Use the lines function (which
adds lines to an existing graph rather than plotting a new one) to add a blue
line showing this normal density to your histogram. Include both your code and
the corresponding graph in your report. [2 marks]
(f) Repeat steps (b)-(e) for a Binomial(1000, 0.25) distribution. What do you observe?
In your report, only include the corresponding histogram (with the corresponding
normal density in blue) and your comments. [2 marks]
(g) Repeat steps (b)-(e) for a Binomial(10000, 0.0001) distribution. What do you
observe? Try to also add the probability mass function of a Poisson distribution
with the same mean. For the Poisson mass function use a red colour. What
is significant about what you observe here? In your report only include the
corresponding histogram (with the corresponding Normal density in blue and
Poisson mass function in red) and your comments. [3 marks]
[Total: 13 marks]
3. Convolutions.
(a) Use rexp() to obtain a sample of 10,000 Exp(1) random variables. Plot a
histogram of your sample on the same scale as a probability density. What do
you observe and why? In your report, include your code, the histogram and
your comments. [1.5 marks]
(b) Write a function which has one argument, n, and which returns a vector of length
n each element of which is obtained as the sum of two Exp(1) random variables.
Plot a histogram of the values obtained using this function for n = 10, 000. In
your report, include your code and the histogram. [1.5 marks]
(c) Compute the density of the sum E1 and E2 if these random variables are independent
Exp(1) distributed random variables. How does this density relate with
the density of a Gamma(α, β) distribution? Add the density you obtained to the
histogram you produced in part (b) using red colour. In your report, include the
computations, your comments, the code and the corresponding plot. [4 marks]
(d) Adapt the function you wrote in part (b) to accept 2 arguments, n and k, and to
return a vector of length n, each element of which comprises the sum of k independent
Exp(1) random variables. Include your code in the report. [2 marks]
(e) Plot a histogram of the values you obtain using the function of part (d) for
k = 10 and k = 50, when n = 10, 000. In your report, include the code
3you used to obtain the vectors (NOT the vectors) and the corresponding histograms.
[2 marks]
(f) Do the histograms you obtain resemble any common probability density? If so,
add the appropriate density function to the plot in red colour. In your report,
include your answer (only brief justification needed and not a proof) as well as
the corresponding histograms with the appropriate density. [3 marks]
[Total: 14 marks]
Note: For full marks, do not forget to add suitable titles to your plots.
4