讲解STAT600、讲解Probability
- 首页 >> CSAuckland University of Technology: STAT600 Probability – Assignment 3 Semester 2, 2019
STAT600 Probability
Semester 2, 2019
Assignment 3
Instructions
• Due date: Submit to Blackboard by Tuesday 15th October, 11pm
• This assignment is worth 15% of your final grade and will be marked out of 100 marks.
• Assignments should be submitted as a single PDF file.
• If you use additional data (other than that provided), submit the data as a csv file.
• Do not upload your submission as a zip file.
• Your submission should contain relevant explanations, mathematical notation, R code, and workings.
Answers which do not include appropriate notation and/or workings will be penalised.
• Where you need to include some R code, a copy of the code and output should be provided in the
PDF. Note, the code should not be included as an image or screenshot. Code submitted should
be in a fixed-width font such as Courier New. It is strongly recommended that you use the
R package knitr with Rmarkdown or LaTeX to incorporate both R code and maths symbols (see
example file on Blackboard). If you use Rmarkdown, submit a PDF file and the corresponding
.Rmd file.
• Your assignment file should include the Individual Assignment Coversheet:
Blackboard/ Assessment Policies, Regulations, Guides and Forms/ Forms and Coversheets/ Individual
Assignment Coversheet
• Late Assignments: Failure to submit the assignment on time will result in a penalty in accordance
with the policy outlined in the STAT600 Study Guide.
item Special Consideration: If extenuating circumstances (e.g. illness) prevent the timely submission
of your assignment you can apply for special consideration. You may also apply for special consideration
if such circumstances result in your submission being incomplete. Applications for special consideration
should be submitted via Blackboard.
• Originality: This assignment is an individual piece of work. You are encouraged to discuss the
assignment with your lecturers and classmates, however, the work you submit must be your own.
Assignments that show similarities to work submitted by other students will be investigated for
plagiarism and treated very seriously. Plagiarism software, such as TurnItIn, may be used to electronically
compare submissions to those of other students and to documents on the internet.
Question: 1 2 3 4 Total
Marks: 10 35 40 15 100
Score:
Page 1 of 3 V: 23rd September 2019
Auckland University of Technology: STAT600 Probability – Assignment 3 Semester 2, 2019
In this assignment you will investigate the application of Markov chains to one of the following areas.
• Stock market
– The daily percentage change in the Air NZ stock price can be classified e.g. ≥ −3%, (-3%, -1%],
etc.
– Data: STAT600_2019_AirNZ.csv
– Variable name: PercentChangeCat
• All Blacks results
– The results of All Blacks games can be classified based on weather they won, drew or lost, and
the points differential, e.g. “won by more than 14 points”, “won by 14 or fewer points”, etc.
– Data: STAT600_2019_rugby.csv
– Variable name: ResultType
• Weather
– The weather in Auckland can be classified by the predominant weather in a given hour, e.g.
"rain" or "cloudy" etc.
– Data: STAT600_2019_auckland_weather.csv
– Variable name: description
• A topic of your choice (5 marks (bonus))
– If you choose your own topic, it is recommended that you discuss your topic and data with
the lecturer.
– Data: You will need to source your own data.
Choose one of the topics listed above and answer the following questions.
1. Markov Chain Definition Total for Question 1: 10 marks
(a) For your chosen topic, specify the states of the Markov chain and provide a table showing the (5 marks)
number of times that each state occurs within the dataset.
Hint 1: Each of the datasets has a categorical variable matching the variable name and description
above. The states will be the unique values of this variable.
Hint 2: Commands such as unique(x) and table(x) may be useful.
(b) Estimate the transition matrix from the data provided. State the transition matrix. Ensure that (5 marks)
all entries are probabilities and that the rows of your transition matrix sum to 1.
Hint: Commands like tab <- table(1:10, 2:11), prop.table(tab) and rowSums(x) may
be useful.
2. Classification of the Markov Chain Total for Question 2: 35 marks
For each of the following questions, refer to the relevant definition and provide an example applying
this definition in the context of your Markov chain.
(a) Does this Markov chain have any absorbing states? Justify your answer. (5 marks)
(b) Is state i accessible from state j (for all i and j)? Justify your answer. (5 marks)
(c) Is this Markov chain irreducible? Justify your answer. (5 marks)
(d) Is this Markov chain ergodic? Justify your answer. (5 marks)
Page 2 of 3 V: 23rd September 2019
Auckland University of Technology: STAT600 Probability – Assignment 3
Question 2 continues . . .
Semester 2, 2019
(e) Use R to compute the steady state probabilities and write several sentences interpreting what (5 marks)
they mean in the context of your chosen scenario.
(f ) Compute the mean first passage times for a state i to return to itself, (for all i). (5 marks)
(g) What assumptions need to be made in order for this scenario to be modelled as a Markov chain? (5 marks)
Discuss whether or not these assumptions are reasonable for your scenario.
3. Simulation & Analysis Total for Question 3: 40 marks
Note: For this question you should write your own simulation code. Do not use an R package which
has an inbuilt simulation function (like the markovchain package).
• Write some R code to simulate this Markov chain and include your code in your assignment
file. Run your simulation for a large number of stages (e.g. at least 10000).
• Set the seed equal to your student ID number. For example, if your ID number is 12345678,
then at the start of your simulation use set.seed(12345678).
• Use the results of your simulation to answer the following questions.
(a) Compute the proportion of time that the Markov chain spends in each state. Compare this to (15 marks)
your answer in question 2e and write 1 - 2 sentences discussing what you observe.
(b) Construct a line graph showing how the proportion of time in each state converges to the steady (10 marks)
state probabilities over the first 1000 stages. Your graph should include appropriate axis labels,
axis limits, and legends.
Hint: If your Markov chain has 5 states, then your graph should have 5 lines.
(c) Using simulation, investigate the first passage times for each state i to itself. Compare your (5 marks)
result to the theoretical results in question 2f and write 1 - 2 sentences discussing what you
observe.
(d) Choose a state i with πi > 0.1. Repeat your simulation a large number of times. For each (10 marks)
simulation, begin the simulation in your chosen state, run each simulation for the length of
your original dataset (e.g., if your dataset has data for 500 days, then run the simulation for
500 days) and record the number of times that the Markov chain enters your chosen state.
Investigate (using summary statistics and graphs) the distribution of the number of times that
the Markov chain enters your chosen state. Compare this to the actual number of times that
this state was observed. Write several sentences discussing your findings in the context of your
selected scenario.
Hint: A histogram will be useful for exploring the distribution.
4. Improving the Model Total for Question 4: 15 marks
The Markov chain, with the states specified by the categorical variable in the CSV file provided, can
be used to analyse the scenario you have selected. However, as with any stochastic model, it is a
model and thus a simplification of reality. Using some of the other information provided in the csv
file (or from elsewhere), provide a new classification, so that your Markov Chain has at least 3 more
states than it did previously.
Hint: The R function cut is useful for splitting continuous variables into categorical ones.
(a) Define the states for your Markov chain and briefly explain why this new classification will (5 marks)
provide an improved model. Use R to compute the new states for your data. If you use additional
data, include the data as a CSV with your submission.
(b) Compute the transition matrix for your new Markov chain. (5 marks)
(c) Does steady state exist for this Markov chain? Justify your answer. If so, compute the steady (5 marks)
state probabilities. If not, demonstrate that the steady state probabilities do not converge.
Page 3 of 3 V: 23rd September 2019