辅导ST437/537调试Take Home Project
- 首页 >> Matlab编程ST437/537 – Take Home Project
Arnab Maity
Due date: April 28, 2020, at 11:00am
1. Instructions
Please follow the instructions below when you prepare and submit your assignment.
• Include a cover-page with your project report. It should contain
(i) Full name,
(ii) Course#: ST 437/537
(iii) Assignment: Final Project Report
(iv) Submission date
• Assignments should be submitted using moodle by the date specified (“due date”).
• Neatly typed work should be submitted.
• Submission should be in the PDF format.
• The project should be your own work – no teams/collaborations are permitted.
• Due date: 4/28/2020, Tuesday, 11:00AM
2. Introduction to the problem
Exposure to lead can produce a variety of adverse health effects in infants and children, including hyperactivity,
hearing or memory loss, learning disabilities, and damage to the nervous system. Although the use of lead
as a gasoline additive has been discontinued in the US, so that airborne lead levels have been reduced
dramatically, a small percentage of children continue to be exposed to lead at levels that can produce such
health problems. Much of this exposure is due to deteriorating lead-based paint that may be chipping and
peeling in older homes. Lead-based paint in housing was banned in the US in 1978; however, many older
homes (built pre-1978) do contain lead-based paint, and chips and dust can be ingested by young children
living in these homes during normal teething and hand-to-mouth behavior. This is especially a problem
among children in deteriorating, inner-city housing. The US Centers for Disease Control and Prevention
(CDC) has determined that children with blood levels above 10 micrograms/deciliter (µg/dL) of whole blood
are at risk of adverse health effects.
Luckily, there are so-called chelation treatments that can help a child to excrete the lead that has been
ingested. The researchers were interested in evaluating the effectiveness of one such chelating treatment,
succimer, in children who had been exposed to what the CDC views as dangerous levels of lead. They
conducted the following study. 120 children aged 12{36 months with confirmed blood lead levels of > 15µg/dL
and ; < 40µg/dL in a large, inner-city housing project were identified; these lead levels are above the at-risk
threshold determined by the CDC. A clinic was set up in the housing project staffed by personnel from the
city’s Department of Public Health. The personnel randomized the children into three groups: 40 children
were assigned at random to receive a placebo (an inactive agent with no lead-lowering properties), 40 children
were assigned at random to receive a low dose of succimer, and 40 children were assigned at random to receive
a higher dose of succimer. Blood lead levels were measured at the clinic for each child at baseline (time 0),
prior to initiation of the assigned treatments. Then, assigned treatment was started, and, ideally, each child
was to return to the clinic at weeks 2, 4, 6, and 8. At each visit, blood lead level was measured for each child.
1
The data are available in the file lead.full.txt included in the package. The data are presented in the form of
one data record per observation; the columns of the data set are as follows:
1 id: Child id
2 ind.age: Indicator of age (= 0 if ≤ 24 months; = 1 if > 24 months)
3 sex: Gender indicator (= 0 if female, = 1 if male)
4 week: time of visit (week 0, 2, 4, 6 or 8 )
5 blood: Blood lead level (µg/dL)
6 trt: Treatment indicator (= 1 if placebo, = 2 if low dose, = 3 if higher dose)
lead <- read.table("data/lead.full.txt", header = F)
colnames(lead) = c("id", "ind.age", "sex", "week", "blood", "trt")
head(lead)
## id ind.age sex week blood trt
## 1 1 0 1 0 31.8 1
## 2 1 0 1 2 31.6 1
## 3 1 0 1 4 39.9 1
## 4 1 0 1 6 40.5 1
## 5 1 0 1 8 48.3 1
## 6 2 0 0 0 24.5 1
The investigators had several questions of interest. Broadly stated, the primary focus was on whether
succimer, in either low- or high-dose form, is effective over an eight week period in reducing blood lead levels
in this population of children. They were also interested in whether blood lead levels in this population are
associated with the age and/or gender of the child, and whether the effectiveness of succimer in reducing
blood lead levels is associated with either or both of these factors.
3. Your assignment
Analyze the data and write a brief report to present your findings. Include the following sections
in your report:
Introduction: This section should give a short description of the problem, a brief summary of the study
carried out, mention each of the specific scientific questions you will investigate, an overview of the conclusions
of the data analysis, and a short roadmap for the remainder of the report. You can paraphrase the materials
at the beginning of Section 2 in here, if you wish.
Methods: This section should contain description of your statistical model, data analysis, and results. Follow
the roadmap given below.
(A) Start by providing visual or numeric summaries of the data (at the very least profile plots with
appropriate groups) and relevant discussion/interpretation and how they might impact your modeling
choices.
(B) For modeling, notice that there are three treatment groups. For each group, write a model with linear
trend in week (the time variable), age indicator (ind.age), sex, and all their interactions (up to week ×
age × sex). Assume common random effects for intercept and slope of week (with a general covariance
structure) across the three treatment groups. In your analysis consider/compare the following models
for the error covariance structure and choose one you prefer – for simplicity assume that the error
variance-covariance structure are same for all the three treatment groups.
• Independent, where error variance does not change over weeks,
• Independent, where error variance changes over weeks,
2
• AR(1) correlation structure, where error variance does not change over weeks
• AR(1) correlation structure, where error variance changes over weeks
• Unstructured, where error variance does not change over weeks
• Unstructured, where error variance changes over weeks
(C) Based on your chosen covariance model, investigate the following questions. Write each of the questions
in terms of model parameters and describe you findings in the context of the subject matter.
(i) Does gender has any association with bloood lead level? Does age has any association with bloood
lead level?
(ii) Based on your findings in (i), propose a smaller models, if possible. Based on the smaller model,
are the mean trends of blood lead level the same for the three treatments?
(iii) Based on the smaller model in (ii), what is the mean trend of blood lead level of a patient who is
(a) male with age < 24 receiving placebo, (b) male with age > 24 receiving placebo? Repeat this
for the other two treatments, and also for females.
(iv) Present some appropriate model diagnostics, and comment on the appropriateness of the model
assumptions as best as you can.
In preparing this section, provide all relevant details such as which fitting method you used, which testing
procedure you employed, the software used and so on. Do not give raw data, code, output in this secton.
You may summarize your results using tables and/or figures, but do not just print the raw output. Also,
present possible limitations of your methods, if any. If this section is too long, you can break it into smaller
subsections for each question you address.
Conclusions: This section is a brief summary of your study objectives and data analysis results in terms of
the subject matter. Succinctly presnt the scientific questons again, your fidnings and interpretation. If you
have made any additional obsevations during your analysis, you may present them here as well.
Appendix: This section contains technical details and other supporting information, if any. All well
commented R code and the corresponding detailed output should be included here in an organized format.
References: Put any references you may have here. You van omit this section if you do not have any
references.
3. Miscellaneous
When using the function lme() in the nlme package in R, it often throws an error (“convergence limit reached”
etc.) in fitting some of the mixed models. If you encounter such a case in your analysis, you may want to
include control = lmeControl(opt=‘optim’) as an input argiment in lme():
lme(..., control = lmeControl(opt='optim'))