讲解Stats 20、R讲解、辅导Statistical Methods
- 首页 >> CSStats 20 Lec 2: Sample Final Exam
Version A
Introduction to Statistical Methods for Life and Health Sciences
UCLA, Summer 2019
In the Version section of your scantron, bubble in A for Version A.
Disclaimer: These 27 practice questions are meant to illustrate the format and approximate
difficulty of typical questions on the actual exam (some exam questions may be more difficult).
Not all topics covered on the exam are necessarily represented.
Instructions: You have 100 minutes to complete the following questions. This exam is closed
book. No calculators or other electronic devices are allowed. Good luck!
Academic Misconduct: Any potential violation of UCLA’s policy on academic integrity will be
reported to the Office of the Dean of Students. All work on this exam must be your own.
By signing my name below, I hereby understand and agree to abide by the instructions above and
UCLA’s policy on academic integrity. All work I submit is a product of my own honest effort.
Name:UID:Signature:(Turn over when exam starts)
2 Stats 20 Lec 2: Sample Final Exam
Mark your answers to all questions on the scantron provided. Any answers marked
on these pages will not be scored.
The following information is used in Problems 1, 2, and 3.
Consider the following commands and output:
> parks_df <- data.frame("Name"=c("Leslie","Ron","April"),
+ "Height"=c(62,71,66),"Weight"=c(115,201,119),"Income"=c(4000,NA,2000))
> parks_df
Name Height Weight Income
1 Leslie 62 115 4000
2 Ron 71 201 NA
3 April 66 119 2000
Problem 1 Which of the following will extract the value 115 from parks df?
(a) with(parks df,Weight[1])
(b) parks df[1,3]
(c) parks df[1,"Weight"]
(d) parks df$Weight[1]
(e) All of the above
Problem 2 What is the output of mode(parks df)?
(a) [1] "numeric"
(b) [1] "character"
(c) [1] "factor"
(d) [1] "data.frame"
(e) [1] "list"
Problem 3 What is the output of dim(parks df)?
(a) [1] 4
(b) [1] 12
(c) [1] 3 4
(d) [1] 4 3
(e) NULL
Stats 20 Lec 2: Sample Final Exam 3
The following information is used in Problems 4 and 5.
Consider the following command and output:
> L
[[1]]
[1] 1 2 3 4 5 6 7 8 9 10
[[2]]
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
[[3]]
Name Height Weight Income
1 Leslie 62 115 4000
2 Ron 71 201 NA
3 April 66 119 2000
[[4]]
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
Problem 4 What is the output of class(L)?
(a) [1] "numeric"
(b) [1] "factor"
(c) [1] "matrix"
(d) [1] "data.frame"
(e) [1] "list"
Problem 5 What is the output of length(L)?
(a) [1] 1
(b) [1] 4
(c) [1] 37
(d) [1] 10 6 12 9
(e) NULL
4 Stats 20 Lec 2: Sample Final Exam
Problem 6 Which command can be used to simulate 30 flips of a weighted coin, where each flip
has a 30% chance of landing heads and a 70% chance of landing tails?
(a) sample(c("H","T"),size=30,replace=TRUE)
(b) sample(c("H","T"),size=30,replace=FALSE)
(c) sample(c("H","T"),size=30,replace=TRUE,prob=c(0.7,0.3))
(d) sample(c("H","T"),size=30,replace=FALSE,prob=c(0.3,0.7))
(e) sample(c("H","T"),size=30,replace=TRUE,prob=c(0.3,0.7))
Problem 7 Suppose there is a plot constructed in an existing screen device (i.e., plotting window
on the computer screen). Which of the following functions will superimpose on the existing plot?
(a) text()
(b) barplot()
(c) pairs()
(d) plot()
(e) boxplot()
Problem 8 Suppose the current working directory has a text file consisting only of R commands
called TreatYourself.R. Which of the following commands would correctly read this file into R?
(a) data("TreatYourself.R")
(b) load("TreatYourself.R")
(c) save("TreatYourself.R")
(d) source("TreatYourself.R")
(e) read.table("TreatYourself.R")
Problem 9 Suppose we want to use the Boston object from the MASS package. Consider the
following command and output:
> Boston
Error: object ’Boston’ not found
Which of the following commands will fix the issue?
(a) library(Boston)
(b) library(MASS)
(c) load(Boston)
(d) data(Boston)
(e) data(MASS)
Stats 20 Lec 2: Sample Final Exam 5
The following information is used in Problems 10, 11, and 12.
The data table below shows the first few rows from the countries.data dataset, found at
https://web.stanford.edu/~hastie/ElemStatLearn/datasets/countries.data.
The data contains an average measure of pairwise dissimilarity between 12 different countries (low
numbers mean two countries are similar, high numbers mean two countries are dissimilar).
0.00 5.58 7.00 7.08 4.83 2.17 6.42 3.42 2.50 6.08 5.25 4.75
5.58 0.00 6.50 7.00 5.08 5.75 5.00 5.50 4.92 6.67 6.83 3.00
7.00 6.50 0.00 3.83 8.17 6.67 5.58 6.42 6.25 4.25 4.50 6.08
7.08 7.00 3.83 0.00 5.83 6.92 6.00 6.42 7.33 2.67 3.75 6.67
Problem 10 Suppose this dataset has been saved to the countries.data file in an appropriate
folder on your computer. Which of the following commands would correctly read this file into R?
(a) read.table("countries.data")
(b) read.csv("countries.data")
(c) read.table("countries.data",header=TRUE)
(d) read.table("countries.data",header=FALSE,sep=",")
(e) read.table("countries.data",header=TRUE,sep=",")
Problem 11 Suppose the dataset has been saved to the countries obj object in the workspace
using the correct command from Problem 10. What is the output of class(countries obj)?
(a) [1] "numeric"
(b) [1] "factor"
(c) [1] "matrix"
(d) [1] "data.frame"
(e) [1] "list"
Problem 12 Using the same countries obj object from Problem 11: If you want to compute
the standard deviation of each row in the data, which of the following functions would you use?
(a) apply()
(b) lapply()
(c) sapply()
(d) tapply()
(e) summary()
6 Stats 20 Lec 2: Sample Final Exam
Problem 13 Which of the following commands will superimpose a line corresponding to the
equation y = 2 − 5x on an existing plot?
(a) curve(2 - 5*x)
(b) lines(2,-5)
(c) lines(-5,2)
(d) abline(a=2,b=-5)
(e) abline(a=-5,b=2)
Problem 14 Consider the following commands:
> wait_times <- runif(1000,0,12)
> mean(wait_times)
[1] 6.130214
If you were to run these commands again, what is the output of the command
round(mean(wait times),6) == 6.130214?
(a) [1] TRUE
(b) [1] FALSE
(c) [1] NA
(d) NULL
(e) None of the above
Problem 15 Which command will generate a random sample of size 35 from a normal distribution
with mean 12 and standard deviation 2?
(a) rnorm(35,12,2)
(b) dnorm(35,12,2)
(c) pnorm(35,12,2)
(d) qnorm(35,12,2)
Stats 20 Lec 2: Sample Final Exam 7
The following information is used in Problems 16 and 17.
Consider the following output, which represents self-identified gender for a sample of people. A
value of F represents female, a value of M represents male, and a value of X represents non-binary.
> gender
[1] M X F F F M
Levels: F M X
Problem 16 What is the output of as.numeric(gender)?
(a) [1] 2 3 1 1 1 2
(b) [1] 1 2 3 3 3 1
(c) [1] "M" "X" "F" "F" "F" "M"
(d) [1] NA NA NA NA NA NA
Warning message:
NAs introduced by coercion
Problem 17 Consider the following command:
> gender[3] <- "Female"
After executing this command, what is the output of gender?
(a) [1] M X F F F M
Levels: F M X
(b) [1] M X Female F F M
Levels: F M X
(c) [1] M X Female F F M
Levels: F M X Female
(d) [1] M X <NA> F F M
Levels: F M X
(e) [1] "M" "X" "Female" "F" "F" "M"
8 Stats 20 Lec 2: Sample Final Exam
The following information is used in Problems 18, 19, and 20.
The data table below shows a few observations from the Tuna.txt dataset, found at
http://www.math.hope.edu/isi/data/chap6/Tuna.txt.
The data contains measurements (in parts per million) on mercury levels for two types of tuna.
Tuna Mercury
albacore 0.82
albacore 0.32
albacore 0.036
yellowfin 0.11
yellowfin 0.379
Suppose the dataset has been saved to the tuna object in the workspace using the correct read.table()
command.
Problem 18 The plot below visualizes the distributions of Mercury split by Tuna in the tuna
data.
albacore yellowfin
0.0 0.5 1.0 1.5
Tuna
Mercury
Which of the following commands will produce this plot?
(a) boxplot(tuna)
(b) with(tuna,boxplot(Tuna,Mercury))
(c) boxplot(Mercury ∼ Tuna,data=tuna,xlab="Tuna",ylab="Mercury")
(d) boxplot(Tuna ∼ Mercury,data=tuna,xlab="Tuna",ylab="Mercury")
(e) None of the above
Stats 20 Lec 2: Sample Final Exam 9
Problem 19 The plot below visualizes the distribution of Tuna in the tuna data.
albacore yellowfin
Frequency
0 50 100 150 200
Which of the following commands will produce this plot?
I. plot(tuna$Tuna,ylab="Frequency")
II. plot(table(tuna$Tuna),ylab="Frequency")
III. barplot(tuna$Tuna,ylab="Frequency")
IV. barplot(table(tuna$Tuna),ylab="Frequency")
(a) I and III only
(b) I and IV only
(c) II and III only
(d) II and IV only
(e) I, II, III, and IV
Problem 20 Using a correct command from Problem 19, which argument will change the level
(or amount) of shading in the above plot?
(a) shade
(b) angle
(c) density
(d) col
(e) beside
10 Stats 20 Lec 2: Sample Final Exam
Problem 21 Suppoes the height of people from a certain population follows a normal distribution
with a mean height of 67 inches and a standard deviation of 2.2 inches. Which command will
compute the probability of observing a person with a height of 70 inches or less from this population?
(a) pnorm(70,67,2.2,lower.tail=FALSE)
(b) pnorm(70,67,2.2,lower.tail=TRUE)
(c) qnorm(70,67,2.2,lower.tail=FALSE)
(d) qnorm(70,67,2.2,lower.tail=TRUE)
The following information is used in Problems 22, 23, 24, and 25.
Suppose we are interested in answering the following question: How tall, on average, is the tallest
person in a group of 30 people? More formally, we are interested in the typical (mean) value of the
maximum height in samples of 30 people from a given population.
Assume that the population of heights follows a normal distribution with a mean of 68 inches and a
standard deviation of 2 inches. Rather than observing many real-life samples from this population,
we can simulate the sample maximum many times using a for() loop to estimate the typical value
of the sample maximum. Consider the following commands:
> M <- 10000
> max_obj <- numeric(M)
> set.seed(143)
> for(i in 1:M){
+ ______ <- max(rnorm(30,mean=68,sd=2))
+ }
Problem 22 Which of the following should be used in place of the blanks (______) in the loop?
(a) max obj
(b) max obj[i]
(c) max obj[i,]
(d) max obj[,i]
(e) max obj[-i,]
Problem 23 Which of the following commands will estimate the typical value of the maximum
height in samples of 30 people?
(a) sum(max obj)/30
(b) max(max obj)
(c) length(max obj)
(d) sum(max obj)/M
(e) pnorm(max obj)
Stats 20 Lec 2: Sample Final Exam 11
Problem 24 The plot below visualizes the distribution of max obj.
Histogram of max_obj
Maximum Value
Frequency
70 72 74 76
0 500 1000 1500 2000
Which of the following arguments in hist() will change the above histogram from a frequency
histogram to a density (or probability) histogram?
I. density=TRUE
II. prob=TRUE
III. freq=FALSE
IV. add=TRUE
(a) I and III only
(b) I and IV only
(c) II and III only
(d) II and IV only
(e) I, II, III, and IV
Problem 25 Using a correct command from Problem 24, suppose a density (or probability)
histogram of max obj has been constructed. Which of the following commands will superimpose a
smooth density estimate of max obj on top of the existing histogram?
(a) lines(density(max obj))
(b) lines(max obj)
(c) curve(dnorm(x,mean(max obj),sd(max obj),add=TRUE))
(d) abline(density(max obj))
(e) hist(max obj,density=TRUE,add=TRUE)
12 Stats 20 Lec 2: Sample Final Exam
Problem 26 Consider the following commands:
> head(trees)
Girth Height Volume
1 8.3 70 10.3
2 8.6 65 10.3
3 8.8 63 10.2
4 10.5 72 16.4
5 10.7 81 18.8
6 10.8 83 19.7
> trees_medians <- numeric(ncol(trees))
> names(trees_medians) <- names(trees)
> for(j in seq_len(ncol(trees))){
+ trees_medians[j] <- median(trees[,j])
+ }
> trees_medians
Girth Height Volume
12.9 76.0 24.2
Which of the following commands will produce the same object as trees medians?
(a) apply(trees,1,median)
(b) apply(trees,2,median)
(c) lapply(trees,median)
(d) sapply(trees,median)
(e) More than one of these commands will produce the same object as trees medians.
Stats 20 Lec 2: Sample Final Exam 13
Problem 27 The plot below visualizes the relationship between the sepal length and petal length
of iris flowers in the iris data from the datasets package.
1 2 3 4 5 6 7
4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
Petal.Length
Sepal.Length
Which of the following commands will produce this plot?
(a) plot(Sepal.Length ∼ Petal.Length,data=iris,pch=19)
(b) plot(Petal.Length ∼ Sepal.Length,data=iris,pch=19)
(c) points(Sepal.Length ∼ Petal.Length,data=iris,pch=19)
(d) with(iris,plot(Sepal.Length,Petal.Length,pch=19))
(e) with(iris,pairs(Sepal.Length,Petal.Length,pch=19))