讲解Stats 20、R讲解、辅导Statistical Methods

- 首页 >> CS


Stats 20 Lec 2: Sample Final Exam

Version A

Introduction to Statistical Methods for Life and Health Sciences

UCLA, Summer 2019

In the Version section of your scantron, bubble in A for Version A.

Disclaimer: These 27 practice questions are meant to illustrate the format and approximate

difficulty of typical questions on the actual exam (some exam questions may be more difficult).

Not all topics covered on the exam are necessarily represented.

Instructions: You have 100 minutes to complete the following questions. This exam is closed

book. No calculators or other electronic devices are allowed. Good luck!

Academic Misconduct: Any potential violation of UCLA’s policy on academic integrity will be

reported to the Office of the Dean of Students. All work on this exam must be your own.

By signing my name below, I hereby understand and agree to abide by the instructions above and

UCLA’s policy on academic integrity. All work I submit is a product of my own honest effort.

Name:UID:Signature:(Turn over when exam starts)

2 Stats 20 Lec 2: Sample Final Exam

Mark your answers to all questions on the scantron provided. Any answers marked

on these pages will not be scored.

The following information is used in Problems 1, 2, and 3.

Consider the following commands and output:

> parks_df <- data.frame("Name"=c("Leslie","Ron","April"),

+ "Height"=c(62,71,66),"Weight"=c(115,201,119),"Income"=c(4000,NA,2000))

> parks_df

Name Height Weight Income

1 Leslie 62 115 4000

2 Ron 71 201 NA

3 April 66 119 2000

Problem 1 Which of the following will extract the value 115 from parks df?

(a) with(parks df,Weight[1])

(b) parks df[1,3]

(c) parks df[1,"Weight"]

(d) parks df$Weight[1]

(e) All of the above

Problem 2 What is the output of mode(parks df)?

(a) [1] "numeric"

(b) [1] "character"

(c) [1] "factor"

(d) [1] "data.frame"

(e) [1] "list"

Problem 3 What is the output of dim(parks df)?

(a) [1] 4

(b) [1] 12

(c) [1] 3 4

(d) [1] 4 3

(e) NULL

Stats 20 Lec 2: Sample Final Exam 3

The following information is used in Problems 4 and 5.

Consider the following command and output:

> L

[[1]]

[1] 1 2 3 4 5 6 7 8 9 10

[[2]]

[,1] [,2] [,3]

[1,] 1 3 5

[2,] 2 4 6

[[3]]

Name Height Weight Income

1 Leslie 62 115 4000

2 Ron 71 201 NA

3 April 66 119 2000

[[4]]

[,1] [,2] [,3]

[1,] 1 4 7

[2,] 2 5 8

[3,] 3 6 9

Problem 4 What is the output of class(L)?

(a) [1] "numeric"

(b) [1] "factor"

(c) [1] "matrix"

(d) [1] "data.frame"

(e) [1] "list"

Problem 5 What is the output of length(L)?

(a) [1] 1

(b) [1] 4

(c) [1] 37

(d) [1] 10 6 12 9

(e) NULL

4 Stats 20 Lec 2: Sample Final Exam

Problem 6 Which command can be used to simulate 30 flips of a weighted coin, where each flip

has a 30% chance of landing heads and a 70% chance of landing tails?

(a) sample(c("H","T"),size=30,replace=TRUE)

(b) sample(c("H","T"),size=30,replace=FALSE)

(c) sample(c("H","T"),size=30,replace=TRUE,prob=c(0.7,0.3))

(d) sample(c("H","T"),size=30,replace=FALSE,prob=c(0.3,0.7))

(e) sample(c("H","T"),size=30,replace=TRUE,prob=c(0.3,0.7))

Problem 7 Suppose there is a plot constructed in an existing screen device (i.e., plotting window

on the computer screen). Which of the following functions will superimpose on the existing plot?

(a) text()

(b) barplot()

(c) pairs()

(d) plot()

(e) boxplot()

Problem 8 Suppose the current working directory has a text file consisting only of R commands

called TreatYourself.R. Which of the following commands would correctly read this file into R?

(a) data("TreatYourself.R")

(b) load("TreatYourself.R")

(c) save("TreatYourself.R")

(d) source("TreatYourself.R")

(e) read.table("TreatYourself.R")

Problem 9 Suppose we want to use the Boston object from the MASS package. Consider the

following command and output:

> Boston

Error: object ’Boston’ not found

Which of the following commands will fix the issue?

(a) library(Boston)

(b) library(MASS)

(c) load(Boston)

(d) data(Boston)

(e) data(MASS)

Stats 20 Lec 2: Sample Final Exam 5

The following information is used in Problems 10, 11, and 12.

The data table below shows the first few rows from the countries.data dataset, found at

https://web.stanford.edu/~hastie/ElemStatLearn/datasets/countries.data.

The data contains an average measure of pairwise dissimilarity between 12 different countries (low

numbers mean two countries are similar, high numbers mean two countries are dissimilar).

0.00 5.58 7.00 7.08 4.83 2.17 6.42 3.42 2.50 6.08 5.25 4.75

5.58 0.00 6.50 7.00 5.08 5.75 5.00 5.50 4.92 6.67 6.83 3.00

7.00 6.50 0.00 3.83 8.17 6.67 5.58 6.42 6.25 4.25 4.50 6.08

7.08 7.00 3.83 0.00 5.83 6.92 6.00 6.42 7.33 2.67 3.75 6.67

Problem 10 Suppose this dataset has been saved to the countries.data file in an appropriate

folder on your computer. Which of the following commands would correctly read this file into R?

(a) read.table("countries.data")

(b) read.csv("countries.data")

(c) read.table("countries.data",header=TRUE)

(d) read.table("countries.data",header=FALSE,sep=",")

(e) read.table("countries.data",header=TRUE,sep=",")

Problem 11 Suppose the dataset has been saved to the countries obj object in the workspace

using the correct command from Problem 10. What is the output of class(countries obj)?

(a) [1] "numeric"

(b) [1] "factor"

(c) [1] "matrix"

(d) [1] "data.frame"

(e) [1] "list"

Problem 12 Using the same countries obj object from Problem 11: If you want to compute

the standard deviation of each row in the data, which of the following functions would you use?

(a) apply()

(b) lapply()

(c) sapply()

(d) tapply()

(e) summary()

6 Stats 20 Lec 2: Sample Final Exam

Problem 13 Which of the following commands will superimpose a line corresponding to the

equation y = 2 − 5x on an existing plot?

(a) curve(2 - 5*x)

(b) lines(2,-5)

(c) lines(-5,2)

(d) abline(a=2,b=-5)

(e) abline(a=-5,b=2)

Problem 14 Consider the following commands:

> wait_times <- runif(1000,0,12)

> mean(wait_times)

[1] 6.130214

If you were to run these commands again, what is the output of the command

round(mean(wait times),6) == 6.130214?

(a) [1] TRUE

(b) [1] FALSE

(c) [1] NA

(d) NULL

(e) None of the above

Problem 15 Which command will generate a random sample of size 35 from a normal distribution

with mean 12 and standard deviation 2?

(a) rnorm(35,12,2)

(b) dnorm(35,12,2)

(c) pnorm(35,12,2)

(d) qnorm(35,12,2)

Stats 20 Lec 2: Sample Final Exam 7

The following information is used in Problems 16 and 17.

Consider the following output, which represents self-identified gender for a sample of people. A

value of F represents female, a value of M represents male, and a value of X represents non-binary.

> gender

[1] M X F F F M

Levels: F M X

Problem 16 What is the output of as.numeric(gender)?

(a) [1] 2 3 1 1 1 2

(b) [1] 1 2 3 3 3 1

(c) [1] "M" "X" "F" "F" "F" "M"

(d) [1] NA NA NA NA NA NA

Warning message:

NAs introduced by coercion

Problem 17 Consider the following command:

> gender[3] <- "Female"

After executing this command, what is the output of gender?

(a) [1] M X F F F M

Levels: F M X

(b) [1] M X Female F F M

Levels: F M X

(c) [1] M X Female F F M

Levels: F M X Female

(d) [1] M X <NA> F F M

Levels: F M X

(e) [1] "M" "X" "Female" "F" "F" "M"

8 Stats 20 Lec 2: Sample Final Exam

The following information is used in Problems 18, 19, and 20.

The data table below shows a few observations from the Tuna.txt dataset, found at

http://www.math.hope.edu/isi/data/chap6/Tuna.txt.

The data contains measurements (in parts per million) on mercury levels for two types of tuna.

Tuna Mercury

albacore 0.82

albacore 0.32

albacore 0.036

yellowfin 0.11

yellowfin 0.379

Suppose the dataset has been saved to the tuna object in the workspace using the correct read.table()

command.

Problem 18 The plot below visualizes the distributions of Mercury split by Tuna in the tuna

data.

albacore yellowfin

0.0 0.5 1.0 1.5

Tuna

Mercury

Which of the following commands will produce this plot?

(a) boxplot(tuna)

(b) with(tuna,boxplot(Tuna,Mercury))

(c) boxplot(Mercury ∼ Tuna,data=tuna,xlab="Tuna",ylab="Mercury")

(d) boxplot(Tuna ∼ Mercury,data=tuna,xlab="Tuna",ylab="Mercury")

(e) None of the above

Stats 20 Lec 2: Sample Final Exam 9

Problem 19 The plot below visualizes the distribution of Tuna in the tuna data.

albacore yellowfin

Frequency

0 50 100 150 200

Which of the following commands will produce this plot?

I. plot(tuna$Tuna,ylab="Frequency")

II. plot(table(tuna$Tuna),ylab="Frequency")

III. barplot(tuna$Tuna,ylab="Frequency")

IV. barplot(table(tuna$Tuna),ylab="Frequency")

(a) I and III only

(b) I and IV only

(c) II and III only

(d) II and IV only

(e) I, II, III, and IV

Problem 20 Using a correct command from Problem 19, which argument will change the level

(or amount) of shading in the above plot?

(a) shade

(b) angle

(c) density

(d) col

(e) beside

10 Stats 20 Lec 2: Sample Final Exam

Problem 21 Suppoes the height of people from a certain population follows a normal distribution

with a mean height of 67 inches and a standard deviation of 2.2 inches. Which command will

compute the probability of observing a person with a height of 70 inches or less from this population?

(a) pnorm(70,67,2.2,lower.tail=FALSE)

(b) pnorm(70,67,2.2,lower.tail=TRUE)

(c) qnorm(70,67,2.2,lower.tail=FALSE)

(d) qnorm(70,67,2.2,lower.tail=TRUE)

The following information is used in Problems 22, 23, 24, and 25.

Suppose we are interested in answering the following question: How tall, on average, is the tallest

person in a group of 30 people? More formally, we are interested in the typical (mean) value of the

maximum height in samples of 30 people from a given population.

Assume that the population of heights follows a normal distribution with a mean of 68 inches and a

standard deviation of 2 inches. Rather than observing many real-life samples from this population,

we can simulate the sample maximum many times using a for() loop to estimate the typical value

of the sample maximum. Consider the following commands:

> M <- 10000

> max_obj <- numeric(M)

> set.seed(143)

> for(i in 1:M){

+ ______ <- max(rnorm(30,mean=68,sd=2))

+ }

Problem 22 Which of the following should be used in place of the blanks (______) in the loop?

(a) max obj

(b) max obj[i]

(c) max obj[i,]

(d) max obj[,i]

(e) max obj[-i,]

Problem 23 Which of the following commands will estimate the typical value of the maximum

height in samples of 30 people?

(a) sum(max obj)/30

(b) max(max obj)

(c) length(max obj)

(d) sum(max obj)/M

(e) pnorm(max obj)

Stats 20 Lec 2: Sample Final Exam 11

Problem 24 The plot below visualizes the distribution of max obj.

Histogram of max_obj

Maximum Value

Frequency

70 72 74 76

0 500 1000 1500 2000

Which of the following arguments in hist() will change the above histogram from a frequency

histogram to a density (or probability) histogram?

I. density=TRUE

II. prob=TRUE

III. freq=FALSE

IV. add=TRUE

(a) I and III only

(b) I and IV only

(c) II and III only

(d) II and IV only

(e) I, II, III, and IV

Problem 25 Using a correct command from Problem 24, suppose a density (or probability)

histogram of max obj has been constructed. Which of the following commands will superimpose a

smooth density estimate of max obj on top of the existing histogram?

(a) lines(density(max obj))

(b) lines(max obj)

(c) curve(dnorm(x,mean(max obj),sd(max obj),add=TRUE))

(d) abline(density(max obj))

(e) hist(max obj,density=TRUE,add=TRUE)

12 Stats 20 Lec 2: Sample Final Exam

Problem 26 Consider the following commands:

> head(trees)

Girth Height Volume

1 8.3 70 10.3

2 8.6 65 10.3

3 8.8 63 10.2

4 10.5 72 16.4

5 10.7 81 18.8

6 10.8 83 19.7

> trees_medians <- numeric(ncol(trees))

> names(trees_medians) <- names(trees)

> for(j in seq_len(ncol(trees))){

+ trees_medians[j] <- median(trees[,j])

+ }

> trees_medians

Girth Height Volume

12.9 76.0 24.2

Which of the following commands will produce the same object as trees medians?

(a) apply(trees,1,median)

(b) apply(trees,2,median)

(c) lapply(trees,median)

(d) sapply(trees,median)

(e) More than one of these commands will produce the same object as trees medians.

Stats 20 Lec 2: Sample Final Exam 13

Problem 27 The plot below visualizes the relationship between the sepal length and petal length

of iris flowers in the iris data from the datasets package.

1 2 3 4 5 6 7

4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0

Petal.Length

Sepal.Length

Which of the following commands will produce this plot?

(a) plot(Sepal.Length ∼ Petal.Length,data=iris,pch=19)

(b) plot(Petal.Length ∼ Sepal.Length,data=iris,pch=19)

(c) points(Sepal.Length ∼ Petal.Length,data=iris,pch=19)

(d) with(iris,plot(Sepal.Length,Petal.Length,pch=19))

(e) with(iris,pairs(Sepal.Length,Petal.Length,pch=19))


站长地图