讲解INFS5730设计程序、辅导R编程语言

- 首页 >> Java编程
INFS5730 – Social Media and Enterprise 2.0 – 2021 T2
Individual Assignment - Social Network Analysis

Due on Friday 5pm of week 5 (2nd July 2021). This assignment is worth 10% of your overall
course mark.
Data Collection and cleaning (worth 25% of the available marks)
In this assignment, you are required to collect comments and replies to comments from
Sony’s PlayStation Blog. As depicted the Figure below, PlayStation users can comment and
reply to each others’ comments on Sony’s PlayStation Blog at https://blog.playstation.com/

The PlayStation Blog uses Wordpress, a free and open-source content management system
that provides an API to collect data such as posts, comments and replies to comments. To
collect data from any Wordpress-based website, you are required to use the R library httr.
The httr R library allows to request data from the Web using HTTP protocols.
2
business.unsw.edu.au
CRICOS Code 00098G
Install and load the httr library by executing the following R command:
install.packages("httr")
library(httr)
To collect data from a Wordpress blog, use a GET request and provide the URL of
the posts api as follows: (only the first 100 posts are needed).
posts = content(GET("https://blog.playstation.com/wp-json/wp/v2/posts?per_page=100"))
To collect all comments (including replies to comments) on each post, loop through
all 100 posts and, for each post, use a GET request and provide the URL of the
comments api for that post as follows:
allcomments = list()
for (i in 1:100) {
allcomments=append(allcomments,content(GET(paste0("https://blog.playstation.com/wp-json/wp/v2/comments?post=",posts[[i]]$id, "&per_page=100"))))
}
Now that all comments and replies are downloaded, you are requested to to create a
clean table of all comments and replies including the following fields: id, author,
author_name, parent.
commentsTable = as.data.frame(allcomments[[1]][c("id", "author", "author_name", "parent")])
for (i in 2:length(allcomments)) {
commentsTable=rbind(commentsTable, as.data.frame(allcomments[[i]][c("id", "author", "author_name", "parent")]))
}
Note that, for replies to comments, the field "parent" is the id of the comment replied to,
otherwise it is 0.
Export the comments and replies table into a CSV file to be able to open it in Excel.
write.csv(commentsTable,"commentsTable.csv")
Open the exported CSV file using Excel and save it as an Excel file called
"yourZid.xlsx" - replace yourZid with your UNSW zID.
Add two sheets to the Excel spreadsheet as follows:
o One sheet called "authors" containing the list of all authors of either
comments or replies to comments (without duplicates). Two columns are
required:
author: the author ID.
author_name: the name of the author.
3
business.unsw.edu.au
CRICOS Code 00098G
o One sheet called "reply_to_comment_relationship" based on the list of replies
to comments. Two columns are required:
reply_author: the ID of the author of the reply.
comment_author: the ID of the author of the comment replied to.
Submit the Excel spreadsheet file on Moodle as part of your assignment submission.
Social Network Analysis (worth 75% of the available marks)
Using NodeXL, conduct a social network analysis of the data obtained from the Data
Collection and Cleaning task above. The objective of this task is to conduct a social network
analysis to assess the reply-to-comment relationship between users.
Submit a Word document reporting the results of the social network analysis of the dataset
using NodeXL including the following:
1. Should the network be treated as directed or undirected? What is the difference?
Explain and justify your answer with examples. (worth 25% of the available marks)
2. Provide a screenshot of the graph pane (in Document Actions) with labelled nodes.
(worth 5% of the available marks)
3. Provide the following graph metrics from the social network analysis that you
generate using NodeXL (Overall Metrics sheet). Provide an explanation of each
metric and an interpretation of the findings (worth 20% of the available marks):
a. Vertices
b. Unique Edges
c. Edges With Duplicates
d. Graph Density
4. Based on your Social Network Analysis, how would you identify the “best influencer”
among users engaged on the PlayStation Blog? Explain and justify your answer.
(worth 25% of the available marks)
Submission Details
Word Limit
There is no word limit perse. Font should be no smaller than Arial 12, with standard margins.
The spacing must be 1.5.

站长地图