辅导CMPSC 473程序、辅导Python/c++编程
- 首页 >> Database作业 Lab 3: A Simple MapReduce-Style Wordcount Application
CMPSC 473, SUMMER 2022
Released on July 21, 2022, due on August 04, 2022, ll:59:59pm
Raj Pandey and Bhuvan Urgaonkar
1 Purpose and Background
This project is designed to give you experience in writing multi-threaded programs by
implementing a simplified MapReduce-style wordcount application. By working on this
project:
• You will learn to write multi-threaded code that correctly deals with race conditions.
• You will carry out a simple performance evaluation to examine the performance
impact of (i) the degree of parallelism in the mapper stage and (ii) the size of the
shared buffer which the two stages of your application will use to communicate.
Input
File
read
fappers Buffer
produce
Reducer
consume write
Output
File
Figure 1: Overview of our Mapreduce-style multi-threaded wordcount application.
The wordcount application takes as input a text file and produces as output the counts
for all uniquely occurring words in the input file arranged in an alphabetically increasing
order. We will assume that the words within our input files will only contain letters
of the English alphabet and the digits 0-9 (i.e., no punctuation marks or other special
characters). Our wordcount will consist of two stages. The first stage, called "mapper,"
1
CMPSC 473, SUMMER 2022
Released on July 21, 2022, due on August 04, 2022, ll:59:59pm
Raj Pandey and Bhuvan Urgaonkar
1 Purpose and Background
This project is designed to give you experience in writing multi-threaded programs by
implementing a simplified MapReduce-style wordcount application. By working on this
project:
• You will learn to write multi-threaded code that correctly deals with race conditions.
• You will carry out a simple performance evaluation to examine the performance
impact of (i) the degree of parallelism in the mapper stage and (ii) the size of the
shared buffer which the two stages of your application will use to communicate.
Input
File
read
fappers Buffer
produce
Reducer
consume write
Output
File
Figure 1: Overview of our Mapreduce-style multi-threaded wordcount application.
The wordcount application takes as input a text file and produces as output the counts
for all uniquely occurring words in the input file arranged in an alphabetically increasing
order. We will assume that the words within our input files will only contain letters
of the English alphabet and the digits 0-9 (i.e., no punctuation marks or other special
characters). Our wordcount will consist of two stages. The first stage, called "mapper,"
1