辅导CSC 230程序、讲解C/C++,CSS程序设计 辅导R语言编程|辅导留学生 Statistics统计、回归、迭代
- 首页 >> CS CSC 230 Project 5
Lossy Image Compression
Images have been a common subject for compression. There are a couple of reasons for this. Without compression, image data
can require a lot of storage space, and images are often tolerant of a small amount of loss. It's generally OK if the compressed
image doesn't capture the original image exactly. If it can make the compressed representation smaller, most applications are
able to accept small changes in the intensity values stored in some pixels of the image.
This is what you'll be working on in this assignment. You'll be writing a pair of programs that compress and decompress grayscale
images. We'll be using a technique that's kind of like how the JPEG image format works. We'll be breaking the image into a lot of
little square blocks, applying a transformation called the Discrete Cosine Transform (DCT) to each block, then encoding the result
using different numbers of bits for various values. We'll call our compressed format j10, since it's like a simplified JPEG format,
but it uses 10 x 10 image blocks rather than the 8 x 8 blocks normally used by JPEG.
Our image compression program will be called encode. The following shows how we can run it. Here, the program will read an
input image named "image-06.pgm" and write out a compressed image named "output.j10".
$ ./encode image-06.pgm output.j10
We won't be able to directly display images in the j10 format. It's a format we just made up for this assignment, so no standard
image viewing program will support it. To view a compressed image, we'll need to first decompress it to PGM. Then, we can view
the resulting PGM image. Our decompression program will be called decode. The following shows how we can run it, taking the
compressed image we just created and decompressing it to produce an image in the standard, pgm format.
$ ./decode output.j10 output.pgm
The figure below shows the results of this compression and decompression. The image on the left is the original, and the one on
the right is the result of compressing and decompressing this image. There are some small differences. That's what you would
normally expect from a lossy compression technique. Overall, the two images look very similar.
Original image (left) and compressed/decompressed image (right)
If we're willing to tolerate some small differences introduced by compression, a lossy technique can save a lot of storage space.
The shell commands below show that we've reduced the storage cost for this image by more than 2/3. The original image
required almost 208 kilobytes, but the compressed representation used only about 63.
$ ls -l image-06.pgm output.j10
-rw-r--r-- 1 dbsturgi staff 212816 Mar 27 10:45 image-06.pgm
-rw-r--r-- 1 dbsturgi staff 64959 Mar 28 10:48 output.j10
As with recent assignments, you'll be developing this project using git for revision control. You should be able to just unpack the
starter into the p5 directory of your cloned repo to get started. See the Getting Started section for instructions.
This Project supports a number of our course objectives. See the Learning Outcomes section for a list.
Rules for Project 5
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 2/17
You get to complete this project individually. If you're unsure what's permitted, you can have a look at the academic integrity
guidelines in the course syllabus.
In the design section, you'll see some instructions for how your implementation is expected to work. Be sure you follow these
rules. It's not enough to just turn in a working program; your program has to follow the design constraints we've asked you to
follow. For this project, we're putting some constraints on the functions you'll need to define, and on one data structure you'll
need to use. Still, you will have lots of opportunities to design parts of the project for yourself.
Requirements
This section says what your programs are supposed to be able to do, and what they should do when something goes wrong.
Encode / Decode Execution
The encode program expects two command-line arguments, the name of an uncompressed input image in PGM format and the
name of a file for storing the compressed image in our j10 format. For example, you should be able to run it as follows:
./encode input.pgm output.j10
If you give the program the wrong number of command-line arguments, it should print the following line to standard error and
then terminate with an exit status of 1.
usage: encode
If one of the given files can't be opened, the program should print an error message to standard error, then terminate with an exit
status of 1. The error message should be generated using the perror() standard library function, with the filename passed to
perror() as the argument. For example, on a common platform system, you'll get an error message like the following if you try to
open a file that doesn't exist. Since the specific message is determined by perror(), you may get different error messages on
other systems of for different kinds of errors.
input.pgm: No such file or directory
If the input image isn't valid (see the description of the raw PGM format below), the encode program should print the following
line to standard error and then terminate with an exit status of 1. An input image could be invalid if the header didn't have the
expected fields, if the maximum intensity value wasn't 255, or if the file wasn't large enough to contain a byte for each pixel. Our
compression technique also requires image width and height to both be a multiple of 10 (there's an extra credit option described
below that lifts this restriction). If you don't do the extra credit, you should also report this error for images that don't have a
width and height that meet this constraint.
Invalid image file
As you'll see below, our compression technique will only support files with a width and height that are both less than 4096 pixels.
The encode program should also print the above message and terminate with a status of 1 if the input image is too wide or too
tall.
The decode program is similar to encode; it expects two command-line arguments, the name of a compressed input file in our j10
format and the name of an output file for the uncompressed image in PGM format. For example, you should be able to run it as
follows:
./decode input.j10 output.pgm
If you give the program the wrong number of command-line arguments, it should print the following line to standard error and
then terminate with an exit status of 1.
usage: decode
If one of the given files can't be opened, decode will also use perror() to print an error message to standard error, then terminate
with an exit status of 1. So, it might print an error message like the following if it couldn't create the requested output file:
output.pgm: Permission denied
If the decode program is given an input file that doesn't contain enough bits to represent an encoded image (if the input file
reaches the end-of-file before it should), it should print the following line to standard error and then terminate with an exit status
of 1. Unless you do the extra credit, you should also print this line and terminate if the input describes an image where the width
and height aren't a multiple of 10.
Invalid encoded file
PGM Image Format
For uncompressed images, we'll be using the raw form of the PGM image format. This is similar to the image format we used on
project 2, but it represents pixel data in binary, rather than in text. This makes it much more space-efficient (but, our compressed
format will generally be able to do better). We'll place some additional constraints on our input and output images, to make them
easier to parse and to write out.
The raw PGM image format starts with a 3-line header like the following. This part of the file is in text. It starts with two
characters, P5 on the first line. This two-character code indicates that this is an image file in raw PGM format. The next line gives
the size of the image, as width then height. The example below is showing an image 60 pixels wide and 45 pixels tall. Notice that
the width, height order is backward from how we normally describe two-dimensional data. Everywhere else in our program, we'll
generally order these as height (rows) then width (columns). In this one place, though, we have to respect the order defined in
the PGM image format, so, for PGM images, this line of the header will give width then height.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 3/17
P5
60 45
255
The third line of the header gives the maximum intensity for pixels. All the images we work with will be at least 1 pixel wide and 1
pixel tall, and they'll all have 255 as their maximum intensity. We'll also assume there's just one space between the width and
height.
The 255 at the end of the header is followed by one whitespace character (a newline for all of our images). Then, the remaining
width x height bytes in the file give the intensity values for the image pixels, one byte per pixel. These are given in row-major
order, so the first byte of pixel data gives the intensity of the pixel in the upper-left corner, then the intensity of the next pixel on
the top row, and so on, up to the last pixel on the top row. Then, immediately afterward are all the pixel intensities for the second
row, then the next row all the way to the last row of the image. Note that all the pixel data is in binary, one byte per pixel. There's
no separation between the intensity values for pixels (e.g., no spaces or newlines separating the data values).
The following shows the contents of a small pgm file using the hexdump program. This lets us see the start of the file as text
(over on the right) and the contents of the pixel data as binary. You can see this is a 10 x 10 image. The hexadecimal 0a near the
end of the first line is the newline after the 255. After this, the ad is the intensity of the pixel in the upper-left corner of the
image. This value, along with the 99 bytes after it give the intensity values of all the image's pixels, in row-major order. The file
ends right after the intensity of the last pixel (31 in hexadecimal).
$ hexdump -C image-02.pgm
00000000 50 35 0a 31 30 20 31 30 0a 32 35 35 0a ad a7 9c |P5.10 10.255....|
00000010 8c 79 65 53 43 37 31 ad a7 9c 8c 79 65 53 43 37 |.yeSC71....yeSC7|
00000020 31 ad a7 9c 8c 79 65 53 43 37 31 ad a7 9c 8c 79 |1....yeSC71....y|
00000030 65 53 43 37 31 ad a7 9c 8c 79 65 53 43 37 31 ad |eSC71....yeSC71.|
00000040 a7 9c 8c 79 65 53 43 37 31 ad a7 9c 8c 79 65 53 |...yeSC71....yeS|
00000050 43 37 31 ad a7 9c 8c 79 65 53 43 37 31 ad a7 9c |C71....yeSC71...|
00000060 8c 79 65 53 43 37 31 ad a7 9c 8c 79 65 53 43 37 |.yeSC71....yeSC7|
00000070 31 |1|
The following figure shows what the image above looks like. Here, the image is enlarged considerably to show individual pixels.
Image defined by the PGM file above.
Image Block Transformation
For the pixel data, our compression technique will work by applying a version of the two-dimensional Discrete Cosine Transform
(DCT) to blocks of the image, then writing out an encoding of each of the resulting blocks. The DCT is a common transformation
used in lossy compression tasks. In fact, if you're familiar with the show, "Silicon Valley", the DCT gets mentioned at least once in
the show. The DCT will let us look at the frequency components in each image block, making it possible to store information about
how the shade is changing within a block, rather than just storing intensity values for individual pixels.
The following figure shows what the DCT does. The left-hand side shows a 10x10 image, with intensity values for each pixel. The
right-hand side shows a DCT of this image, with values rounded to integers after the DCT is calculated. After applying the DCT, we
are left with a grid of values, but many of these values will be zero or close to zero. That will make the transformed image easier
to compress, we can represent it in a way that makes it easy to skip over the zero values.
Example DCT Computation for a small image
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 4/17
We will be applying the DCT separately to 10x10-pixel blocks of the input image. Here, we'll describe how to apply the DCT and
its inverse to an individual block. Compressing an image will require applying this transformation to each 10x10 block and
encoding the result in an output file. Decompressing will require decoding each 10x10 block and applying the inverse DCT to that
block.
For an image block, we'll use X[ i ][ j ] to stand for represent the intensities of the individual pixels. So, X[ 0 ][ 0 ] will stand for
the intensity of a pixel in the upper left corner of the block, X[ 0 ][ 9 ] will be the intensity of a pixel in the upper right, X[ 9 ][ 9 ]
will be a pixel in the lower right corner of the block and X[ 9 ][ 0 ] will be the lower left corner. Even though pixel intensities are
integers in the range 0 .. 255, the transformation will expect these integers to be converted to real values (doubles, with values
between 0 and 255).
From the X values representing pixel intensities in a block, we will compute a 10x10 array of real numbers representing the
transformed block. We will call this array, Y and we will use Y[ m][ n ] to stand for a single element of this array. Each element of
Y will be computed as a weighted sum of all the values of the 10x10 X array. So, applying the DCT will be a little bit expensive.
Each of the 100 elements of Y will require computing a different weighted sum of all 100 values in the X array. There are more
efficient ways to compute the DCT, but we won't worry about that for this assignment.
The following formula shows how to compute the value of any Y[ m ][ n ]. In this definition, B stands for the block size of 10. The
constant, S is a scaling factor that we'll set to S = 0.044997. This value for S should make sure the DCT yields values that can be
encoded with a particular number of bits, and it should help to make sure small differences in how the DCT is computed don't
result in different rounding behavior. The values Sm and Sn are also scaling factors. These will usually have a value of 1 except Sm will have a value of 1 / 2 when m = 0 and Sn will have a value of 1 / 2 when n = 0.
Rule to compute the DCT, generating elements of Y from X
For example, these are the (approximate) weights that would be used to compute the value of Y[ 7 ][ 2 ]. With m = 7 and n = 2,
the formula above would multiply each element of X by the weight shown below and add all of these weighted values up to get Y[
7 ][ 2 ].
0.00389 0.00240 0.00000 -0.00240 -0.00389 -0.00389 -0.00240 -0.00000 0.00240 0.00389
-0.00845 -0.00522 -0.00000 0.00522 0.00845 0.00845 0.00522 0.00000 -0.00522 -0.00845
0.00605 0.00374 0.00000 -0.00374 -0.00605 -0.00605 -0.00374 -0.00000 0.00374 0.00605
0.00134 0.00083 0.00000 -0.00083 -0.00134 -0.00134 -0.00083 -0.00000 0.00083 0.00134
-0.00763 -0.00471 -0.00000 0.00471 0.00763 0.00763 0.00471 0.00000 -0.00471 -0.00763
0.00763 0.00471 0.00000 -0.00471 -0.00763 -0.00763 -0.00471 -0.00000 0.00471 0.00763
-0.00134 -0.00083 -0.00000 0.00083 0.00134 0.00134 0.00083 0.00000 -0.00083 -0.00134
-0.00605 -0.00374 -0.00000 0.00374 0.00605 0.00605 0.00374 0.00000 -0.00374 -0.00605
0.00845 0.00522 0.00000 -0.00522 -0.00845 -0.00845 -0.00522 -0.00000 0.00522 0.00845
-0.00389 -0.00240 -0.00000 0.00240 0.00389 0.00389 0.00240 0.00000 -0.00240 -0.00389
The rule to compute the inverse DCT is similar. Given the DCT of a block, Y, the following will compute the value of any X[ i ][ j ].
Each element of X can be computed as a weighted sum of all the elements of Y. As above, S is the scaling factor of 0.044997
(but, notice that we use its reciprocal here), and B is the block size, 10. The scaling factors Sm and Sn are defined the same as
above. They normally have a value of 1, but Sm has a value of 1 / 2 when m = 0 and Sn has a value of 1 / 2 when n = 0. Notice
that, in this formula, variables m and n are used in the summations, so the values of Sm and Sn will depend on which term you're
adding in.
Rule to compute the inverse DCT, generating elements of X from Y
The following table shows an example. These are the (approximate) weights you would use for computing the value of X[ 9 ][ 0 ].
Each of these weights would be multiplied by an element of Y and the weighted values would be added up to get X[ 9 ][ 0 ].
2.22237 3.10421 2.98908 2.80035 2.54266 2.22237 1.84735 1.42685 0.97121 0.49166
-3.10421 -4.33597 -4.17516 -3.91154 -3.55160 -3.10421 -2.58039 -1.99303 -1.35659 -0.68675
2.98908 4.17516 4.02031 3.76646 3.41988 2.98908 2.48469 1.91911 1.30628 0.66128
-2.80035 -3.91154 -3.76646 -3.52865 -3.20394 -2.80035 -2.32780 -1.79794 -1.22380 -0.61953
2.54266 3.55160 3.41988 3.20394 2.90912 2.54266 2.11360 1.63249 1.11119 0.56252
-2.22237 -3.10421 -2.98908 -2.80035 -2.54266 -2.22237 -1.84735 -1.42685 -0.97121 -0.49166
1.84735 2.58039 2.48469 2.32780 2.11360 1.84735 1.53562 1.18607 0.80732 0.40869
-1.42685 -1.99303 -1.91911 -1.79794 -1.63249 -1.42685 -1.18607 -0.91609 -0.62356 -0.31566
0.97121 1.35659 1.30628 1.22380 1.11119 0.97121 0.80732 0.62356 0.42443 0.21486
-0.49166 -0.68675 -0.66128 -0.61953 -0.56252 -0.49166 -0.40869 -0.31566 -0.21486 -0.10877
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 5/17
Bitwise Input / Output
Our j10 format will write out values to the compressed image file using varying numbers of bits. This will let us save space in the
compressed image. If we have a small value that doesn't need an entire byte, we can store it using just a few bits. Or, if there are
some unused bits in a byte of the output we can use them to store part of a value before starting to use bits in the next byte.
We will always use bits in the output starting from the highest-order bit, then the second-highest-order bit and so on until
reaching the low-order bit of that byte. Then, the next bit we have to store will go in the high-order bit of the next byte. For
example, imagine that we started with an empty output file. If we had to write three bits, 011, to they file, they would go in the
three highest-order bits of the first output byte.
Output after writing three bits
Then, if we had to write out four more bits, say 0101, they would go in the next three bits of the same byte, leaving just the
lowest-order bit unused.
Output after writing seven bits
If we next had to write out three bits, say 111, this would use up the last remaining bit of the first byte and the next two bits
would go in the two high-order bits of the next byte.
Output after writing ten bits
Once all needed bits have been stored, there may still be some left-over bits in the last byte. We can't write just part of a byte to
an output file we'll need to decide what to put in any left-over bits in the last byte before that byte is written out to the file. We'll
always fill any remaining bits with zeros before writing out the last byte of a file.
Encoding Compressed Images
Our j10 image format will start with two 12-bit values giving the height and width of the image (notice, the order is the reverse of
the PGM format). The first 12 bits of the output file will give the height of the image (number of rows) and then 12 more bits for
the width (number of columns). With just 12 bits for each of the these values, this format can support images no larger than
4095 x 4095 pixels.
Image size encoded in the first 12 bits of the j10 format.
After the image size, the j10 file will contain an encoding of each 10x10 block of the image. Blocks will be written out in row- major order, with every block at the top of the image written out left-to-write, then every block below these left-to-right until
reaching the bottom of the image. The figure below shows how a 30x40 image would be handled. It would be encoded as 12
10x10 blocks, with three blocks on each row. Blocks would be encoded to the output one after another in the order shown below
(i.e., in row-major order).
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 6/17
Encoding order for blocks in a 30x40-pixel image.
To encode a 10x10 block, we will first transform it via the DCT. This will yield a 10x10 array of real numbers. We will round each
of these numbers to the nearest integer. This will give us a 10x10 array of integers. The following shows an example of what one
such block might look like. This one represents the 10x10 image-03.pgm given in the starter, it contains just one block. This block
illustrates what the DCT often gives us. In the transformed image block, lots of the values are zero. Most of the non-zero values
are up near the upper-left corner. Our compressed image format will save space by using an (moderately) efficient way to skip
over the zeros in the transformed image block.
DCT of the block in image-03.pgm
In the transformed image block, the value in the upper-left corner represents the overall intensity of the block. We will encode it
first as a 7-bit unsigned value (it should never be negative). In the example above, this value is 50, so we would write it out as
0110010 (that's 50 in binary). Computationally, you won't even need to do any conversion to get the binary representation of this
value. Internally, integers are stored in binary, so we can use the bitwise operators to look at their bitwise representation directly.
Encoding of a 7-bit value for the Y[0][0] value in a block
That takes care of the first value in the transformed image block. For the remaining 99, we'll expect a lot of zero values. We'll
write out 99 bits to indicate which of these values are zero and which are non-zero. For each value in the DCT image block, we'll
write out a 0 bit if that value is zero and a 1 bit if the value is non-zero. Bits for each value will be written out in row-major order,
so we'll write out 9 bits for the 9 remaining values in the first row, then 10 more bits for the 10 bits in the next row, all the way to
the last row of the image block.
The following figure shows how the block shown above would be encoded. It still shows the value of 50 encoded in the first 7 bits.
Then, the next 99 bits represent where the non-zero values are in the remaining elements. The last byte still has 6 unused bits in
it.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 7/17
Encoding of the Y[0][0] term and a bit for each of the remaining 99 values
Next, we will encode the values for the non-zero elements of the DCT block. These will be done in row-major order, but we will
skip the zero values. These values will may be positive or negative, but they will have smaller magnitudes than the value in the
upper-left corner. We will encode a value using one bit to encode the sign, 0 for positive, 1 for negative. This will be followed by 6
bits giving the magnitude of the value (its absolute value). So, for example, a value of -10 would be encoded as a 1 (since it's
negative), followed by 001010 (the value 10 in binary).
The three non-negative values in the sample DCT block above would be encoded as 1001010 (for the -10 in the top fow), then
0001010 (for the 10 in row 2), then 1001010 (for the -10 in row 2). With these values at the end, the encoding of this entire
block would look like the following figure. There's one bit in the last byte that isn't used. If the image had another block after this
one, that bit would be used as the first bit in encoding the next block. Blocks don't need to start or end at a byte boundary.
Encoding of the entire DCT block shown above
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 8/17
The following figure shows what the test input, image-03.pgm would encode to in the j10 format. It's just a 10x10 image, so it
only contains one block. It starts with two 12-bit numbers for the size of the image, then the encoding for the one block in this
image.
Encoding of the entire image-03.pgm test input
The following shows a larger example. This is the j10 encoding of test input image-04.pgm. To make it fit better on the page, it's
broken into two columns. The image-04.pgm input is 20 pixels wide and 10 pixels tall, so it's encoded as two blocks. It starts with
24 bits for the image size, then the encoding of the first (left-hand) block, then the second (right-hand) one. For this image, there
are two unused bits in the last byte. They should be written out as two low-order zeros.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 9/17
Encoding of the image-04.pgm test input
Decoding Images
Decoding an image requires reading a compressed image in the j10 format and then writing an image out in raw PGM format. The
first 24 bits give the size the output image should be. After this, each 10x10 block of the image can be decoded from the j10
encoding.
To decode a block, you can fill in the values in a 10x10 block of doubles, Y. The first 7 bits of the block's encoding give the value
in the upper-left corner of Y. Then, the next 99 bits tell where the non-zero values are in the rest of Y. After this, you can read
and store values for each of the non-zero elements of Y in row-major order (be sure to set the rest of the values to zero if
needed).
After filling in the Y array for a block, you can apply the inverse DCT to get a 10x10 array of intensity values, X. Each element of X
can be rounded to the nearest integer to get the pixel intensity. Because we're implementing a lossy compression technique, it's
possible you will get a value that's larger than 255 or less than 0 after rounding. If that happens, just clamp the value to the
0..255 range; if it's larger than 255, set it to 255 instead, and if it's smaller than 0 just set it to zero.
After filling in each 10x10 block of the resulting image, write it out as a PGM file to produce the output.
Extra Credit
To simplify our encode and decode programs, they only need to work with images that are a multiple of 10 in both dimensions.
However, if you'd like up to 5 points of extra credit, you can lift this restriction. If an image isn't a multiple of 10 in size, we can
just pad the image with extra pixels when we're computing the DCT. Then, these extra pixels can be discarded when decoding the
image.
For the extra credit, your encode and decode programs should work for images of any (non-zero) size. If the image isn't the right
size, you will compute the DCT as if it was padded with extra rows and columns of pixels, enough to bring its size up to a multiple
of 10 in both dimensions. Additional rows will just contain copies of the bottom row form the input image. Additional columns will
contain copies of the rightmost column. Any pixels added to the lower-right will contain copies of the pixel in the lower-right
corner of the input image.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 10/17
Padding the image to get extra pixels needed for the DCT.
The starter includes a few files to help you test the extra credit option, if you choose to implement it. The files image-ec-01.pgm
and image-ec-02.pgm are images that aren't a multiple of 10 in size. The files encoded-ec-01.j10 and encoded-ec-02.j10 are the
compressed output you should get if you compress these images. You can test this part with commands like the following:
./encode image-ec-01.pgm output.j10
echo $?
diff output.j10 encoded-ec-01.j10
There are also a couple of decoded-ec-*.pgm files for testing your decode program support for the extra credit option. You try
these with commands like the following:
./decode encoded-ec-01.j10 output.pgm
echo $?
diff output.pgm decoded-ec-01.pgm
Design
Program Organization
Your solution will use five components. Two of these (encode.c and decode.c) will be top-level components, containing a main
functions for the encode and decode programs. The other three (bits.c/bits.h, image.c/image.h and j10.c/j10.h) will contain
functions and some type definitions to be used by the other components.
The following figure shows the dependency between these components. The top-level programs will use the bits, image and j10
components. The j10 component will use bits and image, but bits and image won't need to use j10 or each other.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 11/17
Dependency structure among the components
Bits Component
The bits component will support reading and writing files one bit at a time. This is a little tricky, since the C library just provides
mechanisms for reading files one or more bytes at a time.
Your bits component will define two struct types, BitReader and BitWriter. These will contain fields to let you read or write a file
one bit at a time. For reading, the BitReader will contain a buffer with room for (at least) 8 bits. When you need to read the first
bit from a file, it will read a whole byte from an input file. It will return the high-order bit from this byte and save the rest in the
buffer. Then, the next 7 bits can be extracted from the buffer before another byte must be read from the file. Like it says in the in
the Bitwise Input / Output section, bits from each byte of a file will be returned in order from the high-order bit to the lowestorder
bit.
Operation of the BitReader and BitWriter
A similar struct, BitWriter, will be used to help save bits one at a time to a file. Using this struct, you can store bits one at a time
in an internal buffer. Then, once 8 bits have been saved, they can all be written out to the file as a single byte. If all the bits in the
last byte aren't needed, they should be filled with zeros and written to the output file when the BitWriter is closed.
You get to define the BitReader and BitWriter structs yourself. You'll need to store a FILE pointer for the file being read from or
written to. You'll also need other fields to keep up with bits being buffered in the struct.
The bits component will provide the following functions. Of course, you can use other functions if you need them, but be sure to
make any additional functions static if they're only used internally in the component.
BitReader *openBitReader( char const *filename )
This function will open the given file for reading and dynamically allocate a new BitReader struct for that file, initialize it and
return a pointer to it. Client code can use this reader to read bits from the file one at a time.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 12/17
void closeBitReader( BitReader *reader )
Client code will call this function when it's done using a BitReader. It will free any resources used by the given BitReader,
closing the file it's reading from and freeing any memory the reader was using.
int getBit( BitReader *reader )
Client code will call this function to get the next bit from the file the reader is reading from. If the buffer contains any bits
that haven't been returned yet, it will return the next bit from the buffer. If the buffer is empty, it will read the next byte
from the input file and return the high-order bit from that byte. If there are no more bits to read, this function should return
the constant, EOF, to the caller.
BitWriter *openBitWriter( char const *filename )
This creates a new BitWriter struct, opening a file with the given name, allocating space for the BitWriter and initializing its
fields as needed. It returns a poi
Lossy Image Compression
Images have been a common subject for compression. There are a couple of reasons for this. Without compression, image data
can require a lot of storage space, and images are often tolerant of a small amount of loss. It's generally OK if the compressed
image doesn't capture the original image exactly. If it can make the compressed representation smaller, most applications are
able to accept small changes in the intensity values stored in some pixels of the image.
This is what you'll be working on in this assignment. You'll be writing a pair of programs that compress and decompress grayscale
images. We'll be using a technique that's kind of like how the JPEG image format works. We'll be breaking the image into a lot of
little square blocks, applying a transformation called the Discrete Cosine Transform (DCT) to each block, then encoding the result
using different numbers of bits for various values. We'll call our compressed format j10, since it's like a simplified JPEG format,
but it uses 10 x 10 image blocks rather than the 8 x 8 blocks normally used by JPEG.
Our image compression program will be called encode. The following shows how we can run it. Here, the program will read an
input image named "image-06.pgm" and write out a compressed image named "output.j10".
$ ./encode image-06.pgm output.j10
We won't be able to directly display images in the j10 format. It's a format we just made up for this assignment, so no standard
image viewing program will support it. To view a compressed image, we'll need to first decompress it to PGM. Then, we can view
the resulting PGM image. Our decompression program will be called decode. The following shows how we can run it, taking the
compressed image we just created and decompressing it to produce an image in the standard, pgm format.
$ ./decode output.j10 output.pgm
The figure below shows the results of this compression and decompression. The image on the left is the original, and the one on
the right is the result of compressing and decompressing this image. There are some small differences. That's what you would
normally expect from a lossy compression technique. Overall, the two images look very similar.
Original image (left) and compressed/decompressed image (right)
If we're willing to tolerate some small differences introduced by compression, a lossy technique can save a lot of storage space.
The shell commands below show that we've reduced the storage cost for this image by more than 2/3. The original image
required almost 208 kilobytes, but the compressed representation used only about 63.
$ ls -l image-06.pgm output.j10
-rw-r--r-- 1 dbsturgi staff 212816 Mar 27 10:45 image-06.pgm
-rw-r--r-- 1 dbsturgi staff 64959 Mar 28 10:48 output.j10
As with recent assignments, you'll be developing this project using git for revision control. You should be able to just unpack the
starter into the p5 directory of your cloned repo to get started. See the Getting Started section for instructions.
This Project supports a number of our course objectives. See the Learning Outcomes section for a list.
Rules for Project 5
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 2/17
You get to complete this project individually. If you're unsure what's permitted, you can have a look at the academic integrity
guidelines in the course syllabus.
In the design section, you'll see some instructions for how your implementation is expected to work. Be sure you follow these
rules. It's not enough to just turn in a working program; your program has to follow the design constraints we've asked you to
follow. For this project, we're putting some constraints on the functions you'll need to define, and on one data structure you'll
need to use. Still, you will have lots of opportunities to design parts of the project for yourself.
Requirements
This section says what your programs are supposed to be able to do, and what they should do when something goes wrong.
Encode / Decode Execution
The encode program expects two command-line arguments, the name of an uncompressed input image in PGM format and the
name of a file for storing the compressed image in our j10 format. For example, you should be able to run it as follows:
./encode input.pgm output.j10
If you give the program the wrong number of command-line arguments, it should print the following line to standard error and
then terminate with an exit status of 1.
usage: encode
If one of the given files can't be opened, the program should print an error message to standard error, then terminate with an exit
status of 1. The error message should be generated using the perror() standard library function, with the filename passed to
perror() as the argument. For example, on a common platform system, you'll get an error message like the following if you try to
open a file that doesn't exist. Since the specific message is determined by perror(), you may get different error messages on
other systems of for different kinds of errors.
input.pgm: No such file or directory
If the input image isn't valid (see the description of the raw PGM format below), the encode program should print the following
line to standard error and then terminate with an exit status of 1. An input image could be invalid if the header didn't have the
expected fields, if the maximum intensity value wasn't 255, or if the file wasn't large enough to contain a byte for each pixel. Our
compression technique also requires image width and height to both be a multiple of 10 (there's an extra credit option described
below that lifts this restriction). If you don't do the extra credit, you should also report this error for images that don't have a
width and height that meet this constraint.
Invalid image file
As you'll see below, our compression technique will only support files with a width and height that are both less than 4096 pixels.
The encode program should also print the above message and terminate with a status of 1 if the input image is too wide or too
tall.
The decode program is similar to encode; it expects two command-line arguments, the name of a compressed input file in our j10
format and the name of an output file for the uncompressed image in PGM format. For example, you should be able to run it as
follows:
./decode input.j10 output.pgm
If you give the program the wrong number of command-line arguments, it should print the following line to standard error and
then terminate with an exit status of 1.
usage: decode
If one of the given files can't be opened, decode will also use perror() to print an error message to standard error, then terminate
with an exit status of 1. So, it might print an error message like the following if it couldn't create the requested output file:
output.pgm: Permission denied
If the decode program is given an input file that doesn't contain enough bits to represent an encoded image (if the input file
reaches the end-of-file before it should), it should print the following line to standard error and then terminate with an exit status
of 1. Unless you do the extra credit, you should also print this line and terminate if the input describes an image where the width
and height aren't a multiple of 10.
Invalid encoded file
PGM Image Format
For uncompressed images, we'll be using the raw form of the PGM image format. This is similar to the image format we used on
project 2, but it represents pixel data in binary, rather than in text. This makes it much more space-efficient (but, our compressed
format will generally be able to do better). We'll place some additional constraints on our input and output images, to make them
easier to parse and to write out.
The raw PGM image format starts with a 3-line header like the following. This part of the file is in text. It starts with two
characters, P5 on the first line. This two-character code indicates that this is an image file in raw PGM format. The next line gives
the size of the image, as width then height. The example below is showing an image 60 pixels wide and 45 pixels tall. Notice that
the width, height order is backward from how we normally describe two-dimensional data. Everywhere else in our program, we'll
generally order these as height (rows) then width (columns). In this one place, though, we have to respect the order defined in
the PGM image format, so, for PGM images, this line of the header will give width then height.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 3/17
P5
60 45
255
The third line of the header gives the maximum intensity for pixels. All the images we work with will be at least 1 pixel wide and 1
pixel tall, and they'll all have 255 as their maximum intensity. We'll also assume there's just one space between the width and
height.
The 255 at the end of the header is followed by one whitespace character (a newline for all of our images). Then, the remaining
width x height bytes in the file give the intensity values for the image pixels, one byte per pixel. These are given in row-major
order, so the first byte of pixel data gives the intensity of the pixel in the upper-left corner, then the intensity of the next pixel on
the top row, and so on, up to the last pixel on the top row. Then, immediately afterward are all the pixel intensities for the second
row, then the next row all the way to the last row of the image. Note that all the pixel data is in binary, one byte per pixel. There's
no separation between the intensity values for pixels (e.g., no spaces or newlines separating the data values).
The following shows the contents of a small pgm file using the hexdump program. This lets us see the start of the file as text
(over on the right) and the contents of the pixel data as binary. You can see this is a 10 x 10 image. The hexadecimal 0a near the
end of the first line is the newline after the 255. After this, the ad is the intensity of the pixel in the upper-left corner of the
image. This value, along with the 99 bytes after it give the intensity values of all the image's pixels, in row-major order. The file
ends right after the intensity of the last pixel (31 in hexadecimal).
$ hexdump -C image-02.pgm
00000000 50 35 0a 31 30 20 31 30 0a 32 35 35 0a ad a7 9c |P5.10 10.255....|
00000010 8c 79 65 53 43 37 31 ad a7 9c 8c 79 65 53 43 37 |.yeSC71....yeSC7|
00000020 31 ad a7 9c 8c 79 65 53 43 37 31 ad a7 9c 8c 79 |1....yeSC71....y|
00000030 65 53 43 37 31 ad a7 9c 8c 79 65 53 43 37 31 ad |eSC71....yeSC71.|
00000040 a7 9c 8c 79 65 53 43 37 31 ad a7 9c 8c 79 65 53 |...yeSC71....yeS|
00000050 43 37 31 ad a7 9c 8c 79 65 53 43 37 31 ad a7 9c |C71....yeSC71...|
00000060 8c 79 65 53 43 37 31 ad a7 9c 8c 79 65 53 43 37 |.yeSC71....yeSC7|
00000070 31 |1|
The following figure shows what the image above looks like. Here, the image is enlarged considerably to show individual pixels.
Image defined by the PGM file above.
Image Block Transformation
For the pixel data, our compression technique will work by applying a version of the two-dimensional Discrete Cosine Transform
(DCT) to blocks of the image, then writing out an encoding of each of the resulting blocks. The DCT is a common transformation
used in lossy compression tasks. In fact, if you're familiar with the show, "Silicon Valley", the DCT gets mentioned at least once in
the show. The DCT will let us look at the frequency components in each image block, making it possible to store information about
how the shade is changing within a block, rather than just storing intensity values for individual pixels.
The following figure shows what the DCT does. The left-hand side shows a 10x10 image, with intensity values for each pixel. The
right-hand side shows a DCT of this image, with values rounded to integers after the DCT is calculated. After applying the DCT, we
are left with a grid of values, but many of these values will be zero or close to zero. That will make the transformed image easier
to compress, we can represent it in a way that makes it easy to skip over the zero values.
Example DCT Computation for a small image
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 4/17
We will be applying the DCT separately to 10x10-pixel blocks of the input image. Here, we'll describe how to apply the DCT and
its inverse to an individual block. Compressing an image will require applying this transformation to each 10x10 block and
encoding the result in an output file. Decompressing will require decoding each 10x10 block and applying the inverse DCT to that
block.
For an image block, we'll use X[ i ][ j ] to stand for represent the intensities of the individual pixels. So, X[ 0 ][ 0 ] will stand for
the intensity of a pixel in the upper left corner of the block, X[ 0 ][ 9 ] will be the intensity of a pixel in the upper right, X[ 9 ][ 9 ]
will be a pixel in the lower right corner of the block and X[ 9 ][ 0 ] will be the lower left corner. Even though pixel intensities are
integers in the range 0 .. 255, the transformation will expect these integers to be converted to real values (doubles, with values
between 0 and 255).
From the X values representing pixel intensities in a block, we will compute a 10x10 array of real numbers representing the
transformed block. We will call this array, Y and we will use Y[ m][ n ] to stand for a single element of this array. Each element of
Y will be computed as a weighted sum of all the values of the 10x10 X array. So, applying the DCT will be a little bit expensive.
Each of the 100 elements of Y will require computing a different weighted sum of all 100 values in the X array. There are more
efficient ways to compute the DCT, but we won't worry about that for this assignment.
The following formula shows how to compute the value of any Y[ m ][ n ]. In this definition, B stands for the block size of 10. The
constant, S is a scaling factor that we'll set to S = 0.044997. This value for S should make sure the DCT yields values that can be
encoded with a particular number of bits, and it should help to make sure small differences in how the DCT is computed don't
result in different rounding behavior. The values Sm and Sn are also scaling factors. These will usually have a value of 1 except Sm will have a value of 1 / 2 when m = 0 and Sn will have a value of 1 / 2 when n = 0.
Rule to compute the DCT, generating elements of Y from X
For example, these are the (approximate) weights that would be used to compute the value of Y[ 7 ][ 2 ]. With m = 7 and n = 2,
the formula above would multiply each element of X by the weight shown below and add all of these weighted values up to get Y[
7 ][ 2 ].
0.00389 0.00240 0.00000 -0.00240 -0.00389 -0.00389 -0.00240 -0.00000 0.00240 0.00389
-0.00845 -0.00522 -0.00000 0.00522 0.00845 0.00845 0.00522 0.00000 -0.00522 -0.00845
0.00605 0.00374 0.00000 -0.00374 -0.00605 -0.00605 -0.00374 -0.00000 0.00374 0.00605
0.00134 0.00083 0.00000 -0.00083 -0.00134 -0.00134 -0.00083 -0.00000 0.00083 0.00134
-0.00763 -0.00471 -0.00000 0.00471 0.00763 0.00763 0.00471 0.00000 -0.00471 -0.00763
0.00763 0.00471 0.00000 -0.00471 -0.00763 -0.00763 -0.00471 -0.00000 0.00471 0.00763
-0.00134 -0.00083 -0.00000 0.00083 0.00134 0.00134 0.00083 0.00000 -0.00083 -0.00134
-0.00605 -0.00374 -0.00000 0.00374 0.00605 0.00605 0.00374 0.00000 -0.00374 -0.00605
0.00845 0.00522 0.00000 -0.00522 -0.00845 -0.00845 -0.00522 -0.00000 0.00522 0.00845
-0.00389 -0.00240 -0.00000 0.00240 0.00389 0.00389 0.00240 0.00000 -0.00240 -0.00389
The rule to compute the inverse DCT is similar. Given the DCT of a block, Y, the following will compute the value of any X[ i ][ j ].
Each element of X can be computed as a weighted sum of all the elements of Y. As above, S is the scaling factor of 0.044997
(but, notice that we use its reciprocal here), and B is the block size, 10. The scaling factors Sm and Sn are defined the same as
above. They normally have a value of 1, but Sm has a value of 1 / 2 when m = 0 and Sn has a value of 1 / 2 when n = 0. Notice
that, in this formula, variables m and n are used in the summations, so the values of Sm and Sn will depend on which term you're
adding in.
Rule to compute the inverse DCT, generating elements of X from Y
The following table shows an example. These are the (approximate) weights you would use for computing the value of X[ 9 ][ 0 ].
Each of these weights would be multiplied by an element of Y and the weighted values would be added up to get X[ 9 ][ 0 ].
2.22237 3.10421 2.98908 2.80035 2.54266 2.22237 1.84735 1.42685 0.97121 0.49166
-3.10421 -4.33597 -4.17516 -3.91154 -3.55160 -3.10421 -2.58039 -1.99303 -1.35659 -0.68675
2.98908 4.17516 4.02031 3.76646 3.41988 2.98908 2.48469 1.91911 1.30628 0.66128
-2.80035 -3.91154 -3.76646 -3.52865 -3.20394 -2.80035 -2.32780 -1.79794 -1.22380 -0.61953
2.54266 3.55160 3.41988 3.20394 2.90912 2.54266 2.11360 1.63249 1.11119 0.56252
-2.22237 -3.10421 -2.98908 -2.80035 -2.54266 -2.22237 -1.84735 -1.42685 -0.97121 -0.49166
1.84735 2.58039 2.48469 2.32780 2.11360 1.84735 1.53562 1.18607 0.80732 0.40869
-1.42685 -1.99303 -1.91911 -1.79794 -1.63249 -1.42685 -1.18607 -0.91609 -0.62356 -0.31566
0.97121 1.35659 1.30628 1.22380 1.11119 0.97121 0.80732 0.62356 0.42443 0.21486
-0.49166 -0.68675 -0.66128 -0.61953 -0.56252 -0.49166 -0.40869 -0.31566 -0.21486 -0.10877
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 5/17
Bitwise Input / Output
Our j10 format will write out values to the compressed image file using varying numbers of bits. This will let us save space in the
compressed image. If we have a small value that doesn't need an entire byte, we can store it using just a few bits. Or, if there are
some unused bits in a byte of the output we can use them to store part of a value before starting to use bits in the next byte.
We will always use bits in the output starting from the highest-order bit, then the second-highest-order bit and so on until
reaching the low-order bit of that byte. Then, the next bit we have to store will go in the high-order bit of the next byte. For
example, imagine that we started with an empty output file. If we had to write three bits, 011, to they file, they would go in the
three highest-order bits of the first output byte.
Output after writing three bits
Then, if we had to write out four more bits, say 0101, they would go in the next three bits of the same byte, leaving just the
lowest-order bit unused.
Output after writing seven bits
If we next had to write out three bits, say 111, this would use up the last remaining bit of the first byte and the next two bits
would go in the two high-order bits of the next byte.
Output after writing ten bits
Once all needed bits have been stored, there may still be some left-over bits in the last byte. We can't write just part of a byte to
an output file we'll need to decide what to put in any left-over bits in the last byte before that byte is written out to the file. We'll
always fill any remaining bits with zeros before writing out the last byte of a file.
Encoding Compressed Images
Our j10 image format will start with two 12-bit values giving the height and width of the image (notice, the order is the reverse of
the PGM format). The first 12 bits of the output file will give the height of the image (number of rows) and then 12 more bits for
the width (number of columns). With just 12 bits for each of the these values, this format can support images no larger than
4095 x 4095 pixels.
Image size encoded in the first 12 bits of the j10 format.
After the image size, the j10 file will contain an encoding of each 10x10 block of the image. Blocks will be written out in row- major order, with every block at the top of the image written out left-to-write, then every block below these left-to-right until
reaching the bottom of the image. The figure below shows how a 30x40 image would be handled. It would be encoded as 12
10x10 blocks, with three blocks on each row. Blocks would be encoded to the output one after another in the order shown below
(i.e., in row-major order).
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 6/17
Encoding order for blocks in a 30x40-pixel image.
To encode a 10x10 block, we will first transform it via the DCT. This will yield a 10x10 array of real numbers. We will round each
of these numbers to the nearest integer. This will give us a 10x10 array of integers. The following shows an example of what one
such block might look like. This one represents the 10x10 image-03.pgm given in the starter, it contains just one block. This block
illustrates what the DCT often gives us. In the transformed image block, lots of the values are zero. Most of the non-zero values
are up near the upper-left corner. Our compressed image format will save space by using an (moderately) efficient way to skip
over the zeros in the transformed image block.
DCT of the block in image-03.pgm
In the transformed image block, the value in the upper-left corner represents the overall intensity of the block. We will encode it
first as a 7-bit unsigned value (it should never be negative). In the example above, this value is 50, so we would write it out as
0110010 (that's 50 in binary). Computationally, you won't even need to do any conversion to get the binary representation of this
value. Internally, integers are stored in binary, so we can use the bitwise operators to look at their bitwise representation directly.
Encoding of a 7-bit value for the Y[0][0] value in a block
That takes care of the first value in the transformed image block. For the remaining 99, we'll expect a lot of zero values. We'll
write out 99 bits to indicate which of these values are zero and which are non-zero. For each value in the DCT image block, we'll
write out a 0 bit if that value is zero and a 1 bit if the value is non-zero. Bits for each value will be written out in row-major order,
so we'll write out 9 bits for the 9 remaining values in the first row, then 10 more bits for the 10 bits in the next row, all the way to
the last row of the image block.
The following figure shows how the block shown above would be encoded. It still shows the value of 50 encoded in the first 7 bits.
Then, the next 99 bits represent where the non-zero values are in the remaining elements. The last byte still has 6 unused bits in
it.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 7/17
Encoding of the Y[0][0] term and a bit for each of the remaining 99 values
Next, we will encode the values for the non-zero elements of the DCT block. These will be done in row-major order, but we will
skip the zero values. These values will may be positive or negative, but they will have smaller magnitudes than the value in the
upper-left corner. We will encode a value using one bit to encode the sign, 0 for positive, 1 for negative. This will be followed by 6
bits giving the magnitude of the value (its absolute value). So, for example, a value of -10 would be encoded as a 1 (since it's
negative), followed by 001010 (the value 10 in binary).
The three non-negative values in the sample DCT block above would be encoded as 1001010 (for the -10 in the top fow), then
0001010 (for the 10 in row 2), then 1001010 (for the -10 in row 2). With these values at the end, the encoding of this entire
block would look like the following figure. There's one bit in the last byte that isn't used. If the image had another block after this
one, that bit would be used as the first bit in encoding the next block. Blocks don't need to start or end at a byte boundary.
Encoding of the entire DCT block shown above
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 8/17
The following figure shows what the test input, image-03.pgm would encode to in the j10 format. It's just a 10x10 image, so it
only contains one block. It starts with two 12-bit numbers for the size of the image, then the encoding for the one block in this
image.
Encoding of the entire image-03.pgm test input
The following shows a larger example. This is the j10 encoding of test input image-04.pgm. To make it fit better on the page, it's
broken into two columns. The image-04.pgm input is 20 pixels wide and 10 pixels tall, so it's encoded as two blocks. It starts with
24 bits for the image size, then the encoding of the first (left-hand) block, then the second (right-hand) one. For this image, there
are two unused bits in the last byte. They should be written out as two low-order zeros.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 9/17
Encoding of the image-04.pgm test input
Decoding Images
Decoding an image requires reading a compressed image in the j10 format and then writing an image out in raw PGM format. The
first 24 bits give the size the output image should be. After this, each 10x10 block of the image can be decoded from the j10
encoding.
To decode a block, you can fill in the values in a 10x10 block of doubles, Y. The first 7 bits of the block's encoding give the value
in the upper-left corner of Y. Then, the next 99 bits tell where the non-zero values are in the rest of Y. After this, you can read
and store values for each of the non-zero elements of Y in row-major order (be sure to set the rest of the values to zero if
needed).
After filling in the Y array for a block, you can apply the inverse DCT to get a 10x10 array of intensity values, X. Each element of X
can be rounded to the nearest integer to get the pixel intensity. Because we're implementing a lossy compression technique, it's
possible you will get a value that's larger than 255 or less than 0 after rounding. If that happens, just clamp the value to the
0..255 range; if it's larger than 255, set it to 255 instead, and if it's smaller than 0 just set it to zero.
After filling in each 10x10 block of the resulting image, write it out as a PGM file to produce the output.
Extra Credit
To simplify our encode and decode programs, they only need to work with images that are a multiple of 10 in both dimensions.
However, if you'd like up to 5 points of extra credit, you can lift this restriction. If an image isn't a multiple of 10 in size, we can
just pad the image with extra pixels when we're computing the DCT. Then, these extra pixels can be discarded when decoding the
image.
For the extra credit, your encode and decode programs should work for images of any (non-zero) size. If the image isn't the right
size, you will compute the DCT as if it was padded with extra rows and columns of pixels, enough to bring its size up to a multiple
of 10 in both dimensions. Additional rows will just contain copies of the bottom row form the input image. Additional columns will
contain copies of the rightmost column. Any pixels added to the lower-right will contain copies of the pixel in the lower-right
corner of the input image.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 10/17
Padding the image to get extra pixels needed for the DCT.
The starter includes a few files to help you test the extra credit option, if you choose to implement it. The files image-ec-01.pgm
and image-ec-02.pgm are images that aren't a multiple of 10 in size. The files encoded-ec-01.j10 and encoded-ec-02.j10 are the
compressed output you should get if you compress these images. You can test this part with commands like the following:
./encode image-ec-01.pgm output.j10
echo $?
diff output.j10 encoded-ec-01.j10
There are also a couple of decoded-ec-*.pgm files for testing your decode program support for the extra credit option. You try
these with commands like the following:
./decode encoded-ec-01.j10 output.pgm
echo $?
diff output.pgm decoded-ec-01.pgm
Design
Program Organization
Your solution will use five components. Two of these (encode.c and decode.c) will be top-level components, containing a main
functions for the encode and decode programs. The other three (bits.c/bits.h, image.c/image.h and j10.c/j10.h) will contain
functions and some type definitions to be used by the other components.
The following figure shows the dependency between these components. The top-level programs will use the bits, image and j10
components. The j10 component will use bits and image, but bits and image won't need to use j10 or each other.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 11/17
Dependency structure among the components
Bits Component
The bits component will support reading and writing files one bit at a time. This is a little tricky, since the C library just provides
mechanisms for reading files one or more bytes at a time.
Your bits component will define two struct types, BitReader and BitWriter. These will contain fields to let you read or write a file
one bit at a time. For reading, the BitReader will contain a buffer with room for (at least) 8 bits. When you need to read the first
bit from a file, it will read a whole byte from an input file. It will return the high-order bit from this byte and save the rest in the
buffer. Then, the next 7 bits can be extracted from the buffer before another byte must be read from the file. Like it says in the in
the Bitwise Input / Output section, bits from each byte of a file will be returned in order from the high-order bit to the lowestorder
bit.
Operation of the BitReader and BitWriter
A similar struct, BitWriter, will be used to help save bits one at a time to a file. Using this struct, you can store bits one at a time
in an internal buffer. Then, once 8 bits have been saved, they can all be written out to the file as a single byte. If all the bits in the
last byte aren't needed, they should be filled with zeros and written to the output file when the BitWriter is closed.
You get to define the BitReader and BitWriter structs yourself. You'll need to store a FILE pointer for the file being read from or
written to. You'll also need other fields to keep up with bits being buffered in the struct.
The bits component will provide the following functions. Of course, you can use other functions if you need them, but be sure to
make any additional functions static if they're only used internally in the component.
BitReader *openBitReader( char const *filename )
This function will open the given file for reading and dynamically allocate a new BitReader struct for that file, initialize it and
return a pointer to it. Client code can use this reader to read bits from the file one at a time.
2021/4/5 CSC230 Project 5 –
https://www.csc2.ncsu.edu/courses/csc230/proj/p5/p5.html 12/17
void closeBitReader( BitReader *reader )
Client code will call this function when it's done using a BitReader. It will free any resources used by the given BitReader,
closing the file it's reading from and freeing any memory the reader was using.
int getBit( BitReader *reader )
Client code will call this function to get the next bit from the file the reader is reading from. If the buffer contains any bits
that haven't been returned yet, it will return the next bit from the buffer. If the buffer is empty, it will read the next byte
from the input file and return the high-order bit from that byte. If there are no more bits to read, this function should return
the constant, EOF, to the caller.
BitWriter *openBitWriter( char const *filename )
This creates a new BitWriter struct, opening a file with the given name, allocating space for the BitWriter and initializing its
fields as needed. It returns a poi