ACS61012代写、MATLAB编程语言代做
- 首页 >> CS Coursework for ACS61012 “Machine Vision”
The purpose of the lab sessions is to give you both theoretical and practical skills in machine
vision and especially in image enhancement, image understanding and video processing.
Machine vision is essential for a number of areas - autonomous systems, including robotics,
Unmanned Aerial Vehicles (UAVs), intelligent transportation systems, medical diagnostics,
surveillance, augmented reality and virtual reality systems.
The first labs focus on performing operations on images such as reading, writing calculating
image histograms, flipping images and extracting the important colour and edges image
features. You will become familiar how to use these features for the purposes of object
segmentation (separation of static and moving objects) and for the next high-level tasks of
stereo vision, object detection, classification, tracking and behaviour analysis. These are
inherent steps of semi-supervised and unsupervised systems where the involvement of the
human operators reduces to minimum or is excluded.
Required for Each Subtask
Task 1: Introduction to Machine Vision
For the report from Task 1, you need to present results with:
From Lab Session 1 – Part I
● The Red, Green, Blue (RGB) image histogram of your own picture and analysis the
histogram. Several pictures are provided, if you wish to use one of them. Alternatively,
you could work with a picture of your choice. The original picture needs to be shown
as well. Please discuss the results. For instance, what is the differences between the
histograms? What do we learn from the visualised red, green and blue components
of the image histogram?
Files: Lab 1 - Part I - Introduction to Images and Videos.zip and Images.zip. You
can work with one of the provided images from Images.zip or with your own image.
From Lab Session 1 – Part II
● Results with different edge detection algorithms, e.g. Sobel, Prewitt and comment on
their accuracy with different parameters (threshold, and different types of noise
especially). Include the visualisation and your conclusions about static objects
segmentation using edge detection (steps 9-11 with Sobel, Canny and {Prewitt
operators)) in your report. Visualise the results and draw conclusions.
[8 marks equally distributed between part I and part II]
Task 2: Optical Flow Estimation Algorithm
For the report, you need to:
● Find corner points and apply the optical flow estimation algorithm.
(use file Lab 2.zip – image Gingerbread Man). Presents results for the ‘Gingerbread
Man’ tasks and visualise the results
[4 marks]
● Track a single point with the optical flow approach (file: Lab 2.zip – the red square
image). Visualise the trajectory on the last frame and the ground truth track of ‘Red
Square’ tasks.
2
● Compute and visualise the root mean square error of the trajectory estimated over
the video frames by the optical flow algorithm. Compare the estimates with the exact
coordinates given in the file called groundtruth. You need to include the results only
with one corner. Give the equation for the root-mean square error. Analyse the
results and make conclusions about the accuracy of the method based on the root
mean square error.
[8 marks]
Task 3: Automatic Detection of Moving Objects in a Sequence of Video Frames
You are designing algorithms for automatic vehicular traffic surveillance. As part of this
task, you need to apply two types of approaches: the basic frame differencing approach
and the Gaussian mixture approach to detect moving objects.
Part I: with the frame differencing approach
● Apply the frame differencing approach (Lab 3.zip file)
For the report, you need to present:
● Image results of the accomplished tasks
● Analyse the algorithms performance when you vary the detection threshold.
[5 marks]
Part II: with the Gaussian mixture approach
For the report, you need to present:
● Results for the algorithm performance when you vary parameters such as number
of Gaussian components, initialisation parameters and the threshold for decision
making
● Detection results of the moving objects, show snapshots of images.
● Analyse all results – how does the change of the threshold and number of Gaussian
components affect the detection of the moving objects?
[5 marks]
Task 4: Robot Treasure Hunting
A robot is given a task to search and find “treasures” in imagery
data. There are three tasks: easy , medium and
difficult.
. The starting point of the robot search is where the red arrow is. For the
medium case the blue fish is the only treasure, for the difficult case the clove and sun are
3
“treasures” that need to be found. Ideally, one algorithm needs to be able to find the
“treasures” from all images, although a solution with separate algorithms is acceptable.
For Task 4, in the report, you need to present results with:
● The three different images (easy, medium and difficult showing the path of finding
“the treasure”.
● Include the intermediate steps of your results in your report, e.g. of the binarisation
of the images and the value of the threshold that you found or any other algorithm
that you propose for the solution of the tasks.
● Explain your solution, present your algorithms and the related MATLAB code.
● Include the brief description of main idea of your functions in your report and the
actual code of the functions in an Appendix of your report.
In the guidance for the labs, one possible solution is discussed, but others are available.
Creativity is welcome in this task and if you have different solutions, they are welcome.
Here 8 marks are given for the easy task, 10 for the medium task and 12 for the most
difficult task.
[30 marks]
Task 5. Image Classification with a Convolutional Neural Network
1. Provide your classification results with the CNN, demonstrating its accuracy and
analyse them in your report.
[2 marks]
2. Calculate the Precision, Recall, and the F1 score functions characterising further the
CNN performance.
[6 marks]
3. Improve the CNN classification results. Please explain how you have achieved the
improvements.
[12 marks]
4. Discuss ethics aspects in Computer Vision tasks such as image classification,
detection and segmentation. Consider ethics in broad aspects – what are the positives
when Ethics is considered. What ethics challenges do ethics poses and how could they
be reduced and mitigated? In your answer you need to include aspects of Equality,
Diversity and Inclusion (EDI).
[10 marks]
Finally, the quality of writing and presentation style are assessed. These include the
clarity, conciseness, structure, logical flow, figures, tables, and the use of references.
[10 marks]
4
Guidance on the Course Work Submission
You need to submit your report and code that you have written to accomplish the tasks.
There are two separate submission links on Blackboard.
Report and Code Submission
There are two submission links on Blackboard: 1) for your course work report in a pdf
format and 2) for the requested code in a zipped file.
A Well-written Report Contains:
● A title page, including your ID number, course name, etc., followed by a content page.
● The main part: description of the tasks and how they are performed, including results
from all subtasks. For instance: “This report presents results on reading and writing
images in MATLAB. Next, the study of different edge detection algorithms is presented
and their sensitivity to different parameters…” You are requested to present in
Appendices the MATLAB code that you have written to obtain these results. A very
important part of your report is the analysis of the results. For instance, what does the
image histogram tell you? How can you characterise the results? Are they accurate? Is
there a lot of noise?
● Conclusions describe briefly what has been done, with a summary of the main
results.
● Appendix: Present and describe briefly in an Appendix the code only for tasks 2-
5. Add comments to your code to make it understandable. Provide the full code
as one compressed file, in the separate submission link given for it.
● Cite all references and materials used. Adding references demonstrates additional
independent study. Write with own style and words to minimise and avoid similarities.
Every student needs to write own independent report.
● Please name the files with your report and code for the submission on Blackboard by
adding your ID card registration number, e.g. CW_Report_1101133888 and
CW_Code_1101133888.
The advisable maximum number of words is 4000.
Submission Deadline: Week 10 of the spring semester, Sunday midnight
5
Guidance to Accomplish the Tasks
Lab Session 1 - Part I: Introduction to Image Processing
In this lab you will learn how to perform basic operations on images of different types, e.g.
how to read them, convert them from one format to another, calculate image histograms and
analyse them.
Background Knowledge
A digital image is composed of pixels which can be thought of as small dots on the screen.
We know that all numeric calculations in MATLAB are performed using double (64-bit)
floating-point numbers, so this is also a frequent data class encountered in image
processing. Some of the most common formats used in image processing are presented in
Tables 1 and 2 given below.
All MATLAB functions work with double arrays. To reduce memory requirements, MATLAB
supports storing image data in arrays of class uint8 and uint16. The data in these arrays is
stored as 8-bit or 16-bit unsigned integers. Such arrays require respectively, one eighth or
one-fourth as much memory as data in double arrays.
Table 1. Data classes and their ranges
Most of the mathematic operations are not supported for types uint8 and uint16. It is
therefore required to convert to double for operations and back to uint8/16 for storage,
display and printing.
Table 2. Numeric formats used in image processing
Image Types
I. Intensity Image (Grey Scale Image)
This form represents an image as a matrix where every element has a value corresponding
to how bright/ dark the pixel at the corresponding position should be coloured. There are
two ways to represent the brightness of the pixel:
6
1. The double class (or data type) format. This assigns a floating number ("a number
with decimals") in the range -10308 to +10308 for each pixel. Values of scaled class
double are in the range [0,1]. The value 0 corresponds to black and the value 1
corresponds to white.
2. The other class uint8 assigns an integer between 0 and 255 to represent the intensity
of a pixel. The value 0 corresponds to black and 255 to white. The class uint8 only
requires roughly 1/8 of the storage compared to the class double. However, many
mathematical functions can only be applied to the double class.
II. Binary Image
The binary image format also stores an image as a matrix but can colour a pixel as black
or white (and nothing in between): 0 – is for black and a 1 – is for white.
III. Indexed Image
This is a practical way of representing colour images. An indexed image stores an image as
two arrays. The first array has the same size as the image and one number for each pixel.
The second array (matrix) is called colour map and its size may be different from the image
size. The numbers in the first matrix represent an instruction of what number to use in the
colour map matrix.
IV. RGB Image
This format represents an image with three matrices of sizes matching the image format.
Each matrix corresponds to one of the colours red, green or blue and gives an instruction
of how much of each of these colours a certain pixel should use. Colours are always
represented with non-negative numbers.
Guidance on Performing Lab Session 1 – Part I
Demos in MATLAB
>> demo MATLAB % Opens a window from which you can select a demo for different tools
Workspace and saving results
To see the variables in the workspace: who, whos
To clear the variables in the workspace: clear
To save the variables in the workspace: save name_of_a_file.mat
To load the data/ image from a file: load name_of_a_file.mat
Examples of Reading images in MATLAB
>> clear all % Clears the workspace in MATLAB
>> I = imread('Dog.jpg'); %
>> size(I) % Gives the size of the image
>> imshow(I); % Visualises the image
>> Ig = rgb2gray(I); % Converts a colour image into a grey level image
>> imshow(Ig)
1. The first line clears all variables from the workspace
2. The second line reads the image file into a 3 dimensional array (x, y, color). MATLAB
can read many image file formats, so you do not have to worry about the details
3. Next, we will have information about the image size of the image
4. Visualise the colour image
5. This line converts an RGB image into a grey image. This is not necessary if the image
is already a grey level image.
6. Visualise the grey image
7
Writing Images in MATLAB
Images are written to disk using function imwrite, which has the following basic syntax:
imwrite(I,’filename’)
The string in filename must include a recognised file format extension (tiff, jpeg, gif, bmp,
png or xwd).
>> imwrite(I,’Dog1.jpg’); % The string contained in filename
Next, you can check the information about the graphics file, by using imfinfo.
Type: imfinfo Dog.jpg
Use the commands, whos and ls to visualise the variables in the workspace.
Changing the Image Brightness
Change the brightness of your image by adding a constant value to all pixel values, resp. by
subtracting a constant value to all pixel values. For instance:
>> I_b = I – 100;
>> figure, imshow(I_b)
>> I_s = I + 100;
>> figure, imshow(I_s)
Flipping an Image
Apply flipLtRt.m function (provided) to your image to flip an image. Visualise the results.
Detection of an Area of a Predefined Colour
Change the colour of the white pixels of an image to yellow on the image
'duckMallardDrake.jpg':
% Color the duck yellow!
im= imread('duckMallardDrake.jpg');
imshow(im);
[nr,nc,np]= size(im);
newIm= zeros(nr,nc,np);
newIm= uint8(newIm);
for r= 1:nr
for c= 1:nc
if ( im(r,c,1)>180 && im(r,c,2)>180 && im(r,c,3)>180 )
% white feather of the duck; now change it to yellow
newIm(r,c,1)= 225;
newIm(r,c,2)= 225;
newIm(r,c,3)= 0;
else % the rest of the picture; no change
for p= 1:np
newIm(r,c,p)= im(r,c,p);
end
end
end
end
figure
imshow(newIm)
8
Another example on finding an area of a predefined colour. Find the pixels indexes with the yellow
colour on the image ‘Two_colour.jpg’.
im = imread('Two_colour.jpg'); % read the image
imshow(im);
% extract RGB channels separatelly
red_channel = im(:, :, 1);
green_channel = im(:, :, 2);
blue_channel = im(:, :, 3);
% label pixels of yellow colour
yellow_map = green_channel > 150 & red_channel > 150 & blue_channel < 50;
% extract pixels indexes
[i_yellow, j_yellow] = find(yellow_map > 0);
Visualise the results. Note that plot and scatter commands work with spatial coordinates.
% visualise the results
figure;
imshow(im); % plot the image
hold on;
scatter(j_yellow, i_yellow, 5, 'filled') % highlighted the yellow pixels
Conversion between Different Formats
1. Select your own image.
2. Read a colour image (imread command). Convert your RGB colour image to grey and
then to HSV format (rgb2gray and rgb2hsv commands, respectively).
3. Convert your RGB image into a binary format (im2bw command) and visualise the
result. Use at least 3 more operations converting images from one format to another.
This part is not required for the report, as mentioned in the assessment criteria section.
The conversion to a binary image is called binarisation. Binarisation is based on a applying
a threshold on the image intensity and the process is called thresholding. The output binary
image has values of 0 for black for all pixels in the input image with luminance less than
the threshold level and 1 (white) for all other pixels.
Understanding Image Histogram
1. Experiment with a grey scale image, calculate the histogram and visualise it. There are
various ways to plot an image histogram:
1. imhist, 2. bar 3. stem 4. plot.
Show results with them. What could you say about the dominating colours of the objects/
images from the histograms?
Example Code:
clear all
I = imread('image.jpg');
Im_grey = rgb2gray(I);
figure, imhist(Im_grey);
xlabel('Number of bins (256 by default for a greyscale image)')
ylabel('Histogram counts')
You can use the bar function to plot the image histogram, in the following way:
9
h = imhist(Im_grey);
h1 = h(1:10:256);
horz = 1:10:256;
figure, bar(horz,h1)
See the difference compared with what plot() function will give you:
figure, plot(h)
2. Calculate and visualise the histogram of an RGB image
In MATLAB you can only use the built in ‘hist’ on one channel at a time. One way to display
the histogram of an image is to convert it into a grayscale format with rgb2gray and apply
the imhist function. Another approach is to work with the RGB image in the following way.
First, we convert the image into double format and we can calculate for each channel:
r= double(I(:,:,1));
g = double(I(:,:,2));
b = double(I(:,:,3));
figure, hist(r(:),124)
title('Histogram of the red colour')
figure, hist(g(:),124)
title('Histogram of the green colour')
figure, hist(b(:),124)
title('Histogram of the blue colour')
Now repeat again the binarisation process after you choose the threshold value
appropriately, based on the histogram that you observe. This threshold value must be
normalised on the range [0, 1] to be used with the function im2bw.
Example: If we choose the median value 128 of the full range [0, 255] as the threshold, then
you can perform binarisation of image Im with the function.
ImBinary=im2bw(I,128/255);
Vary the threshold and comment on the results.
3. Calculate and visualise the histogram of an HSV image
For an HSV histogram you can use the same recommendation as for an RGB histogram,
given above. Another way of calculating the histogram of in the HSV space is given below.
% Display the original image.
subplot(2, 4, 1);
imshow(rgbImage, [ ]);
title('Original RGB image');
% Convert to HSV color space
hsvimage = rgb2hsv(rgbImage);
% Extract out the individual channels.
hueImage = hsvimage(:,:,1);
satImage = hsvimage(:,:,2);
valueImage = hsvimage(:,:,3);
% Display the individual channels.
subplot(2, 4, 2);
imshow(hueImage, [ ]);
10
title('Hue Image');
subplot(2, 4, 3);
imshow(satImage, [ ]);
title('Saturation Image');
subplot(2, 4, 4);
imshow(valueImage, [ ]);
title('Value Image');
% Take histograms
[hCount, hValues] = imhist(hueImage(:), 18);
[sCount, sValues] = imhist(satImage(:), 3);
[vCount, vValues] = imhist(valueImage(:), 3);
% Plot histograms.
subplot(2, 4, 5);
bar(hValues, hCount);
title('Hue Histogram');
subplot(2, 4, 6);
bar(sValues, sCount);
title('Saturation Histogram');
subplot(2, 4, 7);
bar(vValues, vCount);
title('Value Histogram');
% Alert user that we're done.
message = sprintf('Done processing this image.\n Maximize and check
out the figure window.');
msgbox(message);
Include the results of understanding the RGB image histogram in your report.
Understanding image histogram – the difference between one-colour and two-colour
images
An image histogram is a good tool for image understanding. For example, image histograms
can be used to distinguish a one-colour image (or an object in the image) from a two-colour
image (or an object in the image):
1. Read ‘One_colour.jpg’ and ‘Two_colour.jpg’ (with imread);
2. Convert both images into the greyscale format (with rgb2gray);
3. Calculate and visualise the histograms for both images (with imhist);
What is the differences between these colour histograms? What do we learn from the
visualised red, green and blue components of the image histogram?
11
Lab Session 1 - Part II: Edge Detection and Segmentation
of Static Objects
In this practical session, you will continue to study basic image processing techniques. You
will enhance the contrast of images and perform different operations on them. You will learn
how to model different types of noise in images and how to remove the noise from an image.
You will also learn approaches for edge detection and static objects segmentation.
Guidance on Performing Lab Session 1 – Part II
1. Read a preliminary chosen image ‘Image.gif’ (with imread);
Enhancement Contrast
2. Compute an image histogram for the image (imhist). Visualise the results. Analysing
the histogram think about the best way of enhancement the image, recall the methods
from the lectures;
3. Apply the histogram equalisation operation to the image (histeq). Visualise the results.
Compute an image histogram for the corrected image. Visualise the results. Compare it
with the original histogram. Does this method of enhancement actually enhance image
quality?
4. Apply the gamma correction of the histogram to the image (imadjust). Visualise the
results. Experiment with different values for gamma and find the optimal one. Compute
the image histogram to the corrected image. Visualise the results. Compare the
histogram and the image with the original ones and the results of the histogram
equalisation. Which method of enhancement performs better?
Images with Different Types of Noise and Image Denoising
5. Synthesise two images from the image ‘Image.gif’ with two types of noise – Gaussian
and “salt and pepper” (imnoise). Visualise the results;
6. Apply the Gaussian filter to the Gaussian noised image (imgaussfilt). Find the optimal
filter parameters values. Visualise the results;
7. Apply the Gaussian filter to the salt and pepper noisy image (imgaussfilt), visualise
and discuss the results.
8. Apply the median filter to the salt and pepper noised image (medfilt2). Find the
optimal filter parameter values. Visualise the results;
Static Objects Segmentation by Edge Detection
9. Find edges on the image ‘Image.gif’ with the Sobel operator (edge(…, ‘sobel’, …)).
Vary the threshold parameter value and draw conclusions about its influence over the
quality of the segmented image. Visualise the results with the optimal threshold value;
10.Repeat the step 9 with the Canny operator (edge(…, ‘canny’, …));
11.Repeat the step 9 with the Prewitt operator (edge(…, ‘prewitt’, …));
Include the resulting images with segmented objects and add conclusions about static
objects segmentation using edge detection methods (from steps 9-11) in your report.
12
Lab Session 2: Object Motion Detection & Tracking
This lab session is focused on motion detection and tracking in video sequences. You will
apply the optical flow algorithm to object tracking by using corner points. The optical flow
calculates the motion of image pixels from one frame to another.
You will apply the optical flow algorithm to the “interesting” corner points only since the
numerical stability of the algorithm is guaranteed in these points only.
You need to find first the “interesting” points, and then apply an optical flow algorithm only
to them.
Background Knowledge
Corner Points
In many applications of image and video processing it is easier to work with “features”
(“characteristic points” or “local feature points”) rather than with all pixels of a frame. These
“features” or “points” should differ from their neighbours in some area.
Corner points are an example of such features. A corner point is a point whose
surrounding points differ from the surroundings of its neighbours. Figure 2.1 shows an
example of three types of points: 1) a top corner point, 2) an edge point and 3) a point
inside the object (internal point).
● The corner point is surrounded with the solid line square and its neighbour point is
surrounded by the dotted square. The corner point and its neighbour point have
different surrounding areas.
● For the edge point its surrounding is the same as the surroundings of its neighbour
point in one direction and it is different in any other direction.
● The internal point is surrounded by the same neighbourhood as all other near points
around it.
Figure 2.1. Illustration of the difference between corner, edge and internal points of an object.
Please note that the analysed points are surrounded with a square and the dotted square indicates
the area around neighbour points.
One of the most popular methods for detecting corner points is the Harris corner detector.
It is used by default in the MATLAB function corner.
The Optical Flow Approach
An optical flow can be represented as a vector field of apparent pixel motion between
frames. Optical flow estimation is one of the widely methods for motion detection in robotics
and computer vision. Given two images I1 and I2, optical flow estimation algorithms can find
the vector field:
13
where [N, M] is the image size. The vector field contains displacement vectors for each pixel.
Pixel (x, y) from the image I1 will have location (x+ui, j,y + vi, j) in the image I2.
There are many different methods for optical flow estimation. The Lucas-Kanade algorithm
is one of the most popular algorithms. This lab considers only the Lucas-Kanade algorithm.
It has the following assumptions:
1. Brightness (colour) consistency. It means that pixels do not change their colour
between frames.
2. Spatial similarity. It means that neighbours of each pixel have similar motion
vectors.
3. Small displacement. This means that the displacement or motion vectors are small
and a Taylor series expansion can be applied.
With these assumptions in place, the calculation of the optical flow reduces to solving an
overdetermined linear system. This is done by the Least Square method. The conditions
of the overdetermined linear system solution, lead to the Lucas-Kanade algorithm. You will
apply the Lucas-Kanade algorithm to the “interesting” (“feature”) points only.
Tracking with the optical flow
Object tracking is the process of object localisation and association of its location on the
current frame with the previous ones, building a trajectory for each object.
Optical flow estimation algorithms provide a tool to calculate a displacement vector from one
frame to another. This information can be used for tracking purposes. Indeed, if we
determine the point of interest in the first frame, we can compute a displacement vector for
it for every successive frame, using an optical flow estimation algorithm. The combination of
the positions of the points, computed by displacement vectors constitutes the trajectory of
this point.
If we want to track a non-point object, we can find “interesting” points on the object, track
them and use a median position of the “interesting” points as a position for the object. Since
optical flow estimation algorithms are not perfect and can lose tracking points, one should
reinitialise “interesting” points from time to time. At any time instant, the introduced
“interesting” points should satisfy the following constrains:
● A point should not be far from the current median position of the object – it has to be
inside the current bounding box;
● A point should be on the object – in your task you will use colour for this constraint;
● Each pair of tracking points has to differ from each other – if two points are too close
to each other, one of them will be deleted.
As the result, we have the following algorithm:
1. Build a colour template of the object in the first frame.
2. If necessary (in your object detection task) read the next frame.
3. Detect “interesting” points of the object in the current frame. Make sure they are
satisfying all the constraints, mentioned above.
4. Initialise tracks with detected and filtered “interesting” points.
5. Compute an optical flow for every “interesting” point between successive frames
6. Compute new positions of the tracks by adding the optical flow vectors to the current
positions in the tracks.
7. Make sure the new positions of the tracks satisfy the second and third constraints,
mentioned above. If not, delete those tracks.
14
8. Compute the median position of the new positions of the tracks. Move the bounding
box to the new median position.
9. Make sure the new positions of the tracks are inside the bounding box. If not, delete
those tracks.
10.Repeat steps 5-9. Introduce the new “interesting” points of the object in every k
frames.
It is recommended to use k = 5.
Optical Flow Estimation and Visualisation with MATLAB
From MATLAB there is an optical flow object for optical flow estimation – opticalFlowLK
(http://uk.mathworks.com/help/vision/ref/opticalflowlk-class.html)
To estimate an optical flow you will use the command estimateFlow
(http://uk.mathworks.com/help/vision/ref/opticalflowlk.estimateflow.html).
videoReader = VideoReader('…');
frameRGB = readFrame(videoReader);
frameGrey = rgb2gray(frameRGB);
opticFlow = opticalFlowLK('NoiseThreshold',0.009);
flow = estimateFlow(opticFlow,frameGrey);
You can use the following fields of the flow object:
● flow.Vx – the horizontal component of the velocity. size(flow.Vx) ==
size(frameGrey). flow.Vx(i, j) is the horizontal component of the velocity of the pixel
(i, j).
● flow.Vy – the vertical component of the velocity. size(flow.Vy) == size(frameGrey).
flow.Vy(i, j) is the vertical component of the velocity of the pixel (i, j).
You need the Computer Vision System toolbox from MATLAB.
For visualisation of the optical flow there are several options:
1. with the command plot
(http://uk.mathworks.com/help/vision/ref/opticalflow.plot.html)
2. with the command quiver(u, -v, 0), where u, v are the horizontal and vertical
displacements, respectively. Note, that it may take some time to visualise the
results on your Figure.
*Moving a bounding box to a new position – help for the provided
function
In the object tracking task you could move a bounding box around an object to a new position
between frames. The function ShiftBbox could help perform this task.
The function ShiftBbox has two input arguments:
● input_bbox – the current bounding box in the format: input_bbox is a 1 x 4 vector
The. input_bbox(1:2) are the spatial coordinates of the left top corner of the
bounding box, input_bbox(3) is the horizontal size of the bounding box,
input_bbox(4) is the vertical size of the bounding box;
● new_center – the new position of the centre of the bounding box in spatial
coordinates
The function ShiftBbox has one output:
15
● shifted_bbox – the updated bounding box in the same format as the input_bbox
argument. The centre of the updated bounding box is equal to the new_center input
parameter
Guidance for Performing Lab Session on Optical Flow
1. You can find corner points (with the corner MATLAB function) on the images
‘red_square_static.jpg’ and ‘GingerBreadMan_first.jpg’. Note that the corner
function works with greyscale images. You need to convert first the input images to
the greyscale format. Next, you can apply the function with different maximum
number of corners. Include the resulting images in your report. You need to show
the results only with one corners value.
2. Find optical flow of the pixels which moved from the image
‘GingerBreadMan_first.jpg’ to the image ‘GingerBreadMan_second.jpg’
(opticalFlowLK, estimateFlow). Note that the estimateFlow functio
The purpose of the lab sessions is to give you both theoretical and practical skills in machine
vision and especially in image enhancement, image understanding and video processing.
Machine vision is essential for a number of areas - autonomous systems, including robotics,
Unmanned Aerial Vehicles (UAVs), intelligent transportation systems, medical diagnostics,
surveillance, augmented reality and virtual reality systems.
The first labs focus on performing operations on images such as reading, writing calculating
image histograms, flipping images and extracting the important colour and edges image
features. You will become familiar how to use these features for the purposes of object
segmentation (separation of static and moving objects) and for the next high-level tasks of
stereo vision, object detection, classification, tracking and behaviour analysis. These are
inherent steps of semi-supervised and unsupervised systems where the involvement of the
human operators reduces to minimum or is excluded.
Required for Each Subtask
Task 1: Introduction to Machine Vision
For the report from Task 1, you need to present results with:
From Lab Session 1 – Part I
● The Red, Green, Blue (RGB) image histogram of your own picture and analysis the
histogram. Several pictures are provided, if you wish to use one of them. Alternatively,
you could work with a picture of your choice. The original picture needs to be shown
as well. Please discuss the results. For instance, what is the differences between the
histograms? What do we learn from the visualised red, green and blue components
of the image histogram?
Files: Lab 1 - Part I - Introduction to Images and Videos.zip and Images.zip. You
can work with one of the provided images from Images.zip or with your own image.
From Lab Session 1 – Part II
● Results with different edge detection algorithms, e.g. Sobel, Prewitt and comment on
their accuracy with different parameters (threshold, and different types of noise
especially). Include the visualisation and your conclusions about static objects
segmentation using edge detection (steps 9-11 with Sobel, Canny and {Prewitt
operators)) in your report. Visualise the results and draw conclusions.
[8 marks equally distributed between part I and part II]
Task 2: Optical Flow Estimation Algorithm
For the report, you need to:
● Find corner points and apply the optical flow estimation algorithm.
(use file Lab 2.zip – image Gingerbread Man). Presents results for the ‘Gingerbread
Man’ tasks and visualise the results
[4 marks]
● Track a single point with the optical flow approach (file: Lab 2.zip – the red square
image). Visualise the trajectory on the last frame and the ground truth track of ‘Red
Square’ tasks.
2
● Compute and visualise the root mean square error of the trajectory estimated over
the video frames by the optical flow algorithm. Compare the estimates with the exact
coordinates given in the file called groundtruth. You need to include the results only
with one corner. Give the equation for the root-mean square error. Analyse the
results and make conclusions about the accuracy of the method based on the root
mean square error.
[8 marks]
Task 3: Automatic Detection of Moving Objects in a Sequence of Video Frames
You are designing algorithms for automatic vehicular traffic surveillance. As part of this
task, you need to apply two types of approaches: the basic frame differencing approach
and the Gaussian mixture approach to detect moving objects.
Part I: with the frame differencing approach
● Apply the frame differencing approach (Lab 3.zip file)
For the report, you need to present:
● Image results of the accomplished tasks
● Analyse the algorithms performance when you vary the detection threshold.
[5 marks]
Part II: with the Gaussian mixture approach
For the report, you need to present:
● Results for the algorithm performance when you vary parameters such as number
of Gaussian components, initialisation parameters and the threshold for decision
making
● Detection results of the moving objects, show snapshots of images.
● Analyse all results – how does the change of the threshold and number of Gaussian
components affect the detection of the moving objects?
[5 marks]
Task 4: Robot Treasure Hunting
A robot is given a task to search and find “treasures” in imagery
data. There are three tasks: easy , medium and
difficult.
. The starting point of the robot search is where the red arrow is. For the
medium case the blue fish is the only treasure, for the difficult case the clove and sun are
3
“treasures” that need to be found. Ideally, one algorithm needs to be able to find the
“treasures” from all images, although a solution with separate algorithms is acceptable.
For Task 4, in the report, you need to present results with:
● The three different images (easy, medium and difficult showing the path of finding
“the treasure”.
● Include the intermediate steps of your results in your report, e.g. of the binarisation
of the images and the value of the threshold that you found or any other algorithm
that you propose for the solution of the tasks.
● Explain your solution, present your algorithms and the related MATLAB code.
● Include the brief description of main idea of your functions in your report and the
actual code of the functions in an Appendix of your report.
In the guidance for the labs, one possible solution is discussed, but others are available.
Creativity is welcome in this task and if you have different solutions, they are welcome.
Here 8 marks are given for the easy task, 10 for the medium task and 12 for the most
difficult task.
[30 marks]
Task 5. Image Classification with a Convolutional Neural Network
1. Provide your classification results with the CNN, demonstrating its accuracy and
analyse them in your report.
[2 marks]
2. Calculate the Precision, Recall, and the F1 score functions characterising further the
CNN performance.
[6 marks]
3. Improve the CNN classification results. Please explain how you have achieved the
improvements.
[12 marks]
4. Discuss ethics aspects in Computer Vision tasks such as image classification,
detection and segmentation. Consider ethics in broad aspects – what are the positives
when Ethics is considered. What ethics challenges do ethics poses and how could they
be reduced and mitigated? In your answer you need to include aspects of Equality,
Diversity and Inclusion (EDI).
[10 marks]
Finally, the quality of writing and presentation style are assessed. These include the
clarity, conciseness, structure, logical flow, figures, tables, and the use of references.
[10 marks]
4
Guidance on the Course Work Submission
You need to submit your report and code that you have written to accomplish the tasks.
There are two separate submission links on Blackboard.
Report and Code Submission
There are two submission links on Blackboard: 1) for your course work report in a pdf
format and 2) for the requested code in a zipped file.
A Well-written Report Contains:
● A title page, including your ID number, course name, etc., followed by a content page.
● The main part: description of the tasks and how they are performed, including results
from all subtasks. For instance: “This report presents results on reading and writing
images in MATLAB. Next, the study of different edge detection algorithms is presented
and their sensitivity to different parameters…” You are requested to present in
Appendices the MATLAB code that you have written to obtain these results. A very
important part of your report is the analysis of the results. For instance, what does the
image histogram tell you? How can you characterise the results? Are they accurate? Is
there a lot of noise?
● Conclusions describe briefly what has been done, with a summary of the main
results.
● Appendix: Present and describe briefly in an Appendix the code only for tasks 2-
5. Add comments to your code to make it understandable. Provide the full code
as one compressed file, in the separate submission link given for it.
● Cite all references and materials used. Adding references demonstrates additional
independent study. Write with own style and words to minimise and avoid similarities.
Every student needs to write own independent report.
● Please name the files with your report and code for the submission on Blackboard by
adding your ID card registration number, e.g. CW_Report_1101133888 and
CW_Code_1101133888.
The advisable maximum number of words is 4000.
Submission Deadline: Week 10 of the spring semester, Sunday midnight
5
Guidance to Accomplish the Tasks
Lab Session 1 - Part I: Introduction to Image Processing
In this lab you will learn how to perform basic operations on images of different types, e.g.
how to read them, convert them from one format to another, calculate image histograms and
analyse them.
Background Knowledge
A digital image is composed of pixels which can be thought of as small dots on the screen.
We know that all numeric calculations in MATLAB are performed using double (64-bit)
floating-point numbers, so this is also a frequent data class encountered in image
processing. Some of the most common formats used in image processing are presented in
Tables 1 and 2 given below.
All MATLAB functions work with double arrays. To reduce memory requirements, MATLAB
supports storing image data in arrays of class uint8 and uint16. The data in these arrays is
stored as 8-bit or 16-bit unsigned integers. Such arrays require respectively, one eighth or
one-fourth as much memory as data in double arrays.
Table 1. Data classes and their ranges
Most of the mathematic operations are not supported for types uint8 and uint16. It is
therefore required to convert to double for operations and back to uint8/16 for storage,
display and printing.
Table 2. Numeric formats used in image processing
Image Types
I. Intensity Image (Grey Scale Image)
This form represents an image as a matrix where every element has a value corresponding
to how bright/ dark the pixel at the corresponding position should be coloured. There are
two ways to represent the brightness of the pixel:
6
1. The double class (or data type) format. This assigns a floating number ("a number
with decimals") in the range -10308 to +10308 for each pixel. Values of scaled class
double are in the range [0,1]. The value 0 corresponds to black and the value 1
corresponds to white.
2. The other class uint8 assigns an integer between 0 and 255 to represent the intensity
of a pixel. The value 0 corresponds to black and 255 to white. The class uint8 only
requires roughly 1/8 of the storage compared to the class double. However, many
mathematical functions can only be applied to the double class.
II. Binary Image
The binary image format also stores an image as a matrix but can colour a pixel as black
or white (and nothing in between): 0 – is for black and a 1 – is for white.
III. Indexed Image
This is a practical way of representing colour images. An indexed image stores an image as
two arrays. The first array has the same size as the image and one number for each pixel.
The second array (matrix) is called colour map and its size may be different from the image
size. The numbers in the first matrix represent an instruction of what number to use in the
colour map matrix.
IV. RGB Image
This format represents an image with three matrices of sizes matching the image format.
Each matrix corresponds to one of the colours red, green or blue and gives an instruction
of how much of each of these colours a certain pixel should use. Colours are always
represented with non-negative numbers.
Guidance on Performing Lab Session 1 – Part I
Demos in MATLAB
>> demo MATLAB % Opens a window from which you can select a demo for different tools
Workspace and saving results
To see the variables in the workspace: who, whos
To clear the variables in the workspace: clear
To save the variables in the workspace: save name_of_a_file.mat
To load the data/ image from a file: load name_of_a_file.mat
Examples of Reading images in MATLAB
>> clear all % Clears the workspace in MATLAB
>> I = imread('Dog.jpg'); %
>> size(I) % Gives the size of the image
>> imshow(I); % Visualises the image
>> Ig = rgb2gray(I); % Converts a colour image into a grey level image
>> imshow(Ig)
1. The first line clears all variables from the workspace
2. The second line reads the image file into a 3 dimensional array (x, y, color). MATLAB
can read many image file formats, so you do not have to worry about the details
3. Next, we will have information about the image size of the image
4. Visualise the colour image
5. This line converts an RGB image into a grey image. This is not necessary if the image
is already a grey level image.
6. Visualise the grey image
7
Writing Images in MATLAB
Images are written to disk using function imwrite, which has the following basic syntax:
imwrite(I,’filename’)
The string in filename must include a recognised file format extension (tiff, jpeg, gif, bmp,
png or xwd).
>> imwrite(I,’Dog1.jpg’); % The string contained in filename
Next, you can check the information about the graphics file, by using imfinfo.
Type: imfinfo Dog.jpg
Use the commands, whos and ls to visualise the variables in the workspace.
Changing the Image Brightness
Change the brightness of your image by adding a constant value to all pixel values, resp. by
subtracting a constant value to all pixel values. For instance:
>> I_b = I – 100;
>> figure, imshow(I_b)
>> I_s = I + 100;
>> figure, imshow(I_s)
Flipping an Image
Apply flipLtRt.m function (provided) to your image to flip an image. Visualise the results.
Detection of an Area of a Predefined Colour
Change the colour of the white pixels of an image to yellow on the image
'duckMallardDrake.jpg':
% Color the duck yellow!
im= imread('duckMallardDrake.jpg');
imshow(im);
[nr,nc,np]= size(im);
newIm= zeros(nr,nc,np);
newIm= uint8(newIm);
for r= 1:nr
for c= 1:nc
if ( im(r,c,1)>180 && im(r,c,2)>180 && im(r,c,3)>180 )
% white feather of the duck; now change it to yellow
newIm(r,c,1)= 225;
newIm(r,c,2)= 225;
newIm(r,c,3)= 0;
else % the rest of the picture; no change
for p= 1:np
newIm(r,c,p)= im(r,c,p);
end
end
end
end
figure
imshow(newIm)
8
Another example on finding an area of a predefined colour. Find the pixels indexes with the yellow
colour on the image ‘Two_colour.jpg’.
im = imread('Two_colour.jpg'); % read the image
imshow(im);
% extract RGB channels separatelly
red_channel = im(:, :, 1);
green_channel = im(:, :, 2);
blue_channel = im(:, :, 3);
% label pixels of yellow colour
yellow_map = green_channel > 150 & red_channel > 150 & blue_channel < 50;
% extract pixels indexes
[i_yellow, j_yellow] = find(yellow_map > 0);
Visualise the results. Note that plot and scatter commands work with spatial coordinates.
% visualise the results
figure;
imshow(im); % plot the image
hold on;
scatter(j_yellow, i_yellow, 5, 'filled') % highlighted the yellow pixels
Conversion between Different Formats
1. Select your own image.
2. Read a colour image (imread command). Convert your RGB colour image to grey and
then to HSV format (rgb2gray and rgb2hsv commands, respectively).
3. Convert your RGB image into a binary format (im2bw command) and visualise the
result. Use at least 3 more operations converting images from one format to another.
This part is not required for the report, as mentioned in the assessment criteria section.
The conversion to a binary image is called binarisation. Binarisation is based on a applying
a threshold on the image intensity and the process is called thresholding. The output binary
image has values of 0 for black for all pixels in the input image with luminance less than
the threshold level and 1 (white) for all other pixels.
Understanding Image Histogram
1. Experiment with a grey scale image, calculate the histogram and visualise it. There are
various ways to plot an image histogram:
1. imhist, 2. bar 3. stem 4. plot.
Show results with them. What could you say about the dominating colours of the objects/
images from the histograms?
Example Code:
clear all
I = imread('image.jpg');
Im_grey = rgb2gray(I);
figure, imhist(Im_grey);
xlabel('Number of bins (256 by default for a greyscale image)')
ylabel('Histogram counts')
You can use the bar function to plot the image histogram, in the following way:
9
h = imhist(Im_grey);
h1 = h(1:10:256);
horz = 1:10:256;
figure, bar(horz,h1)
See the difference compared with what plot() function will give you:
figure, plot(h)
2. Calculate and visualise the histogram of an RGB image
In MATLAB you can only use the built in ‘hist’ on one channel at a time. One way to display
the histogram of an image is to convert it into a grayscale format with rgb2gray and apply
the imhist function. Another approach is to work with the RGB image in the following way.
First, we convert the image into double format and we can calculate for each channel:
r= double(I(:,:,1));
g = double(I(:,:,2));
b = double(I(:,:,3));
figure, hist(r(:),124)
title('Histogram of the red colour')
figure, hist(g(:),124)
title('Histogram of the green colour')
figure, hist(b(:),124)
title('Histogram of the blue colour')
Now repeat again the binarisation process after you choose the threshold value
appropriately, based on the histogram that you observe. This threshold value must be
normalised on the range [0, 1] to be used with the function im2bw.
Example: If we choose the median value 128 of the full range [0, 255] as the threshold, then
you can perform binarisation of image Im with the function.
ImBinary=im2bw(I,128/255);
Vary the threshold and comment on the results.
3. Calculate and visualise the histogram of an HSV image
For an HSV histogram you can use the same recommendation as for an RGB histogram,
given above. Another way of calculating the histogram of in the HSV space is given below.
% Display the original image.
subplot(2, 4, 1);
imshow(rgbImage, [ ]);
title('Original RGB image');
% Convert to HSV color space
hsvimage = rgb2hsv(rgbImage);
% Extract out the individual channels.
hueImage = hsvimage(:,:,1);
satImage = hsvimage(:,:,2);
valueImage = hsvimage(:,:,3);
% Display the individual channels.
subplot(2, 4, 2);
imshow(hueImage, [ ]);
10
title('Hue Image');
subplot(2, 4, 3);
imshow(satImage, [ ]);
title('Saturation Image');
subplot(2, 4, 4);
imshow(valueImage, [ ]);
title('Value Image');
% Take histograms
[hCount, hValues] = imhist(hueImage(:), 18);
[sCount, sValues] = imhist(satImage(:), 3);
[vCount, vValues] = imhist(valueImage(:), 3);
% Plot histograms.
subplot(2, 4, 5);
bar(hValues, hCount);
title('Hue Histogram');
subplot(2, 4, 6);
bar(sValues, sCount);
title('Saturation Histogram');
subplot(2, 4, 7);
bar(vValues, vCount);
title('Value Histogram');
% Alert user that we're done.
message = sprintf('Done processing this image.\n Maximize and check
out the figure window.');
msgbox(message);
Include the results of understanding the RGB image histogram in your report.
Understanding image histogram – the difference between one-colour and two-colour
images
An image histogram is a good tool for image understanding. For example, image histograms
can be used to distinguish a one-colour image (or an object in the image) from a two-colour
image (or an object in the image):
1. Read ‘One_colour.jpg’ and ‘Two_colour.jpg’ (with imread);
2. Convert both images into the greyscale format (with rgb2gray);
3. Calculate and visualise the histograms for both images (with imhist);
What is the differences between these colour histograms? What do we learn from the
visualised red, green and blue components of the image histogram?
11
Lab Session 1 - Part II: Edge Detection and Segmentation
of Static Objects
In this practical session, you will continue to study basic image processing techniques. You
will enhance the contrast of images and perform different operations on them. You will learn
how to model different types of noise in images and how to remove the noise from an image.
You will also learn approaches for edge detection and static objects segmentation.
Guidance on Performing Lab Session 1 – Part II
1. Read a preliminary chosen image ‘Image.gif’ (with imread);
Enhancement Contrast
2. Compute an image histogram for the image (imhist). Visualise the results. Analysing
the histogram think about the best way of enhancement the image, recall the methods
from the lectures;
3. Apply the histogram equalisation operation to the image (histeq). Visualise the results.
Compute an image histogram for the corrected image. Visualise the results. Compare it
with the original histogram. Does this method of enhancement actually enhance image
quality?
4. Apply the gamma correction of the histogram to the image (imadjust). Visualise the
results. Experiment with different values for gamma and find the optimal one. Compute
the image histogram to the corrected image. Visualise the results. Compare the
histogram and the image with the original ones and the results of the histogram
equalisation. Which method of enhancement performs better?
Images with Different Types of Noise and Image Denoising
5. Synthesise two images from the image ‘Image.gif’ with two types of noise – Gaussian
and “salt and pepper” (imnoise). Visualise the results;
6. Apply the Gaussian filter to the Gaussian noised image (imgaussfilt). Find the optimal
filter parameters values. Visualise the results;
7. Apply the Gaussian filter to the salt and pepper noisy image (imgaussfilt), visualise
and discuss the results.
8. Apply the median filter to the salt and pepper noised image (medfilt2). Find the
optimal filter parameter values. Visualise the results;
Static Objects Segmentation by Edge Detection
9. Find edges on the image ‘Image.gif’ with the Sobel operator (edge(…, ‘sobel’, …)).
Vary the threshold parameter value and draw conclusions about its influence over the
quality of the segmented image. Visualise the results with the optimal threshold value;
10.Repeat the step 9 with the Canny operator (edge(…, ‘canny’, …));
11.Repeat the step 9 with the Prewitt operator (edge(…, ‘prewitt’, …));
Include the resulting images with segmented objects and add conclusions about static
objects segmentation using edge detection methods (from steps 9-11) in your report.
12
Lab Session 2: Object Motion Detection & Tracking
This lab session is focused on motion detection and tracking in video sequences. You will
apply the optical flow algorithm to object tracking by using corner points. The optical flow
calculates the motion of image pixels from one frame to another.
You will apply the optical flow algorithm to the “interesting” corner points only since the
numerical stability of the algorithm is guaranteed in these points only.
You need to find first the “interesting” points, and then apply an optical flow algorithm only
to them.
Background Knowledge
Corner Points
In many applications of image and video processing it is easier to work with “features”
(“characteristic points” or “local feature points”) rather than with all pixels of a frame. These
“features” or “points” should differ from their neighbours in some area.
Corner points are an example of such features. A corner point is a point whose
surrounding points differ from the surroundings of its neighbours. Figure 2.1 shows an
example of three types of points: 1) a top corner point, 2) an edge point and 3) a point
inside the object (internal point).
● The corner point is surrounded with the solid line square and its neighbour point is
surrounded by the dotted square. The corner point and its neighbour point have
different surrounding areas.
● For the edge point its surrounding is the same as the surroundings of its neighbour
point in one direction and it is different in any other direction.
● The internal point is surrounded by the same neighbourhood as all other near points
around it.
Figure 2.1. Illustration of the difference between corner, edge and internal points of an object.
Please note that the analysed points are surrounded with a square and the dotted square indicates
the area around neighbour points.
One of the most popular methods for detecting corner points is the Harris corner detector.
It is used by default in the MATLAB function corner.
The Optical Flow Approach
An optical flow can be represented as a vector field of apparent pixel motion between
frames. Optical flow estimation is one of the widely methods for motion detection in robotics
and computer vision. Given two images I1 and I2, optical flow estimation algorithms can find
the vector field:
13
where [N, M] is the image size. The vector field contains displacement vectors for each pixel.
Pixel (x, y) from the image I1 will have location (x+ui, j,y + vi, j) in the image I2.
There are many different methods for optical flow estimation. The Lucas-Kanade algorithm
is one of the most popular algorithms. This lab considers only the Lucas-Kanade algorithm.
It has the following assumptions:
1. Brightness (colour) consistency. It means that pixels do not change their colour
between frames.
2. Spatial similarity. It means that neighbours of each pixel have similar motion
vectors.
3. Small displacement. This means that the displacement or motion vectors are small
and a Taylor series expansion can be applied.
With these assumptions in place, the calculation of the optical flow reduces to solving an
overdetermined linear system. This is done by the Least Square method. The conditions
of the overdetermined linear system solution, lead to the Lucas-Kanade algorithm. You will
apply the Lucas-Kanade algorithm to the “interesting” (“feature”) points only.
Tracking with the optical flow
Object tracking is the process of object localisation and association of its location on the
current frame with the previous ones, building a trajectory for each object.
Optical flow estimation algorithms provide a tool to calculate a displacement vector from one
frame to another. This information can be used for tracking purposes. Indeed, if we
determine the point of interest in the first frame, we can compute a displacement vector for
it for every successive frame, using an optical flow estimation algorithm. The combination of
the positions of the points, computed by displacement vectors constitutes the trajectory of
this point.
If we want to track a non-point object, we can find “interesting” points on the object, track
them and use a median position of the “interesting” points as a position for the object. Since
optical flow estimation algorithms are not perfect and can lose tracking points, one should
reinitialise “interesting” points from time to time. At any time instant, the introduced
“interesting” points should satisfy the following constrains:
● A point should not be far from the current median position of the object – it has to be
inside the current bounding box;
● A point should be on the object – in your task you will use colour for this constraint;
● Each pair of tracking points has to differ from each other – if two points are too close
to each other, one of them will be deleted.
As the result, we have the following algorithm:
1. Build a colour template of the object in the first frame.
2. If necessary (in your object detection task) read the next frame.
3. Detect “interesting” points of the object in the current frame. Make sure they are
satisfying all the constraints, mentioned above.
4. Initialise tracks with detected and filtered “interesting” points.
5. Compute an optical flow for every “interesting” point between successive frames
6. Compute new positions of the tracks by adding the optical flow vectors to the current
positions in the tracks.
7. Make sure the new positions of the tracks satisfy the second and third constraints,
mentioned above. If not, delete those tracks.
14
8. Compute the median position of the new positions of the tracks. Move the bounding
box to the new median position.
9. Make sure the new positions of the tracks are inside the bounding box. If not, delete
those tracks.
10.Repeat steps 5-9. Introduce the new “interesting” points of the object in every k
frames.
It is recommended to use k = 5.
Optical Flow Estimation and Visualisation with MATLAB
From MATLAB there is an optical flow object for optical flow estimation – opticalFlowLK
(http://uk.mathworks.com/help/vision/ref/opticalflowlk-class.html)
To estimate an optical flow you will use the command estimateFlow
(http://uk.mathworks.com/help/vision/ref/opticalflowlk.estimateflow.html).
videoReader = VideoReader('…');
frameRGB = readFrame(videoReader);
frameGrey = rgb2gray(frameRGB);
opticFlow = opticalFlowLK('NoiseThreshold',0.009);
flow = estimateFlow(opticFlow,frameGrey);
You can use the following fields of the flow object:
● flow.Vx – the horizontal component of the velocity. size(flow.Vx) ==
size(frameGrey). flow.Vx(i, j) is the horizontal component of the velocity of the pixel
(i, j).
● flow.Vy – the vertical component of the velocity. size(flow.Vy) == size(frameGrey).
flow.Vy(i, j) is the vertical component of the velocity of the pixel (i, j).
You need the Computer Vision System toolbox from MATLAB.
For visualisation of the optical flow there are several options:
1. with the command plot
(http://uk.mathworks.com/help/vision/ref/opticalflow.plot.html)
2. with the command quiver(u, -v, 0), where u, v are the horizontal and vertical
displacements, respectively. Note, that it may take some time to visualise the
results on your Figure.
*Moving a bounding box to a new position – help for the provided
function
In the object tracking task you could move a bounding box around an object to a new position
between frames. The function ShiftBbox could help perform this task.
The function ShiftBbox has two input arguments:
● input_bbox – the current bounding box in the format: input_bbox is a 1 x 4 vector
The. input_bbox(1:2) are the spatial coordinates of the left top corner of the
bounding box, input_bbox(3) is the horizontal size of the bounding box,
input_bbox(4) is the vertical size of the bounding box;
● new_center – the new position of the centre of the bounding box in spatial
coordinates
The function ShiftBbox has one output:
15
● shifted_bbox – the updated bounding box in the same format as the input_bbox
argument. The centre of the updated bounding box is equal to the new_center input
parameter
Guidance for Performing Lab Session on Optical Flow
1. You can find corner points (with the corner MATLAB function) on the images
‘red_square_static.jpg’ and ‘GingerBreadMan_first.jpg’. Note that the corner
function works with greyscale images. You need to convert first the input images to
the greyscale format. Next, you can apply the function with different maximum
number of corners. Include the resulting images in your report. You need to show
the results only with one corners value.
2. Find optical flow of the pixels which moved from the image
‘GingerBreadMan_first.jpg’ to the image ‘GingerBreadMan_second.jpg’
(opticalFlowLK, estimateFlow). Note that the estimateFlow functio