Você está na página 1de 107

PIC 10B Study Guide

Monday, March 2, 2015

8:42 AM

Inheritance
- This will mainly show up in the multiple choice and in the finding errors part of the test
- You'll have to know how public, private, and protected work, and something about data slicing as
well

Overview
- So what is inheritance?
Inheritance - the ability to derive a new class by extending existing ones
Derived Class/Subclass - an extension of its parent class
Base Class/Superclass - a class which is extended
- Throughout the rest of this, I'm going to explain things in terms of parents, children, and siblings
and I'll refer to classes as subclasses and superclasses, simply because that's how I was taught in
Java
The parent, child, and sibling hierarchy works the same way as you'd expect it to work in a
family
Each child has a parent, where the parent is the superclass and the child is a subclass
Ex: suppose we have a Person class and a Student class. Then the Student class is a CHILD of
the Person class
The Person class is the PARENT of the Student class
Suppose we also extend Person using an Employee class
The Person is the PARENT of the Employee
The Employee is the CHILD of the Person
The Employee is the SIBLING of a Student because both of them have the same
parent, Person
- Now that that's out of the way, let's proceed
- The general idea behind inheritance is that subclasses inherit EVERYTHING from a parents
This includes all member variables and member functions
Let's say we have a class Parent with a member variable x = 5, and a subclass Child that
extends Parent
class Parent {
public:
int x = 5;
};
class Child : public Parent {
// Literally nothing here
};
Then:

int main() {
Parent p;
cout << p.x << endl; // Will print out 5
Child c;
cout << c.x << endl; // Will also print out 5
return 0;
}
The cout << p.x << endl makes sense; this is just PIC 10A material
However, notice that the class Child has NO member declarations inside
We didn't write public: int x = 5; in there anywhere
So how come we're able to print out c.x?
Study Guide Page 1

So how come we're able to print out c.x?


Simple; since Child is a subclass of Parent, it will inherit Parent's member variables and
functions
- In general, it's good to follow the is a rule
This rule helps determine what gets to inherit what
For example, let's say we have a Student class as a subclass of the Person class
A Student is a Person
This means that it will inherit all of the Person attributes
However, a Person is NOT a Student
This means that the Person class does NOT inherit all of the Student attributes
- So far so good, right? Not too hard to remember this so far
- The part that gets difficult is determining what the child can access
Just because a child can inherit everything from a parent does not mean it can access
everything from the parent
A conceptual example would be a really rich person with a spoiled 3 year old child
Let's say the really rich person dies; he's going to leave a huge inheritance to the child
Can the child access it? Well, not exactly; since he's still a minor, there will usually be
an executor of the estate that keeps the money in trust
Yeah, I just realized I'm complicating the issue by introducing estate law, but basically,
it would be weird if the 3 year old suddenly had direct access to all of his parent's
money
This is illustrated in the three types of access

Public, Private, Protected


- There are three types of access
Public
Private
Protected
- The purpose of having types of access is to determine what people can access
If something is public, then EVERYONE can access it
This is NOT restricted to subclasses
In PIC 10A, when we created classes and member functions, we often just set
everything to public
We are then able to call the functions in the main function
The main function is an example of an outsider trying to access things
If a class's member variable/function is public, then main can access it
Consider the following class, which we'll use an example
class Example {
public:
int public_int = 5;
protected:
int protected_int = 3;
private:
int private_int = 1;
};
In our main function, suppose we do the following:
int main() {
Example e;
cout << e.public_int; // Will print out 5
return 0;
}
We can print out public_int because it's a PUBLIC member of the Example class; this
means that EVERYONE can access it
If something is private, then NO ONE can access it EXCEPT for the Example class
Study Guide Page 2

If something is private, then NO ONE can access it EXCEPT for the Example class
NO. One. This means no children, no outside functions, NO one
Any member functions of the Example class WILL be able to access it, but no one else
will
This means the following will break:
int main() {
Example e;
cout << e.private_int; // Error: private_int is inaccessible
return 0;
}
This is because private_int is inaccessible to main, since main is an outsider
Remember how subclasses inherit everything from parents?
They technically inherit private variables, but they become hidden to the child
This means that they aren't able to access them in their methods
For example:
class Parent {
public:
int getVar();
private:
int var = 5;
};
class Child : public Parent {
public:
int childGetVar();
};
int Parent::getVar() {
return var; // Works just fine because member of Parent
}
int Child::childGetVar() {
return var; // Will throw error because var is inaccessible to Child
}
The childGetVar() function above will fail because it tries to access var
However, var is private for Parent, which means it is hidden to Child
So what did I mean when I said that it still exists?
Even though Child can't directly access var in its member functions, it can
access getVar, which is a PUBLIC function of Parent
Remember, since it is a subclass of Parent, it inherits EVERYTHING from
Parent
This means it inherits both var and getVar
Even though Child is calling getVar, getVar is still a function of Parent
This means that it still has full access to var
Therefore, the following is perfectly valid:
Child c;
cout << c.getVar(); // Will print out 5
The last type of access is protected
This means that the variable will ONLY be accessible to itself and children
This is mainly used for getters and setters
In the example above, we created a public getVar function for the Parent class
However, this means that anyone can access it
What if we only wanted Child and other children to access it?
If we made getVar() protected instead of public, then this would do it!
Study Guide Page 3

If we made getVar() protected instead of public, then this would do it!


This does restrict outside functions from accessing information, however
Speaking of children

Subclasses
- To create a subclass of a parent class, we use a modified version of a class declaration
- Basically:
class Subclass : public Superclass { };
- Notice how we wrote public Superclass. This means that we're extending the Superclass as a
public class as opposed to a protected or private class.
Why does this matter?
If we extend the class as a public class, then all of the Parent's members will be inherited at
the same level of access that they are defined with
Public members of Parent will be inherited as public
Protected members of Parent will be inherited as protected
Private members will be inherited as private
- What happens if we extend as protected or private?
If we extend as private, then everything is private
This is terrible.
This means that all public and protected functions, functions which the subclass
should have access to, are now inaccessible because they are now private
If we extend as protected, then all public and protected members become protected
- Here's a table to help illustrate!
Public Inheritance Protected Inheritance Private Inheritance
Public Members

public

protected

private

Protected Members protected

protected

private

Private Members

private

private

private

- Notice that private members will always stay private


- Forgetting to specify the class inheritance type will default to private inheritance; this is bad and
often appears as an error on his test
Be on the lookout for class Subclass: Superclass { };

Summary
- Classes can borrow code from other classes by extending them, meaning they will inherit
everything
- Inheritance follows an is a structure, where a subclass is a superclass
- Three types of access control to restrict what is inherited: public, protected, private
Private: everyone can access
Protected: only subclasses can access
Private: only this specific class can access

Tricks/Things to Keep in Mind


- This section covers all the tricky things that he might test you on

Item 1
- Suppose we have the following class and function:
class Parent {
private:
int variable = 1;
void print( Parent p );
Study Guide Page 4

void print( Parent p );


};

void Parent::print( Parent p )


{
cout << p.variable;
}
Does the print function have access to p's member?
As a matter of fact, it DOES
Why?
This is because print is a member function of the Parent class
Since it is a member of the Parent class, it has full access to all private members of the
Parent class
Therefore, there will be NO error on p.variable
What about the following:
int main() {
Parent p;
p.print(p);
return 0;
}
This WILL break
This is because print is a PRIVATE function of class Parent
Since the main function is NOT a member of Parent, it does not have access to any private
functions of Parent
Therefore, it will throw an error on the p.print(p); line and say that the print function is
inaccessible

Item 2
- Suppose we have the following:
class Parent {
protected:
int x = 5;
};
class Child : public Parent {
public:
void print(int y);
};
void Child::print(int y) {
cout << y;
}

int main() {
Child c;
c.print(c.x);
return 0;
}
Print is a public function of Child, which means it should be valid in main
x is a protected variable of Parent, but Child should have access to it since it's protected
What will the following code do?
If you guessed error, you're right! Why?
When we try to pass c.x as an argument to the print function, we're doing so inside the main
function
Remember that the main function is an outsider; it doesn't have access to protected items
Study Guide Page 5

- Remember that the main function is an outsider; it doesn't have access to protected items

Polymorphism
- This will show up in both finding errors and reading code
You will have a snippet of code which you will have to read and write output

Overview
- Polymorphism - the ability to treat different objects related by a common base class uniformly
- Basically, we're taking advantage of the inheritance thing we just covered
Think of this as creating different permutations of a certain method
- For example, let's say we have an Employee parent class and an Engineer, Janitor, and Physicist
subclasses
- Let's say the Employee class has a doWork method
- Obviously, an engineer's do work is going to be different from that of a janitor, whose job is not
going to be THAT different from a theoretical physicist (<-- jab at my theoretical physicist friend),
but still somewhat different
- Given an arbitrary engineer, janitor, or physicist, we want to be able to tell them to doWork and
have them do their special job on their own
THIS is the guiding principle behind polymorphism
- There are three ways to do this:
Using a virtual function
Creating a polymorphic collection of objects
Using dynamic method binding

Function Overloading
- When an Engineer object is created, it automatically inherits the doWork method from the
Employee
- However, this is going to be a generic doWork method
- We want to specialize the doWork method for the Engineer
- To do this, we just define the doWork method again
void Engineer::doWork() {
// Whatever an engineer does
}
- This will overwrite the original doWork command so when you construct a new Engineer object
and tell him to do something, he'll do the new method
Engineer e;
e.doWork(); // will do what the overwritten method says to do

Data Slicing
- Data Slicing - the loss of data when converting from the object's type from a Parent to Child
- Let's consider the following code:
vector<Employee> v;
v.push_back( Engineer() );
- Engineers are technically employees, so we are allowed to push it into the vector
- However, when we created the vector, we created it specifically to hold Engineer objects, not
subclasses of Engineers
This means that any new variables or functions you add to the Engineer class will NOT be
preserved
- Data slicing sucks and will probably show up as an error on the exam
- To fix it, use a vector of pointers instead
vector<Employee*> v;
v.push_back( new Engineer() );This works because we use pointers, not the actual object
itself
Study Guide Page 6

itself
An employee pointer stores no data itself; merely that it points to objects of Employee type
When we create the object and have the pointer point to it, there is no data slicing because
we aren't trying to force the Engineer object into a container which it isn't designed for
- When we mentioned using a polymorphic collection of objects earlier, this is what we meant

Virtual Methods
- However, this isn't enough :(
- From the previous example, let's say we constructed a vector of Employee pointers and added a
new Engineer
- What happens when we do v[0]->doWork()?
- This will use the Employee version of the function, NOT the Engineer version
- To fix this, the method that is to be overwritten must be virtual
This is dynamic method binding
- If a method is not virtual, then the class will choose whatever version its type says it should have
For example, if we do the following:
Employee a; // will use the Employee version of doWork
Engineer b; // will use the Engineer version of doWork
Employee c = b; // will use the Employee version of doWork
Employee* d = new Engineer(); // will use the Employee version of doWork
Employee* e = &b; // will use the Employee version of doWork
Notice the problem we have here? Unless we create the object as an Engineer, our method
won't be overwritten
- Note that this is an error on the exam and something you will have to be able to read; ALWAYS
check if the method is virtual
If it is virtual, use the overloaded method
If it is not virtual, then check what the object was when it was created
UNLESS it was explicitly created as the subclass, then use the parent version of the
method
- To fix this, make the function virtual in the Employee class
- This means that the compiler will look for an overloaded version; if found, it will use the
overloaded version instead of the original version
- Note: once a function is declared virtual, it remains virtual in all derived classes, even if it is
overwritten by a non-virtual method

Abstract Classes and Friends


- This will show up as potentially a single error and a lot of MC questions
- This is mostly a conceptual topic more than anything else

Abstract Classes
Abstract Class - a class with at least one pure virtual function
Pure Virtual Function - a virtual function that is set equal to 0 in the class definition
Note that abstract classes CANNOT be instantiated; this was one of the errors in the midterm
How can we tell whether a class is abstract or not?
If in the class definition, you see something like:
virtual void function() = 0;
- If it's set to 0, then it is a pure virtual function, and that class will NOT be able to be created
directly
- For example, let's pretend we have a Parent class that contains a pure virtual function and we
have a Child class that extends it
We can do:
Child c;
Parent p = c;
We CANNOT do:
-

Study Guide Page 7

We CANNOT do:
Parent p;
This is an error on the exam; you CANNOT create objects of abstract classes
You are allowed to create pointers for abstract classes though
Why have pure virtual functions at all? Why have abstract classes?
Abstract classes are useful for organizing your code; it's not something you HAVE to have
For instance, let's say we're creating a geometry program
We might want to have a Shape abstract class to hold member variables and functions that
all shapes might have
However, it's impossible to say, getArea() of a shape since you don't know what type of
shape it even is

Friends
- Friend - a function or class that has direct access to a class's protected and private members
- This is a one-way thing; just because you designate another function/class as your friend does not
mean that you automatically become their friend
This is a very cruel metaphor for real life :(
- This will become more useful in later sections of this course, you don't really have to worry about
that right now
- The following is an example on how to designate a function and a class as a friend:
class Example {
friend Person;
friend void printX();
private:
int x;
};

void printX( Person p ) {


cout << p.x; // okay because I am a friend
}
- Notice the direct access to the Example class's private member x
- This is what friendship does!

Streams
- I find this topic really stupid and I don't see why he's teaching it
- Like seriously, this is all just google-able stuff
- If you're preparing for your final like I am right now :D Just write this stuff down on your formula
card and you're good to go
I'm so sorry for you if you're doing your first midterm :P
Don't worry, the questions aren't that difficult :D
- Nonetheless, on your first midterm, TWO, that's right, TWO of your short answer questions will be
about streams
Since there are only two short answer questions anyways, that means all of them are going
to be about streams
I've also been told that they will return for the final, so whatever

Overview
- You've worked with streams before in the form of cout and cin
- These are special input streams for the console (that's right, the c in cout and cin stands for
console!)
- These streams fall in the #include<iostream> library
- You're probably already familiar with some basic syntax as well
Study Guide Page 8

- You're probably already familiar with some basic syntax as well


For output, use <<
For input, use >>
If you have trouble remembering which is which and you've ever chatted with me in
the past, you'll probably notice that I say >> a lot
You know, cause I'm shortening >.>
Anyways, just remember that the smiley I make is for input :P
- The syntax for all the other streams that I'll cover in this section is the exact same :D
Really.
That's why I don't know why this topic's that important
- Also remember that getline(cin, <string variable name>); exists
The difference between using >> and getline is that getline grabs ALL of the data up to the
next \n character
>> will ignore white spaces and quit whenever it sees one
- Yeah, I don't think getline will be on the test never mind, ignore me
- Conceptually, think of streams as hoses
If it's an output stream, then things will be spit out of the hose
If it's an input stream, then things will be sucked into the hose
The hose itself is called a buffer; this is kind of a queue of things waiting to be processed
- The user will not be prompted to enter additional input until the buffer is empty
- Here's a couple of methods that will probably be useful for you
ignore()
This will ignore the next input item, INCLUDING whitespace
If you're reading in characters, it will ignore the next character in the stream
clear()
This will reset the stream's failstate; probably won't be using this on the test at all
peek()
Returns the value of the next char value without removing it from the stream
This is a good way to check what you're going to be manipulating before you actually do
anything to it
fail()
Returns whether the stream is currently clogged or there was an error opening a file or
formatting error or something. Basically, triggers whenever there's an error. Clear works in
tandem with this; if fail() returns true, running clear() will reset it such that fail() returns
false. THIS DOES NOT FIX THE UNDERLYING PROBLEM, however, you will still have to clean
up the buffer manually. Again, this probably won't show up on the test so don't worry about
it.
get( char& c )
Gets the next char value from the stream and stores it in the parameter c. This DOES NOT
ignore whitespace, so it is possible that you will get whitespace.
unget()
Places the character you just read back onto the stream where you found it, so it can be
read again
- Also, there are two special types of streams you'll be working with

File Streams
- This is basically just reading to/from a file instead of to/from the console
Study Guide Page 9

This is basically just reading to/from a file instead of to/from the console
To use a filestream, you have to #include<fstream>
Suppose you're given a string containing the filename called filename
To open the file for input or output, use the following syntax:

ofstream out( filename.c_str() ); // for output


ifstream in( filename.c_str() ); // for input
- Note that you have to add .c_str() to the filename string
- This is because ofstreams and ifstreams are old and can only read C-strings
Don't worry about what this is because you're taking PIC not CS 31 :P
- Still, remember to add it
- Once you've created your filestream object, reading stuff from it or printing stuff to it is exactly
the same as you've done before
- Before we launch into examples, there's two more commands you should learn specifically for
filestreams
eof()
Returns whether or not you've reached the end of the file; use this for reading input from
files. Let's demonstrate below.
close()
Detaches the stream from the file. Always, always ALWAYS use this command after working
with a fstream.
- For example, let's say that we'll be given a text file with a set of numbers and we want to sum all
those numbers
- The code is below:
int sumNumbers( string filename ) {
ifstream in( filename.c_str() );
int total = 0;
while( !in.eof() ) {
int num;
in >> num;
total += num;
}
return total;
}
- See? Not so bad. The only new parts are the eof() for the while loop condition and the creation of
the ifstream object at the beginning of the code
- Once we've created the ifstream, however, the rest of the code is pretty familiar
We still use >> to read in values into a temporary variable
- Note: use >> if you want to ignore whitespace and use get() if you want to include whitespace. On
the test, he'll tell you whether or not you should ignore whitespace or not
- Let's do an example with output
- Let's say we're given a string and we want to print out the string backwards into the given file

void printCharacters(string input, string filename) {


ofstream out(filename.c_str());
for (int j = input.length() - 1; j >= 0; --j)
out << input[j];

Study Guide Page 10

out.close(); // you will lose points if you don't put this here
}

- And tada! Yeah, I forgot to put in out.close() in the code above; oops, -2 for me :P
- Before we finish with filestreams, here's a question straight from the test :D
- Define a function that reads in input from the console, ignoring whitespace, and prints it one
character at a time to the file with the given filename
Repeat this process until you reach a '?'
Remember to close the output file stream
- Here's my code:
void type_to_file( string filename ) {
ofstream out( filename.c_str() );
char c;
cin >> c;
while( c != '?' ) {
out << c;
cin >> c;
}
out.close(); // remember to close the filestream
}

String Streams
- The other type of stream you will be working with is a string stream
- Basically, given a string, you're going to convert it into a stream object that you can >> and << to at
your leisure
- To do this, you must #include<sstream>
- Once done, you can convert a string as such:
string source = "Hello World";
istringstream in( source );

Study Guide Page 11

- Or quite simply just


istringstream in("Hello World");
- To create a new output string stream, use the following:

ostringstream out;
// Insert some code where you << stuff to it
string output = out.str();
That's all there is to it!
In terms of actually working with the string stream, you will still use << and >> as normal
For example, here's another one of his test questions:
Define a function that creates an istringstream object to convert the given money string into its
double value; Assume that the string always begins with the dollar sign '$'
- Andd
-

double convert_to_double(string money) {


istringstream in( money );
in.ignore(); // to get rid of the '$'
double result;
in >> result; // read in the value into the double
return result;
}
- Yup that's it. Yup, these are real test questions. They're not so bad, are they :P
- Just remember your syntax and you should be fine

Operator Overloading
- So he really hyped this topic up for the first midterm and then didn't test us on it at all
- Nonetheless, I'm sure it might show up for your midterm, and I was told that this is a big part of
the free response for the final, so why not

Overview
-

The basic gist of this topic is, you know all those operators you've been using? +, -, *, ==, !=, etc etc
You can override what they normally do!
This is useful when you are creating a custom class
There are three ways to overload them
As a member function of the class (so basically, what you would normally do)
Operator[]
Operator+=
Operator-=
Operator*=
Operator/=
Operator++
As a friend function of a class
Operator>>
Operator<<
As a nonmember function
Operator+
Operator Operator*
Operator/
Study Guide Page 12

Operator/
Operator==
Operator!=
Operator>
Operator>=
Operator<
Operator<=
The general rule is, if it is a stream operator, make it a friend
If it needs direct access to the object itself (such as += or []), then it should be a member
function
If it doesn't fit either of the first two rules, and all of the comparison operators fall here,
then make it nonmember
What about operator=, you may be asking? We'll get to that in the next section
There are two types of operators
Unary and binary
Unary operators only take a single argument; for instance, operator+= only needs one
argument, the input that you want to add to the given object
Binary operators take two; all comparison operators fall here, because you need two things
to compare
When overloading a binary operator, note that the left argument is the item on the left, and the
right argument is the item on the right
For example, consider the Set class (from your homework!)
Set operator+ (const Set& a, const Set& r);

- Let's say we have


Set r; Set s;
r + s;
- In this situation, r is going to be fed as argument a for operator+ and s is going to be fed as
argument b
This is because r is on the left of the operator and s is on the right
- Sometimes, the left argument is going to be a member of another class
This is always true for operator>> and operator<<
Consider operator>>
When we want to cin something, our syntax looks like:
cin >> variable
The variable might be the type of our class, but cin is always going to be an istream object
Therefore, we have to make this a friend function because our class does not have access to
istreams
That's why the operator>> declaration looks like this:
friend istream& operator>> (const istream& in, const Set& s);
- Thus, whatever stream object we use will be on the left side of the >> and the Set we want to read
in will be on the right

Member Operators
- This section encompasses all operators that should be member functions
- Namely, the operator+=, operator-=, etc etc
- Let's consider a ComplexNumber class
A ComplexNumber will have a real component and an imaginary component
Let's call the real component x and the imaginary y
Study Guide Page 13

Let's call the real component x and the imaginary y


- Our class is going to look something like this:
class ComplexNumber {
public:
ComplexNumber( double x, double y );
ComplexNumber& operator+= (const ComplexNumber& z);
private:
double x;
double y;
};
- When we add two complex numbers, we add the x components together and the y components
together
- Also note that we placed the operator+= definition into the class
Anything that modifies the calling object is always going to return by reference
This is why operator+= is of type ComplexNumber&
- The code for operator+= follows below:
ComplexNumber& ComplexNumber::operator+= (const ComplexNumber& z) {
x += z.x; // Add the parameter's x to our x
y += z.y; // Add the parameter's y to our y
return *this; // Return a reference to the calling object
}
- Not too difficult; the only thing you have to remember is that you have to return a reference to
the calling object
This is accomplished by "return *this;"

Nonmember Operators
- What about for nonmember operators?
- Since they are nonmember, they won't show up inside the class definition (well, duh)
- Rather, they'll show up in some arbitrary .cpp or .h file
You know, like the functions you did in PIC 10A before you even learned classes existed
- When overloading nonmember functions, first check whether you created any member functions
that you can re-use
- For instance, consider operator+
- What differentiates operator+ from operator+= is that operator+ returns a new value, whereas
operator+= modifies the calling object
- Well, we can work that into our code
ComplexNumber operator+ (const ComplexNumber& z1, const ComplexNumber& z2) {
ComplexNumber answer = z1; // Create a new ComplexNumber equal to the first param
answer += z2; // Add it to z2 by shamelessly reusing operator+=
return answer; // return the temp answer
}
- You'll notice some differences
- First of all, it doesn't return by reference; unlike operator+= which returned the calling object,
operator+ (and all nonmember functions) will return a new object
- You'll also notice the lack of a scope; this is because it is nonmember
Whereas we had ComplexNumber::operator+=, we only need operator+ because this is not
part of the ComplexNumber class
- Finally, we have two parameters instead of just one
When we add two complex numbers, the one to the left of the + sign will be z1, and the one
Study Guide Page 14

Finally, we have two parameters instead of just one


When we add two complex numbers, the one to the left of the + sign will be z1, and the one
to the right will be z2
- The logic behind a nonmember operator is pretty simple; create a new object, shamelessly reuse
the member function, and then return the new object

Comparison Operators
- These can be either member or nonmember functions
- An important thing to note about these functions is they all have to be const
- Consider operator==
If it's a member function
bool ComplexNumber::operator==(const ComplexNumber& z) const
And if it's not
bool operator==(const ComplexNumber& z1, const ComplexNumber& z2) const
- Two differences abound!
The member function only has one argument; this is because the calling object will be the
left side by default; for nonmember functions, you have to accept two parameters
You can tell the first is a member function because of the scope parameter (the
ComplexNumber::)
- The actual code for the operator overloading isn't very interesting
- Really, just write customized code to check if the two are equal, so basically just check if their x's
and y's are equal to each other
- If you had to then overload operator!=, save yourself some trouble and just write:
bool operator==(const ComplexNumber& z1, const ComplexNumber& z2) const {
return !(z1 == z2);
}
- Or for the member function
bool ComplexNumber::operator==(const ComplexNumber& z) const {
return !(this == z);
}
- ALWAYS re-use your code. No shame.

Friend Operators (Basically Streams)


- These are kinda tricky in that they're how he's probably going to test you on streams on the final
oh gosh he might actually do that that's a good question to study for ANYWAYS
- The left argument for streams will always be the iostream object
For >>, that would be an istream
For <<, that would be an ostream
- The return types for these will always be the iostream object by reference
Consider some code that you may have written in PIC 10A:
cout << "Hello " << "World" << endl;
- Notice how we chain the things together? The only reason we can chain them together is because
the << and >> operators return the stream object
- This is equivalent to seeing
operator<<( operator<<( operator<<( cout, "Hello" ), "World" ), endl );
- Oh god it's disgusting
- But that's how it is. The innermost operator<< will return the calling ostream object, which is cout
The second operator<< can then use that as its first argument to print out "World"
- Therefore, the return types for these will always be the iostream object by reference
And of course, the first argument must also be the same iostream object by reference, as
well
- The second argument, on the other hand, can be anything you want, but you probably want it to
Study Guide Page 15

- The second argument, on the other hand, can be anything you want, but you probably want it to
be the class you're working with
- For example, for the ComplexNumber class, we'd want our second argument to be that of a
ComplexNumber
- Suppose we want to print out the ComplexNumber in the format x + yi (or x - yi if y is negative)
ostream& operator<<( ostream& out, const ComplexNumber& z ) {
cout << z.x;
if( y > 0 ) // switch based on whether y is positive or negative
cout << " + ";
else if
cout << " - ";
cout << z.y << "i";
return out; // VERY IMPORTANT, DO NOT FORGET
}
Yeah, I know in his lecture notes he used an ostringstream, but screw it, this works just as
well
You can't directly concatenate integers or floats to strings, but I'm not doing that in my
above code
Rather, I print them out separately, which is perfectly valid
And
suppose
we want to read it in from that same format, x + yi / x - yi (with those spaces as well)
istream& operator>>( istream& in, ComplexNumber& z ) {
in >> z.x;
char c;
in >> c; // ignore whitespace by using >>
double y;
in >> y;
if( c == '+' )
z.y = y;
else
z.y = -y;
in.ignore(); // get rid of that i at the end
return in; // VERY IMPORTANT, DO NOT FORGET
}
- Do note that the ComplexNumber argument for operator>> cannot be const because we need to
directly modify that variable to read in the value
- While stream functions are more complex, there are only two of them yay :D Just memorize these
and you're golden

Postfix and Prefix


- So these are two very special operators in that they are both operator++
- If you aren't aware, prefix is ++x, whereas postfix is x++
- If you create operator++ with no arguments, then it is prefix; if you have one argument, then it is
postfix
- Also, the return types are different; prefix returns by reference while postfix returns a new
variable
ComplexNumber& ComplexNumber::operator++(); // this is prefix
ComplexNumber ComplexNumber::operator++(int unused); // this is postfix
- For prefix, directly modify the calling object; for postfix, return a new copy

Study Guide Page 16

Unary Operators
-

To be honest, he probably won't test you on postfix or prefix or anything in this section
I'm only including it because it's in the lecture slides
For unary operators such as - or !, (as in !x or -x), they should be friend functions
They take a single parameter, the class type by reference
friend ComplexNumber operator-(const ComplexNumber& z);

Implicit Conversion
- So notice how when we overloaded all of these operators, we set our type to the class?
ComplexNumber?
- What if we want to be able to add doubles to ComplexNumbers?
- Will we have to copy all of our operators and change the left argument to double?
- Nope! Implicit conversion!
Implicit Conversion - the process of wrapping another variable type in a constructor to
convert it to the calling type
That's a very confusing definition, sorry :(
- Also, Ouellette really likes implicit conversion; my TA really doesn't like implicit conversion, so, use
it at your own risk
- Basically what this means is, let's say we wanted to add doubles to our ComplexNumbers
- Let's say we had the following constructor
ComplexNumber::ComplexNumber(double d);
- And let's say we tried to do the following
ComplexNumber z(3, 5);
double a = 5.0;
z += a;
- Normally, operator+= takes in the argument (const ComplexNumber& z), and a clearly is not a
ComplexNumber
- However, the compiler will try its hardest to make this work
It will look for a constructor with a double as its argument and use that to convert a into a
ComplexNumber
Therefore, behind the scenes, it'll look like this:
z += ComplexNumber(a);
Which will work, because a is now wrapped as a ComplexNumber
- However, do note that the compiler still can't do this for things on the left of operators
a += z; // will throw an error because still not overloaded
You would have to overload += as a friend function as such:
double& operator+=( const ComplexNumber& z );
Which I'm not even sure will work because I don't know if we can do this for doubles
Nevertheless, you will not be asked to do this on the test; just be able to identify this as an
error

Errors
- These have shown up in error questions both on purpose and on accident
- Something to note: none of the operators we just discussed here are overloaded by default
ONLY operator= is overloaded by default
In the second midterm, Ouellette accidentally assumed operator== was overloaded when it
wasn't
Therefore, bonus points! This is good to know
- For questions that have shown up on the test intentionally
Using operator> but only overloading operator< in the class definition
In the class, he overloads operator< for the given class
However, in the actual code, he uses h > k
Study Guide Page 17

However, in the actual code, he uses h > k


Just because we overload one of the operators does not mean that it automatically
works in the opposite direction
You have to overload operator> manually for that to work as well
Using different argument types for overloaded operators
For example, consider the operators we overloaded for Complex Numbers
Consider the following code snippet:
ComplexNumber z( 3, 5 );
cout << ( 3 == z );
This is going to fail because our operator== function was only overloaded for
arguments of type ComplexNumber
Carefully check the arguments for operators!

Memory Management (Big 3)


- So our free response question for the first midterm was basically this
- We were asked to construct a class that overwrote the copy constructor, operator=, and the
destructor
So yeah, you should know how to do these
- Also, function pointers showed up as a single MC question, so these are probably useful to know
as well
- There are four types of memory
Code memory
Static data
Stack
Heap
- The big 3 refer to the heap and this is the most important section
- For the stack, you really only have to know it exists; basically, function calls go to the stack
- For static data, might be useful to know how it works but I highly doubt it'll show up as more than
just a MC question; didn't show up at all for us
- For code memory, yeah it'll show up in your homework, but no more than a MC question on the
test

Code Memory
- Really all you have to take away from this section is function pointers
- When you compile your C++ program, all of the functions you create will be put into a section of
memory called "code memory"
- You can create pointers to point to functions here and make them do stuff!
- The syntax for this is ugly and scary, but it's not that bad
- What's great about these pointers is you can store them in arrays and pass them to other
functions
- But I digress; let's look at what a function pointer actually looks like

Passing Function Pointers as Arguments


- Suppose we have the following two functions
int add( int a, int b );
int multiply( int a, int b );
- Both of them have the same return type and same arguments
- Let's say we want to create a function that performs this operation on two given parameters
int perform( int a, int b, int (*operation)(int, int) ) {
return operation(a, b);
}
- My gosh, that parameter looks like a monstrosity :( Let's break it down
Study Guide Page 18

- My gosh, that parameter looks like a monstrosity :( Let's break it down


int (*operation)(int, int)
- The first part makes sense; that first int is the return type and is nothing you haven't seen before
- The second part is the name of the function. Why is it enclosed in parentheses? Consider what
happens when we don't consider it in parens
int *operation(int, int)
- While that looks more readable, the problem is the computer is stupid and will interpret the int
*operation as int* operation
That is, it will assume it is a regular function that has a return type of int pointer
Therefore, you must use those parentheses
- Finally, we get to the arguments
The only difference is we omit the argument names
- And as such, it isn't that bad :D
- After we include it as an argument, you can use the function just as any other function
Notice how we can call operation(a, b) just fine
- But how do we actually pass a function as a parameter?
- Simple, like so:
perform( 3, 5, add );
- Just put the name of the function and you're good to go

Storing Function Pointers in Arrays


- As if passing functions as parameters wasn't enough, you can also stick them in arrays
- Let's create a function pointer array to store the pointers to our add and multiply functions
int (*op[2]) (int, int) = {add, multiply}
- Yeah, the syntax is a bit uglier than normal array syntax, but it's just something to memorize
- He's not going to give you a free response or short answer question on this, so just be able to read
it :P
- It will show up in the homework though, so be prepared

Static Data
- Static data memory holds both global and static variables
- Global Variable - a variable declared and defined outside the scope of any function or class
- Consider the following:
int variable = 5;
void foo( int x ) {
cout << x + variable;
}
- Notice how we created variable outside of the foo function
- Also notice how it isn't a member of any class
- Yet, we're STILL able to access it inside the foo function, because it's considered a global variable
We'll be able to access it anywhere within this program!
This is probably something you did in PIC 10A without even realizing what you were doing
- Static variables are a bit different
Static Variable - a local or member variable declared with the keyword static that persists in
all instances of its creator
- It will exist for the entire duration of the program and is only initialized once
- For example, consider the following:
int foo() {
static int x = 0;
return ++x;
}
Study Guide Page 19

int main() {
cout << foo() << endl; // Prints out 1
cout << foo() << endl; // Prints out 2
cout << foo() << endl; // Prints out 3
}
We initialize x in the first call to foo() and return ++x, which means we return 1
The second time we call it, x is already initialized and has value 1; when we call return ++x, we
now return 2 because we increment it from 1
The same for the third iteration
Is this cool? Sure
Are you going to use this? Probably not; if you ever face a point where you're going to need to use
it, either make that variable a member variable or make it a global variable
I mean, I guess this is more efficient, but you could still do all the problems without ever
having learned this

Stack Memory
- Stack - a type of memory where ordinary nonstatic local variables are stored
- Also function calls. Function calls are stored here too.
- This is the stack which the eponymous Stack Overflow comes from

- Whenever you create variables in the main method, they get pushed to the stack
- Whenever you make a function call in the main method or something, it gets pushed to the stack
- Just know that this exists; who knows, it might even show up as a single MC question someday

The Heap/Big Three


- Okay, this is the largest topic for this section and the most important
- Heap - a type of memory where nonstatic variables created explicitly using the new operator are
stored
These variables persist and the heap is the place to store data you want to keep for a long
time
- The variable exists until you explicitly deallocate it using delete on its pointer
If you lose the pointer, however, you won't be able to deallocate it
It'll last FOREVER. This is called a memory leak; try your hardest to avoid these
- The Big Three consists of the following:
Copy Constructor
Operator=
Study Guide Page 20

Operator=
Destructor
You will have to be able to write code for these, plus the default constructor
If you don't explicitly create these, C++ will automatically create a default constructor, copy
constructor, operator=, and destructor for you
That's why we're able to use operator= for custom classes when we neglect to explicitly
define them
So why are these the big 3?
These are things you should always overwrite when performing operations on the heap
For starters, we need a destructor to deallocate things from the heap
We also need to overwrite operator= to perform a deep copy
Deep Copy - creating a copy of the values instead of just creating a reference
Consider the following code:
int x = 5;
int y = x;
x = 3;
cout << y; // prints out 5
This is a deep copy. When we do y = x, we set y equal to the VALUE of x, not equal to x
So what? Well, pretend this performed a shallow copy, that is, it set y equal to x
Then, when we did x = 3, y still equals x and would equal 3
Then cout << y would print out 3 instead of 5
Wouldn't that be irritating?
To avoid that kind of stuff, we have to create deep copies using operator=
The final item is the copy constructor
Basically, given a parameter equal to our current type, we create a new object equal to that
Consider the following
ComplexNumber::ComplexNumber( const ComplexNumber& z );
Given a ComplexNumber, it will create a copy of this
In order to avoid creating a shallow copy (the same problem that operator= faces), we
should define our own copy constructor
What's great about the big 3 is it should really just be the big one, operator=
If you play your cards right, the only thing you actually have to write code for is operator=
Let's consider the following example, which coincidentally is the big free response question from
the first midterm
Given the following class, define all the functions. Store the member variable p on the heap:
using namespace std;

class Character {
public:
Character(char value);
Character(const Character& c);
~Character();
Character& operator= (const Character& c);
private:
char* p;
};
- Basically, the big three plus an additional constructor
- To approach this problem, let's start with the constructor
This should just create a new Character object and set p equal to the value; easy!
Character::Character(char value) {
p = new char(value);
}
Yeah, in order to create primitives on the heap, you have to say new type(value)
If you didn't know this syntax before, learn it now :P
That's all there is!
Study Guide Page 21

That's all there is!


- Okay, we're done with that; now let's tackle the copy constructor
Character::Character(const Character& c) {
*this = c;
}
- The purpose of a copy constructor is to create a new object and set it equal to the given value
Well, if we want to set it equal to the given value, why don't we just use operator=? That's
basically what it's for
Score one for lazily reusing code :P
- Alright, moving on; the destructor
- Basically, we just have to deallocate all the member variables placed on the heap, so basically just
p
Character::~Character() {
delete p;
}
- And that's it
Note, if your member variable is an array, you have to use delete[]
- By the way, that was 12 out of 20 points right there in like what, 9 lines of code?
- Not too much to memorize, I hope :P
- Okay, we're finally at the largest component: operator=
Character& Character::operator=(const Character& c) {
if( this != &c ) {
p = new char(*c.p);
}
return *this;
}
- When performing this operation, you have to remember to check if the current object is already
equal to the calling object
To do this, you have to check if the this pointer points to the same location as the given
argument
That's the
if( this != &c )
If they're already equal, great! You don't have to do anything; just return *this
By the way, *this is just itself. This is a pointer to itself, and * dereferences it, so *this
gets the value of itself
If they aren't equal, we have to set p equal to c's value
Since this is a member function, we can access c's private p pointer
Therefore, c.p will give us the pointer to c's value
However, we want the VALUE, not the pointer; therefore we have to dereference it
*c.p will give us c's value
Finally, we take that value and set p equal to it
p = new char(*c.p);
- Something to note:
Character c;
Character d = c; // will call the copy constructor
Character e;
e = c; // will call operator=
- If the object is being initialized, then using operator= will actually call the copy constructor. Not
sure if this tidbit of information will be useful but oh well
- Also, destructors can be inherited
Let's say you defined a destructor for a subclass
After it finishes executing, it will automatically call the base class's destructor next
It deconstructs itself in reverse order, reaching the base class last
The reason for this is because, as we extend the base class, we will continue to add
new functions and member variables
Study Guide Page 22

new functions and member variables


If we delete the base class first, then some of those member variables might break
because they reference base class material
Therefore, we delete them in reverse order; subclass stuff gets deleted first and base
class stuff next
If your class has at least one virtual method, then you should always make your destructor
virtual
If you don't make it virtual, then subclasses of this class won't call their own
destructors
For example:
Person* p = new Student();
delete p; // will call Person destructor but not Student destructor if Person
destructor is not virtual
The problem with destructor inheritance is that the base class's destructor will always be
called, but the subclass's destructor might not be; to resolve this, make sure the destructor
is virtual
- And we're done! That's it. That was a whole free response question to

Recursion and Backtracking


- Okay, I can't teach you recursion if you don't already know recursion
- That's the bad news; more bad news! One free response question on the midterm is guaranteed
to be a backtracking problem
- The good news is even if you don't know anything about recursion, you can still memorize the
format to the backtracking problem and get full score on it

Overview
- Backtracking is just a super large recursive problem solving technique
- It has the following structure
Base case: if you're at the end, then you're done; return true;
If you can legally do what you're trying to do in your current position, then do it
Recursively call the command on the next available position
If that command returned true, then return true
Else, undo what you did and return false
Return false;
- There are two formats to this problem; the first is making the function a bool, and the second is to
make it a void
- In the homework, you make it a void, which I find ridiculous but oh well
- He will either make it a bool or void on your midterm; mine was a bool

Study Guide Page 23

Bool Version
-

Consider a maze that's a 2D array of integers


The "paths" that you can visit are 1s, whereas if you can't visit it, it'll be some other value
We're given the canVisitSquare(int maze[N][N], int x, int y) method to check if we can visit it
Define a recursive function
bool solveMaze( int maze[N][N], int x, int y, int& moves )
If we make it to the bottom right corner, then we've solved the maze
Otherwise, we try to move south and then east and repeat until we solve it
As we move through the maze, increment moves and set the current position to moves;
remember to reset it if you un-visit a spot
Let's do this:
bool solveMaze( int maze[N][N], int x, int y, int& moves ) {
// base case
if( x == (N - 1) && y == (N - 1)) {
maze[x][y]++;
return true;
}
if( canVisitSquare( maze, x, y )) {
moves++;
maze[x][y] = moves;
if( solveMaze( maze, x + 1, y, moves ) ) // visit the east
return true;
else if( solveMaze( maze, x, y + 1, moves ) ) // visit the south
return true;
else {
Study Guide Page 24

else {
moves--;
maze[x][y] = 1; // reset the spot we just visited
return false;
}
}

return false;
}
Okay, let's break this down
The first step is to write the base case. Even if you have absolutely NO idea what you're doing, you
can write down the base case
if( x == (N - 1) && y == (N - 1)) {
maze[x][y]++;
return true;
}
Basically, check if you've made it to the end; if yes, return true
Also, I completely forgot you could use the N from the arguments for array size here; that might
be important information for you in the future :P
NEVER, ever ever EVER forget the base case or else you'll be stuck in infinite recursion

- All you have to do for the base case is check if you are done; if yes, return true
- Okay, let's break down the next part
if( canVisitSquare( maze, x, y )) {
// some stuff
}
- He's going to give you a method to check whether you can perform your task here
All you have to do is use that method in an if statement here
- For the stuff inside the if statement, there are three parts
Perform your move here
Recursively call the command for the next step and check if it is valid
If it is not valid, undo what you did
Study Guide Page 25

If it is not valid, undo what you did


- Let's look at the first part
moves++;
maze[x][y] = moves;
- This is just doing what he told us in the problem; basically, we are visiting the current spot by
doing this
- The second part isn't too bad either
if( solveMaze( maze, x + 1, y, moves ) ) // visit the east
return true;
else if( solveMaze( maze, x, y + 1, moves ) ) // visit the south
return true;
- What makes this problem unique is there are two recursive calls in this problem; usually, you only
have one!
- Still, all you do is recursively call the function for the next available spot
For this problem, we move to the right to check the east and move down to check the south
- Finally, we have to include the code for what happens if none of the recursive calls return true
else {
moves--;
maze[x][y] = 1; // reset the spot we just visited
return false;
}
- All this does is undo the first part and return false
- FINALLY, there is one last important part that I forgot to put on my test
return false;
- Yes, this is ALWAYS going to be your last line of code (before the last curly brace, that is)
- If you go through the entire backtracking process and don't have any successes, return false
- And that is the entire backtracking process
It's okay if you don't entirely understand it or can't visualize it in your head; as long as you
memorize the format, you'll get full score
Of course, it HELPS to actually understand the process, but it's not necessary

Void Version
For your homework, he'll make you do a void version as well
I find this version to be much less intuitive than the boolean version
Basically, you'll pass a bool by reference as one of the arguments in the function
Here's a sample problem from his other midterm
Suppose you have a 1D array of numbers from 1 to 9 in random order
Some of the numbers will be 0 while others will already be placed there
Ex: 8 0 0 4 0 6 0 0 7
Write a function to fill in all the 0s with numbers from 1 to 9, without any duplicates
- Here's the class provided:
class Sudoku {
public:
bool number_on_square(int column) const;
void fill_square(int column, int number);
void erase_square(int column);
void next_square(int column, bool& success);
bool valid_square(int column, int number) const;
private:
static const int BOARD_SIZE = 9;
int num_squares_filled;
};
- Our mission is to define the next_square function, so let's do it
-

Study Guide Page 26

void Sudoku::next_square( int column, bool& success ) {


// base case
if( num_squares_filled == BOARD_SIZE ) {
success = true;
return;
}
for( int x = 1; x <= BOARD_SIZE; ++x ) {
if( valid_square( column, x )) {
fill_square( column, x );
next_square( column + 1, success );
if( success ) {
return;
} else {
erase_square( column );
}
}
else if ( number_on_square( column )) {
next_square( column + 1, success );
if( success )
return;
}
}
success = false;

}
- This problem is a bit different than before, but it more closely resembles the homework, so there's
that!
- Still, the format's pretty similar
- Base case
if( num_squares_filled == BOARD_SIZE ) {
success = true;
return;
}
- If we're at the end condition; that is, we've reached the end of the board, we set success equal to
true and return
- Since it's a void, we can't just return true; we have to set the argument bool to true, and then
return to quit
- Otherwise, we iterate through all of the possible values to place onto the board
Yeah, I know we didn't do this for the previous problem, but we didn't have to
You have to do something similar to this for your homework
- For every iteration, we check whether we can place the number on the given spot
By the way, there are two checker methods for this, valid_square and number_on_square
I didn't provide the documentation for them; they're provided on the test, but here's what
they do
valid_square checks whether the current square is non-zero AND whether the number
you're trying to place has already been placed on the board
number_on_square only checks whether the current square is non-zero
Anyways, for every iteration, we use valid_square to check if the current spot is nonzero and
to make sure this number doesn't already exist on the board
If so, then:
fill_square( column, x );
next_square( column + 1, success );
Study Guide Page 27

next_square( column + 1, success );


if( success ) {
return;
} else {
erase_square( column );
}
We fill the square using the helper method (step 1)
We make our recursive call (step 2)
We check if it was successful; if it was successful, we quit, otherwise, we undo what we did
(step 3)
- Unfortunately, this problem also has a special component
Since valid_square checks if the number already exists on the board, the code won't
progress once it reaches a number that was already placed on the board at initialization
To fix that, we need to add another block
else if ( number_on_square( column )) {
next_square( column + 1, success );
if( success )
return;
}
If valid_square is false, we know that one of two things (or both) must be true
That number was already placed on the board
There is already something on that square
We want to find the case for when there is already a number on the board placed at
initialization
This means we want the case where there is already something on the square AND
the number in question has already been placed on the board
To do this, we need to check if there if there is something on the board
The way valid_square works is it first checks if number_on_square is true; if it is true,
then it checks whether the number was already placed
We do else if( number_on_square(column)), because we know that if we reached this
point then either number_on_square was false or number_on_square was true and
the number was already placed
If this is true, then we know that the number was already placed, because that's
the only other reason why valid_square would be false
I feel like this is getting super complicated and I'm even confusing myself right
now
Basically, if there is already a number on the square that we're visiting, then it was probably
placed at init because we move from left to right, so there's no reason why we would have
placed that
Anyways, once we get to this point, we recursively call next_square to move to the
next spot
If true, then we quit
If not true, well, we don't have to do undo anything because we didn't do
anything
- Finally, at the very end, we have to set success = false; this is the very last step
- And these are the two versions of backtracking! Just study the homework really, really hard for
this version, I guess

Searching and Algorithm Analysis


- A binary search function is probably going to show up on your second midterm and you will be
asked to follow it through
- You'll also have to know how to reconstruct a function call stack, but that's not anything difficult
- Finally, algorithm analysis questions will show up in the multiple choice section

Study Guide Page 28

Overview
- There are two types of searching you will be responsible for knowing: linear search and binary
search
- That's about it

Linear Search
- Linear search works by beginning at the start, iterating through until you either find the item
you're looking for or you reach the end of the list
- Here's the code:
int linear_search( const int A[], const int SIZE, int target ) {
for( int x = 0; x < SIZE; x++ ) {
if( A[x] == target )
return x;
}
return -1;
}
- Basically, just uses a for loop to find the index of the target
- If the target cannot be found, it returns -1
- This is a pretty simple algorithm, and there isn't much to say about this except for the fact that it's
O(N) (which will make no sense to you now, but will make more sense as you go into the next few
sections)
- Linear search's advantage over binary search is that it has the same level of efficiency regardless
of whether the list you're searching through is already sorted or not
Binary search can't work on unsorted lists
Speaking of which

Binary Search
- Binary search works by dividing and conquering
- You ever play that game where you ask someone to think of a number between 1 and 100 and you
try to guess it?

- Let's play a variant of that game where the person who's guessing the number will tell you if your
guess is higher or lower than the number they chose
- To start, a good guess will be 50
This will divide the numbers neatly in half; either you guessed it right, or you will be told
which 50 numbers you no longer have to consider (1 to 50 or 51 to 100)
Let's say I was told it was higher
I know the number falls between 51 and 100 then
My next guess will divide that range in half; I'd guess 75
Depending on what I'm told, I will then be able to eliminate another 25 numbers
Study Guide Page 29

Depending on what I'm told, I will then be able to eliminate another 25 numbers
Notice how we keep dividing in half
At worst, you will have to make 6 (or is it 7) guesses!
That's pretty good!
Meanwhile with linear search, if my number were say, 99, then you'd have to make 99 guesses
Clearly binary search is better if you're given a sorted list
Let's examine the code:
int binary_search( const int A[], int first, int last, int target ) {
if( first > last ) return -1;
int middle = (first + last)/2;
if( A[middle] == target ){ return middle; }
if( A[middle] > target )
return binary_search( A, first, middle - 1, target );
Else
return binary_search( A, middle + 1, last, target );
}
Note that this is a recursive function
The base case is that we reach a point where either we've found the target or it is no longer
possible to search in the range we're given
Otherwise, we find the value that's in the middle of our range
We compare to our target; if our middle is currently greater than the target, then we need
to search the target's left (basically, all the values less than the middle)
Otherwise, search all the values greater than the middle
We perform this task recursively, by calling ourselves with a revised boundary
For instance, let's say we're given the following:
int A[] = {2, 5, 7, 8, 9, 11, 12, 14, 20};
binary_search(A, 0, 8, 14);
Well, we first check if first > last; nope, 0 is not greater than 8
Alright, let's calculate the middle; this is (0 + 8)/2 = 4
A[4] is equal to 9
However, we're looking for 14; since our target is greater than the middle, we want to
search the larger values
Thus, we will return binary_search( A, 5, 8, 14 );
Okay, let's see what that does for us
First is not greater than last because 5 is not greater than 8
We calculate the middle -> (5 + 8)/2 = 6
A[6] is equal to 12
12 is still smaller than our target, so we return binary_search( A, 7, 8, 14 );
Let's do this again
7 is not greater than 8
Our middle is (7 + 8)/2 = 7
A[7] is equal to 14, so we're done! We return middle, which is 7
Remember all the other binary_searches we did before? Well, they form a function call stack that
looks like this:
binary_search( A, 7, 8, 14 );
binary_search( A, 5, 8, 14 );
binary_search( A, 0, 8, 14 );
The first binary search with args A, 0, 8, 14 is still waiting for a response from A, 5, 8, 14, which is
waiting for a response from A, 7, 8, 14
Since A, 7, 8, 14 returns 7, it passes that response to A, 5, 8, 14
We told A, 5, 8, 14 to return whatever binary_search( A, 7, 8, 14 ) returned, so it will also return 7
Finally, we get to the bottom call: A, 0, 8, 14, which will return what A, 5, 8, 14 returns (which is 7)
Therefore, our last function will also return 7 as well
You'll be told to construct a stack exactly like this on the midterm, so be sure you understand this
Study Guide Page 30

- You'll be told to construct a stack exactly like this on the midterm, so be sure you understand this

Algorithm Analysis
- Ehh, see the Big-O section. I feel that this should be merged with that and will make a lot more
sense after you read that

Sorting
- I happen to like sorting. After reading this section and his lectures, you will probably end up not
liking sorting. Then you'll end up hating everything I like because I have very poor taste in liking
things :P
- I'll take that risk. ANYWAYS, for the exam, you will be asked:
1 MC question where you're given code and need to identify what sorting algorithm it is
1 free response question where you're given a string of numbers and need to sort it
- Really not all that bad, but it's important to understand the sorting algorithms conceptually
- For the midterm, know how to do bubble sort, selection sort, and insertion sort; he is most likely
going to make you perform selection sort or insertion sort, since bubble sort takes too much time
and merge sort, shell sort, and radix sort are too complicated to write out

Overview
- Before we jump into the different sorting algorithms, I like to divide them into two categories (this
will help later on in the Big-O section)
Bad sorting algorithms
Bubble Sort
Selection Sort
Insertion Sort
Good sorting algorithms
Shell Sort
Merge Sort
Quick Sort
Radix Sort
- You do not have to know quick sort for this class at all
- However, he covers it briefly in his lecture for some reason, and this is the ultimate job interview
question; when they ask you to sort something, be all, "yo I got this" and then describe how quick
sort works
Basically what my roommate tells me
- Anyways, the difference between bad sorting algorithms and good sorting algorithms is that good
sorting algorithms are optimized for really large data sets
- You'll see what I mean in the later sections. For now, this is just an introduction to each sorting
algorithm and an example of how they work. Onwards!

Bubble Sort
- Bubble sort is named this way because the largest elements "bubble" to the top
What the heck does that mean?
- Bubble sort works by iterating through the entire list (well almost, iterating until the second last
element) and swapping the current index with the next element if the current element is bigger
than the next element
Let's say you have a list: 9 1 3 5 4
You start at 9 and check whether it's bigger than the next element, 1
Of course it is; swap them
Now we're at 1 9 3 5 4
We're now at index 2, which is 9 and compare to index 3, which is 3
Swap!
Now we're at 1 3 9 5 4
Study Guide Page 31

Now we're at 1 3 9 5 4
Notice how the 9 is moving up in the world list
It's "bubbling" up to the top!
- Once it finishes an iteration, it will start over again and do the swapping thing until it runs through
an entire cycle without swapping anything
This is unique about bubble sort; it is the ONLY sorting algorithm with a built-in system to
check whether the list is already sorted
- Note: Every new iteration iterates for one less item
The first iteration runs from index 0 to index N - 1
The second iteration runs from index 0 to index N - 2
The third runs from index 0 to index N - 3
The reason for this is because the largest element in the list will tend to bubble to the front
of the list
The second time we iterate, we're looking for the second largest element because we
know the largest element is at the end of the list
Therefore, we will bubble the second largest element to the second last index
Then, we will repeat again for the third largest element to get that to the third last index
- Hopefully that makes sense. Time for an example!

Sample Run
- We begin with 0 5 3 2 1 4
First iteration:
0 5 3 2 1 4 // index 0: 0, index 1: 5, don't swap because 0 < 5
0 3 5 2 1 4 // (this is after swap) index 1: 5, index 2: 3, swap because 5 > 3
032514
032154
0 3 2 1 4 5 // Finished first cycle; notice we iterated 5 times and there are 6 elements
Second iteration:
0 3 2 1 4 5 // Dont swap because 0 < 3
0 2 3 1 4 5 // We're at index 1 now, which is 2 after the swap (3 before the swap)
0 2 1 3 4 5 // We're at index 2 now, which is 1 after the swap
0 2 1 3 4 5 // Compare index 3 to index 4; 3 < 4 so no swap
// Finished second cycle; notice how we iterated 4 times, 1 less than last
Third iteration:
0 2 1 3 4 5 // No swaps!
0 1 2 3 4 5 // Swap!
0 1 2 3 4 5 // 3rd iteration, 1 less than last time; notice how it's sorted, but since we
made swaps this time, we still have to go through another cycle
Fourth iteration:
0 1 2 3 4 5 // No swaps; it's already sorted, but we still have to iterate
0 1 2 3 4 5 // And we're done
Since there were no swaps during that cycle, we quit!
- Notice how each cycle we perform one fewer iteration than the previous cycle
We have 6 elements, so our first cycle is 5 iterations
Then we do 4, then 3, then 2
- On the test, he will make sure you write the right number of iterations; even if there are no swaps,
you still have to write each iteration
You don't have to write out every swap though, just how the array looks after every
iteration
So you'd write:
0 5 3 2 1 4 // before any calls
0 3 2 1 4 5 // after first iteration
0 2 1 3 4 5 // after second iteration
0 1 2 3 4 5 // after third iteration
Study Guide Page 32

0 1 2 3 4 5 // after third iteration


0 1 2 3 4 5 // after fourth iteration; then trigger quit because no swaps
- The important thing to remember is the number of iterations you have to do
Your first pass through the list will be the number of elements - 1
Each subsequent pass will be one less than the previous pass
So the second pass will be number of elements - 2, and so on

Selection Sort
- The next sort is Selection Sort
- Conceptually, it works by iterating through the list from the beginning and finding the smallest
element
Then it will swap the first element with the smallest element
- This process will repeat, but instead of starting from the beginning of the list, it will start from the
second spot
- This means it will find the second smallest item in the list and swap it with whatever the second
item in the list is

Sample Run
- We begin with 5 3 2 0 1 4
We iterate from index 1 to N to find the smallest element; once found, we'll swap it with the
first element
0 3 2 5 1 4 // 0 was the smallest, swap with 5
Now that we know the smallest element is at index 1, we repeat the process but iterate
from 2 to N
0 1 2 5 3 4 // 1 was the smallest, swap with 3
Now repeat from 3 to N
0 1 2 5 3 4 // 2 was the smallest; didn't find any smaller, so no swaps
Now repeat from 4 to N
012354
And again, for the last time; remember, Selection Sort performs N-1 iterations
012345
- Without the catch to check if it's sorted, even if you're given a sorted list, selection sort will still
perform N-1 iterations
- On the test, you'd write it in the format:
5 3 2 0 1 4 // before any calls
0 3 2 5 1 4 // after first iteration
0 1 2 5 3 4 // after second iteration
0 1 2 5 3 4 // after third iteration
0 1 2 3 5 4 // after fourth iteration
0 1 2 3 4 5 // after fifth iteration
- Include the comments; they will clear up any ambiguity and help you argue for points back if they
dock you

Insertion Sort
- Insertion sort divides the array into two sections, a sorted side and an unsorted side
- The sorted side is generally the left side
- It begins with 1 element in the sorted side (since a list of size 1 is always sorted) and N - 1 in the
unsorted side
- It then takes the first element from the unsorted side and sorts it into the sorted side
Basically, it will keep shifting it one step over to the left until the value on the left is smaller
than it
- What this means is that if you have an array that's already sorted, it won't perform any swaps or
anything
It will still iterate through the array, but it won't actually perform any actions
Study Guide Page 33

It will still iterate through the array, but it won't actually perform any actions
This makes it second best or equal to BubbleSort in that it won't do anything when given a
sorted array
Of course, when you're given an unsorted array, insertion sort is superior to bubble sort in
many ways

Sample Run
- We begin with 8 3 6 4 0 (this is a problem directly from my midterm by the way)
The 8 forms the sorted part of the array and the 3 6 4 0 is the unsorted part
We get the first element from the unsorted part, 3, and we want to sort it into the sorted
part
Basically, we keep swapping it with the element to its left until it encounters something
smaller than it or it is the smallest
This will yield:
38640
Our sorted array now encompasses 3 8, while the unsorted is 6 4 0
We then sort the first element of the unsorted, 6
36840
And then 4 0 were left
34680
03468
- On the test you'd write:
8 3 6 4 0 // before any calls
3 8 6 4 0 // after first iteration
3 6 8 4 0 // after second iteration
3 4 6 8 0 // after third iteration
0 3 4 6 8 // after fourth iteration
- Remember that insertion sort will make N - 1 iterations

Shell Sort
- Shell sort is an improvement on Insertion Sort made largely because Insertion Sort makes too
many swaps
- The idea is that we create a sequence of h values and we divide the array according to that
For instance, if we have an h value of 3, then our first sub-array consists of index 0, 2, 5, 8,
etc etc
Our second sub-array would be 1, 3, 6, 9, etc
We then perform insertion sort on each of these subarrays
- Once done, we then decrease our h value, repeating this process until we reach an h-value of 1
- Once we're at an h-value of 1, then our "subarray" is just our regular array
Insertion sort on that will then result in a fully sorted array!
- But wait, how is this more efficient, you might be wondering; we're performing insertion sort like
a hundred times inside shell sort
That may be true, but shell sort's real strength lies in the fact that we don't perform as many
swaps in each iteration
Sure, we may have more iterations, but we perform far fewer swaps
This is very noticeable on large lists of values
- Shell sort will provide performance almost on par with merge sort (which we'll get to next), while
insertion sort will get nowhere near mergesort on large lists of values
- On SMALL lists of values, however, shell sort is LESS efficient than insertion sort

Sample Run
- Suppose we have the following array: 8 4 1 3 7 9 2 5 0
Notice that this is larger than stuff we've done in the past
- Our h-values are going to be 3, 2, 1
Study Guide Page 34

- Our h-values are going to be 3, 2, 1


For h = 3
We begin at index 3 (because h = 3)
Our first 3-array at this point is 8 3 2
Remember how in insertion sort we begin at the second element?
In this case, index 3 is the 2nd element; the first element is 3 - 3 = 0
The third element is 3 + 3 = 6
Therefore, we're at indices 0, 3, and 6, which gets us this array
Perform the first iteration of insertion sort on this
Okay, we now have 3 8 2
In the large array, that looks like 3 4 1 8 7 9 2 5 0
We're now at index 4
Our subarray is found at indices 1, 4, 7
This yields subarray 4 7 5
We perform the first iteration of insertion sort here which doesn't change
anything
We still have 4 7 5
Our large array is still 3 4 1 8 7 9 2 5 0
We're now at index 5
Subarray from indices 2 5 8
Subarray is 1 9 0
Insertion sort; doesn't change anything
Still have 1 9 0
Large array is still 3 4 1 8 7 9 2 5 0
Now at index 6
We've returned to subarray 3 8 2
We insertion sort it! Now it's 2 3 8
Our large array looks like 2 4 1 3 7 9 8 5 0
Notice how every time we perform a sort on the subarray, we're really swapping
it in the large array
We just use tunnel vision to focus on the small array to find what it is we'll
be swapping
Now at index 7
Back to subarray 4 7 5
Now 4 5 7
Now 2 4 1 3 5 9 8 7 0
Now at index 8
Back to subarray 1 9 0
Now 0 1 9
Now 2 4 0 3 5 1 8 7 9
Since index 8 is the last index, we're done with this h value
The array now looks like 2 4 0 3 5 1 8 7 9
Now h = 2
We begin at index 2 (because h = 2)
This has the subarray 2 0 5 8 9
2 is the sorted region, 0 5 8 9 is the unsorted
Insertion sort! First iteration only, which results in 0 2 5 8 9
Large array now looks like 0 4 2 3 5 1 8 7 9
Index 3
Subarray is 4 3 1 7
Now 3 4 1 7
Now 0 3 2 4 5 1 8 7 9
Index 4
Subarray is 0 2 5 8 9
Still 0 2 5 8 9 (0 2 5 is sorted, 8 9 is unsorted)
Study Guide Page 35

Still 0 2 5 8 9 (0 2 5 is sorted, 8 9 is unsorted)


Large array doesn't change
Index 5
Subarray is 3 4 1 7
Now 1 3 4 7
Large array is now 0 1 2 3 5 4 8 7 9
Index 6
Subarray is 0 2 5 8 9
Still 0 2 5 8 9 (0 2 5 8 is sorted, 9 is unsorted)
No changes
Index 7
Subarray is 1 3 4 7
No changes
Index 8
Subarray is 0 2 5 8 9
No changes
The array now looks like 0 1 2 3 5 4 8 7 9
Now h = 1
This is just regular insertion sort at this point
Index 1 (h = 1)
012354879
Index 2
012354879
Index 3
012354879
Index 4
012354879
Index 5
012345879
Index 6
012345879
Index 7
012345789
Index 8
012345789
And ta-da, we're sorted!
- The hardest part of this is keeping track of what your subarray is
- I highly doubt that he'll make you do this on the midterm; he might on the final though, cry cry the
final's in 10 hours :(

Merge Sort
- This is my favorite sorting algorithm, and I use this to sort papers after grading :D
- It's a recursive sorting algorithm that divides and conquers
- Here's how I sort papers:
Divide the stack into a lot of smaller stacks, usually 2 papers each
Sort the 2-paper stacks, which takes like no time at all
Now, pick two-paper stacks at random, and "merge" them
By that, I mean I look at the top of both of the two paper stacks and I take the paper
that comes first in alphabetical order
I repeat this process until both stacks are empty
Then I'll end up with a bunch of 4-paper stacks
Repeat the process to merge the 4 paper stacks
Repeat until I finish sorting
This sounds like a convoluted way to sort, but trust me, sorting papers is a LOT faster than
Study Guide Page 36

This sounds like a convoluted way to sort, but trust me, sorting papers is a LOT faster than
manually sorting
When you manually sort papers, you basically perform insertion sort, and it takes forever to
find the right place to put in the paper
- I'm not crazy! This actually works well! :(
- Also, this is a pretty good sorting algorithm; this is the "worst" acceptable sorting algorithm to talk
about at a job interview
If they ask you to sort something, you should generally either use heapsort or quicksort, but
if you can't use either one, then mergesort is acceptable
- Anyways, here's merge sort with numbers

Sample Run
- Let's sort 8 3 4 0 1 5 7 2, shall we?
Let's divide this into two piles
8340
Let's divide this into two piles
83
Let's divide this into two piles
8
3
40
Let's divide this into two piles
4
0
1572
Let's divide this into two piles
15
Let's divide this into two piles
1
5
72
Let's divide this into two piles
7
2
At this point, we have 8 sorted piles
Since each pile only has one item, they have to be sorted
Now, let's merge:
Let's merge the two piles 8 and 3
Compare the top elements of both
3 is smaller, so we add 3 to our new list
Now there's only one pile, so we add 8 to our list
We now have one pile that's 3 8
Let's merge the two piles 4 and 0
Compare the top elements of both and push them to the new list in the
order of the smallest
This yields 0 4
Let's merge the two piles 1 5
This yields 1 5
Let's merge the two piles 7 2
This yields 2 7
Okay, now we have 4 sorted piles
38
04
15
27
Study Guide Page 37

27
Let's merge 3 8 and 0 4
Compare the top elements: 3 and 0
We take 0; our piles are now 3 8 and 4
We take 3; our piles are now 8 and 4
We take 4; our only pile is now 8
We take 8;
Our new list is 0 3 4 8
Let's merge 1 5 and 2 7
Compare the top elements: 1 and 2
We take 1; our piles are now 5 and 2 7
We take 2; our piles are now 5 and 7
We take 5; our pile is now just 7
We take 7
Our new list is 1 2 5 7
We now have 2 sorted piles, 0 3 4 8 and 1 2 5 7
Merge!
Compare top elements: 0 and 1
We take 0, our piles are now 3 4 8 and 1 2 5 7
We take 1, our piles are now 3 4 8 and 2 5 7
We take 2, our piles are now 3 4 8 and 5 7
We take 3, our piles are now 4 8 and 5 7
We take 4, our piles are now 8 and 5 7
We take 5, our piles are now 8 and 7
We take 7, our pile is just 8 now
We take 8
And we're done; we've produced the sorted pile 0 1 2 3 4 5 7 8
- Wasn't that fun? I sure think so
- Here's a more concise version of what I was doing:

Radix Sort
Fortunately, you don't know have to know quicksort for this test
Unfortunately, you have to know radix sort!
Radix sort is incredibly efficient; unfortunately, it only works for numbers
Basically, you sort numbers by their ones place, then by their tens place, then hundreds, etc
To do this, you construct 10 vectors, one for each possible digit (0 through 9)
You then read each number into a vector based on the current digit; for example, if we are sorting
through the ones place, we'll put 48 into the vector for 8
- Once all of the numbers have been put into vectors, you empty the vectors out in order, from 0 to
9
- Then you repeat the process for the next digit (so start with ones place, then move onto the tens
-

Study Guide Page 38

- Then you repeat the process for the next digit (so start with ones place, then move onto the tens
place, etc)
- The advantage of radix sort is that it's incredibly fast; the disadvantage is that it takes up a lot of
memory space to store all of these vectors

Sample Run
- Suppose we want to sort 48 63 33 24 11 1 31 88
- Here's what it looks like after we put them all into vectors
0:
1: 11 1 31
2:
3: 63 33
4: 24
5:
6:
7:
8: 48 88
9:
- Then we read them out in order from the beginning of each vector, in the order of 0 -> 9
We get 11 1 31 63 33 24 48 88
- Okay, repeat for 10s place
0: 1
1: 11
2: 24
3: 31 33
4: 48
5:
6: 63
7:
8: 88
9:
- And read them out in order again
1 11 24 31 33 48 63 88
- And ta-da, they're sorted! Wasn't that quick and easy?

Last Thing
- On the exam, he will give you the code for a sorting algorithm and ask you to identify what it is
- Here's what to look for:
Characteristic

Sorting Algorithm

A bool called 'quit' that is part of the for loop

Bubble Sort

A swap function; also the only one that iterates from x = 0 to SIZE - 1 Selection Sort

Two for loops, one with a super complicated interior

Insertion Sort

Three for loops, one iterating through h-values

Shell Sort

The only recursive sorting algorithm

Merge Sort

Lots and lots of vectors

Radix Sort

Pivots (no way he's going to give this to you)

Quick Sort

Big-O
- This is the hardest topic you will do (unless you count recursion)!
Study Guide Page 39

This is the hardest topic you will do (unless you count recursion)!
Fortunately, this will probably only show up as one or two MC questions worth like 1 point each
Unfortunately, this is one of the most important concepts in programming
Fortunately, I will try my best to make it as simple as possible :P
He likes to write a lot of crap about actually counting the number of statements and
determining Big O from that. That's why his slides are a smorgasbord of mathematical
expressions While you COULD do it that way, understanding how Big-O works conceptually
is so much faster

Overview
- The point of Big-O analysis is to determine how efficient an algorithm is and how long it takes
given small or large inputs
- For instance, let's start with something simple
Remember linear search? The search method where you look through a list one by one until
you find the right element?
Let's say we have a list of 100 elements, and let's say the thing we're looking for is the 47th
If we use linear search, how many comparisons will we have to make to find it?
Well, we start from the first one and compare. Since our thing is the 47th and not the
first, we move on to the second
Again, ours is the 47th, not the second. Move on.
Please don't make me type this 47 times. Basically, we will have to make 47
comparisons before we actually find what we're looking for.
What if our item is the 99th? We'll have to make 99 comparisons.
What if our item is the 3rd? We'll only have to make 3 comparisons.
I think you're beginning to see a trend here. Basically, to find the n-th element, we will have
to make n comparisons until we find what we're looking for.
This is called O(N)
This means that the algorithm is directly related to the size of the array
- In our example above, we made exactly n-comparisons to find the n-th element
- What if we did something where we would make exactly 2n comparisons to find the n-th
element? Is this O(2N)?
Technically yes, but this will reduce to O(N).
Big-O notation is NOT meant to be an exact count of how many statements it takes, despite
what Ouellette might write in the slides
Rather, it is a conceptual understanding of how the algorithm operates
We can think of this linear search as a single for loop
Basically, that's how we'd write it; we'd write a for loop and compare things one by
one
- In general, if there is only a single for loop, it is O(N), because it will generally run about N times
As we increase N by 1, we execute the inner code one more time, so it grows linearly
It's okay if this confuses you right now; it'll make more sense once we look at the
alternatives
- What if we had a for loop INSIDE of a for loop?
Let's say the inner for loop runs M times and the outer for loop runs N times
This means we'd be running the inner for loop N times
Since the inner for loop runs M times, we'd do a total of M * N runs
As we increase N by 1, we're going to increase the number of runs by M
Before with O(N), we'd only increase the number of commands by 1
Now, we're increasing them by M
This will grow exponentially; this is called O(N2)
Why is it called O(N2) and not O(M*N)?
Again, Big-O notation is a simplification of the situation; even though both for loops
are not running N times, we say O(N2) to say that it grows exponentially
- In general, if there is a nested for loop, it is O(N2)
Study Guide Page 40

- In general, if there is a nested for loop, it is O(N2)


- What if we nested ANOTHER for loop?
That would be O(N3); see if you can figure out why! If not, FB message me :P
O(N3) is not on the test, but it's to make sure you understand O(N2)
- The next order of complexity is O(1)
This is constant time; this means that it will always take roughly the same number of
operations each time
For instance, let's say we have a sorted list and we want to find the largest element
I mean, you could iterate through the list and compare until you find the largest (this
is O(N))
OR, since you know the list is sorted, it'll be at the endpoint
Since you know it'll always be at the end, all you have to do is just access the element
at the end
This is O(1)
O(1) does not have to run in 1 command; it could be 5, 10, 15
My roommate came up with a brilliant sorting algorithm he calls RealWorldSort as a joke
Yes, we make computer science jokes. We're lame like that :P
Basically:
Sort the first 100 elements
Sort the last 100 elements
Hope no one checks the middle, since people usually just check the ends
While it's not really a good sorting algorithm since it doesn't actually sort anything,
this "sorting" algorithm is TECHNICALLY O(1)
This is because it makes the same number of operations EVERY single time
Even though sorting the first and last 100 elements might be an O(N) or O(N 2)
operation individually, sorting 100 elements is a constant
It'll always take the same amount of time to sort 100 elements
Therefore, we can conclude that this algorithm is O(1)
If we were to vary the number of elements to sort, such as say, sort the first and
last 10% of elements instead of 100, then we would no longer be O(1)
This is because the number of elements we sort now depends on the size
of the list
In this case, it would be O(N)
O(1) is probably the easiest complexity to understand since you NEED to find a constant
If there's anything that can vary, it's probably not O(1)
- The last two orders of complexity are harder to understand :(
- Fortunately, if you understand the previous 3 well and you come across an algorithm which you
know for sure isn't the previous 3, then you can probably assume it's one of the following two
- Anyways, the next two (and only other Big-O things you're responsible for) are O(log N) and O(N
log N)
Consider binary search
We start halfway through a sorted list and compare the number we're looking for with
the number we're at
If our number is less than the number we're at, then divide the left half in half
If our number is greater than the number we're at, then divide the right half in
half and search that
Let's say we have 100 elements total
Let's say we're really unlucky; then we'll divide the array into 50, then 25, then
12, then 6, then 3, then 1
This means that at most, we have to make only 6 comparisons to search a list of
100 for something
This certainly beats our linear search!
Notice the strategy of "divide and conquer"
This is O(log N)
I'm sure Ouellette has a mathematical proof to prove it's log N
Study Guide Page 41

I'm sure Ouellette has a mathematical proof to prove it's log N


Conceptually, all you have to understand is that a divide and conquer algorithm
is always log N
- What about O(N log N)?
This one looks the most confusing, but it's not all that bad!
Remember how O(N2) was just doing an O(N) operation N times?
How we put a for loop inside another for loop?
Well, O(N log N) is basically just doing an O(log N) operation N times!
For instance, let's say we want to find N numbers
We'll have to perform the binary search algorithm N times then
This would be O(N log N)

Case Study
- To help conceptualize Big-O operations, imagine a phone book.
O(1): given a page that a person's name is on and their name, find their phone number
This is easy for you to do; you will know exactly where to look and how to find the
phone number
O(log N): given a person's name but not the page, find the phone number
Assuming you use binary search and not linear search, you'll open the book to some
page that isn't the first page
Then you'll check whether the person's name comes before or after the page you're
on
If it comes after, then flip randomly forward a bit; if it comes before, then flip
randomly before a bit
Repeat and keep flipping forward/backwards in smaller and smaller increments until
you find the page with the name on it
Then look up the phone number
Yes, I know this isn't actually binary search since binary search will divide in half and
such, and flipping randomly forwards and backwards will probably take a bit longer,
but it's still log N in terms of complexity
It isn't mathematically log N, but it follows the same methodology as an
algorithm that is log N; therefore, we can consider it log N
O(N): find all people whose phone numbers contain the digit "5"
Yeah, I don't envy you if you ever have to do this
You're going to have to check every single phone number to see if it contains 5 in it
This means that you're going to have to iterate through the entire book
This is similar to doing a single for loop
This will therefore be O(N)
O(N): alternatively, given a phone number, find the person's name
Again, you're going to have to check every single person and see if their phone
number matches the one you have
Again, similar to a single for loop
- Now, pretend that we're in a printer's office printing phone books for customers!
O(N log N): personalize each phone book for each customer by opening the phone book for
that customer, finding their name, and putting a sticker next to it
Doing this for one person is just O(log N), as we discussed earlier
However, you have to do this O(log N) operation N times, for the number of
customers you have
N log N is just doing log N operations N times
O(N2): an extra 0 was added to every phone number by accident; open every phone book
and white out the extra 0 for every phone number in every phone book
This is the equivalent of doing a nested for-loop
For each phone book
For each number in the phone book
Study Guide Page 42

For each number in the phone book


White out the number
Whiting out the extra 0 for every phone number in one phone book is O(N)
This is because we have to read every phone number in the book
However, we have to do this for ALL the phone books
We're basically doing an O(N) operation N times
Therefore, this is O(N2)
- There are more types of big-O, such as O(n!), but you are not required to know these for this class
- Only the ones above you will have to know
Also, I find that O(1), O(N), and O(N2) are the easiest to understand conceptually
If you absolutely don't get O(log N) and O(N log N) like me for the longest time, just assume
that if it isn't O(1), O(N), and O(N2), it's one of the log functions

Sorting Algorithm Analysis


- Anyways, this section is meant to correspond with Ouellette's lecture on sorting algorithm analysis
- Let's walk through each sorting algorithm CONCEPTUALLY instead of his mathematical nightmare

Bubble Sort
- Bubble sort is a fun sort; see the sorting algorithms section for a refresher on it
- Let's consider the best and worst case scenarios
Best Case:
Remember how bubble sort has a catch to tell if the list is already sorted?
You probably didn't :D Well, REMEMBER IT.
Anyways, the catch works by checking whether any swaps were made in that pass
If no swaps were made in that pass, then quit
This means that best case, it needs to make a single pass through all the stuff
How fast is a single pass?
Did you guess O(N)?
Well you should have. It's O(N) because it's directly proportional to the length of
the items
Therefore, our best case scenario is O(N)
Worst Case:
Worst case is that our catch never kicks in, and that we make swaps every single time
That means we're going to have to iterate through the entire list roughly N times
Okay, technically the amount we iterate through decreases each time, from N 1 to N - 2 to N - 3, etc etc
For simplicity's sake, let's just say it's about the same, okay? Let's not
complicate things too much
Conceptually, we have to iterate through each list (which is already O(N)) N times,
because our catch never kicks in
Since we're performing an O(N) algorithm N times, we will have O(N2) complexity
- Again, the important thing is not an accurate count of the mathematical statements, as Ouellette
seems to emphasize, but a rough understanding
- Therefore, Bubble Sort is:
Best Case: O(N)
Worst Case: O(N2)
- What about average case? On average, the catch will probably kick in somewhere, so we don't do
the O(N) operation N times
However, on average, we will still have to perform the O(N) operation some number of
times, and if we increase the size of the list, we will have to perform the O(N) operation
more times, meaning it scales with the size of the list
Therefore, we can still conclude that Bubble Sort is O(N2) in average case

Selection Sort
Study Guide Page 43

Selection Sort
- While selection sort is better than bubble sort, it's still a pretty terrible sorting algorithm
- On the bright side, it has the same efficiency for worst case, best case, and average case
The reason for this is because it has no checks for whether the array is already sorted
It will perform the same number of iterations every time
It will iterate through the entire unsorted part of the array and compare elements to
find the smallest element
Then it will do this N - 1 times
- We're performing an O(N) action about N times; what does this mean?
Yup,
- No need to do any of that math stuff he puts into his lecture slides

Insertion Sort
- What's great about insertion sort is it won't actually do anything if the array is already sorted
- This makes it ideal for almost sorted or fully sorted arrays
- The best case is if it's already sorted
In this case, you will still iterate through the entire array
Every iteration, your sorted section grows larger by one and your unsorted grows smaller by
one
You take the first element from the unsorted and try to sort it into the sorted region
However, since the array is already sorted, that element will already be where it
should be
Therefore, it will continue along in its loop
Since you're basically just going to iterate through the loop, the best case scenario is
- Let's consider the worst and average cases
Chances are, you won't be using a sorting algorithm if your array is already sorted
Let's look at an unsorted array
The worst case would be if it were in reverse order
This means that as we move to the next element in the unsorted array, we have to move it
all the way to the front of the sorted array
This is an
operation
Not to add, there are N elements, so performing an
task N times is
On average, it'll also be

Merge Sort
- What's great about merge sort is it has no best case, worst case; they're all the same, just like
selection sort
- Unlike bubble sort and insertion sort, merge sort doesn't discriminate
No matter what, it will always break everything into the smallest piles and sort them the
exact same way
- If you want a comprehensive breakdown on why this is O(N log N) for all three cases, read his
lecture notes
He calculates it using math and stuff
- I can't really give you a really good answer for this, except for the fact that
Since it's a divide and conquer algorithm, the dividing part is O(log N)
However, the number of times we have to divide will scale based on the size of the array, so
it has O(N) growth
- Therefore, combining the two, we get O(N log N)
- This is a grossly oversimplified explanation; there's no way he's going to ask you to prove it on the
test
- At worst, you'll be asked a MC question on merge sort's efficiency
- In that case, just say O(N log N)
- Yes, this was very unsatisfying, but you can just read his lecture notes if you'd like to delve into the
math :D
Study Guide Page 44

math :D

Conclusion
- In general, use insertion sort on small arrays and arrays that are almost sorted
- Shell sort is almost as good as insertion sort on small arrays and almost as good as merge sort and
quick sort on large arrays
- Merge sort and quick sort are fastest on large arrays
Merge sort is a stable sort though, and is more reliable; quick sort is less reliable but faster
on average
- Radix sort is the fastest, but requires memory to hold all of those darn vectors
- If you can understand the following/find it funny, you should be very well prepared :P

Algorithm Analysis
- A really irritating MC problem he puts on the test is that he asks you to find the Big O of something
given T(2N)/T(N)
- What does this mean? Let's say we have a linear function, that is O(N)
- Let's say it takes 10 seconds to run 100 operations
How long do you think it will take to run 200 operations?
Well, since this is linear, you'd probably expect it to take 20 seconds to run
- What about a function that's O(1)? Let's say it takes 10 seconds to run 100 operations and you
know it's O(1)
How long will it take to run 200 operations?
Since it's constant time (that means the time will never change), it'll also take 10 seconds
Study Guide Page 45

Since it's constant time (that means the time will never change), it'll also take 10 seconds
- Now, given times, he's going to ask you to determine what a function's big O is
- Let's consider the O(N) example first
He's going to give you a table like the following:
N

T(2N)/T(N)

100

1.999

200

2.001

Basically, what T(2N)/T(N) means is you take the time it takes to run N operations, you take
the time it takes to run 2N operations, and you divide them
Let's say our N is 100; it takes 10 seconds to run 100 so T(N) = 10
It takes 20 seconds to run 200 so T(2N) = 20
T(2N)/T(N) = 20/10 = 2
But wait, you might complain, my table has 1.999 instead of 2
Yeah, he adds in some stupid "random variation" bs to make it more "realistic"
- Anyways, instead of brute forcing this, I'm going to put a table here with values and answers
T(2N)/T(N) Big-O
1

O(1)

O(N)

2->1

O(log N)

4 -> 2

O(N log N)

O(N2)

O(N3)

- For O(log N) and O(N log N), he's probably going to start you at N 100
That means you'll see numbers more like 2.3 for O(N log N) and 1.15 for O(log N)
- If you don't see any of the following, chances are it's O(2 N)
There's no predictable number for this one, so if it doesn't follow a pattern, it's this
- Just memorize this table and you'll ace that question

Recurrence Relations
- You're probably only going to be asked a single MC question on this, if at all
- Unfortunately, this topic makes its return from Math 61
- Fortunately, it is highly unlikely that he is going to test you on it (yay!)

Study Guide Page 46

- Still, it's good to know how to solve one just in case?


- I mean, I can't really teach you too much because I myself don't really know how to solve one
- I can guarantee you that this will not show up as a short answer question; this is not a math
course :P
At worst, it'll show up as a single MC question
- Theoretically, this is the process for solving one:
Let's consider the recurrence relation for Linear Search

Well, given this formula, we can solve for

And we can plug this into the original formula

Then repeat!

Note that this is going to repeat N times


On our N-th repetition, we're going to have

So we're basically just going to have


N times, which means

Yeah, don't worry if you don't get this; just try to brute force your way through things on the
test only if you're truly desperate
- The following should be enough for the test:
If you see
, then it is going to be
If you see
or N divided by any number, it's going to be
If you aren't given any variables, then it's going to be
If you're told to add N, then it's going to be N times whatever the inside operation is
You'll see what I mean when you look at the selection sort
and merge sort
relations
- Also, memorize the following:

This is linear search

Study Guide Page 47


This is binary search

This is selection sort

This is merge sort

This is the Towers of Hanoi


- There are two types of questions you might face
One: You'll be given one of the recurrence relations above and told to identify which it is
Just memorize the list I put here and you're going to be just fine
Two: You'll be given one of the recurrence relations above and told to identify its efficiency
in Big-O notation
Just memorize the rules I put above and you're fine
- And that's all you really have to know for this topic. This topic is stupid.

LinearList
- There will absolutely be a question on the test where you will have to make use of his LinearList
class
- The class API will be provided for you, but it's good to know it ahead of time so you don't have to
look at it
- You will have to know how three classes work
Node
Iterator
LinearList

Overview
- So what is a "linear list"? It's essentially a single-y linked linked list
- Like that helps at all. What's a single-y linked linked list?
- Think of one of those metal chain things

- This thing! When you think about it, all a chain really is is just a bunch of metal rings connected to
each other
- A linked list is just like this
Linked List - a data structure consisting of a group of nodes which together represent a
sequence

- Notice the similarity? Yeah, I don't really, but bear with me


- Each rectangle is a Node of the LinkedList
Study Guide Page 48

- Each rectangle is a Node of the LinkedList


It contains two attributes, a number and an arrow
The number is the value of the Node
In an array, you'll recall that at each index of an array, there is a value; this is the LinkedList's
version of that
The arrow is a pointer to the next Node
In arrays, you access each element by using an index
For instance, if you want to get the 5th element in an array, you'll just type array[4]
(assuming array is the name of the array)
We can't do that in LinkedLists, unfortunately; there IS no index structure
Rather, you ONLY have access to the first item in the linked list; this is called the head
We then have to look at where the head points and then go to that node; that will be
the second node
Then we look at where the second node points and go where it goes; this is the third
node
Then we look at where the third node points and go where it goes; this is the fourth
node
I would just say and so on, so forth, but this is the last step, so go where the fourth
node points
Then we FINALLY reach the fifth node
Notice how this took us like, five whole statements
In general, if we want to access any element that isn't the first Node, we have to visit that
many elements to get to it
Say we want to go to the 7th node; then we have to do 7 operations
Therefore, accessing elements in linked lists is O(n) because we have to iterate through it
For arrays, accessing elements is O(1) because you just have to use the index
- So why would we want to use a linked list? This sounds like a lot of hassle for something really
terrible
- Short answer? Job interviews. They LOVE to ask questions about linked lists at job interviews
Other than that? Vectors are better; don't ever use linked lists
- Unfortunately, linked lists are still a part of this course, so I guess we'll talk about why we might
want to use a linked list
In C++ arrays are complicated. Remember from earlier in this quarter, like Hw3 or
something where you had to build your own resizing array class or whatever
Wasn't that such a pain? Let's say our array holds 5 elements and you want to add a 6th
element
BAM. Now you have to create a new array, copy everything over, etc etc, what a pain
What if we want to insert a new element? For example, let's say we want to insert 5 at the
3rd index
Well sure, we can just do array[2] = 5
But what about the data that was already at array[2]? Well, we have to set array[3] to
array[2]
And array[3]? We have to set array[4] to array[3]
Basically, we have to move everything down one EVEN IF we don't have to resize.
STILL irritating.
Essentially, inserting elements into an array is an O(n) operation, because while we might
not have to move n elements, the number of elements we have to move is directly
dependent on the size of the array
With linked lists, however, all we have to do change the pointers around
Suppose we have two Nodes, a and b, and we want to insert a third one c between
them
Currently, a points to b
We want to end up with a pointing to c and c pointing to b
To accomplish this, we do the following
Make c point to b
Study Guide Page 49

Make c point to b
Make a point to c instead of b
Ta-da! We're done
No matter how many Nodes are in our linked list, insertion will always involve a constant
number of steps
Therefore, insertion in a linked list is an O(1) operation
Consider deletion as well
Let's say we have a pointing to c pointing to b
We want to end up with a pointing to b
To do this, we just
Make a point to b
Delete c
Again, notice how the number of steps does not depend on the number of elements
in the list
Therefore, deletion in a linked list is an O(1) operation
- TL;DR, the above text is summarized in the table below
Insertion Deletion Accessing/Indexing
Arrays/Vectors O(N)

O(N)

O(1)

Linked Lists

O(1)

O(N)

O(1)

- These advantages will probably show up in a multiple choice question somewhere, so know
these :P

Node
- The Node class has two attributes: an int storing data called 'data' and a pointer to the next node
called 'next'
- It also has two friend classes: Iterator and LinearList
Basically, what this means is that both the Iterator and LinearList classes have full access to
it
If you're writing a method in LinearList or Iterator, you can directly manipulate the Node
pointer because they're friends
- If you have to visualize it, draw a circle and write a number in it
That number is the data
Now, draw another circle and write another number in it
That would be another Node
Draw an arrow from the first circle to the second circle
This is the pointer to the next Node

In the picture above, the green '5' is the data for the first node, and the blue arrow is the
pointer to the next Node
- Every Node HAS to have a pointer. If there is no other Node to point to, then it will point to NULL
If it points to NULL, then it is the very last pointer
In the picture above, the two black dashed lines represent NULL, and the red arrow
represents this Node pointing towards NULL

Study Guide Page 50

Iterator
- For the professor's classes, iterator is kind of a misnomer
- The Iterator is a kind of container for a Node. It contains methods to manipulate the Node for you,
so you don't have to directly mess with pointers. It'll help to know how to do it with Iterators and
without Iterators.
DON'T assume that its only purpose in life is to iterate through the LinearList; this is false
Just treat it as a container for a Node
- An iterator contains two attributes, a Node pointer 'position', and a LinearList pointer 'container'
The 'position' attribute is a pointer to the Node that it is currently at
In the picture above, pretend the orange arrow is pointing from an Iterator
Then, the orange arrow is the Iterator's Node pointer to the first Node
- An iterator can access the data and next variables of its Node pointer by doing the following:
Iterator t = l.begin(); // l.begin will return an iterator at the beginning of the list, I'll talk more
about this later
t.position->next; // This will access the Node's next pointer
t.position->data; // This will access the Node's data
- Let's break down the syntax for the commands above
When we create a new Iterator t, we create a new object with a member variable position
The member variable position is a member of t, so we access it using the dot operator
HOWEVER, it is a pointer, which means it is NOT actually the Node object, but merely a
POINTER to it
To access the data from a pointer, we have to dereference it using the * operator
Therefore, *(t.position) will return the actual Node object itself
Now, we can access the next and data by using the dot operator
*(t.position).next will be the next pointer and *(t.position).data will be the data variable
Remember that -> is shorthand for *(something)., however
Therefore, writing t.position->next and t.position->data is quicker and is easier to read
- The Iterator class contains three functions
next()
get()
operator==
- The next() function will "iterate" the Iterator to the next Node by going wherever the current
Node's next pointer points
In the picture far above, calling next() would cause the orange arrow to point to the circle
with the red '6' inside of it
It works by looking at the object it is pointing to right now (the circle with the green '5')
Then it looks at where that object's next pointer is pointing to
The blue arrow from that circle points towards the circle with the red '6'
Therefore, it will switch its position pointer to point at the new circle
The code for this is below:
void Iterator::next(){
assert(position != NULL);
position = position->next;
}
Note that there is an assert(position != NULL)
This ensures that the Iterator doesn't push past NULL
Let's say the Iterator is pointing at the circle with the red '6' already
What happens when we try to call the next() function?
Well, we look at where the red arrow is pointing (NULL), and set our position to that
Does this throw an error?
NO, it DOESN'T (gotcha if you said yes :P)
This is perfectly fine, and this is how l.end() works actually
l.end() (I'll talk more about this under the LinearList section) returns an Iterator
Study Guide Page 51

l.end() (I'll talk more about this under the LinearList section) returns an Iterator
where the position is NULL
So why'd I bring this up?
Consider what happens if we try to call next() AGAIN
Recall that NULL doesn't actually mean anything; in CS, it's basically an object
representation of nothing
Therefore, it doesn't have a position pointer
Notice the lack of arrows coming from the dashed black lines
If we tried to follow one (which doesn't exist), we'd almost certainly get an error
Fortunately, we have a line of code saying assert(position != NULL)!
If our position IS NULL, which it is at this point, then assert will return false
The assert function will then crash our program and return an error
Why'd we go through this convoluted process if we're going to get an error anyways?
Apparently the professor really likes the assert function for some reason, and I guess
having an intentional error is better than an unintentional error?
I guess it's more, "it's the thought that counts"; by adding that line of code, we make
sure it's working as intended and that we aren't trying to call next() on NULL
TL;DR - if the Iterator is NOT currently pointing at NULL, then it will try to point at whatever
the pointer is pointing to
Otherwise, it will crash and burn because of the assert function
- The get() function will return the data at the current position
This is a pretty simple function; it simply returns the data at the given location
The code for this is below:
int Iterator::get()const{
assert(position != NULL);
return position->data;
}
Pretty simple stuff; it accesses the data at the current position pointer and returns it
Remember that position is a POINTER, not an object with actual data
Therefore, we have to dereference it first by using the * operator
*position.data, basically
Again, -> is shorthand for that, so that's why we write position->data
Also, we must check that the Iterator currently points to a Node before attempting to get its
data, hence the assert function
- Finally, we overload the operator== to compare an iterator's position
Remember that operator== is NOT overloaded by default
The code for this is below:
bool Iterator::operator==(const Iterator& it)const{
return position == it.position;
}
We compare the current Iterator's position pointer and check whether it's pointing to the
same place as the parameter's position pointer
If yes, return true; if no, return false; that's all there is to it

LinearList
- Finally, we get to the meat of the entire thing
- Thing is, on the midterm, he doesn't actually ask you about the LinearList
You're only responsible for knowing how to use his Node and Iterator classes
- Still, we'll cover his LinearList
- In his lecture notes, his LinearList class is pretty straightforward
class LinearList {
public:
LinearList();
Study Guide Page 52

LinearList();
private:
Node* head;
};

LinearList::LinearList() {
head = NULL;
}
Remember that the LinearList class is a friend of both the Iterator and the Node classes, so it has
direct access to their methods and member variables
So what's this whole LinearList business all about?
The LinearList is basically a container for all of the Node objects
Think of each Node as a link in a chain; then the LinearList is the entire chain
It stores a Node pointer called head to indicate where the list begins
After that, it's up for each individual Node to indicate where the rest of the chain is
Again, returning to our chain link example, if you know where the beginning of the chain is, you
can find everything in the chain because you can see what each individual link is chained to
Let's cover some operations!

Operations
Insertion of the First Node
- Let's say the list is empty right now and we want to insert our first node
- When we first create a LinearList, the head Node pointer points to NULL because there's nothing
there
- When we insert our first node, we want the head pointer to point to our new Node because this
will be our new head
- Therefore, if we wanted to insert a new Node with value 5:
head = new Node(5);
head->next = NULL;
- The second line isn't required because I believe the Node constructor already initializes it to NULL
by default
Nonetheless, he has it in his lecture notes, so why not; let's be redundant for the sake of
being redundant
- Here's a pictorial representation to make it easier to visualize:

Insertion at the Front of the List


- Okay, what if we already have a head pointer pointing to another Node and we want to insert a
new first?
- Well, we know that we need the head pointer to point to our new Node
- However, we also need to keep the current head chained to the LinearList
This means we have to set our new Node's next pointer to the current head
- The code to do this is below:
Node* temp = new Node(2); // creates a new Node with value 2
temp->next = head; // the new Node will now point towards the current head
Study Guide Page 53

temp->next = head; // the new Node will now point towards the current head
head = temp; // the head pointer now points towards our Node!
- More pretty pictures

Insertion between two Nodes


- Let's say the list is not empty and we want to insert the new Node between two nodes
Let's call the left Node "current"
We want to insert our new Node right after current
- This means that we want current's next pointer to point towards our new Node instead of the
Node it's currently pointing to
- However, we also don't want to lose track of the right Node either; we need to set our new Node's
next pointer to that Node to maintain the chain
- Here's the code:
Node* temp = new Node(8); // creates a new Node with value 8
temp->next = current->next; // makes our new Node's next pointer point to what the current
Node is pointing to
current->next = temp; // makes current point to our new Node

- And that's all there is to it!


- Really, it's just keeping track of things; at every point in the process, make sure you have some
way to refer to every single Node you'll be working with

Removing the first Node


- Let's say we want to remove the first Node from our list
- Accessing this is easy; this is just our head Node
- However, we can't lose track of the rest of the list; we need to make sure that we have some way
to point to it
- When in doubt, create a lot of extraneous Node pointers :D
Node* temp = head; // create a new Node pointer called temp that points to the head Node
head = temp->next; // make the head pointer point to the next Node in the list
delete temp;
Study Guide Page 54

delete temp;

- This is what the list looks like right before deleting temp
- Again, note that we always have a backup pointer! After we delete temp, the temp pointer will be
useless because it won't be pointing to anything anymore

Removing a Node in the List


- Let's say we want to delete the Node right after a given Node called prev
- To do this, we make a pointer to that Node (the one we're going to be deleting)
- Before we delete, however, we need to make sure that prev is pointing to the next Node in the list
THEN we delete it
- In the example below, we'll be given two pointers, prev and current
We want to delete current and we're given prev
In this example, current is given for reasons unbeknownst to me
If on the exam you AREN'T given a pointer to the Node you actually want to delete, that's
not difficult!
Just add the following line:
Node* current = prev->next; // create a new node pointer that points to the Node after prev
- And here's the rest of the code for the deletion process
prev->next = current-> next; // link the prev Node to the Node after current
delete current;

Study Guide Page 55

- This is what it looks like right before deleting current


- Again, deleting current will render the pointer useless because it no longer points to anything
- We need to make prev point to what comes next before deleting it to avoid losing any way to
reach the 5 Node

Iterator Operations
- On the test, doing everything the way we did it above is FINE.
- You will NOT have to do it using an iterator if you don't want to
- However, it's useful to at least learn how iterators work because they'll be used a lot in the last
third of the course
- I know we already covered the iterator class, but a quick refresher
Iterators are a container for a Node object
All it does is contain a pointer to the Node we want to access
This allows us to get data and iterate through a Linked List without directly touching any of
the pointers
This is a safer approach! Directly manipulating pointers is dangerous
Of course, the only reason to actually USE C++ is pointers Go use a safe language like
Java if you want to be safe
- Anyways, before we continue with iterators, let's add two more functions to the LinearList class
that I omitted earlier
begin()
This function returns an iterator to the head Node
Okay, what it ACTUALLY does is it creates a new Iterator with a pointer to the head pointer
Since we overloaded operator== in the Iterator class, it's pretty much the same thing though
I'll demonstrate how to use this in a sec
end()
The opposite of begin(); it returns an iterator to the Node AFTER the last Node in the list
Basically, it creates an iterator that points to NULL
Consider the last Node in the LinearList; every Node has to point to something, but what
does the last Node point to?
Well, there's nothing for it to point to, so it points to NULL
This is why end() returns an iterator at NULL; when we iterate through, we check if our
iterator equals end
Why not point to the last node?
Consider a for loop; our loop condition is going to be iterator != end()
If end were a pointer to the last Node, then once our iterator reaches the last Node, it
will quit out of the loop before executing the code in the for loop
This means we wouldn't actually perform any operations on the last Node
This is why we have to point to NULL, because after we perform operations on the last
Node, we visit the last Node's pointer
Guess where that points? NULL!
- Now that we have these two functions, let's see how they're used
Study Guide Page 56

- Now that we have these two functions, let's see how they're used
- Here's a for loop using his LinearList class to iterate through the entire LinearList
LinearList l;
for( Iterator itr = l.begin(); !(itr == l.end()); itr.next())
- You'll notice some strange things in this for loop, probably (or maybe not)
- The first part looks fine; we create an iterator and set it equal to the beginning of the list; no
problem here
- You may be wondering why we have to use !(itr == l.end()) instead of itr != l.end()
The reason is because Ouellette never overloaded the != operator
We only overloaded operator==, not operator!=, so saying itr != l.end() doesn't actually
mean anything
This will throw an error because != is not overloaded by default; the function doesn't exist
We did overload operator==, however, and we can negate the whole thing by wrapping it in
parentheses and prefixing it with a not
- Finally, we use itr.next() to advance the iterator
- The syntax is stupid, but I don't think he'll ask you to do something like this on the test; looking at
his past midterms, he's only asked you to use Nodes and Iterators (if you so desire) to swap Nodes
around
- Also note, if you want to get the data from an Iterator, use the get() function
- Therefore, a loop to print out all the values in a LinearList would be:
for( Iterator itr = l.begin(); !(itr == l.end()); itr.next())
cout << itr.get() << endl;

STL LinkedList, Queues, and Stacks


-

Unlike his homebrew LinearList class, Ouellette will NOT give you an API for the STL classes
Fortunately, for the final you can just write these on your notecard
Unfortunately, this stuff's on the midterm too, so you'd better know this stuff :D
There will be one short answer question on the midterm asking you to write some function to do
something using one of the STL classes

Linked Lists
- So we went over what a linked list is conceptually above
- Fortunately, some very wonderful people have created a standard library version of the LinkedList
for us to work with
- Note that the STL linked list is a doubly-linked linked list; this means that unlike Ouellette's
LinearList, every Node contains a pointer to its predecessor
This means that iterators can move BACKWARDS in addition to moving FORWARDS
We'll get to the iterators in a sec
- To use a STL linked list, simply #include<list>
- You can then create a linked list of ints with the following notation: list<int> l;
- What's great about the STL linked list is all of the methods we want are already implemented for
us!

Linked List Methods / Iterator


push_back( element );
Does whatever you think it probably does; appends the element to the end of the list.
Example:
list<int> l;
l.push_back(2); // list is now 2
l.push_back(7); // list is now 2 7

Study Guide Page 57

push_front( element );
Similar to push_back, but it pushes it to the front of the list!
Yeah, probably not going to be using this one
-

Before I can give you the rest of the commands, I have to teach you how the STL iterator works
You're not going to like this syntax
To create an iterator for a list, it would be type list<int>::iterator
Remember the begin() and end() functions for the LinearList? They make a reappearance
Let's say we want to iterate through a Linked List; we would use the following for loop
for( list<int>::iterator itr = l.begin(); itr != l.end(); ++itr )

- Yeah, I know that looks complicated, but you'll get used to it! Let's break it down
The first part is the type; where we would normally say int i, we instead have an iterator itr
We have to specify the type of the iterator; a list<int>::iterator will not work for vectors
Incidentally, you can also use vector<int>::iterator to iterate through vectors if you so
desire!
Anyways, once we create our iterator itr, we need to set it to a starting value
It's good to set this to l.begin(), since that returns an iterator to the first element in
the list
Note: l.end() does NOT point to the last element in the list; it points to the spot that
the last element in the list points to
We then continue to iterate until our iterator == l.end(); basically, it has finished iterating
through every element in the linked list and is now pointing to a point after the last element
- See, not so bad :D
- What if we want to go backwards? He might ask you to print out a linked list backwards
Well, we can't just do
for( list<int>::iterator itr = l.end(); itr != l.begin(); --itr )
The problem with this is that l.end() does not point to the last element; rather, it points to a
point after the last element
Also, l.begin() points to the first element; however, we need it to point to a place BEFORE
the first element to know when to stop
- Seriously, it's a hassle to do this. Fortunately, reverse_iterators exist :D
for( list<int>::reverse_iterator itr = l.rbegin(); itr != r.end(); ++itr )
This will iterate through the list in reverse order. Remember this syntax and notation,
because it will be useful in the future as well
- Anyways, now that we got the for loop structure down, how do we actually access the data at
each point?
- Simple, just dereference the iterator
- The following code will print out every single item in a Linked List
for( list<int>::iterator itr = l.begin(); itr != l.end(); ++itr )
cout << *itr << endl;
-

Remember this notation because it'll be useful.


Here's a sample problem from one of his old exams: print out every other element in a linked list
There are two ways to do this, the lazy way and the way he probably intends you to do it
Lazy way:
int count = 0;
Study Guide Page 58

int count = 0;
for( list<int>::iterator itr = l.begin(); itr != l.end(); ++itr, ++count )
if( count % 2 == 0 )
cout << *itr << endl;
- Basically, iterate through the entire loop and only print it out if count % 2 is even
- I told you it was lazy :P
- Here's the way he probably intends you to do it
Also note, while we can use ++itr and itr++, we CANNOT use itr += 2 to advance forward two
spaces
Unless you import the algorithm library, which you aren't going to, the ONLY way you can
advance an iterator is using ++ and -- With that in mind, here's another solution:

for (list<int>::iterator itr = l.begin(); itr != l.end(); ++itr)


{
cout << *itr << endl;
++itr;
if (itr == l.end())
break;
}
- Basically, we have to manually increase the iterator to skip every other element
- HOWEVER, remember that the for loop only performs the end condition check (the itr != l.end())
AFTER incrementing
- If we didn't have that if statement there, we could potentially increase itr to the end with the ++itr
inside the code block
- Suppose we had 3 elements, 1 3 5
- We begin at 1
No problem, inside our code block we print out 1 and then advance to 3
- We return to our for loop code, which advances us to 5
It checks if we're at the end, which we aren't because we're at 5
Okay, now we run the code inside the for loop
Print out 5, then increase itr
We are now pointing to the point after 5, which is NULL
Basically, we've reached itr == l.end()
Since we have the if(itr == l.end()) check, we're safe; the code will break
IF WE DIDN'T, however, we would return to the for loop code which would attempt to
iterate forward once more
However, since we're at NULL and since NULL has no pointers, the for loop will throw
an error and we will be sad
Moral of the story: REMEMBER TO CHECK for the end condition whenever you manually
increment itr
- Here's another problem from another one of his tests
- Define a function to compute and return the maximum of the data values of the given STL list l;
assume it is nonempty
- Easy enough, right
- Our pseudocode will be to create a temporary int that will always hold the largest value
We will iterate through the list until we find larger values; then we will replace our int with
that value
At the end, we just return our int
- Here's the code in action
int getMaxOfValues( list<int>& l )
{
Study Guide Page 59

{
list<int>::iterator itr = l.begin();
int max = *itr;
for( ++itr; itr != l.end(); ++itr )
{
if( *itr > max )
max = *itr;
}
return max;
}
- Basically, we know our list is never empty, so there must be at least one element in the list
- Since we need to return something and that something has to come from our list, it makes sense
to make that the first element in the linked list
- Why can't we just do int max = 0?
Well, what if the list only consisted of negative integers?
Then nothing in the list would be greater than 0, which means the function would return 0
which was the default starting value
However, 0 isn't in the list, so our output would be incorrect
- To resolve this, we set our max value to the first element in the linked list
We know that there has to be at least one
If there's only one value, then great; this is our max value
If there's more than one value, then we iterate through the rest
In our for loop, you'll notice it looks pretty standard except for the first part
I used ++itr as the initialization of the counter variable
Since the prefix operator returns the variable after performing the increment, ++itr returns
an iterator that points to the spot directly after l.begin()
Basically, the second element
We already got the first element by setting max to it, so we don't need to iterate with
that again
- Now that we have iterators out of the way, here's the rest of those useful list functions!
insert( iterator, value );
This will insert a new node with the given value BEFORE the given iterator
Example:
list<int> l;
l.push_back(3);
l.push_back(5); // list consists of 3 5
list<int>::iterator itr = l.begin(); // itr points to 3
++itr; // itr now points to 5
l.insert( itr, 4 ); // list now consists of 3 4 5
To insert to the end of the list, just pass l.end() as the argument
Example:
list<int> l;
l.push_back(3); // list consists of just 3
l.insert( l.end(), 4 ); // list now consists of 3 4
Of course, I don't see why you'd do that when you could just use push_back but okayyyyy
erase( iterator );
This will delete the node at the given iterator
Example:
list<int> l;
l.push_back(3);
l.push_back(4);
l.push_back(5); // list consists of 3 4 5
Study Guide Page 60

l.push_back(5); // list consists of 3 4 5


l.erase( l.begin() ); // list now consists of 4 5
- Here are some more commands that he didn't include in his lecture slides, but I believe will be
very useful
front();
Returns the value of the first element in the list, so you don't have to create an iterator at
l.begin() and dereference it
back();
Returns the value of the last element in the list
pop_back();
A classic from the vector; deletes the last element

reverse();
Reverses the order of the elements in the list
empty();
Returns whether the list is currently empty
size();
Returns the number of elements in the list right now
- One final test question to make sure you understand linked lists
Given a list of numbers (for example, 1 3 5), convert that into the following list (1 3 3 3 5 5 5
5 5) where each element is duplicated a number of times equal to its value
Also, you should delete any 0s that you find in the list
- This is good practice to try on your own!
- Here's the solution:
void convertList(list<int>& l) {
for (list<int>::iterator itr = l.begin(); itr != l.end(); ++itr)
{
if (*itr == 0)
{
list<int>::iterator temp = itr;
++itr;
l.erase(temp);
if (itr == l.end())
break;
}
else {
for (int i = 1; i < *itr; ++i)
{
l.insert(itr, *itr);
}
}
}
}
- This is not the most elegant solution, but it is currently 2:40 AM and I don't really care too much
anymore
- Again, when you manually advance an iterator inside a for loop, remember to check for the end
Study Guide Page 61

- Again, when you manually advance an iterator inside a for loop, remember to check for the end
condition and to break if you reach it
Why do we manually advance, you might ask?
Well, if we didn't, then after we erase the iterator pointing at 0, our iterator now isn't
pointing at anything
When the for loop tries to advance it by visiting the current position's next pointer, it will
throw an error
This is because the current position has no next pointer because we deleted the Node
at the current position

Queue
- Queue - a data structure where data is organized in a linear fashion and items are added at the
end of the line and items are removed from the front of the line
- This is a FIFO data structure (first in, first out); think of it the same way you think of FIFO
inventory :D
- Probably doesn't help at all. Let's think of any arbitrary line for a bank or something
When you stand in line, you join at the end of the line
People are removed from the front of the line when they are called to the window or
whatever
You move up in the line as people before you are removed
Eventually you reach the front of the line and are called upon as well
- Queues work in exactly the same way in that items are removed in the exact order that they are
put in
- Fortunately, he doesn't have his own homemade Stack and Queue class (although you will have to
implement your own for your homework)
- You can just use the STL Stack and Queue!
- To use a STL Queue, #include<queue>
- Create a queue in the same manner that you create a vector or a list
queue<char> q;

Queue Functions
push( element );
Adds an element to the end of the queue. Note that this is push, not push_back like a
vector. This is because for vectors, you can use push_back to push to the end of a vector and
push_front to push to the front of a vector. With queues, there is only ONE data entry point,
the end of the queue.
Example:
queue<int> q;
q.push(5); // q contains 5
q.push(7); // q contains 7
q.push(9); // q contains 9
front()
Returns the element from the front of the queue WITHOUT removing it.
Example:
queue<int> q;
q.push(5); // q contains 5
q.push(7); // q contains 5 7
q.push(9); // q contains 5 7 9
cout << q.front(); // will print out 5
size()
Returns the number of elements currently in the queue.
Study Guide Page 62

Returns the number of elements currently in the queue.


pop()
Removes the first element from the queue WITHOUT returning it. Again, note that it is pop
and not pop_back or pop_front because unlike vectors, you can only pop from one
direction, the front.
Example:
queue<int> q;
q.push(5); // q contains 5
q.push(7); // q contains 5 7
q.push(9); // q contains 5 7 9
q.pop(); // q contains 7 9

Stacks
- Stack - a data structure where data is organized in a pile of sorts, where new items are added to
the top of the pile and old items are removed from the top of the pile
- This is a LIFO data structure (last in, first out)
- Think of one of those stacks of plates at a buffet
You always take off plates from the top
When the restaurant waiter guy comes to add more plates, they push them onto the top of
the stack
- Again, we're given a STL stack class yay, just #include<stack>
- To create a stack: stack<char> s;

Stack Functions
push( element );
Similar to the queue's push in that there is no push_back or push_front. However, the push
command will push things to the FRONT of the stack.
Example:
stack<int> s;
s.push(5); // s contains 5
s.push(3); // s contains 3 5
s.push(1); // s contains 1 3 5
top()
Returns the element at the top of the stack, but does not remove it from the stack.
Example:
stack<int> s;
s.push(5); // s contains 5
s.push(3); // s contains 3 5
cout << s.top(); // will print out 3
size()
Returns the number of elements currently in the stack.
pop()
Removes the top element from the stack without returning it.
Example:
stack<int> s;
s.push(5); // s contains 5
s.push(3); // s contains 3 5
s.pop(); // s just contains 5 now

Stacks and Queues


Study Guide Page 63

Stacks and Queues


- So now that we've covered the two of those, let's get into how you would use them
- Also, something to note: stacks and queues work by having another data structure underneath the
hood
By default, the STL stacks and queues work by using a deque as their underlying data
structure
In the homework, you will implement them using a linked list
- Anyways, here're some practice problems using stacks and queues!
- The one he has in his lecture notes is to check whether parentheses are balanced
The code below is going to be my personal code and not his because I'm practicing for the
final
This is actually easier to do without a stack or queue, but let's just do it with a stack because
why not
and it turns out that my code is almost identical to his. Great.
bool delimitersAreBalanced(char left, char right, istream& in) {
stack<char> s;
char c = in.get();
while (!in.eof())
{
if (c == left)
s.push(c);
else if (c == right)
{
if (s.empty())
return false;
else
s.pop();
}
c = in.get();
}
return s.empty();
}
- Here we have our lovely function to check if delimiters are balanced
We first get a character because something something that's how streams work
Check the streams section for a refresher (I know I'm going to have to :( )
Every time we see a left delimiter, we push it to the stack
Every time we see a right delimiter
We check if the stack is empty
If it's empty, that means there is no left parentheses to match our right parentheses
Return false!
If it isn't empty, that means that we're safe, so we remove one of the left parens from
the stack
At the end of the loop, we check if there's anything left in the stack
If yes, that means there exists one or several unmatched left parentheses, which means we
return false
Otherwise, if we're empty, return true!
- Another problem he has you do is to perform a change of base
Yes I'm aware this is from the homework, but I'm out of problem ideas :D
In decimal (what we normally use), we use the digits 0 through 9
This is a base 10 system because we have 10 possible digits
We combine these digits to make numbers; so 19 is just 1 * 10 + 9
253 is
Study Guide Page 64

253 is
For other bases, we do something similar
Consider binary; there are only two digits, 0 and 1
This means we multiply by some power of 2
For instance, 110101 in binary is equivalent to:
This equals 53
Our pseudocode for this algorithm will be to divide the number by the base; the remainder
will be our ones place digit
Repeat again, the next digit will be the tens place, the next the hundreds, and so on
We then have to multiply by the base itself to the current power
Let's begin! I'm going to use the STL stack, whereas you have to use the homemade stack for
the homework so it's not exactly the same
unsigned int changeBase(unsigned int n, unsigned int base) {
stack<int> s;
while (n > 0) {
s.push(n % base);
n /= base;
}
int result = 0;
while (!s.empty()) {
result *= 10;
result += s.top();
s.pop();
}
return result;
}
So basically, I do what I said I'd do above
My first while loop is continuously dividing the number by the base and storing the
remainders
Remember that since n is an int, dividing will truncate any remainders
If it weren't an int, then we would never quit the loop :(
Once all the remainders are stored in the stack
Then I pop them off one by one, multiplying each one by 10 to move it forwards
Alternatively, I could have used a string and concatenated them to the end
However, it's easier to just multiply by 10 here
- Another problem he has you do is to check whether a string is a palindrome
For instance, racecar is a palindrome because it reads the same forwards and backwards
The pseudocode for this is rather simple as well; we want to read the string from the front
and the back at the same time
And I just realized that this could be accomplished much quicker using a single for loop and
a vector
Oh well, let's do this with stacks and queues!
We're basically going to push every char into a stack and a queue at the same time
Then we loop through the stack and queue, using the stack's top method and queue's
front method
If at any point the two don't equal each other, then we've got a problem and we can
return false
Otherwise, we'll return true if we can iterate through the entire string with no
problems
The code is below

Study Guide Page 65

bool isPalindrome(string phrase) {


stack<char> s;
queue<char> q;
for (int i = 0; i < phrase.length(); ++i) {
s.push(phrase[i]);
q.push(phrase[i]);
}
while (!s.empty()) {
if (s.top() != q.front())
return false;
else {
s.pop();
q.pop();
}
}
return true;
}
That's all there is to it! Nothing too difficult
- Do note, something he might ask on the final is to read reverse polish notation using a stack
- The algorithm for this is later in the study guide; you'll know it when you see it
- Other than that, stacks and queues really aren't all that difficult; they're very simple data
structures with a singular purpose

Templates
- The only reason you care about this for the midterm is so you can find errors
- He might ask you to template things on the final though as a free response question
Let's hope he doesn't :(

Overview
- Ever write a piece of code for integers and think to yourself, man, I really wish this worked for
doubles as well?
- Probably not.
- For the sake of argument, let's say you did

- ^obligatory XKCD
- Anyways, the purpose of templating is to make functions work in general cases
- For instance, suppose we have the following function
int sum( int a, int b ) {
int result = a + b;
return result;
}
Study Guide Page 66

}
- Not a very useful function since we could just do a + b, but roll with it
- Anyways, our function only works for ints. What if we wanted to do the exact same thing using
doubles instead?
- Well, we'd have to write a whole new function with doubles
double sum( double a, double b ) {
double result = a + b;
return result;
}
- How irritating, right? I found that irritating and I copied and pasted that snippet too :|
- Well, with templating, we can generalize code for an arbitrary variable type!

Function Templates
- Yes, there are other types of templating too which we'll get to later
- To make a function a templated function:
Use the keyword template followed by a list enclosed in <> of type parameters
Replace every occurrence of a type to parametrize with the appropriate type parameter
- For our function above:
template<typename T>
T sum( T a, T b ) {
T result = a + b;
return result;
}
- And that's all there is to it!
- Note: older versions of C++ allow saying template<class T> instead of template<typename T>,
even if T isn't a class
This is important because he will 100% put this on the test in an attempt to trick you
template<class T> is NOT an error
- In general, templated function declarations should go in the .h file, not the .cpp file
Not really sure why
Just memorize that; he'll have you do that for your homework
- Also in general, the compiler is smart enough to know what argument type you are trying to use
int m = 2; int n = 3;
cout << sum(m, n); // will properly templatize this for ints
double x = 2.0; double y = 3.2;
cout << sum(x, y); // will properly templatize this for doubles
- What if we try to mix variable types?
- The compiler is also smart enough to throw an error!
- For example:
cout << sum(m, x);
- This will fail because m is type int and x is type double
However, when we templatized the function, we said the args were (T a, T b)
This means that both a and b are of type T
Since a is type int when we call sum(m, x), then b must also be type int
However, this is not the case, because x is type double
Therefore, the compiler will get confused and throw an error
- Another thing that may confuse the compiler is inheritance
- Suppose we have an Animal class which is a parent of the Cat class
Animal a; Cat c;
sum( a, c );
- Let's put aside how ridiculous it is that we're trying to add an animal with a cat and pretend the
operator+ was overloaded for them
However, the compiler will be confused because it won't be sure which type to use
Since c is both an Animal and a Cat, both would work
Study Guide Page 67

Since c is both an Animal and a Cat, both would work


To fix this, we explicitly state the type using <> notation, much like we do for vectors
sum<Animal>( a, c ); // This will work
Note: this is more of an inheritance thing, but note that the following will NOT work
sum<Cat>( a, c ); // Will throw error
Why does this throw an error? Well, a Cat is an Animal, so using type Animal works,
but an animal is not necessarily a Cat
Therefore, this will throw an error because a is not a Cat, and therefore does not have
Cat methods

Specialization
- What if we had a templatized function and decided to overload it for a specific type anyways?
template<typename T>
void sayHello( T param ) {
cout << "Hello World";
}

void sayHello( string param ) {


cout << param;
}
I mean, the templated version should work for every single parameter type, string included
What happens if we try to do this?
Well, this is called specialization!
Specialization - the process of overloading a templatized function with a specific type
Whenever the compiler is greeted with both a templatized version and a version with an explicitly
defined type, it will use the one with an explicitly defined type instead of the templatized version
For example:
sayHello(2); // will print out "Hello World"
sayHello('c'); // will print out "Hello World"

string p = "this is a string";


sayHello(p); // will print out "this is a string"
- To force the compiler to use the templatized version, just explicitly define the type
sayHello<string>(p); // will print out "Hello World"

Class Templates
- Well, we can templatize functions
- Why not take it one step further and templatize classes while we're at it?
- I mean, classes are just collections of variables and functions; we can generalize the types of these
variables and functions for the entire class
- A good example of this is the vector class
Remember how we can create a vector for any type by doing vector<type>?
Basically the same principle for templatizing your own class
- To templatize a class
Prefix the class with template<typename T>
Prefix all of the friends with template<typename T>
Place all of the function declarations in the same header file as the class
- When defining a templated class, remember that you must also place the template type in the
scope
For example, suppose we have the following:
template<typename T>
class Animal {
public:
Animal();
Study Guide Page 68

Animal();
};
template<typename T>
Animal<T>::Animal() {
// put constructor code here
}
Note that when we define the scope of the function, that is, the thing before the double
colons, we have to put the type in <> notation as well

Common Errors
- Since this will show up as an error problem on the test, I thought I'd highlight the errors he'll use
- Templates don't preserve inheritance
Suppose we have classes Animal and Cat; a Cat is a subclass of Animal
The following code will break:
vector<Animal> v;
Cat kitten;
v.push_back(kitten); // Will throw an error
Just because a vector is of type Animal does not mean that it will accept subclasses
- Templated functions/classes must have the template<typename T> prefix
Look at the top of the class declaration, the friends, and function declarations for this
If it's missing, then that's an error
Especially lookout for a missing template<typename T> for friends; this is the easiest one to
miss
- You must put <> notation for the scope declaration of a templated class
Basically, the Animal<T>::Animal() example above
Instead of that, he'll write something like:
template<typename T>
class Animal {
public:
Animal();
};
template<typename T>
Animal::Animal() {
// put constructor code here
}
This is a good opportunity to see if you can find the error yourself :P If you can't find it,
carefully compare this code to the code above
- Compiler getting confused on what type to use
Cat c;
Animal a = c;
areEqual( a, c ); // Will throw an error because the compiler doesn't know whether to use
type Cat or Animal
- Must also use <> notation when declaring an object of a templated class
Suppose we templatized the Container class
When creating an instance of the Container object, we have to use Container<type> instead
of just Container
Container h(2); // Will throw an error
Container<int> h(2); // Correct
- Not replacing a return type or variable or something with the template type T
For example
template<typename T>
double Quantity<T>::getValue() const {
Study Guide Page 69

double Quantity<T>::getValue() const {


return value;
}
See that double there? Yeah, that should be a T
- Not templatizing an argument for a template function
Consider
template<typename T>
T getFirstElement(vector& v) {
// some code here
}
Yeeeeeahhh, that vector there should be vector<T>
He's also done things for custom classes as well
For the following, note that the Quantity class is templatized
template<typename T>
class ClassName {
public:
ClassName( Quantity& q, T value );
}
Since Quantity is a templatized class, it should be Quantity<T>
- Again, note that template<class T> is NOT an error

Trees
- Thus marks the beginning of finals materials woo
- At this point, I have no idea what's going to be tested cause I haven't really taken the final yet, so
- However, I was told that you'd be given a picture and you'd have to draw some operations and
such

Overview
- Tree - a tree is a data structure with a bunch of nodes and directed edges that connect them
- If you've taken Math 61, then this should be somewhat familiar?
Basically, it'd be the same as a rooted tree, if you remember what that is
- If not

- This is a pretty good idea of what it looks like


- There is one root node, which is the 2 at the top
Each node has up to two children, a left child and a right child
Each one of those children can have up to two children as well
If a node does not have any children (the 2, 5, 11, and 4 at the bottom), then they are
considered leaves
Leaf/External Node - a node without any children
Internal Node - a node with children
- Note that all the edges point down, with none pointing up; this is because parent nodes point
towards children, but children nodes do NOT point towards parents
If you are at any arbitrary node, you CAN tell what its children are; you CANNOT tell what its
parent is
Study Guide Page 70

parent is
- For the most part, we're only going to deal with rooted trees in this class, since unrooted trees
aren't interesting
Rooted Tree - a tree with a designated root node that has no parent; all children can be
traced back up to it
- Remember that the height of a tree is the length of the longest path; in the example above, it is 4
- Also for the purposes of this class, we're only going to work with binary trees
Binary Tree - a tree where each child has at most 2 children

Binary Search Trees


- If we weren't specific enough, for the purposes of this class, we're mostly going to work with
binary search trees as well
Binary Search Tree - a binary tree where every node is larger than its left subchild and
smaller than its right subchild
Also, no duplicate values are allowed in binary search trees
- In the example above, that is NOT a binary search tree, since the root node 2 is smaller than its
parent node 7
- The reason we have binary search trees is to organize data in a sorted fashion

- This is an example of a binary search tree


- Let's look at the root node 8
Notice how the left child of 8 (which is 3) and all of its children and all of their children are
smaller than 8
Also notice how the right child of 8 (which is 10) and all of its children and their children are
larger than 8

TreeNode
- So Ouellette has graciously gone and constructed his own TreeNode class, which I'm sure is going
to show up on the final probably
- This is similar to his Node class

Study Guide Page 71

- Each TreeNode has an int data, which is the data at that point, and two pointers to the left and
right children
- If the TreeNode does not have a leftChild or rightChild, those will equal NULL
- Uhh, I think that's all

Insertion
Binary Search Tree operations are very important since these will probably show up on the final
Let's start with insertion!
When inserting, we're obviously going to be given a TreeNode parameter to insert into the tree
If our tree is empty, then we're going to make this TreeNode the root, since there's nothing else in
the tree
- If our tree is NOT empty (which it isn't 99% of the time)
Compare the value of the parameter TreeNode to the value of the root TreeNode
If our node's value is GREATER than the value of the root, we visit the root's right child
If our node's value is SMALLER than the value of the root, then we visit the root's left child
- We will repeat this process until we reach a point where there is no existing Node at that location
Then we will set the previous Node to point to our new Node
- Suppose we have the following tree:
-

- Say we want to insert the TreeNode with the value 9


- We start at the root, 6
Study Guide Page 72

- We start at the root, 6


Since our Node is LARGER than 6, we move to 6's right child, which is 8
We then compare our Node with 8; since our Node is once again LARGER than 8, we move
to 8's right
However, 8 has no right child at the moment; it only has a left child
Therefore, we've found our point of insertion; we will add 9 as 8's right child as so

- And that's all there is to it!


- Here's the code:
bool BinarySearchTree::insert(TreeNode* node, TreeNode*& parent){
assert(node); // check if node exists
if(parent == NULL){
parent = node;
return true;
}
if(node->data < parent->data)
return insert(node, parent->leftChild);
else if(node->data > parent->data)
return insert(node, parent->rightChild);
else
return false;
}
- The function will return true if insertion is successful; otherwise, it will return false
The only time it would return false is if the parameter's data is equal to the node we're
comparing to
Basically, if it's a duplicate
In this case, then it will trigger the else block since it doesn't trigger any of the other blocks

Searching
- Well I mean, if the data structure's called a Binary Search Tree, I'd assume there would be some
searching going on here
- Fortunately, the search process with a binary search tree is identical to the binary search that we
went over earlier
- As a refresher, in case you don't remember it, our algorithm was
Start at the center of our data structure
Study Guide Page 73

Start at the center of our data structure


If our value is equal to the one we're looking for, found!
If not, compare the center to our value
If our value is greater than the center's value, recursively search the right half
If our value is less than the center's value, recursively search the left half
Gee, this sounds awfully similar to what we're doing here with our tree
The algorithm for searching a Binary Search Tree is the same! In case you didn't pick up on my
sarcasm :P
Let's try to find 7 in the tree above
We start at the root node 6
Our value 7 is larger than 6; therefore we will recursively search the right half
Basically, just visit 6's right child
Okay, now we're at 8; 7 is less than 8
Therefore, we will search the left by visiting 8's left child
Now we're at 7; oh look, that's equal to 7; we're done!
Wasn't that fast?
Guess the complexity of searching through a BST for bonus points
Yup, O(log N), which is the same as the binary search algorithm
Let's look at our tree; what's the maximum number of comparisons we'll have to make?
Worst case scenario, our node is at the bottom of the tree, since we can't really go
backwards
In the tree above, the absolute worst case would be if we were looking for 3 because we'd
have to make 4 comparisons (with 6, 2, 4, and 3)
This is true for any tree of any size
From Math 61 (you don't have to remember this), the maximum number of nodes a
complete binary tree of size h can have is
Suppose we have a tree of height 3; then at most, it will have 7 nodes
We will have the root node (1), its two children (2), and each of those children have 2
children (4)

Suppose we have 100 elements; then, the height of the tree will be expressed in the
following equation

What do we do now? You guessed it, log :D

This means, with a complete binary tree with 100 nodes, at most it will take 6 comparisons
Note that we used the log operator in this thing and our algorithm is O(log N) complexity
Coincidence? I think not :P
But anyways, I digress; our algorithm is O(log N) complexity because conceptually, we're just
recursively dividing our search in half as we progress
Let's say we were to expand our tree by 1000 extra nodes
We wouldn't have to make 1000 extra comparisons because we keep subdividing our search
area
Therefore, we will grow at O(log N) speed
Here's the code:
TreeNode* BinarySearchTree::find(int value, TreeNode* subtree){
if(subtree == NULL) return NULL;
else if(value == subtree->data){
return subtree;
}
else if(value < subtree->data){
return find(value, subtree->leftChild);
}
else {
Study Guide Page 74

else {
return find(value, subtree->rightChild);
}
}

Removal
- This one's a bit tricky, unfortunately
- We can't just find the node we want to delete and delete it; then all of that node's children will
become orphans :(
Note: orphan is not a CS term, I'm just making a joke here, please do not use that term on
the test :P
Actually, that's a really great term though; from now on, I'm going to refer to child nodes
with no parents as orphans :D
- Our deletion process will take two steps
Find our node to delete (this is just the stuff discussed above)
Deal with our node and its children
- What does that mean?
- We have to consider three possibilities
The node we want to delete has:
0 children
If it has no children, well, we can just remove the node no problem
That's exactly what we're going to do; just delete the Node
1 child
If it only has one child, then we can fix this easily; we'll just replace the node
we're trying to delete with our child
2 children
- Let's visualize these three cases

No Children

- Let's say we want to delete 3


- Well, just type:
delete target
- There are no children we have to take care of
- But wait! What about 4? 4 won't be too happy we just took out one of its children
parent->leftChild = NULL;
Study Guide Page 75

parent->leftChild = NULL;
target = NULL;
- There, now 4 won't remember it has any children
- Therefore, deletion of a no-child node is a two-step process: deleting the node and setting its
parent's child pointer to NULL
- We also have to set the target pointer to NULL to avoid any null pointers

One Child

- Let's say we want to delete 2 now


- This will be a bit problematic since 2 has its own family beneath it, it has a child 4 which has its
own child 3
- Remember that when we constructed this tree, we made sure that every node we inserted fit the
Binary Search Tree property
That means that all of 2's children are less than 2's parent, 6
Therefore, we will have no problem just replacing 2 with 4 and its family
- Remember that we always want a pointer to something
- First, we make 6 point towards 2's children before eliminating it to avoid dangling pointers
parent->leftChild = target->rightChild;

Study Guide Page 76

- Okay great, that's done with


- Now we can delete 2 and no one will be any the wiser
delete target;
target = NULL;
- And that's all there is to it!
- Therefore, deletion of a one-child node is also a two-step process; make its parent point to the
target's only child, and then delete the target node
- Again, remember to set the target pointer to NULL to avoid NULL pointers

Two Children
- Here's where things get more complicated, although it isn't that bad still!

Study Guide Page 77

- Suppose we want to delete 2 now


- We can't just replace it with the left child since the Binary Search Tree condition won't be satisfied
anymore
1 is less than 4, so it wouldn't make sense to have 1 as 4's new parent
- We also can't just replace it with the right child since the right child already has a bunch of
children
- In this case, what you want to do is replace the target node with the SMALLEST element of its right
subtree
In this case, we would want to replace 2 with 3
To find the smallest element of the right subtree, go right once (to visit the right subtree)
and then go as far left as you can
- Then we just swap the target node and the next smallest node
target->data = replace->data;

Study Guide Page 78

- Now that that's taken care of, all we have to do is delete 3


- This is just the leaf case, which we can take care of with the process above
- Here's the code:
bool BinarySearchTree::erase(const int& value, TreeNode*& subtree){
if(subtree == NULL) return false;
if(value == subtree->data){
if(num_children(subtree) <= 1){
TreeNode* target = subtree;
if(subtree->leftChild != NULL){
subtree = subtree->leftChild;
}
else{
subtree = subtree->rightChild;
}
delete target;
return true;
}
else {
subtree->data = find_min(subtree->rightChild);
return erase(subtree->data, subtree->rightChild);
}
}
else if(value < subtree->data){
return erase(value, subtree->leftChild);
}
else {
return erase(value, subtree->rightChild);
}
}
- Unfortunately, this one's a lot longer than the other code snippets before it, but let's walk through
it!

Study Guide Page 79

if(subtree == NULL) return false;


- Should be self-explanatory; if we're at a leaf, quit
- Otherwise:
if(value == subtree->data){
// We'll come back to this block in a sec
}
else if(value < subtree->data){
return erase(value, subtree->leftChild);
}
else {
return erase(value, subtree->rightChild);
}
- If we aren't at the node we're trying to delete, then we use binary search to find the node we
want to delete
- Once we find the node we want to delete:
if(value == subtree->data){
if(num_children(subtree) <= 1){
TreeNode* target = subtree;
if(subtree->leftChild != NULL){
subtree = subtree->leftChild;
}
else{
subtree = subtree->rightChild;
}
delete target;
return true;
}
else {
subtree->data = find_min(subtree->rightChild);
return erase(subtree->data, subtree->rightChild);
}
}
- We have two cases: the easy case and the hard case
The easy case is if the node is a leaf or only has one child
Remember how easy that was?
The hard case is when there are two children
- Let's look at the hard case first (I know, counterintuitive but still)
- This is triggered in this block:
else {
subtree->data = find_min(subtree->rightChild);
return erase(subtree->data, subtree->rightChild);
}
- All we do is we replace the current node's data with the smallest value of the right subtree, as we
described above
- Then we go and delete the smallest value of the right subtree, which will either be a leaf or a node
with one child (which, incidentally, has to be the right subchild)
The reason for this is the smallest node can't be a node with two children
If it did have two children, then it has to have a left child (because that is one of the two
children)
Remember that the left child is always smaller than the parent
Therefore, this node can't be the smallest because it has a left subchild that is smaller
Study Guide Page 80

Therefore, this node can't be the smallest because it has a left subchild that is smaller
than it
- Once we finally do get down to the no-children or one-children case:

if(num_children(subtree) <= 1){


TreeNode* target = subtree;
if(subtree->leftChild != NULL){
subtree = subtree->leftChild;
}
else{
subtree = subtree->rightChild;
}
delete target;
return true;
}
We create a target pointer that points to the current Node we're at, and we helpfully name it
"target"
We check if there is a left child; if there is a left child, then we set the current node to its left child
Otherwise, we set it equal to its right child
But wait! What if there ARE no children, you ask?
Well, in this case, subtree->rightChild will be equal to NULL
Since we're setting subtree = subtree->rightChild, this is equivalent to setting subtree equal
to NULL
This is effectively deleting the node
Finally, we delete the target pointer, and return true because we're done!
What about setting target equal to NULL?
Well, this is Ouellette's code, and the stuff I said earlier about setting target to NULL was in
his lecture slides
I guess he forgot? Whatever.

Tree Traversal
- But wait, there's more! We gotta learn how to navigate a tree as well
- This is definitely going to show up on the final (as far as I've heard); you'll be given a tree and
asked to print out what it says using postfix, prefix, and in-order traversal
By the way, if you're curious, all three of the following are depth-first searches
There's one more (level-order traversal, which is breadth-first search) that will come later :P
- What are these fancy terms I'm spewing out?
Let's find out!
I feel that my excitement at this point is far too high, to the point where I'm basically just
Ouellette
It is also like 2 in the morning right now so I'm kinda on a dopamine high off of sleep
deprivation :D

Preorder Traversal
- The point of tree traversal is mainly to iterate through a tree's nodes in a certain order
- In preorder traversal, we follow the following order:
Visit the TreeNode
Recursively visit its left subtree
Recursively visit its right subtree
- Let's say we have the following tree:

Study Guide Page 81

- We start at the root node, 6


Our preorder traversal ordering tells us that our first step is to visit the TreeNode
Okay, let's do that; our first data point is 6
- Great! Now what
- We have to then recursively visit its left subtree
- This means that we visit 3; okay, now we're at 3
What do we do now? Again, we're told that our first step is to visit the TreeNode
Okay, so we're at 3; so far, our order is: 6 3
- Now what? Recursively visit the left subtree
- Okay, we're at 1; again, our first step is to visit the TreeNode
Our order: 6 3 1
- Now what?
Recursively visit the left subtree
But wait! There's no left subtree; therefore, we're done here
Okay, let's recursively visit the right subtree?
No subtree here either; therefore we complete our pre-order traversal for this node
- Where did we last leave off? Oh yes, 3
Remember that we left off like halfway through 3's pre-order traversal; we visited the left
subtree but not the right
Let's visit the right now
- We're now at 4, so our order is: 6 3 1 4
Visit left? Doesn't exist
Visit right? Okay, we're at 5
- Our order is now: 6 3 1 4 5
No more nodes, so we're done here
We also finished the pre-order traversal for 3 since we visited its right subtree, so we go
back up one level
- Now we gotta visit 6's right subtree
- I'm going to spare you the details since I think you have it figured out at this point, but we should
end with:
63145879
- Great! So now, why would we ever want to do this?
- The main reason you use preorder traversal is to print out the order of nodes that, when used to
Study Guide Page 82

- The main reason you use preorder traversal is to print out the order of nodes that, when used to
create a new binary search tree, will create the exact same tree as before
- Even though two BS trees may have the same nodes, they could look completely different
- For example, suppose we tried to construct a tree using the same nodes as above but in the order:
13456789
- Well, our root would then be 1
Since 3 is greater than 1, we would move it to the right of 1
Since 4 is greater than 1 and then greater than 3, we would move it to the right of 3
See the pattern? We'll basically end up with a linked list since every additional item is
greater than the right-most element in the tree
- Therefore, ORDER MATTERS when constructing a new BS tree, and preorder traversal is a way to
preserve that order
- I'm not going to prove it to you since you could probably do that on your own; actually, it'd be
good practice to try to reconstruct a binary search tree using the order we derived just now

In-Order Traversal
- The steps for in-order traversal are:
Recursively visit its left subtree
Visit the TreeNode
Recursively visit its right subtree
- Okay, let's try it again with the same tree

We start at our root node 6


HOWEVER, since we're told to first visit the left subtree, 6 is NOT part of our order yet
We're now at 3, but again, must visit left subtree first
Okay, now we're at 1; there is no left subtree, so we're done with that
Next step: visit the TreeNode
Okay, our order begins with 1
Next step: visit the right subtree; there is none, so we can go back one level
Now, we can visit 3, so our order is now 1 3
Now we need to visit 3's right subtree
We're now at 4; try to visit the left subtree
However, it doesn't exist, so we go to the next step
We visit 4; our order is now: 1 3 4
Study Guide Page 83

- We visit 4; our order is now: 1 3 4


Now we visit 4's right subtree
- 5 doesnt have a left subtree, so we're done there; next step, visit 5
- Our order is now 1 3 4 5
- Okay, now visit 5's right subtree
Doesn't exist, so we're done here; back up
- Since we finished 3, we can finally back all the way up to our root node, 6 and finally visit it
- Our order is now: 1 3 4 5 6
- Now we visit 6's right subtree
- I'm sure you can do this at this point or at least see the underlying pattern that emerges
- At the end, our ordering will be 1 3 4 5 6 7 8 9
- Why is this useful? Well we just printed the whole thing out in order! I think that's kinda useful
Also, given that our tree is already constructed, it is an O(N) operation to print out every
node in order!
Why is it O(N) and not O(log N) or something higher, like O(N2)?
Well, we have to visit every single node at least once, which means that if we were to have a
tree of 200 elements, we kinda have to visit all 200 of them to print them out in sorted
order
Therefore, this is not O(log N) since we have to do this N times
We're also pretty efficient; we visit every single node exactly once, not one time more
Therefore, it's not anything greater than O(N)

Postorder Traversal

- Also called Reverse Polish Notation, this is kind of the opposite of prefix notation
The comic will make more sense after we finish this :P
If you can understand all the comics I insert into this study guide, then chances are you will
be well prepared for the exam :P
Also for citing XKCD references on the interwebs when conversing with the denizens
of the internet
- Anyways, our ordering for postorder traversal is
Recursively visit its left subtree
Recursively visit its right subtree
Visit the TreeNode
- Notice how all three traversals have the exact same 3 commands, just in a different order
Preorder = visit first
Inorder = visit in the middle
Postorder = visit at the end
- Let's see how this all plays out on the same binary tree as before!

Study Guide Page 84

Let's see how this all plays out on the same binary tree as before!

- We begin at the root, but we're told to visit the left subtree before doing anything
- Now we're at 3; left subtree!
- Okay, 1; left subtree!
But wait, it doesn't exist; okay fine, right subtree
That doesn't exist either
We can finally visit 1
Order thus far: 1
- Now we back up to 3, but we have to visit the right subtree first
- We're at 4; visit left? Nothing, so visit right? 5
- Okay, at 5
Visit left, nothing, right, nada
We can finally print out 5
Our order: 1 5
- Back up to 4; since we visited the left subtree and right, we can add that to our order now: 1 5 4
- Back up to 3; since we visited its left and right subtrees, add it! 1 5 4 3
- Okay, now we're at 6
- However, we can't visit 6 yet because we haven't visited its right subtree
- We go to 8 and visit its left, which is 7
Left, nothing, right, nothing; add to order: 1 5 4 3 7
- Okay, back up to 8, now visit 8's right subtree
9 is a leaf as well, so we can print it: 1 5 4 3 7 9
- Back up to 8 again, since we've visited both left and right, we add it to the order: 1 5 4 3 7 9 8
- Finally we're at the root, and since we've done everything, we can add it to the order: 1 5 4 3 7 9 8
6
- And we're done!
- So why did we do all this? What's the point of having postorder traversal and why did I post a
comic of a sausage?
One thing at a time!
Postorder is useful for deleting all the nodes in a binary search tree without causing memory
leaks
Let's say you start at the root node and want to delete all the nodes in the tree
Well, it wouldn't really make much sense to delete the root node and then move on
because after you delete the root node, you can't access its children anymore :(
Study Guide Page 85

because after you delete the root node, you can't access its children anymore :(
Therefore, we probably want to delete from the bottom up
We want to delete the leaves first, which will make their parents leaves; then
we want to delete those new leaves, and recurse all the way until we get back
to the root
Once we have eliminated the root node's entire family, we can finally delete the root
node
Think of it as some really twisted mafia movie or something

Expression Trees
- But what about the sausage? Yeah, I'm pretty sure you don't care that much, but I do need a way
to tie into the next topic somehow
- One use of binary search trees is expression trees
Basically, consider the expression 1 + 2 * 3
- For some reason, some people find it useful to convert this into a tree, where the numbers are
leaves and the parents are nodes
- Like so!

- I mean, I guess this is an elegant way to write it out


- Anyways, one application of the three traversal methods we just studied is to print this out in
different ways
- Preorder traversal: + 1 * 2 3
This is called prefix / Polish notation; I don't really see how this makes it any more readable
but okay
- In-order traversal: 1 + 2 * 3
See, this makes sense; I don't get why we can't just use this but okay
Also, this is called infix notation
- Postorder traversal: 1 2 3 * +
This is postfix notation, also called Reverse Polish notation
By the way, I'm not just throwing in these vocab phrases for fun; these are from the
lecture slides, so it actually is important to know what Reverse Polish notation is :P
Notice how we write the numbers first and then the operators, instead of sandwiching the
operator between the numbers?
Yes, that is a deliberate use of the word sandwich.
Hopefully the comic makes a lot more sense now :P
The hot dog should go between the two buns, but instead it falls outside
because that's how it works in postfix notation
- Yeah, these things are kinda terrifying to read, but let's go through how to read them just in case
they do show up on the exam
I'm not going to teach you how to read infix notation, by the way; that should be pretty
Study Guide Page 86

I'm not going to teach you how to read infix notation, by the way; that should be pretty
straightforward

Reading Postfix
- For this, it'll help to imagine a stack because that's what we're going to work with
- The general idea is:
Every time we get to a number, we push it to the stack
Every time we get to an operator, we pop off the top two values of the stack, with the
second top being the left side of the expression, the operator in the middle, and the top
being the right
Repeat until we finish the whole thing!
- Okay, let's do this
- Let's try to read our 1 2 3 * + abomination
We get 1, push to the stack; our stack is 1
We get 2, push to the stack; our stack is 2 1 (I'm putting the top at the left)
We get 3, push to the stack; our stack is 3 2 1
We get *
Since this is an operator, pop off the two values in the stack with the top being the
right and second top being left
This gets us 2 * 3
We push this entire expression to the stack
Our stack is now (2 * 3) 1
Two items, 2*3 is one and 1 is the other item
We get +
We pop off the two items in our stack
Again, the top is the right side of the expression and second top is left
This yields 1 + (2 * 3)
And ta-da! We're done
- Here's another example, let's try with 2 3 + 4 *
We get 2, push to the stack: our stack is 2
We get 3, push to the stack: our stack is 3 2
We get +
This creates the expression 2 + 3, which we push back to our stack
Our stack is now the single item (2 + 3)
We get 4, push to the stack: our stack is now 4 (2 + 3)
We get *
This creates the expression (2 + 3) * 4
- And we're done!
- Hopefully this isn't too bad
- There is an algorithm to convert infix notation into postfix notation called the Shunting Yard
algorithm by Dijkstra
This is not covered in the class, but if he asks on the test to convert between the two
notations, this is certainly a much quicker way to do it or at least check your work
I'm not going to bore you with it here though, but it might be worth reading into

Reading Prefix
- Fun fact, this notation was invented because the inventor didn't want to use parentheses anymore
- Another fun fact, this is how LISP does its math! Which is ironic because LISP is a language that
uses a shitton of parentheses
- But I digress
- For prefix, it's easiest to read right to left; this is more annoying for computers, but what the heck,
we're humans
- Basically, we're going to do the same thing as postfix, just from right to left instead
- Let's try it with * + 2 3 4
Study Guide Page 87

- Let's try it with * + 2 3 4


We start with 4; push to our stack: 4
We're at 3; push to our stack: 3 4
We're at +
This is an operator, so we create the expression (2 + 3) and push it to the stack
Note that unlike postfix, in prefix the TOP goes to the left and the second top goes to
the right
Our stack is now (2 + 3) and 4
We're at *
This creates (2 + 3) * 4
- And tada, we're done!
- Just the same drill
We're at 2; push to our stack: 2 3 4
- One last note: Ouellette considers it important to tell you guys the difference between an iterator
and a traversal, so I might as well
An iterator is a container for a node and lets us visit certain nodes in a range (we can choose
to visit maybe every other, or maybe the first 5, or all of them), while a traversal covers the
entire tree without any interruption

Sets and Multisets


-

I dunno, I just don't really see this topic being all that important
I heard he might give you a bunch of data and ask the best way to organize it
Sets are the best way to do this if you don't want duplicates, I guess
I think that's it Oh well, I'll cover this topic anyways?

Overview
- Set - a data structure that consists of a collection of distinct elements
Basically, it's just a bunch of things with no duplicates
- IMPORTANT NOTE: In his lecture notes he says that a set is unordered, but the STL set that we
work with is ordered. The STL set works by storing its elements in a binary search tree. Keep this in
mind when writing code involving sets
- Ex: a dictionary
- Fortunately for you, he didn't create his own homebrew Set or Multiset class, so we get to use the
STL sets!
- To use a STL set, include the STL set library
#include<set>
- That's all there is to it
- To create a set, create it the same way you would create a vector
set<int> s;
- Let's run through set functions (the following is a cheat sheet you should put on your final index
card)

Set Functions
insert( element );
Will insert an element into a set if that element does not already exist in the set
Example:
set<char> s;
s.insert('a'); // set now contains 'a' inside of it
s.insert('A'); // set now contains 'a' and 'A'
s.insert('a'); // set already contains 'a'; still just 'a' and 'A'
count( element );
Will count the number of occurrences of an element in a set
Study Guide Page 88

Will count the number of occurrences of an element in a set


I find this a special breed of stupid especially since there can't ever be any duplicates of that
element in a set
Nonetheless, you could use this as an exists function since it either returns 0 or 1, and bools
in C++ are just glorified ints
Example:
set<char> s;
s.insert('a'); // set now contains 'a' inside of it
bool hasA = (bool)s.count('a'); // hasA set to 1 which is equal to true
// Doesn't actually need (bool) in front of it, but the
grader is not going to be happy if they see it without the
casting. I mean
bool anotherHasA = s.count('a'); // anotherHasA set to 1 which is equal to true
// This works perfectly fine and I highly doubt you'll lose
points if you do this because it works, but you know, for
the grader's sanity, cast it I guess?
if( (bool)s.count('a') ) { cout << "Hello World"; } // Will print out Hello World; proper
way to use this
erase( element );
Will remove an element from the set if it exists in the set; if not, it does nothing
Example:
set<char> s;
s.insert('a'); // set now contains 'a' inside of it
s.insert('A'); // set now contains 'a' and 'A'
s.erase('a'); // set now contains just 'A'
- These are all the methods he wants you to know from his lecture slides. Here are some more
methods that may be useful

empty();
Returns whether the given set is empty or not
Example:
set<char> s;
cout << s.empty(); // Will print out 1 because this is true
s.insert('a'); // set now contains 'a' inside of it
cout << s.empty(); // Will print out 0 because now this is false
size();
Returns the number of elements in the given set
Example:
set<char> s;
cout << s.size(); // Will print out 0 because the set is empty
s.insert('a'); // set now contains 'a' inside of it
cout << s.size(); // Will print out 1 because there's one element
find( element );
Will return an iterator at the position of that element
If it cannot find the element, it will equal the iterator returned by the end() function

erase( iterator );
Will delete the element at the given iterator
erase( iterator, iterator );
Will delete all elements between the two iterators
Study Guide Page 89

Will delete all elements between the two iterators


The following code will empty a set:
set<char> s;
// Pretend I inserted a lot of elements here
s.empty( s.begin(), s.end() );
cout << s.empty(); // Will print out 1 because the set is now empty
clear();
Or, you know, you could just use the clear function to delete everything in the set

Set Iterator
- Like a vector, sets also have iterators which work in exactly the same way. Nonetheless, here is
the syntax:
set<char> a;
a.insert('a');
a.insert('c');
a.insert('b');
for (set<char>::iterator itr = a.begin(); itr != a.end(); ++itr)
cout << *itr << " ";
This will print out:
abc
Because remember, STL sets are ORDERED.
Oh yeah, the syntax is identical to what you normally do for vectors or other STL objects
You can also use the reverse iterator in the same way:
for (set<char>::reverse_iterator i = a.rbegin(); i != a.rend(); ++i)
cout << *i << endl;
This will print out:
cba
- I guess all you have to remember for iterators is that it's the exact same as iterators for vectors
Basically begin, end, set<type>::iterator, set<type>::reverse_iterator, rbegin, and rend
Also that *itr gets the data from that iterator
Also you can increase an iterator's position by using ++itr but itr += 2 does not work for
reasons

Use Case
Why would we ever want to use a set?
The main thing a set has going for it is that it doesn't contain any duplicates
It also has a binary search tree under the hood, so lookup time is O(log N)
Ouellette's example for this is to use it as a spell checker, to check that every word in a given input
is in the dictionary
- His code is below:
-

void spell_check( istream& dictionary, istream& text )


{
set<string> words;
string word;
while( dictionary >> word ) words.insert(word);
while( text >> word )
if( words.count(word) == 0 )
Study Guide Page 90

if( words.count(word) == 0 )
cout << "Misspelled word " << word << endl;
}
- This takes in two input streams, a dictionary stream and a text stream
We create a set called words and we push every element from the dictionary input stream
into it
This ensures that there are no duplicates in our dictionary
I mean, if we're getting a dictionary stream with every word, why isn't the guy giving
us the dictionary smart enough to not give us duplicates so we could just use a vector?
Beats me It is more efficient to use a set though, because a vector would be O(N)
lookup time
This is because we have to use linear search to find things in a vector, whereas
since sets are built with binary search trees, they have O(log N) lookup time
Okay, so we create our dictionary set called words
While we have words in our dictionary input stream, we put them into the set
Then, we iterate through every word in the text input stream
If that word does not exist in our dictionary set, then print it out and print that it was
misspelled
Again, since words.count(word) returns either 0 or 1, we could've just used:
if( !(bool)words.count(word) )
- And that's all there is to it

Multisets
- As if a set weren't enough, someone deemed it appropriate to create a multiset to, you know,
torment us
- Multiset - a data structure similar to a set, except elements can occur multiple times
- So yeah, take the only thing that makes a set unique and get rid of it
Basically, this is just a collection of things (still sorted though, this still has a binary search
tree under the hood)
- This is also imported by using #include<set>
- To create a multiset, use the following notation:
multiset<int> bag;
- This will create a multiset of ints called bag
- The functions for multisets are the exact same as those for sets, except:
Insert will now insert duplicates
Count is no longer useless (that is, it actually returns an accurate count)

Use Case
- So why a multiset? Ouellette's example is a ballot box where each "vote" is a string of the
candidate's name
- We can quickly count the number of votes each candidate has by using .count("name")
This is probably better done as a map, but we'll get there later
- Here's his code:
void countVotes( istream& votes ) {
set<string> candidates;
multiset<string> ballotbox;
string vote;
while( votes >> vote ) {
candidates.insert(vote);
ballotbox.insert(vote);
}
for( set<string>::iterator itr = candidates.begin(); itr != candidates.end(); ++itr ) {
cout << *itr << ": " << ballotbox.count(*itr) << endl;
Study Guide Page 91

cout << *itr << ": " << ballotbox.count(*itr) << endl;
}
}
- For bonus points, rewrite this using a map once you learn maps :D
- This is a comprehensive exercise in sets and multisets
Basically, we accept an input stream of votes which are just candidate names
We push each vote into both a set and a multiset
Remember that sets don't allow duplicates
This means that the set will only have one of each name, whereas the multiset will
have every single input item
Once we read in everything, we iterate through all of the unique candidates by iterating
through the set
Then we print out the candidate's name as well as the total number of them
We obtain the total number of votes for that candidate by using the .count function
from the multiset
- And yeah, that's it; this is better done using maps
- That's all I have for you for sets and multisets :D They're pretty useless data structures in my
opinion, but I guess we gotta know them anyways

Maps and Multimaps


- Speaking of maps, maps!
- I find this topic to be slightly more important than sets, but again, this isn't a very significant topic

Overview
- In essence, a map is similar to an associative array
You know how arrays/vectors have indices?
If we want to get the first element in array, we'd do something like array[0]
If we wanted to get the fifth, we'd do something like array[4]
Now imagine that we weren't restricted to just numbers to put inside square brackets
What if we could put chars? Or even STRINGS?
Well you CAN
- Map - a data structure that keeps associations between elements of one type called keys and
another type called values
The stuff that goes inside the square bracket is the key, while the actual value is called a
value
For instance, let's say we have a map that is aptly named map
map['a'] == 5;
In this case, our key is 'a' and our value is 5, because when we put 'a' in the square brackets,
we get our value
- But I'm getting ahead of myself
- To use the STL map class, simply #include<map>
- Maps are created with the following syntax: map<char, int> m;
Separate the two types with a comma; the char will be the type of the key and int will be the
type of the value
- Also, keys are going to be stored in a binary search format, so there is no O(1) lookup time; rather,
it's still O(log N)
If you want O(1) lookup time, use a hashtable
This is mostly a note to me and not anything you really have to worry about :D

Use Case
- So what do we do with maps? Well, frequency counting is the best use case I guess
- Suppose you're given a string and you want to count the number of each character (I'm sure
you've done that already using vectors)
With vectors or arrays, you will have to convert the index into the character
Study Guide Page 92

With vectors or arrays, you will have to convert the index into the character
You might do a vector/array of size 26, where 0 corresponds with 'a' and 25 corresponds
with 'z'
Well, why don't we just use 'a' and 'z' as our keys directly!
Consider the following code snippet
map<char, unsigned int> frequency;
frequency['d'] = 1; // frequency['d'] now equals 1
frequency['a'] = 1; // frequency['a'] now equals 1
Note: in his lecture slides, he has the keys pointing to the same value. THIS IS WRONG. Each key
has a separate copy of a value
Like, I'm not nitpicking here or something, this is blatantly wrong.
In his picture, he has both 'd' and 'a' pointing at 1
This implies that if we change frequency['d'] = 2, then frequency['a'] will also equal 2
However it won't; frequency['d'] = 2 and frequency['a'] = 1
Um, yeah, so maps are just a special array that uses keys instead of indices
Pretty much just brush up on vectors

Pairs and Iterators


- So maps are actually composed of a smaller data structure called a pair
- A pair is a data structure that has two values of potentially different types
- In the frequency map above, there would be two pairs
Both would be of type pair<char, unsigned int>
One pair would be ['d', 1]
The other would be ['a', 1]
- I'm using array notation because I don't actually know how they look
- To use the class, #include<utility> although you will never use these individually so I don't think
this is an important thing to note at all
- You WILL have to know how iterators work with maps
- Speaking of which, here are some functions that will be useful!

Map Functions
find( key );
Will search the map for the key; if found, it will return an iterator at the position of that pair
If it is NOT found, then it will equal the iterator returned by the end() function
Example:
map<char, int> m; // creates an empty map
if( m.find('a') == m.end() )
cout << "This character could not be found"; // will trigger
erase( key );
Will search the map for the pair with the given key, and then delete it from the map
erase( iterator );
Will erase the pair pointed to by the given iterator
- While these are the only functions he needs you to know for maps, do be advised that all set
functions work for maps as well
- For instance, you can use count(key) to see if the map contains anything for that key; it will still
return 0 or 1
- Not that this function was any help anyways because you could always just get the data at that
point by doing m['a']

Iterators
- Okay, if you're given a choice, do NOT use iterators for maps. Seriously.
Study Guide Page 93

- Okay, if you're given a choice, do NOT use iterators for maps. Seriously.
- The only reason you would want to use an iterator is to iterate through EVERY possible key.
THAT'S it
- If you want to change one value, then use the already overloaded operator[]
- But enough complaining, I guess I shall teach you how to use them
- Iterators for maps are almost identical except for one small difference
You know how we normally use *itr to get the data at an iterator's position? Can't do that
anymore
Instead, you have to use itr->first for the key and itr->second for the second
- The reason for this is because the map iterator actually points towards a pair object, so when you
dereference the iterator, you're actually getting the pair object
The pair object contains two members, first and second
First refers to the key
Second refers to the value

- The following is code to print out every key and value in the map:
map<char, int> m;
m['h'] = 5;
m['a'] = 3;
m['d'] = 4;
for (map<char, int>::iterator itr = m.begin(); itr != m.end(); ++itr)
{
cout << itr->first << ": " << itr->second << endl;
}
- Note that this will print out the following output:
a: 3
d: 4
h: 5
- The reason for this is because keys are stored in a binary search tree, so they will be printed out in
order based on however the key is ordered
Using a reverse iterator will print them out in backwards order, so h d a

Multimaps
- So turns out there are multimaps as well
- Multimap - a data structure that generalizes a map by allowing multiple values to be associated to
the same key
- Conceptually, think of it as a multidimensional array, except you aren't limited to just numbers as
keys
- To create a multimap, make sure you #include<map> and use the following syntax:
multimap<string, string> m;
m.insert( make_pair( "key", "value" ));
- Yes, it's a lot more irritating to insert things, but hey, you can just write this on your formula card!

Multimap Functions/Iterators
insert( make_pair( "key", "value" ));
This is how Ouellette wants you to add things into a multimap
Example:
multimap<string, string> m;
m.insert( make_pair( "joe", "PIC 10A" ));
m.insert( make_pair( "joe", "PIC 10B" ));
m.insert( make_pair( "joe", "PIC 10C" )); // joe is now attached to PIC 10A, 10B, and
10C
Study Guide Page 94

10C
erase( key );
This will delete ALL items attached to that key
Example:
multimap<string, string> m;
m.insert( make_pair( "joe", "PIC 10A" ));
m.insert( make_pair( "joe", "PIC 10B" ));
m.insert( make_pair( "joe", "PIC 10C" )); // joe is now attached to PIC 10A, 10B, and
10C
m.erase( "joe" ); // There will now be NOTHING in the multimap
erase( iterator );
This will delete the element attached to this specific iterator
Example:
multimap<string, string> m;
m.insert( make_pair( "joe", "PIC 10A" ));
m.insert( make_pair( "joe", "PIC 10B" ));
m.insert( make_pair( "joe", "PIC 10C" )); // joe is now attached to PIC 10A, 10B, and
10C
m.erase( m.find( "joe" )); // joe will still be attached to 10B and 10C
lower_bound( key );
This will return an iterator to the FIRST occurrence of the pair with the given key. First is
determined by the order the pairs were added into the multimap
Example:
multimap<string, string> m;
m.insert(make_pair("joe", "PIC 10A"));
m.insert(make_pair("joe", "PIC 10B"));
m.insert(make_pair("joe", "PIC 10C")); // joe is now attached to PIC 10A, 10B, and 10C
m.erase(m.lower_bound("joe")); // m now only contains PIC 10B and 10C
upper_bound( key );
This will return an iterator to a point AFTER the last occurrence of the pair with the given
key. Last is determined by the order the pairs were added into the multimap
Note: you CANNOT delete the last element by using m.upper_bound("joe") since this will
return an iterator to the point AFTER the last pair; can't delete NULL
To delete the last element, you have to delete the point right before the last element
This can be done by using --m.upper_bound("joe") to reach the point before the end
Example:
multimap<string, string> m;
m.insert(make_pair("joe", "PIC 10A"));
m.insert(make_pair("joe", "PIC 10B"));
m.insert(make_pair("joe", "PIC 10C")); // joe is now attached to PIC 10A, 10B, and 10C
m.erase(--m.upper_bound("joe")); // m now only contains PIC 10A and 10B
// m.erase(m.upper_bound("joe")); <-- THIS WILL THROW AN ERROR

To iterate through all the values of a given key, use the following for loop:
for (map<string, string>::iterator itr = m.lower_bound("key"); itr != m.upper_bound("key");
++itr)
cout << itr->first << ": " << itr->second << endl;
Remember that you have to use ->first and ->second to access the key and value at that iterator's
position
You could also use the equal_range function to return two iterators to use as your upper and
lower bounds
Study Guide Page 95

lower bounds
Before I show you the syntax, understand conceptually that it returns a pair of two elements
The first element is the lower_bound iterator
The second element is the upper_bound iterator
Okay, we good? Here it is
equal_range( key );
This will return a pair of two iterators, one with a lower bound for the given key and one
with the upper bound
Example:
pair<map<string, string>::iterator, map<string, string>::iterator> itr_pair =
m.equal_range("joe");
for (map<string, string>::iterator itr = itr_pair.first; itr != itr_pair.second; ++itr)
cout << itr->first << ": " << itr->second << endl;
Let's break down that syntax, shall we
We create a pair of iterators; as such, we must specify the type of the pair
Both are going to be type map<string, string>::iterator; this is something we've
seen before
We then add those two as types for the pair
Once we've set the type of the variable, then equal_range will fill in the rest
In order to access the first and second variables, we use the .first and .second parameters to
access them
Note the for loop; this is how you iterate with equal_range!
Of course, this is much longer than just using lower_bound and upper_bound, so I'd just
recommend that
max_size();
Returns the max number of elements you can store in the map
If you're wondering, it seems to be 268435455 divided by the size of the data type
I highly doubt you'll ever use this function ever. At all.
size();

Returns the number of pairs currently in the multimap; if this is a regular map, it will just be
the number of keys
empty();
Returns whether the map is empty or not
clear();
Erases all elements from the map
count( key );
Returns the number of pairs with the given key. More useful for multimaps than for maps,
but oh well.
And all of the old iterator functions, such as begin, end, rbegin, rend

Priority Queues
Overview
- Priority Queue - a data structure designed to quickly access/remove the element in the collection
with the highest priority
- Not really a queue; it's basically a data structure designed to remove the most important element
first
Study Guide Page 96

first
- As you'd probably expect, the generous gods of the STL have bestowed upon us the honor of
implementing their pre-created priority queue
#include<queue> // yes, it falls under the queue import for some reason
priority_queue<string> vocab_words;
- By default, it will assign the highest priority to the largest number or word based on the order N
<U<L
Numbers smaller than uppercase characters smaller than lowercase characters
- For example:
priority_queue<string> words;
words.push("goodbye");
words.push("latest");
words.push("zebra");
words.push("another");
cout << words.top(); // will print out zebra
- This data structure shares a lot of the same functions that the queue class uses
For instance, pop, top, push, empty, and size are all the same
- In order to modify the priority_queue to work for custom classes, overload the operator<
Priority_queue uses operator< to compare elements
- Another (more convoluted) way to accomplish the same thing is to create a comparison class
Inside this comparison class, we overload the operator() as a const member function
This will return a bool and take two of arguments (const and by reference) of our
element type
Example:
class ReverseComparison {
public:
ReverseComparison();
bool operator() (const int& l, const int& r) const;
};

bool ReverseComparison::operator() (const int& l, const int& r) const {


return (l > r);
}
Great, we've created a comparison class. What do we do with this?
Well, we can create another version of the priority_queue that takes three type parameters
The type of element it will hold (same as the other priority_queue)
The type of the underlying container
The comparing function which defines the order of two elements according to priority
So let's say we want to create a priority_queue of ints sorted using our ReverseComparison class
We would create it using the following syntax:
priority_queue<int, vector<int>, ReverseComparison> pq;
Therefore:
priority_queue<int, vector<int>,ReverseComparison> pq;
pq.push(3);
pq.push(1);
pq.push(2);
cout << pq.top(); // Will print out 1
You see that container we specify in the middle? The vector?
In order to use a specific container for that slot, it MUST overwride the push_back command
This means you CANNOT use stacks, queues, etc because they use push, not push_back
By default, it will use a vector if you don't specify anything

Advantages/Disadvantages of Using a Vector


- Speaking of which, it's time for your topic! Complexity analysis! Yay.
Study Guide Page 97

- Speaking of which, it's time for your topic! Complexity analysis! Yay.
- So, I don't actually know how it works under the hood for the STL priority queue, and at this point
I don't care enough
- However, let's consider how much time it will take THEORETICALLY (since he has these in his
lecture notes)
- Let's say we have an unordered vector
To find the element with the highest priority, we have to use linear search to iterate through
This is O(N)
To remove the largest element, we just swap it with the last element and use pop_back()
We don't have to resort everything because we don't care about order!
This is O(N) for the search for the largest element and O(1) for the actual removal
process
To insert elements, we just use push_back() which is O(1)
- Let's say we have an ordered vector
To find the element with the highest priority, we can just look at the end of the vector,
which is O(1)
To remove the largest element, we can quickly remove it with pop_back() cause it's at the
end, which is O(1)
When we want to insert elements, however, we have to insert it into the vector somehow
Ouellette suggests using push_back() and then performing insertion sort on the last
element, which is O(N)
We aren't sorting the entire vector, just the last element

Heaps
- This will definitely show up on the final (or so I've heard)
- Again, you may have to pictorially represent a heap
- Or convert it to a vector or something I don't know

Overview

- Heap - a binary tree that satisfies the heap property


Heap Property - every level of the tree except for the last is completely filled, and every
node is larger than its children
- Therefore, the root is the largest element
Study Guide Page 98

- Therefore, the root is the largest element


- Heaps are NOT binary search trees
In BS trees, the right child is greater than the parent
This cannot be in heaps, however, since all nodes in a heap must be greater than BOTH of
their children

- This is a heap
- Note that in the final level, the nodes are as far left as possible
- If a new node were to be inserted, it would be inserted as the left child of the 3 node
The levels will fill from left to right, until all of the parent nodes have children
Then, it will move to the next level, again from left to right

Insertion
- The insertion process is actually not as bad as you might think
- This consists of two steps
Inserting an element into the next available spot
Re-heapifying the heap
Basically, fixing the heap structure to preserve the heap property

- Suppose we want to insert 8 into the heap above


First, we add it to the next available spot; it is now the left child of 4

Study Guide Page 99

Next, we need to fix the heap property


Notice how 4 is smaller than 8
However, in a heap, all nodes must be LARGER than their children
Therefore, this is NOT a heap :(
- To re-heapify, we compare the newly inserted node with its parent
If it is larger than its parent, then we swap them
Recursively repeat this until either the parent is larger than the new node or the new node
is now the root
- Since 8 is larger than all the other elements in this heap, we would expect it to become the new
root node, which it does
We first compare 8 with 4
Since 8 is greater than 4, we swap them
Now 4 is the left child of 8 and 8 is the right child of 7
- But wait! We aren't done yet
Again, repeat the process; compare 8 with 7
Since 8 is greater than 7, we swap them
Now 8 is the root node as such

- NOW we're done; the heap property is satisfied!

Deletion
- Deletion is kinda similar to insertion
- Three steps
Study Guide Page 100

- Three steps
Set the node you want to delete equal to the value of the last node
Delete the last node
Re-heapify to preserve the heap property
- Suppose we want to delete 8 from the following heap

- First things first: set 8 equal to 3; now the root of the tree is 3
- Delete the last element, so now the tree looks like this

- Now we need to fix the heap property


Unlike insertion, where we perform a bottom-up heapify, we do a top-down heapify
This means we compare to the children instead of the parent
What makes this more confusing is that every node has two children, unlike before where
there was only one parent to compare against
So which child do we use?
When comparing against children in a top-down heapify, always swap with the larger
child
This means we'll swap 3 and 6 because 6 is larger than 3
Now our heap has 6 as the root, with 3 and 5 as its children and 4 as a child of 3
But wait, we aren't done; we must repeat this process until either:
The new node is larger than both of its children
The new node is a leaf
In this case, 3 is less than 4, so we have to swap it with 4
Therefore, we will end with the following heap

Study Guide Page 101

- And now we're done!

Analysis of Complexity
- Of course we have to do this :P
- Let's say we're inserting a new node
There are two steps, adding the new node and performing the re-heapify process
The addition is O(1) because we just append it to the end
- What about the re-heapify process?
Well, let's consider the worst case
Worst case, our new node is the largest, which means that we will have to move it all the
way to the root
However, do we have to compare the new node to every single node in the heap?
No! We only have to compare to its immediate parent
Then, once we swap with its parent, we have to compare to its parent
We're going to repeat this process for however tall the tree is
For the insertion example above, we moved 8 all the way to the top, but we only did 2
comparisons (one with 3 and one with 7) in a tree with 7 nodes
Notice how efficient this is?
This is O(log N) complexity because we only work with half a tree at a time
- What about deletion?
There are three steps this time
Setting the value of the target node equal to the value of the last node; this is O(1)
Removing the last node; this is O(1)
Re-heapifying again
Again, when we re-heapify, we only compare to children
Worst case, we make twice as many comparisons as the bottom-up reheapify because
we have to compare to two children instead of just one parent
This is 2O(log N)
However, remember how we drop coefficients for Big O notation?
Therefore, this is also O(log N)!
- Therefore, heap procedures are very efficient
- Fun fact: heapsort is one of the most efficient sorting algorithms; basically, you construct a heap
and then just print out the max elements
Since the max elements are always the parents, finding the largest is really quick
This sorting algorithm is O(N log N) because we insert N nodes and the insertion process is
O(log N)
Again, this isn't on the final, but if you understand that this kind of process is O(N log N), you
should be well prepared for the final :D

Study Guide Page 102

Vector as a Heap
-

So how do we build one of these miraculous data structures?


Turns out that heaps are just other data structures masquerading as a heap
There are two ways to do this, using a binary tree and a vector
While building a heap using a binary tree makes sense since, well, all of our pictures thus far have
been of binary trees, building it using a vector is a bit more involved
Trust me, it's possible :D
Basically, there's a handy formula (write this on your formula card) to find the index of a node's
two children given the node's index
Do note that for this to work, index 0 has to be empty
If an element is at index n
Parent: (int)(n/2)
Left Child: 2n
Right Child: 2n + 1
I'm not going to prove this formula for you; just use it a couple times and it'll be pretty clear it
works :P
Here is a picture to help visualize it though

- What's great about using a vector is that inserting to the end of the tree is super easy!
Just use push_back to add something to the end
Want to remove something from the end? Pop_back

Level-Order Traversal
-

And here's the last type of tree traversal strategy! This is also known as breadth-first search
Basically, we visit the nodes on each level from left to right, before moving to the next level
This is the same as just iterating through the vector (if we were to use a vector)
For the heap above, it would just be 9 7 4 5 3 1 2
Notice how we read level 1 first (9), then level 2 from left to right (7 4), then level 3 from left
to right (5 3 1 2)
Study Guide Page 103

to right (5 3 1 2)
- You may be asked to code this on the test, given a binary tree instead of a vector
- The following is actually a really interesting algorithm to do this, and perhaps the best way to do it
as far
- Here's the code:
void BinarySearchTree::levelorder(ostream&os, TreeNode* subtree) {
if (subtree != NULL) {
os << subtree->data << " ";
tree_queue.push(subtree->leftChild);
tree_queue.push(subtree->rightChild);
}
tree_queue.pop();
}
void BinarySearchTree::levelorder(ostream& os) {
tree_queue.push(root);
while (!tree_queue.empty())
{
levelorder(os, tree_queue.front());
}
}
Basically,
the algorithm is to create a queue of TreeNode pointers
We push the root node into the queue
Then, while the queue is not empty
We take the first element from the queue
If it has a value (isn't NULL), then we print out its value
Then we push its left child and right child into the queue
Why does this work?
Let's consider the following tree

- We start at the root node, 7, and push a pointer to 7 into the queue
- This is the initialization procedure; now we begin the while loop in earnest
While our queue is not empty (it's not, we have the root node pointer)
We take the first element (the root node pointer)
We print out its value: 7
We then push pointers to its left and right child (in that order) into the queue
Now the queue contains pointers to 5 and 4, in that order
Repeat!
Since 5 is the first element in our queue, we remove it from the queue
We print its value, so now we've printed: 7 5
We push pointers to its left and right child into the queue
Study Guide Page 104

We push pointers to its left and right child into the queue
Now the queue contains 4 2 1
Repeat!
Print out 4, so now we've printed: 7 5 4
Push pointers to its left and right child into the queue (both are NULL, but they get
pushed in anyways)
Now the queue contains 2 1 NULL NULL
Repeat!
Print out 2, push in its left and right children, which are both NULL
Now the queue contains 1 NULL NULL NULL NULL
Repeat!
Print out 1; at this point we've printed out everything, which is: 7 5 4 2 1
Our queue is all nulls now
Fortunately, we have a catch in the code; if it's NULL, just pop it from the queue
- And that's it! This is level-order traversal
- Notice how it follows the same rule as before; we start from the top level, print them in order
from left to right, and then proceed to the next level
- Of course, if you're using a vector as the underlying data structure, you could just, you know, print
them out in order?
Might be a tad easier :P

Binary Tree as a Heap


- Of course, with all this talk about heaps and trees and pictures of heaps as trees, we might as well
just make it a tree
- What makes this much more irritating than a vector is we have no easy way to push an element to
the end
In a vector, we can use push_back and pop_back to access the last element; no such joy for
a tree
- Rather, what we can do is this super convoluted method I'm crying just looking at it right now
- Okay, basically, we store the size of the tree so far in a variable
The next available location will be size + 1
So far so good, right?
- Well, the next step is to convert that number into binary for our binary search tree
- Quick refresher (or tutorial) on how to convert decimal to binary
Divide the number by 2 and write down the remainder to the LEFT of the original number;
every time we divide, truncate it
Repeat until your number is 0
- For example, let's convert 18 to binary
18 divided by 2 has remainder 0
9 divided by 2 is 4 with remainder 1, so we have 10
4 divided by 2 is 2 remainder 0, so we have 010
2 divided by 2 is 1 remainder 0, so we have 0010
1 divided by 2 is 0 remainder 1, so we have 10010

Study Guide Page 105

- Which is correct, according to the output above! :D


- Anyways, what now?
- Well, follow the binary number as a set of instructions, with 0 meaning left child and 1 meaning
right child
- Suppose we have a tree of size 5 and want to insert at the next available spot, 6
6 converted to binary is 110
- Well

- Ignore all this other clutter, I'm lazy so I'm stealing Ouellette's pictures
- Anyways, we begin at the root node, which is 1
We look at the second digit in our binary number, which is 1
This means we visit the root node's right child, so now we're at 4
- We look at our last digit, which is 0
This means we visit the left child, so now we're at 1
This is position 6!
- Here's the entire process written in code (note that Ouellette uses a stack for binary conversion
purposes)
TreeNode* Heap::get_node( unsigned_int path ) const {
if( path == 0 || path > size ) return NULL; // invalid input
TreeNode* subtree = root;
Study Guide Page 106

TreeNode* subtree = root;


stack<unsigned int> s;
while( path > 0 ) {
s.push( path%2 );
path /= 2;
}
s.pop(); // get rid of the root node
while( s.size() > 0 ) {
int direction = s.top();
if( direction % 2 == 0 )
subtree = subtree->leftChild;
else
subtree = subtree->rightChild;
s.pop();
}
return subtree;
}
First, we check to make sure that the path is valid input
The first while loop is just converting the path into binary by pushing the elements into a stack
We then remove the first element, which is always going to be 1 (and the root node)
Then we pop each item off one by one, going left if it's 0 and right if it's 1
All other heap operations are the same, you're just going to have to use this process to access
nodes
- And that's it. That's all the material in the course!
-

Study Guide Page 107

Você também pode gostar