Você está na página 1de 5

Teaching Growth of Functions Using Equivalence Classes

An Alternative to Big O Notation


Constantine Roussos
Lynchburg College
1501 Lakeside Drive, Lynchburg, VA 24501
804-544-8395
roussos@lynchburg.edu

ABSTRACT Bachman in the book Analytishche Zahlen-theorie in 1892 [5]. I


Understanding growth of functions using the standard big O consider the notation to be archaic and present arguments to show
definition and notation is a challenge for many undergraduate that more modern mathematical notation better serves its function.
students. This paper This paper presents growth functions as members of equivalence
1. presents an approach to teaching growth of functions that classes and uses a modified big O definition to impose a partial
utilizes equivalence classes and partial ordering, ordering on these equivalence classes.

2. identifies those mathematical concepts students should Additionally, the functions under consideration are restricted to
comprehend in order to understand the principles underlying positive-valued, non-decreasing (monotonic increasing) functions,
growth of functions, those normally encountered in algorithm analysis. This approach
is consistent with the intuitive idea of comparing two functions as
3. demonstrates pedagogical inadequacies in existing order of being less than, greater than or equal to.
complexity notation and definitions and
The paper addresses the topic at two levels. The first level being
4. gives a rationale for restricting functions under consideration that reached in a Discrete Mathematics course and the second
to positive-valued, monotonic increasing. being the beginning of a course in Design and Analysis of
Algorithms. The importance of proper sequencing of course topics
Categories and Subject Descriptors is addressed and note is made of the background material required
G.2 [Discrete Mathematics]: and F.2 [Analysis of Algorithms for students to understand the various topics covered.
and Problem Complexity]: – order of complexity, equivalence
classes, partial ordering, big O notation. Limits and derivatives are shown to be useful tools in computing
the order of complexity of a function but probably not useful for
General Terms defining function relationships. Examples are presented to
Algorithms, Theory support this.

Keywords
big O, equivalence classes, little o, order of complexity, theta, 2 FUNCTIONS UNDER
complexity analysis, growth of functions. CONSIDERATION
Since the purpose in teaching growth of functions is ordinarily to
1 INTRODUCTION give students an important tool for comparing the efficiency of
Traditionally, growth of functions has been taught using big O algorithms we restrict ourselves to functions associated with the
notation. Although the concept and definition of big O are sound, execution of algorithms. These functions, normally,
for many students they are hard to grasp. This is due, in part, to a) map from Z+ to Z+ (the positive integers).
the complexity of the definition of big O and, in part, to the fact b) are monotonic increasing (non-decreasing)
that big O is often presented before students have been exposed to
concepts necessary to understand the principles underlying Definition 2.1: We shall refer to the family of functions defined
complexity analysis. Big O notation was introduced by P. above as Discrete Growth Functions.

3 THE BASICS AND BIG O


Permission to make digital or hard copies of all or part of this work for Selection Sort requires f(N) = C1* N2 operations to sort a list of N
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
numbers in the worst case. Similarly, Bubble Sort requires
bear this notice and the full citation on the first page. To copy otherwise, g(N)=C2*N2 operations to sort N numbers in the worst case.
or republish, to post on servers or to redistribute to lists, requires prior (Note: All Ci are constants and all logs are base 2, denoted lg.)
specific permission and/or a fee. The difference in the number of operations executed between
SIGCSE’04, March 3–7, 2004, Norfolk, Virginia, USA. Selection Sort and Bubble Sort in the worst case, expressed as a
Copyright 2004 ACM 1-58113-798-2/04/0003…$5.00 ratio, is a constant (C1/C2). In practice, a constant ratio may easily
be made insignificant by any number of factors including

170
a) the speed of the computer on which the program executes Unfortunately, many students have difficulty applying the
b) coding efficiencies/ inefficiencies definition even to this simple example. The reason for the
c) the language in which the program is written, etc. difficulty seems to lie in the fact that the two constants C and k
yield two degrees of freedom and this can be confusing. Some
Note that the ratio, f(N)/g(N) = C1/C2, remains constant as N students assume that only certain, hard-to-find combinations of C
increases. In complexity analysis such a constant multiple and k will actually work. Explaining to them that the two degrees
difference is not considered to be significant. Hence, we consider of freedom actually make it easier to find the required constants
Selection Sort and Bubble Sort to be equally efficient even though does not seem to help. Those readers who have worked through
Selection Sort typically requires fewer operations and runs faster the definition of the limit of a sequence [4] or epsilon-delta proofs
than Bubble Sort. for the limit of a real-valued function [8] will detect striking
similarities.
Merge Sort, however, requires only h(N) = C3*N*lg(N)
operations. The ratio between Selection Sort and Merge Sort is Fortunately, there is an easier way.
C1*N2/(C3*N*lg(N)) = (C1/C3)*N/lg(N). Definition 4.1a: If f and g are positive-valued functions defined
This ratio increases without bound as N increases without bound. on the positive integers (recall condition a. for Discrete Growth
I.e., h grows more slowly than f and we consider h to be less than Function) then f(N) is O(g(N)) iff there exists a constant C such
f in some meaningful sense. This holds true regardless of the that f(N) <= C*g(N) for all N [7]).
relative values of C1 and C3. Hence we do consider the running
times (efficiency) of these algorithms to be significantly different. It is often most useful to compare two functions, f and g, by
Traditionally, the means for expressing these differences has been examining the ratio f/g. To be consistent with this approach
the big O comparison. Definition 4.1a may be restated as follows:
Definition 4.2: (Big O modified)
4 BIG O AND ITS SHORTCOMINGS If f and g are functions that map from Z+ to Z+ then f(N) is
Definition 4.1: (Big O) O(g(N)) iff there exists a constant C such that
If f and g are real-valued functions then f(N) is O(g(N)) (big O of f(N)/g(N) <= C for all N.
g(N)) iff there exist constants C and k such that f(N) <= C*g(N)
whenever N>k. [2] This simpler definition appears to be easier for students to grasp
and to use. This addresses issue a. of "Pedagogical Impediments".
I. e., function f is less than a single constant multiple of function g
for all values of N that are sufficiently large. 5 EXTENSIONS TO BIG O
Most texts state that f = O(g) [2]. However, many students do not As stated above, Big O only compares functions for being less
understand that the = in this context does not represent strict than or equal to. Often we would like to know if one function is
equality and are confused about its use and special restrictions. equal to or strictly less than another. The traditional approach to
Consequently I avoid the use of that notation in favor of f is O(g) addressing this problem is to introduce two additional definitions.
as in [7].
Definition 5.1: f is Θ(g) (Theta of g) iff f is O(g) and g is O(f).
We then say f is big O of G. Some authors (for ex. [2]) consider In other words, if f<=g and g<=f then f ≡ g.
O(g) to be a set. In this case we should say that f ∈ O(g). This is It follows directly that if f is Θ(g) then g is Θ(f).
sound mathematically and pedagogically.
I see three pedagogical impediments inherent in reliance on the A Caveat:
definition of Big O given above. The use of Θ has led to some confusion in order of complexity
discourse. A common statement is "The order of complexity of
4.1 Pedagogical Impediments with Big O function f is g". It is not always clear if this is to mean that f is
a. The big O definition is too complex for many students. O(g) or f is Θ(g). The latter meaning seems to be more useful.
b. The definition only applies to less than or equal to Many texts define less than in terms of limits [6].
comparisons; not equal to or strictly less than.
c. The notation of the big O definition does not lend itself to Definition 5.2: f is o(g) (little o of g) iff lim(f(N)/g(N)) = 0 as N
mathematical manipulation or rigor. ⇒ ∞.
I.e., if f grows at a substantially slower rate than g then the ratio
Example 4.1: Let f(N) = N + 10 and let g(N) = 2*N. f(N)/g(N) will approach 0 as N grows .
Show that f is O(g) using Definition 4.1.
An inherent incompatibility between the definition of big O and
Solution: this definition of little o will be discussed later.
Whenever N >= 10, f(N) <= g(N). Therefore we may choose C =
1 and k = 9 and use Definition 4.1 to show that f is O(g).
6 A DIFFERENT PARADIGM USING
Also, note that f(N) <= 6*g(N) whenever N >= 1. Therefore we EQUIVALENCE CLASSES
may also choose C =6 and k = 0 when applying Definition 4.1. In A more intuitive approach may be used to compare discrete
fact, there are an infinite number of choices of C and k that will growth functions. We may partition functions into equivalence
work with Definition 4.1 to show that f is O(g). classes [3] based upon their rates of growth by utilizing a
modified Θ definition.

171
Definition 6.1: (Function Equivalence) Functions f and g are antisymmetric and transitive [3]. We must also show that the
said to be equivalent (members of the same equivalence class ) iff choice of functions from [f] and [g] is irrelevant.
there exists constants C1 and C2 such that f(N)/g(N) <= C1 and
The proofs that these properties hold are easy to complete and
g(N)/f(N) <= C2 for all N ∈ Z+. We may write f ≡ g.
may be given as exercises for the student.
Let [f] denote the equivalence class containing function f.
One may ask if Definition 7.1 actually defines a total ordering.
It is an easy and worthwhile exercise to show that every element I.e. are every two equivalence classes (or every two discrete
of [f] generates the same equivalence class. growth functions) comparable?
In order to ensure that definition 6.1 does, in fact, define It is not difficult to produce two functions f and g such that there
equivalence classes we must show the relation is reflexive, is no constant C1 such that f(N) <= C1*g(N) nor is there a
symmetric and transitive [3]. All are easy to show so the proofs constant, C2, such that g(N) <= C2*f(N). In such a case f and g are
may be given as exercises for students. not comparable and, thus, Definition 7.1 defines only a partial
ordering (not a total ordering).
It is also worthwhile to note here that these equivalence classes
are sets of functions [2]. This means that the results students have Ideally, to be consistent with familiar mathematical concepts, we
studied in set theory are relevant to the study of growth function should like to be able to say that if [f] ≤ [g] and [f] ≠ [g] then [f] <
equivalence classes. [g]. Defining [f] < [g] in this manner requires functions f and g to
be monotonically increasing. Example 7.1 illustrates this.
From Definition 6.1 we may easily conclude the following results.
The proofs may be given as exercises for students.
Example 7.1: (The need for monotonic increasing)
6.1 f ≡ g iff f ∈ [g]. Let f(N) = N.
Let g(N) = N1/2 when N is a power of 2 and N otherwise.
6.2 f ∈ [g] iff g ∈ [f]. Clearly g(N) <= 2*f(N) and, thus, [g] <= [f].
However, there is an infinite and unbounded number of points for
6.3 [f] = [g] iff f ≡ g
which g(N) = N1/2. At these points f(N)/g(N) = N1/2 . Therefore it
Prior to introducing students to Definition 6.1 students should be is impossible to find a constant C such that f(N)/g(N) <= C.
shown what it means for one function to be less than another. I.e., Hence, [f]≠[g]. We probably do not want to consider [g] to be <
present examples of two functions f and g where f(N) < g(N). This [f] when f and g are so very nearly equal. Thus we have the
might also be a good time to introduce the hierarchical growth uncomfortable situation where [f] ≤ [g] and [f] ≠ [g] but it is not
function chart that appears in nearly every text that covers Big O. true that [f] < [g]. The difficulty in this example is due, in large
The chart should, at least, include lg(N), Nk and CN where C and k part, to the downward drops in the value of g(N) when N is a
are constants. power of two. Requiring our functions to be monotonically
increasing eliminates such sharp drops.
Also, students should be familiar with equivalence classes. A
classic example of equivalence classes is the integers modulo p
Question 7: (inadequacy of little o for <)
where p is normally a prime [2] and [3]. Using the integers
Should we use the traditional little o definition (Definition 5.2) to
modulo 5 as an example most will recall that
define < on growth functions?
{…, -5, 0, 5, 10, …} forms the equivalence class [0] and {…, -4,
1, 6, 11, …} forms the equivalence class [1], etc.
The background material given above should come early in a Example 7.2 (inconsistency of little o and Big O)
Let g(N) = N and h(N) = N1/2 .
student’s college studies. ACM Curriculum 2001 recommends
this prerequisite material be covered in the first year [1]. We shall construct a function, f(N), that shares an infinite,
unbounded number of points with g(N) and (within epsilon)
shares an infinite, unbounded number of points with h(N).
7 ORDERING FUNCTIONS Let  represent the floor function [3].
Next, we address the concept of less than or equal to for growth Let q(N) = lg(lg(N)).
function equivalent classes. Following a standard approach we Let p(N) = 2q(N) .
define a partial ordering [3] on the equivalence classes that we I.e. the exponent of 2 above is lg(lg(N)).
have just defined. Let f(N) = 2p(N).
Definition 7.1: (≤≤ Function Ordering) The reader may verify that f(N) has the properties stated above.
[f] ≤ [g] iff there exists a constant C, such that f(N)/g(N)<=C for Furthermore h(N) <= f(N) <= g(N) = N, for all N ∈ Z+
all N ∈ Z+ where f ∈ [f], g ∈ [g]. Hence, [f] ≤ [g].
This is just the (modified) definition for Big O. But since f(N) (nearly) shares an infinite, unbounded number of
points with h(N) and [h] < [g] it is not true that [g] ≤ [f]. So [f] ≠
It is important to note that we have defined a relation on [g].
equivalence classes by referring to a relation on function values
(integers in this case). The distinction between relations and It can easily be shown that lim(f(N)/g(N) (as N ⇒ ∞) does not
operations on equivalence classes and relations and operations on exist even if a smooth (everywhere differentiable) function that
function values should be emphasized to students. approximates f(N) is substituted for f(N). So if we use little o
In order to show that Definition 7.1 does, in fact, define a partial (Definition 5.2) as the definition for < we have a case where f ≤ g
ordering we must show that the relation so defined is reflexive, and f ≠g but it is not true that f < g. Thus, the limit definition for

172
little o is incompatible with the big O definition and our partial Theorem 8.2:
ordering, ≤. a. [C * f] = [f] where C is a constant.
b. [f + g] = max([f], [g]) if [f] and [g] are comparable.
The following definition for < is compatible with the definitions Note: if f and g are not comparable then [f + g] is distinct
for ≤ and =. from [f] and [g].
Definition 7.2: (Less Than Function Ordering) The above are not difficult to prove and may be given as exercises
[f] < [g] iff [f] ≤[g] and [f] ≠[g]. to advanced students.

8 THEOREMS AND PROOF 8.2 "Bracketing" for Complexity Analysis


A common and effective method of comparing functions is
TECHNIQUES sometimes referred to as "Bracketing".
8.1 Operations on Growth Function
Equivalence Classes Theorem 8.3: (Bracketing)
If f(N) <= g(N) <= h(N) (N>k) and [f] = [h] Then g ∈ [f]
Definition 8.1:
a. [f] + [g] = [f + g] The proof is left as an exercise for advanced students.
b. C * [f] = [C * f]
c. [f] * [g] = [f * g] Example 8.1:
d. [f] / [g] = [f / g] Find a simpler expression equivalent to N2/(N + lg(N)).

Here we have defined operations on equivalence classes in terms Solution 1: (Using Theorem 8.3, bracketing)
of operations on functions. These definitions are very powerful for Since 0 <= lg(N) <= N,
manipulating order of complexity expressions. However, it is very N2/(N + N) <= N2/(N + lg(N)) <= N2/(N + 0)
important to note that the definitions are meaningful only if the ∴ N/2 <= N2/(N + lg(N)) <= N
particular choice of an element from an equivalent class is [N/2] = [N]
irrelevant. ∴ N2/(N + lg(N)) ∈ [N]

We may show, for example, that in Definition 8.1d, [f / g] yields As straightforward as the above is, the use of Theorem 8.2 and
the same equivalence class for any choice of elements from [f] Definition 8.1d yield an almost trivial solution.
and [g]. See Theorem 8.1d below.
Solution 2: (Using Theorem 8.2)
Theorem 8.1d: If f1, f2 ∈ [f], and g1, g2 ∈ [g] Then Since [lg(N)] <= [N], [N + lg(N)] = [N]
[f1 / g1] = [f2 / g2] ∴ [N2/(N + lg(N))] = [N2/N] = [N]
Proof:
Let f1, f2 ∈ [f], and g1, g2 ∈ [g]. Definition 8.1 allows us to substitute any member of an
Then, f1 <= C1*f2 and g2 <= C2*g1 ⇒ g2 / C2 <= g1 equivalence class for any other member. This is a very intuitive
Since f1 <= C1*f2 and powerful method.
f1 / g1 <= C1 * f2 / g1
Since g2 / C2 <= g1, 8.3 Using Limits for Complexity Analysis
f1 / g1 <= C1 * f2 / g1 <= C1 * f2 /( g2 / C2) Limits can play a very useful role in simplifying complexity
f1 / g1 <= C1 * C2* f2 /g2 expressions.
(f1 / g1)/ (f2 / g2) <= C1 * C2
∴ [f1 / g1] ≤ [f2 / g2] Theorem 8.4:
a. lim(f(N)/g(N)) = C >0 ⇒ [f] = [g]
Similarly, by reversing the roles of f1, f2 and g1, g2 we can show b. lim(f(N)/g(N)) = 0 ⇒ [f] ≤ [g].
that
[f2/ g2] <= [f1,/ g1] thus giving the desired result that The proofs may be assigned to advanced students with good
[f1/ g1] = [f2,/ g2] Calculus skills.
Students with an understanding of Calculus may question why
Equivalence class notation simplifies proving theorems and derivatives are not used for determining function growth. Note:
solving order of complexity problems. Perhaps even more f′(N) denotes the derivative of f(N).
significant than proving theorems using this paradigm is the lack
of necessity for some theorems. For example, many texts state the Theorem 8.5:
following theorem and follow with a proof. a. lim(f′(N)/g′(N)) = C >0 ⇒ [f] = [g]
If f is O(g) and g is O(h) then f is O(h). b. lim(f′(N)/g′(N)) = 0 ⇒ [f] ≤ [g].

Since ≤ forms a partial ordering on the equivalence classes we Theorem 8.5 follows immediately from Theorem 8.4 and
have the following more natural statement: l'Hôspital's rule [3], [8].
If [f] ≤ [g] and [g] ≤ [h] then [f] ≤ [h] by the transitive property of
the partial ordering, ≤.

173
Question 8.1: 10 CONCLUSIONS
Should we use limit-based definitions for growth function We have utilized equivalence classes and partial ordering to create
comparisons? a means of comparing function growth in order of complexity
analysis. The method is similar to Big O but offers several
Example 8.3 pedagogical advantages including
A crude version of Merge Sort (CMS) requires the number of
input values to be a power of two. When the number of input a. easier to teach and learn
values is not a power of 2 the algorithm creates enough dummy
b. better extensibility
input values to raise the number up to the next power of two. A
better version of Merge Sort (BMS) is able to properly handle any c. modern mathematical notation.
number of input values.
11 REFERENCES
The number of comparisons required for BMS is [1] ACM Computing Curricula 2001, Chapter 7, 7.4
f(N) = C1*N*lg(N) and for CMS it is g(N) = C2*2lg(N)*lg(2lg(N)) http://www.computer.org/education/cc2001/final/chapter07.htm
where   represents the ceiling function [3]. and http://www.computer.org/education/cc2001/final/cs115.htm
Using Definition 6.1 one can show that [f] = [g] and most would [2] Cormen, Leiserson and Rivest Algorithms, The MIT Press,
agree that f and g should be equivalent since they differ by a McGraw-Hill Book Co., 1990, 23-34, 41-52, 804
factor of between two and four. However, due to the nature of the [3] Gersting, J. L. Structures for Computer Science, W.H.
step function g(N), lim(f(N)/g(N)) does not exist and thus f and g Freeman and Co., Fourth Edition, 1999, 239, 242, 245,
would be considered incomparable if limits were used to define
282, 309
equality.
[4] Goldberg, R. R. Methods of Real Analysis, Blaisdell
9 A STILL BETTER METHOD? Publishing Co., 1964, 26.
All of the measures considered thus far for comparing functions [5] Knuth, Donald E. The Art of Computer Programming,
measure function values at a particular point, f(i)/g(i). One might Volume 1/Fundamental Algorithms, second edition,
argue that we would be better served by a measure that considers Addison-Wesley Publishing Co., 1973, 104
the relative values at all points. The ratio of the average number of
operations for function f as compared to g is lim∑(f(i)/g(i))/N [6] Manber, Udi Introduction to Algorithms, Addisin-Wesley
(1<= i <= N, N ⇒ ∞). Publishing Co., 1989, 41.
[7] Smith, Jeffrey D. Design and Analysis of Algorithms,
We may define [f]=[g] iff lim∑(f(i)/g(i))/N = C > 0. And likewise PWS-Kent Publishing Co., 1989, 12.
define [f]<[g] iff lim∑(f(i)/g(i))/N = 0. Although computing
∑(f(i)/g(i))/N is often difficult, the literature [2] gives many [8] Thomas, George B. Jr. Calculus and Analytic Geometry,
Addison-Wesley Series in Mathematics, Third Edition,
methods to apply, including approximations involving ∫f(i)/g(i). A
June 1962, 41, 815
thorough investigation of the above definitions might be suitable
as an advanced undergraduate research topic. Ftp Access: Obtain an on-line copy of this paper via anonymous
ftp to cs-netlab-01.lynchburg.edu. Move to the Algorithms
subdirectory and retrieve the file Growth.doc.

174

Você também pode gostar