Você está na página 1de 5

Algorithm Visualization: A Report on the State of the Field

Clifford A. Shaffer Matthew Cooper Stephen H. Edwards


Department of Computer Department of Computer Department of Computer
Science Science Science
Virginia Tech Virginia Tech Virginia Tech
Blacksburg, VA 24061 Blacksburg, VA 24061 Blacksburg, VA 24061
shaffer@cs.vt.edu macooper@cs.vt.edu edwards@cs.vt.edu

ABSTRACT Baeker was the first well-known visualization, ad hoc visu-


We present our findings on the state of the field of algo- alizations existed long before. The first recognized system
rithm visualization, based on extensive search and analysis for creating algorithm animations was BALSA [2] in 1984.
of links to hundreds of visualizations. We seek to answer Since then, hundreds of algorithm visualizations have been
questions such as how content is distributed among topics, implemented and provided freely to educators, and scores
who created algorithm visualizations and when, the overall (or hundreds) of papers have been written about them.
quality of available visualizations, and how visualizations It is widely perceived that algorithm visualizations can
are disseminated. We have built a wiki that currently cat- provide a powerful alternative to static written presentations
alogs over 350 algorithm visualizations, contains the begin- (from textbooks) or verbal descriptions supported by illus-
nings of an annotated bibliography on algorithm visualiza- trations (from lectures). There has been some debate in the
tion literature, and provides information about researchers literature as to whether algorithm visualizations are effective
and projects. Unfortunately, we found that most existing in practice. Some studies have shown the classic dismissal
algorithm visualizations are of low quality, and the content that is the downfall of most technological interventions in
coverage is skewed heavily toward easier topics. There are education: “no significant difference” [6, 9, 11]. Other stud-
no effective repositories or organized collections of algorithm ies have shown that algorithm visualizations can indeed im-
visualizations currently available. Thus, the field appears in prove understanding of the fundamental data structures and
need of improvement in dissemination of materials, inform- algorithms that are part of a traditional computer science
ing potential developers about what is needed, and propa- curriculum [13, 3, 7]. Certainly, many visualizations exist
gating known best practices for creating new visualizations. and are widely (and freely) available via the Internet. Un-
fortunately, the vast majority of those currently available
serve no useful pedagogical purpose.
Categories and Subject Descriptors So we see that (a) many algorithm visualizations exist,
E.1 [Data Structures]; E.2 [Data Storage Represen- yet relatively few are of true value, and (b) algorithm visu-
tations]; K.3.2 [Computers and Education]: Computer alizations can be demonstrated to have pedagogical value,
and Information Science Education yet it is also quite possible to use them in ways that have no
pedagogical effect. These facts seem to imply that creating
General Terms and deploying effective algorithm visualizations is difficult.
There is a small body of literature that investigates how to
Algorithms, Measurement, Design
create pedagogically useful algorithm visualizations (for ex-
ample, [10, 15]). Yet, there is still much to be done before we
Keywords are at the point where good quality visualizations on most
Data Structure and Algorithm Visualizations, Algorithm topics of interest are widely available.
Animation, Courseware The purpose of this paper is to provide a summary of
preliminary findings resulting from our efforts to survey the
1. INTRODUCTION state of the field of algorithm visualization. To bound the
content area, we focused our attention on topics commonly
Data structure and algorithm visualizations and anima-
taught in undergraduate courses on data structures and al-
tions (hereafter referred to generically as algorithm visu-
gorithms. We seek an understanding of the overall health
alizations) have a long history in computer science educa-
of the field, and present a number of open research ques-
tion. While the 1981 video “Sorting out Sorting” by Ronald
tions that we and others can work on in the future. Some
examples of the questions we seek to address are:
• What visualizations are available?
Permission to make digital or hard copies of all or part of this work for • What is their general quality?
personal or classroom use is granted without fee provided that copies are • Is there adequate coverage of the major topic areas
not made or distributed for profit or commercial advantage and that copies covered in data structures and algorithms courses?
bear this notice and the full citation on the first page. To copy otherwise, to • How do educators find effective visualizations?
republish, to post on servers or to redistribute to lists, requires prior specific • Is the field active, and improving?
permission and/or a fee.
SIGCSE’07, March 7–10, 2007, Covington, Kentucky, USA. • Is there adequate infrastructure for storing and dis-
Copyright 2007 ACM 1-59593-361-1/07/0003 ...$5.00. seminating visualizations?

150
2. WHAT’S OUT THERE? 140 134

Since Spring 2006, we have made a significant effort to 120


catalog as many existing algorithm animations as we could. 100
The results can be found at our Data Structure and Algo-
80
rithm Visualization Wiki [18]. In this time, we have de- 60
veloped the most extensive collection of links to algorithm 60
46
38
visualizations currently available. While it is by no means 40 29
complete, it does serve as a representative sample of the 20 14 13 12
total population of visualizations accessible from the Inter- 4
0
net. We are still collecting visualizations, and many of the

Compression Algs
Sorting Algorithms

Graph Algorithms

Other Algorithms

Search Algorithms

Memory Management
Search Structures

Linear Structures

Other Data Structures


following assessments are based on preliminary data.

2.1 How did we find them?


We used a number of techniques to locate algorithm visu-
alizations. We began with a list of all visualization systems
that we were aware of from our general knowledge of the Figure 1: Histogram of major categories in Table 1
field. We developed a topic list based on our experiences
teaching relevant courses. We considered what search terms collections that we find (as described in Section 2.1) to in-
would be most productive for locating visualizations via In- sure that our data collection process does not focus unfairly
ternet searches (this is discussed further in Section 3). Based on individual applets.
on our topic list, we then performed searches using Google
to find whatever we could. Once we had generated a base 2.4 How are they disseminated?
of visualization links, we then examined these pages to try Almost none of the visualizations that we have found are
to locate other visualizations, since developers of a given vi- available from large, organized repositories. We discuss the
sualization often have others available. Sometimes we could status and effect of various courseware repositories on the
find these other visualizations from direct references on the algorithm visualization community in Section 3. Many visu-
pages we already had, and other times we could deconstruct alizations are cataloged by link sites, meaning sites that (like
the URLs to find more visualizations. Whenever we stum- our wiki) attempt to link to collections of visualizations that
bled across a page that had links to collections of visualiza- the site managers have considered worthy. However, most
tions, we would follow those links to capture any new ones of these link sites are small in scale, perhaps linking to 20
not yet in our collection. favorite visualizations for some course or textbook. There
have been a small number of efforts to produce comprehen-
2.2 How many are available? sive catalogs of visualizations (for example, the Hope College
We have collected links to over 350 visualizations. Many collection [8] in 2001), but our own wiki appears to be the
of these are individual applets or programs, but a signifi- only effort to do this that is currently active.
cant fraction appear as parts of integrated visualization col-
lections (typically, individual applications that include 5-20 2.5 Who makes them?
distinct visualizations, or toolkits that distribute a collection While these numbers are quite preliminary, we have de-
of 5-20 distinct visualizations as a unit). If a given program veloped a picture of who is providing visualizations. Per-
contains multiple visualizations (for example, a single Java haps 10-20% of algorithm visualizations are essentially sin-
applet that embodies separate visualizations for both stacks gle efforts by their respective authors. Perhaps 30-40% are
and queues), then we count it multiple times, once for each provided by “small shops” that have created typically 5-15
distinct visualization. We speculate that we have so far cap- visualizations, mostly by hand as individual Java applets.
tured roughly half of what is publicly available, and most These might have each been created by the same individual
likely we have captured the vast majority of the visualiza- over some number of years (typically a faculty member who
tions that are easily found and better known. We are still is teaching relevant courses), or they might have been de-
actively collecting new links. veloped by a small number of students working for a given
faculty member. Perhaps 50% of visualizations are provided
2.3 How are they implemented? by teams that have also created visualization systems of one
Since Java’s introduction in the mid-1990s, virtually all sort or another (and the visualizations are related to the
algorithm visualizations and algorithm visualization tool- system). We catalog such systems in addition to cataloging
kits have been implemented in Java. Probably over half the individual visualizations.
of available visualizations are provided as applets directly
from web pages. However, a noticeable minority are Java 2.6 What is the content distribution?
applications that must be downloaded and opened locally. We have restricted our study to topics that are typically
These numbers are somewhat biased. There is a tendency covered in undergraduate courses in data structures and al-
for us to search for applets, since this turns out to be easier gorithms. While this concerns mostly lower division materi-
to do (as explained in Section 3). Visualizations that are als, we also considered some upper division topics like com-
directly available in web pages will typically get more at- putational geometry, algorithms for N P-complete problems,
tention from potential users, since they need not go through dynamic programming, and string matching. Our top-level
the additional step of downloading and unpacking a visual- categories for grouping visualizations, along with their cur-
ization or visualization system. We do a significant amount rent population counts, are shown in Table 1.
of cross checking by capturing links from visualization link As Figure 1 shows, there is wide variation in coverage.

151
Linear Structures 46 9
8 8
Lists 13 8
7
Stacks & Queues 32 7
Search Structures 60 6
Binary Search Trees 13 5 5 5
5
AVL Trees 7 4
4
Splay Trees 9 3
Red-Black Trees 8 3
B-Trees and variants 10 2
1
Skiplist 3 1
0 0
Other Data Structures 14 0
Heap/Priority Queue 8 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Search Algorithms 12
Figure 2: Last-change dates for sorting visualiza-
Linear/Binary Search 2
tions. Counts are by project.
Hashing 10
Sorting Algorithms 134 Only a small fraction (certainly less than a quarter) would
Sorting Overviews 6 we feel comfortable recommending for use, either as a lecture
Insertion Sort 18 aid, as the basis for a lab exercise, or for self-study of a topic.
Selection Sort 18 Another quarter of visualizations fall in between: they are
Quicksort 23 potentially useful but severely limited. Even the better vi-
Mergesort 28 sualizations tend to have serious deficiencies. Roughly half
Heapsort 9 of all algorithm visualizations are actually animations. For
Radix Sort 10 many of these, users are relegated to being passive observers
Graph Algorithms 38 with no control over pacing (aside from animation speed),
Traversals 10 the data being processed, or the operations being conducted.
Shortest Paths 11 A different type of deficiency often occurs with visualizations
Spanning Trees 9 of tree structures, which are popular subjects for visualiza-
Network Flow 3 tion. Typically, these will show the tree that results from a
Compression Algorithms 13 user-selected insert or delete operation. But rarely do they
Huffman Coding 10 illustrate at all, let alone effectively, how the insert or delete
Memory Management 4 operation actually works. Similar deficiencies occur in many
Other Algorithms 28 visualizations for simple structures such as lists, stacks and
Computational Geometry 6 queues. Only the outcome of an insertion or deletion is dis-
N P-complete Problems 3 played, with no explanation of how the result was obtained.
Algorithmic Techniques 7
String Matching 8 2.8 When were they made?
Mathematical Algorithms 4 A significant number of systems were developed in the
early 1990s for creating algorithm animations and visualiza-
Table 1: Major categories for visualizations and
tions. However, most of these are now no longer available,
their counts for 352 visualizations collected. Each
or so difficult to install and run due to changes in computer
major category shows its total, and significant sub-
operating systems, that they are not currently a factor in
categories are also shown.
education. If we consider the development of visualizations
since the mid 1990s (i.e., Java), we have observed a decline
Nearly 15% of all visualizations are on linear structures such over time in the creation of new visualizations, particularly
as stacks and queues, even though these probably present after 2002. Figure 2 shows a histogram of the last-change
less difficulty to students than many other topics. Over a dates for the sorting visualizations currently in our collec-
third of all visualizations that we have found are on sorting tion. These counts are “by project” rather than by individ-
algorithms. Sorting is an important topic for undergraduate ual visualization, in that if a given visualization or visual-
data structures and algorithms courses, but it is certainly ization system provided visualizations for multiple sorting
being given too much attention by creators of visualizations. algorithms, then we only count it once in the histogram.
Further, many of the sorting visualizations are variations on We can see that, while there are still active projects, overall
the classic “Sorting out Sorting” video, and just show bars activity does not appear to be as extensive today as it was
being swapped. In contrast, most specialized and advanced previously. Since we are measuring the “last change” date
topics are poorly covered. There is certainly room for new for the various visualizations, the better numbers at the end
visualizations. of the histogram merely indicate that there still exist active
projects, as opposed to a recent rise in activity.
2.7 What is their quality? As discussed above, the recent decline in the creation of
A majority of the visualizations that we have encountered new visualizations certainly cannot be explained by satura-
either cannot be made to work easily, or appear to have tion of either topic coverage or content quality, since both
no pedagogical value. When we say “have no pedagogical are sadly lacking. Ongoing projects appear no more attuned
value,” we mean that they give the user no understanding to the topical gaps in coverage. We speculate that students
whatsoever of how the data structure or algorithm being “vi- were readily available to create visualizations during the
sualized” works, and so will be of little use in the classroom. Java boom of the mid to late 1990s, when Java and applets

152
were new and the “in thing” for students to learn. But now search terms will not find them. Nor is there any standard
there are other “in things” competing for students’ atten- alternative synonym that we know of that can capture the
tion, and Java is considered old hat. The nation-wide drop typical visualization. Since it turns out that the vast major-
in computer science enrollments might also mean that there ity of visualizations written since the mid 1990s were written
are fewer students available who want to do such projects. in Java, and many of these are delivered as applets, “applet”
is often a successful choice of keyword. Unfortunately, this
2.9 Will we find them again? will tend to leave out those visualizations that exist within a
Like everything on the Internet that is maintained by in- visualization system, since they are not typically presented
dividuals, algorithm visualizations have a high turnover rate to the world labeled as “applets.” Using “applet” as a key-
in their accessibility. To get some measure of this, we have word also results in generating a self-fulfilling prophecy that
been tracking the status of the visualization links on our if you search for applets, then the only presentation mech-
wiki. Each week, we take a snapshot of how many links on anism you will find will be applets. Initially, this skewed
any of our wiki pages are still accessible. As of May 29, the balance of the materials found in our wiki away from
2006, we had 46 active (working) links in our wiki. Of those projects with integrated applications, since non-applets were
46, after three months we had one whose host machine was intrinsically harder for us to find. But we have since made
no longer on the Internet, three which returned “page not a conscious effort to catalog non-applets, and found that
found” (two of these three were at the same site), and two these make up a significant fraction of the total (they are
with permanent redirection pages (indicating that our links still written in Java, however).
need to be updated in the wiki or eventually we will lose The main alternative to keyword-based Internet search is
them). In other words, over a span of three months, 6 out to look in visualization or courseware repositories or link
of 46 links were either lost to us or moved. One was sub- collections. Unfortunately, there currently is little in the
sequently found at a new location. On July 31, 2006, we way of good repositories for data structures and algorithm
had 219 unique working links. Of these, 15 were no longer visualizations. Within the computer science community,
available on October 31, 2006. the most likely candidates are the collection of materials
submitted to JERIC [12], and the CITIDEL repository [4].
3. SEARCHING FOR VISUALIZATIONS JERIC and its contributed courseware are indexed as part
of the ACM Digital Library [1]. While the ACM DL and
There are essentially two distinct ways to go about find-
CITIDEL are both huge collections, neither appears to have
ing an algorithm visualization on a given topic. One is to
large amounts of courseware in general or visualizations in
“google for it”—that is, use your favorite Internet search
particular. Equally important, neither provides good search
engine to search for whatever keywords you believe will best
tools for courseware or visualizations. The bulk of their
find a useful visualization on the topic you need. The other
materials are papers, and they tend to organize by content
is to look in an algorithm visualization collection, a course-
topic (CITIDEL) or publication venue (ACM DL). Neither
ware repository, or a curated link collection of which you
supports browsing for courseware separate from the (over-
might be aware.
whelming) body of non-courseware content. SIGCSE main-
Let us assume that there exists on the Internet a suitable
tains a collection of courseware links which includes a small
visualization on a specific topic. In that case, we can hope
number of visualizations [16]. Broader courseware reposi-
that standard Internet search technology will allow educa-
tories include SMETE [17], and Connexions [5]. None of
tors to find it. If so, this might alleviate the need to cre-
these are well known within the CS education community,
ate and maintain specialized repositories or link collections
and none have large collections of visualizations. Further,
for courseware in general, and visualizations in particular.
none of the repositories mentioned here have much “web
Unfortunately, whether any given instructor will be able to
presence” with respect to algorithm visualization. In all the
find an existing artifact depends a lot on the instructor’s
hours that we have spent conducting google searches for var-
ability to supply the right keywords to yield successful re-
ious visualizations, not a single visualization within any of
sults. Like any Internet search, keywords need both to cap-
these repositories was discovered by that means.
ture the desired artifacts, and to avoid burying the user in
Courseware repositories have the potential to be far su-
overwhelming numbers of false-positive responses. There-
perior to a basic Internet search when looking for visualiza-
fore, keywords must identify both the type of material de-
tions, for at least four reasons.
sired (visualizations as opposed to explanations), and the
topic or content area desired. Some keywords, while techni- • Internet-based keyword searches do not help the in-
cally specific, might lead to a wealth of non-related informa- structor who is searching for good ideas. By browsing a
tion. For example, looking for “Huffman Trees” is likely to list of visualizations and courseware for lower-division
give results related to the data structure, while looking for data structures and algorithms topics, the instructor
“lists” or “queues” is likely to return information unrelated might come upon different treatments and approaches
to computer science. Unfortunately, some data structures than she might have found by listing topics and search-
have common, everyday names. ing explicitly for them. She might even discover visu-
Visualizations constitute only a small part of the course- alizations that encourage her to present new topics.
ware materials available on the web. Far more artifacts exist • Visualizations maintained by individuals are prone to
that can be described as content presentations (lecture ma- loss of access. The Internet is well known for high
terials or tutorials) or projects or exercises (assignments). turnover of material, either because URLs change, or
Thus, to find a given visualization, some sort of restrictive because the material is no longer made available. Both
keywords are necessary for searches on most topics. Un- keyword-based search of the Internet and collections
fortunately, the providers of algorithm “visualizations” and of visualization links are susceptible to this problem.
“animations” often do not label them as either, so these Only curated repositories hosted by stable providers

153
can give any assurance for long-term sustainability. 5. REFERENCES
• A large fraction of existing visualizations are unusable, [1] Association for Computing Machinery. The ACM digi-
either because they simply do not work, or because tal library. http://portal.acm.org, 2006.
they are poorly conceived and implemented. A repos- [2] M. H. Brown and R. Sedgewick. A system for algori-
itory, or a curated site of visualization links, could thm animation. In SIGGRAPH ’84: Proceedings of the
provide editorial guidance to educators regarding the 11th Annual Conference on Computer Graphics and
quality of the visualizations. Such sites might pick and Interactive Techniques, pages 177–186, New York, NY,
choose which visualizations to provide, or they might USA, 1984. ACM Press.
supply ratings information.
[3] M. D. Byrne, R. Catrambone, and J. T. Stasko. Do
• A repository or curated site that developed its web
algorithm animations aid learning? Technical Report
presence could allow educators to actively participate
GIT-GVU-96-18, Georgia Institute of Technology,
in the site, using social bookmarking schemes, like
1996.
del.icio.us, or user ratings, like Amazon.com, or fea-
[4] CITIDEL: Computing and information technology
tures of other social networking tools. By allowing
interactive digital educational library. http://
users to comment on, rate, and cross-link resources,
www.citdel.org, 2006.
educators would be able to add value to the entire col-
lection over time just by using it. MERLOT [14] is [5] Connexions scholarly content repository. http://
one repository moving in this direction. cnx.org, 2006.
[6] J. S. Gurka and W. Citrin. Testing effectiveness of
algorithm animation. In Proceedings, IEEE
Symposium on Visual Languages, pages 182–189, 1996.
4. CONCLUSIONS AND FUTURE PLANS
[7] S. R. Hansen, N. H. Narayanan, and D. Schrimpsher.
While many good algorithm visualizations are available, Helping learners visualize and comprehend algorithms.
the need for more and higher quality visualizations contin- Interactive Multimedia Electronic Journal of
ues. There are many topics for which no satisfactory visual- Computer-Enhanced Learning, 2, 2000.
izations are available. Yet, there seems to be less activity in
[8] Hope College. Complete collection of algorithm
terms of creating new visualizations now than at any time
visualizations. http://www.cs.hope.edu/
within the past ten years. On the other hand, the theoreti- ∼dershem/ccaa/ccaa, 2006.
cal foundations for creating effective visualizations appear to
[9] C. Hundhausen and S. Douglas. Using visualizations
be steadily improving. More papers are appearing regarding
to learn algorithms: should students construct their
effective use of visualization in courses, and a body of work
own, or view an expert’s? In Proceedings, IEEE
has begun to form regarding how to develop effective visu-
Symposium on Visual Languages, pages 21–28, 2000.
alizations in the first place. Collectively, the community is
learning how to improve. While more fundamental research [10] C. Hundhausen, S. A. Douglas, and J. T. Stasko. A
on how to develop and use algorithm visualizations is nec- meta-study of algorithm visualization effectiveness.
essary, we also simply need more implementors to produce Journal of Visual Languages and Computing, 2002.
more quality visualizations. And we need to encourage them [11] D. J. Jarc, M. B. Feldman, and R. S. Heller. Assessing
to provide visualizations on under-represented topics. the benefits of interactive prediction using web-based
Given the existence of effective visualizations, there is still algorithm animation courseware. In SIGCSE ’00:
a major gap in terms of a credible national or international Proceedings of the Thirty-First SIGCSE Technical
repository for visualizations and other courseware. Some Symposium on Computer Science Education, pages
repositories exist now (CITIDEL and the JERIC/ACM DL, 377–381, New York, NY, USA, 2000. ACM Press.
for example), but they need to mature before they can claim [12] JERIC: Journal on Educational Resources in
to provide an adequate solution to this problem. Thus, a Computing. http://www.acm.org/pubs/jeric, 2006.
major concern for computer science educators should be how [13] A. W. Lawrence, J. Stasko, and A. Badre. Empirically
the gains made in knowledge and artifacts can be dissemi- evaluating the use of animations to teach algorithms.
nated effectively so as to insure continued improvements. In Proceedings, IEEE Symposium on Visual Languages
Our own future plans include developing our wiki into a 1994, pages 48–54. IEEE Computer Society, 1994.
resource that can provide value to the community. We con- [14] Multimedia Educational Resource for Learning and
tinue to expand our collection of links to visualizations. We Online Teaching. http://www.merlot.org, 2006.
are developing a bibliography of the research literature, a [15] P. Saraiya, C. Shaffer, D. McCrickard, and C. North.
catalog of projects and algorithm visualization systems, and Effective features of algorithm visualizations. In
a directory of researchers. Much of the data reported here SIGCSE ’04: Proceedings of the 35th SIGCSE
was hand generated. We are creating data collection tools to Technical Symposium on Computer Science Education,
allow us and the CSE community to data mine our database. pages 382–386, Norfolk, VA, March 2004.
We have developed evaluation forms and a mechanism for [16] SIGCSE educational links. http://sigcse.org/
collecting them, and tools for capturing public comments topics, 2006.
and ratings. If we can enlist significant community help, we [17] SMETE digital library. http://www.smete.org, 2006.
plan to provide evaluations of existing visualizations, and [18] Virginia Tech Data Structures and Algorithm
thereby better pinpoint what is available and what is lack- Visualization Research Group. Data structures and
ing, both in terms of quantity and quality. In these ways, we algorithm visualization wiki.
can make progress toward the day when computer science http://web-cat.cs.vt.edu/AlgovizWiki, 2006.
educators have a rich collection of high-quality visualizations
that span all the traditional topics that we teach.

154

Você também pode gostar