Amsci Survey 2009

Macroscope
How Do Scientists Really Use Computers?

Gregory Wilson
C omputers are now essential

tools in every branch of sci-
ence, but we know remarkably little
A Web-based
Respondents’ descriptions of their
disciplines were much more diverse.
Roughly 150 identified themselves
about how—or how well—scientists
use them. Do most scientists use off-
survey offers as physicists, but no other discipline
made up more than 5 percent of the
the-shelf software or write their own?
Do they really need state-of-the-art clues sample. These figures are necessarily
imprecise, since we had to make a lot
supercomputers to solve their prob- of judgment calls when coding them.
lems, or can they do most of what they For example, should astrophysics be
need to on desktop machines? And classified as a separate discipline from
how much time do grad students re- over 50 or, in the case of 15 respond- astronomy and physics? If so, what
ally spend patching their supervisors’ ents, didn’t answer. These figures are about plasma physics? And how ex-
crusty old Fortran programs? consistent with reports about degrees: actly do we count “theological engi-
To answer these questions, my col- Seventy-one percent had a Ph.D. or neering”? (In the end, we discarded
leagues and I ran a Web-based survey equivalent, with 18 percent reporting that response entirely.)
during the last two months of 2008. at least an M.Sc.
We were surprised and gratified that When asked to identify their roles, Getting the Answers
almost 2,000 people took the time to over half of our 1,972 respondents So what did these people tell us? First,
tell us what they were doing. We were chose more than one category (be- respondents work an average of 48
equally surprised by what they told us. low)—which is probably an accurate hours a week, of which 30 percent
reflection of how many jobs working is spent developing software and 40
Who Responded scientists actually do. percent is spent using it. They also re-
First, a few facts about who answered.
Thirty-one percent told us they were
from the United States, 20 percent
from Canada, and 8 percent from the
United Kingdom. Germany and Nor-
way came next with 7 percent and 6
percent respectively, while the rest of
the world made up the remaining 28
percent. The high representation from
Canada and Norway reflects the fact
that my colleagues and I are based
there, while the low response rate from
areas such as Russia and East Asia is
undoubtedly due to the fact that we
only advertised the survey in English-
language channels.
Thirty-three percent of respondents
were 18 to 30 years old; 35 percent
were 30 to 40, and 17 percent were 40
to 50. The remaining 15 percent were
Greg Wilson is an adjunct professor of computer

science at the University of Toronto. His course
material is available at http://www.third-bit.com/
swc. Address: Room 3230, Bahen Centre for In-
formation Technology, University of Toronto, To-
ronto, Ontario, M5S 2E4. Internet: gvwilson@
cs.utoronto.ca
American Scientist, Volume 97

port that these proportions are going er people. When we asked where that ter of the total, while larger programs
up—45 percent of respondents say that software comes from, though, they re- account for the remaining 12 to 15
scientists spend more or much more of ported “commercial off-the-shelf soft- percent. To look at it another way, two
their time developing scientific software,” “open source” and “we build it thirds of the programs used by these
ware than they did 5 years ago, and ourselves” in almost equal numbers. scientists are less than 5,000 lines long.
70 percent say that they spend more It’s interesting to compare the latter The hardware scientists use is just as
or much more time using it. These an- answers with those given for another interesting. Eighty-one percent prima-
swers are much higher than we expect- question. Fifty-eight percent of scien- rily use desktop machines; only 13 per-
ed, and probably signal that our (self- tists reported that they do development cent use intermediate-sized machines
selected) respondents use computers on their own; 17 percent work with one such as departmental Linux clusters,
more than the “average” scientist (if in other person, and 18 percent in teams of and a mere 6 percent use supercom-
fact there is such a thing). 3 to 5 people, while only 9 percent work puters. This is consistent with their
Second, most scientists generate and in larger groups. These numbers are the reports about how they use comput-
archive a few gigabytes of data each reverse of what would be expected for ers: Most said that interactive use was
year. This answer was more popular professional software developers, who most common, followed by prepar-
than all the others together, which were usually work in teams. They also ex- ing and reformatting data, preparing
“a few megabytes,” “a few terabytes” plain the relatively low uptake among things for batch processing, and finally
and “more than a few terabytes.” One scientists of collaborative tools like ver- systems administration.
thing we didn’t ask (but should have) sion control, which most professional As for what occupied the most of
was how that data is archived: Is it software developers consider essential: our respondents’ time, coding and de-
stored in a Web-accessible database If you expect to work alone, why invest bugging took first place. Planning and
with searchable metadata, or on a DVD in tools for working with others? quality assurance tied for second place,
stuck in the bottom drawer of some- The prevalence of solo and small- reading/reviewing code came third,
one’s desk? Personal experience tells us team work is consistent with another documenting fourth, and packaging
the latter is far more likely…. finding. Roughly 38 percent of the pro- software came last. It is ironic to com-
Third, most of the software that sci- grams scientists write are between 500 pare this complaint with answers to
entists work with is widely used: Only and 5,000 lines long; smaller programs, another question: What “pain points”
10 percent reported that the programs and programs between 5,000 and 50,000 hurt you most? Lack of documentation
they rely on are used by three or few- lines long, each make up about a quar- was the number-one answer for more
www.americanscientist.org 2009 September–October

than 40 percent of respondents, and in The three areas in which respond- tists how to use computers effectively as
the top three for 80 percent. ents felt they didn’t know as much as research tools. One reason for this fail-
Where do scientists learn how to they should were, in order of increasing ure is that commercial software devel-
develop software and use computers gap, software construction, verification opment tools and practices often don’t
in their research? Almost all said that and testing. Again, this isn’t surprising, fit the needs of people doing explora-
informal self-study had been most im- since the whole point of science is to tory research in domains where years
portant. Peer mentoring came second, be able to prove that your answers are of training are required to understand
with formal instruction at school or on valid--and that requires confidence in the problems being solved. At the same
the job trailing well behind. the methods and tools used to get them. time, university science and engineer-
To close off, we wanted to find out The necessity of keeping test tubes clean ing departments feel their curricula are
how good scientists are at developing and calibrating equipment is drilled already overfull. As a physicist said to
and using software. However, self- into students from high school onward, me some years ago, “What should we
assessment is notoriously unreliable, and but most are uncomfortably aware that take out to make room for more pro-
administering a proficiency test over the we know a lot less about how to ensure gramming—thermodynamics or quan-
web would have been impractical. We that software is correct. The fact that tum mechanics?” Figuring out how to
therefore asked our respondents to rate there always seems to be one more bug square these circles is, in my opinion,
how well they felt they understood vari- to fix only reinforces the feeling. the only grand challenge in scientific
ous aspects of software development, computing that really matters.
and how important those aspects are. Helping Those Who Need It
The results were consistent with Our results can be interpreted in many Acknowledgments
answers given to other questions. In ways, but I think two things are clear. This work was made possible by a grant
most areas—requirements, design, The first is that if funding agencies, ven- from The MathWorks, Inc. I’d like to thank
maintenance, product management dors and computer science researchers my co-investigators, as well as Jon Pipi-
and project management—scientists really want to help working scientists tone and Dr. Laurel Duquette, who helped
reported that they knew as much as do more science, they should invest with data coding and analysis.
they felt they needed to know. This isn’t more in conventional small-scale com-
surprising: Scientists are usually their puting. Big-budget supercomputing Bibliography
own customers, and as our findings projects and e-science grids are more Hannay, Jo Erskine, Hans Petter Langtangen,
about team and program size suggest, likely to capture magazine covers, but Carolyn MacLeod, Dietmar Pfahl, Janice
those who develop software are creat- improvements to mundane desktop Singer and Greg Wilson. 2009. “How Do
ing small programs for their own use. applications, and to the ways scientists Scientists Develop and Use Scientific Soft-
ware?” Proceedings of the Second Interna-
Skills relevant to large projects done for use them, will have more real impact. tional Workshop on Software Engineering
other people are therefore unlikely to My second conclusion is that we’re for Computational Science and Engineer-
loom large in their minds. not doing nearly enough to teach scien- ing. New York: IEEE Press.
10 American Scientist, Volume 97

Amsci Survey 2009

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Amsci Survey 2009

Enviado por

Direitos autorais:

Formatos disponíveis

Macroscope

How Do Scientists Really Use Computers?

C omputers are now essential

Greg Wilson is an adjunct professor of computer

American Scientist, Volume 97

www.americanscientist.org 2009 September–October

10 American Scientist, Volume 97

Você também pode gostar

Amsci Survey 2009

Enviado por

Dados do documento

Descrição original:

Direitos autorais

Formatos disponíveis

Compartilhar este documento

Compartilhar ou incorporar documento

Opções de compartilhamento

Você considera este documento útil?

Este conteúdo é inapropriado?

Direitos autorais:

Formatos disponíveis

Amsci Survey 2009

Enviado por

Direitos autorais:

Formatos disponíveis

Macroscope

How Do Scientists Really Use Computers?

C omputers are now essential

Greg Wilson is an adjunct professor of computer

 American Scientist, Volume 97

www.americanscientist.org 2009 September–October 

10 American Scientist, Volume 97

Você também pode gostar

American Scientist, Volume 97

www.americanscientist.org 2009 September–October