Você está na página 1de 24

Software Development and Version

Control Systems
- Emphasis on Distributed -

By Jari Aalto

© Jari Aalto 1
Terminology

• Terms SCM, VCS and RCS are commonly used


interchangeable; they usually refer to the same thing:
– SCM = Source code management*
– VCS = Version control systems
– RCS = Revision control system
• Note, that ”RCS”was also name of an early program that
provided revision control (see also SCCS).
• Glossary:
– Revision = a change identified by the system
– Change set = a set of changes in one commit
– Version = a product from the system; usually a numeric release.

(*) SCM is also used for Software configuration Management © Jari Aalto 2
Collaboration: Basic Problems

• Persons working on same files


– Person A makes a modification; Person B makes too – who wins?
– The code does not work. ”I’m waiting for you to fix it”
– The change is too big! ”We only needed bug fix, no new features”
– Copies easily float around – which is the latest?

© Jari Aalto 3
Collaboration: Convential Solutions

• Never two persons work at the same time. When one person
finishes, he notifies other / sends work to others.
– Problems: Unsafe, Doesn’t scale, copies float around
• Shared directory (Windows network share; Unix/Linux NFS)
– Problems: Access restrictions, permissions etc. See above
• File is locked during edit (Windows; Unix/Linux *.lock files)
– Problems: Someone forgots to unlock, program left open

Serialized work, time management problems

© Jari Aalto 4
Benefits of Version Control

• Single developer
– possibility to revert to a previous revision (backup)
– Code review between changes (revision differences)
• Team development
– Noticing changes immediately
– Separation of development lines (stable/devel branches)
– Improves project structure (directories, naming conventions)
– Easy sharing of code in new projects

© Jari Aalto 5
Version Control Systems Compared*

• Centralized: Accurev, CearCase, Perforce, MS VSS/TFS;


Svn, Cvs
– Client–server: admin, security, access rights, participating problems
– Disk failure / repository corruption causes whole project to halt
– All code is in one server: a star-like system needs beefy hardware
– Branching and merging issues (Cvs, Svn)

• Distributed: BitKeeper; Bazaar, Mercurial, Git, [Monotone,


Darcs]
– In DCVS, only disk space (http, sftp, ssh) is needed; easy to relocate
– All developers can have a complete copy (sandboxes)
– Disk crash at some developer’s host does not necessarily affect project.
– Fast: "offline", no network lag / communication only when needed
– Branching and private modifications
© Jari Aalto 6
An example: Linux Kernel Project

• Not the biggest FOSS project, but probably the most active
• 10 MiB code changes a months (in form of patches)
• ≈ 20 000 files, 280 MiB sources, approx. 5.5-7 million lines
of code.
• Many branches
– Short life: develop a feature, a fix. Merged when ready
– Long life: bigger features that need separate line of development
(ReiserFs4, Ext4 etc.)
– Test, debug: a modification goes through several phases before feature is
accepted to mainline (*-mm trees etc.)

Centralized model difficult; need personal sandboxes

© Jari Aalto 7
Version Control System Maturity
Features

Commercial
Git (C/sh)
= star
BitKeeper Bzr (P)
= dvcs Hg (P)
ClearCase
Open Source Accurev
Perforce Darcs (H)
= star Mtn (C)
MS TFS
= dvcs
Svn (C) (Arch C/sh)
Cvs (C) MS VSS Programming languages:
GNU Rcs (C) C, (H)askel, (P)ython, (sh)ell

old design mature / stable New design


Accurev: novel new ideas.
Streams, not distributed
© Jari Aalto 8
Version Control Software Timelines
Features

Git (2005 1.0)


Commercial
Bzr (2005/07 1.0)
= star
Hg (2005/08 1.0)
Cvs BitK
= dvcs Linux kernel 2009
patches/tarballs 1991 2002
(cf. quilt) Mtn (2003/?)
Open Source Darcs (2002/04 1.0)
Arch (2001/03 1.0)
= star BitK(1998)
Svn (2001/04 1.0)
= dvcs
Perforce (1995) 2006
YYYY = The year ClearCase (1992, 2003 IBM)
of many projects
moving to use Cvs (1986/90) Legacy Systems
the VCS
Time
1980 1990 2000
© Jari Aalto 9
Free Version Control Hosting
The start of DVCS bandwagon (see table 1, table 2)

• Sourceforge.net 2001 160 000/1.7M users


– Cvs, Svn2006 Svn2006, Bzr2009, Hg2009, Git2009, Semi-commercial
• Savannah.gnu.org and Gna.org 2001 30 000/60K users
– Cvs, Svn2005; Arch*2005, Git2007, Hg2008 GNU ideology
• Launchpad.net 2004 5000/1.5M users
– Cvs, Svn, Bzr. Ubuntu Linux development, PPA2007
! • Github.com 2008, Gitorious.org 2008 rapid growth
– Git. GitHub is semi-commercial (see also Repo.or.cz)
• Code.google.com 2006 200M users
– Svn, Hg2009

[*] = The first FOSS DCVS; GNU Arch, unused by 2006 © Jari Aalto 10
DVCS Release Schedules
Open Source
usable!

Git
0.1 - 1.0 1.5 1.6 1.6.4
Git (2005-04-07)
4 12 1 2 4 6 2 6

Hg
0.1 - 0.7 0.9.3 0.9.4 0.9.5 1.0 1.3.1
Hg (2005-05-27)
5 6 9 1 4 7 12 6 10 3

Bzr Speed
0.1 - 0.6
12 0.9 1.0 (2.0)
Bzr (2005-03-22)
3 4 6 8 9 10 8 11 12 1 4 5 6 7 8 9 1112
Time
2005 2006 2007 2008 2009
© Jari Aalto 11
Pace of Development (1/3)

Git

Bzr

Hg

Source: Gmane.org
© Jari Aalto 12
Pace of Development (2/3)

Git

Bzr

Hg

Source: www.ohloh.net
© Jari Aalto 13
Pace of Development (3/3)

Git

Bzr

Hg

Source: www.ohloh.net
© Jari Aalto 14
DVCS and FOSS projects

Million lines of code

Source: Ohloh.net (2009). FOSS = Free and Open Source Software


© Jari Aalto 15
DVCS Popularity Estimates
popularity
- Darcs: Exotic. Scaling / mem issues
- Hg: head start. Xen 10M,
Predicted OpenJDK 6M, OpenSolaris 5M,
growth Python, Mozilla, XEmacs
- Bzr: At Fringe. Future looks
bright (launchpad.net Ubuntu),
Emacs 1.7M, MySQL 1.5M
- Git: Rapid growth in user base and
Projects. QT 24M, Kernel 11M,
Git has technology advantages:
• merging: multiple strategies
Perl 4M, X.org 3M, Wine 2.5M,
• gateways: Cvs, Svn
Android , Gnome, (Debian)
Current
popularity
Project that will move:
- Cvs: OO 20M, FreeBSD (Hg),
Eclipse
- Svn: Samba(git) 2M
Apache/TomC 1.5M, GCC 8K,
Mono 8K, Kde (git)
Prediction source: Darcs Hg Bzr Git N million lines of code
personal gut feeling (Monotone) W2K: 20 M © Jari Aalto 16
State of DVCS: Performance

Compared to Git (average): Hg 6x, Bzr 7-8x


init Hg 1.6x, Bzr 20x ci Hg 40x, Bzr 70x
add Hg –90%, Bzr 60% clone Hg 5x, Bzr 3x

Linux 2.6.30 sources (ca. 28 000 files, 1700 dirs; 350 MiB)

Source: DVCS Benchmark results http://www.editgrid.com/user/jaalto/vc-test


© Jari Aalto 17
DVCS Space Requirements

Percentages (%) bigger than original sources

Source: DVCS Benchmark results http://www.editgrid.com/user/jaalto/vc-test


© Jari Aalto 18
Scope of DVCS Projects
• Git
– Features, features, more cool features
– No usability roadmap. 80/20* rule problem
– No bug tracker. "Decentralized”: go and fix it yourself if you want
something and be prepared for harsh critizism. Good quality achieved by
mailing list patch reviews
– High rate of development, very lively community
• Hg
– Portability, ease of use
– Small development team
• Bzr
– Extensions, UI and speed is the primary focus; emphasis on usability
– Features are simple and serve the needs of the people well (80/20 ok!)
– TDD: well planned tests, development process and bug tracker (launchpad)
(*) ”Version control and the 80%” by Ben Collins-Sussman 2007-10-16 © Jari Aalto 19
Weaknesses of DVCS
• Git
– Highly complex, non-unified UI with 150 commands: plumbing API,
porcelain. Manual care needed for repository maintenance (garbage collect).
– cross-platform issues: tied tightly to Unix/Linux –like OS.
– Revision numbers are very different: SHA1 abcd24132b8e65678f… vs. Cvs
1.1 or Svn r12343.
– Migration issues: centralized-emulation is not the easiest of the pack
– No plug-in features other than hooks (Due to C/sh).
• Hg
– Although quite fast, contains less features. Weak collaboration support.
(email/receive). Limited network protocols: http.

• Bzr
– Overall slowness, branching efficiently is difficult (special setup).
– Branches are "directories”. Good cross-platform line ending control.
– Cherry picks are just "merges" that are not tracked (cf. Git).
© Jari Aalto 20
State of DVCS: Git

• Recent enhancements since 1.5.x


– Sub module support: super projects
– "git gui" – a graphical display of commits, merges
• TODO
– Conversion to C language continues (Windows OS; unofficial)
– UI unification needed for all commands: options naming, --long option
support for all etc.
– Directory versioning support (may never be)
– Real rename support (may never be; must be careful)
– Does not track file permissions: ACLs (may never be)
– Extremely inefficient HTTP protocol (may never improve): 12-22x slower
than Hg

© Jari Aalto 21
State of DVCS: Hg

• Recent enhancements since 1.x


– Symbolic link support, large file handling support
– More performance
• TODO
– No directory versioning
– Can't diff by date
– EOL-handling needs more robust design

© Jari Aalto 22
State of DVCS: Bzr

• Recent enhancements since 1.x


– Repository performance improvements
– Cherry picking (new repository format), almost git-style branch switching
– OS line ending control (cf. SVN:properties) 1.15
• 2.x (2009)
– Performance gap to Hg leveled
– Branching speed is in par with Git: very fast with shared repositories.
• TODO
– More performance tuning (repository changes)
– Network communication bottlenecks need resolving
– Network protocols rsync, WebDAV and web interfaces (like bzrweb) need
to be moved into the core

© Jari Aalto 23
Conclusions

• After the start of DVCS development in 2005, three strong


contenders are left. Others, like Darcs, have serious
technical problems*: scaling, disk consumption etc.

– Git will dominate technically and offer ”enough rope to hang oneself
multiple times”. On the other hand support for git is easy to find. Extremely
flexible but a complex system.
– Bzr will probably be the choice of corporates: is has clear migration plan: 1)
same command set as those in centralized VCS and 2) it offers an easy
migration plan. User can choose centralized or distributed model. Big
shoulders: GNU and Canonical backing. Speed is no longer an issue.
– Hg has too little development power to keep in pace with the two.

(*) DVCS Round-up: One System to Rule Them All by Robert Fendt 2009-01-19
© Jari Aalto 24

Você também pode gostar