Você está na página 1de 27

Introduction Metrics Conclusions References

Introduction to FLOSS Data Sources


Master on Free Software Daniel Izquierdo
dizquierdo@libresoft.es GSyC/Libresoft

18 de noviembre de 2011

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

(cc) 2011 Daniel Izquierdo. Some rights reserved. This document is distributed under the Creative Commons Attribution-ShareAlike 3.0 licence, available in http://creativecommons.org/licenses/by-sa/3.0/

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Index

Introduction Metrics Conclusions References

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Data sources

Source code management system Mailing lists Bug tracking system Source code

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Data sources

What type of metrics can we retrieve from them?

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Index

Introduction Metrics Conclusions References

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

SCM: main attributes

Per commit:
Owner of the change: committer or author Date of commit Files touched Message left by the committer or author Lines involved in the changes

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

SCM: main metrics

Regarding to the size of the project or community:


Number of commits Number of committers/authors Number of les touched Number of lines touched Usual programming language used (based on the le path) Others...

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

SCM: main metrics

Workload adequacy
Average number of commits per committer/author Average number of les/lines touched by committer/author Territoriality: number of les only handled by just one developer

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

SCM: main metrics

Distribution of eort:
Distribution of commits per developers (generally following a 20 % - 80 % distribution) Distribution of modules or areas of the source code by developer Others...

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

SCM: main metrics

Social network analysis


Creation of networks based on the type of action (e.g.: two developers working in the same le, people working in the same programming language) Betweeness: interesting people with a high know-how of the community (usually found as heading two dierent networks of people)

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

SCM: main metrics

Evolutionary studies
Evolution in number of new people coming to the community (regeneration of developers) Evolution in the number of xing commits (data left by developers in the log message) Evolution in number of commits (is the community growing in activity?)

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Mailing list: main metrics

Size of the community:


Number of unique posters in the mailing lists Number of users posting Number of developers posting Number of mailing lists (specic mailing lists for developers, users, per language, etc)

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Mailing list: main metrics

Workload adequacy:
How many developers are interacting with end users? Number of e-mails per developer / per user Number of e-mails per mailing list

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Mailing lists: main metrics

Social network analysis:


Similar to metrics found in the SCM Detection of important people not registered in the SCM (lawyers)

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Mailing lists: main metrics

Evolutionary analysis:
Evolution in the number of new people posting new e-mails Evolution in the general activity in the Mailing lists

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

BTS: main metrics

Size metrics:
Number Number Number Number Number Number of of of of of of bugs open bugs closed bugs developers xing bugs users reporting bugs developers reporting bugs

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

BTS: main metrics

Workload adequacy:
Average number of bugs xed per developer Average number of bugs remaining open per developer

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Source code: main metrics

Size adequacy:
Number of lines Number of les Types of programming languages Types of les (source code, translation, images, etc...) Number of lines per le

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Source code: main metrics

Static metrics

Dynamic metrics

Fan-in/Fan-out

Depth inheritance tree

Length of code

Method fan-in/fan-out

Cyclomatic complexity Weighted methods per class Fan-in/Fan-out Number overriding operations

Depth conditional testing

Fog index

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Source code: main metrics


Static metrics:
Fan-in/Fan-out: Number of functions or methods that call some other function or method (complexity, connascence). Length of code: Size of program. SLOC (LOC or LLOC). Cyclomatic complexity : Metric for control complexity of software. Length of identiers: Average length of identiers used in a program. Supposedly, the longer the ID, the better for readability and maintenance of code. Depth of conditional nesting : Deeply nested statements are harder to be grasped. Fog index: Average length of words and sentences in documents.

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Source code: main metrics


Dynamic metrics:
Depth of inheritance tree: Number of discrete levels in the tree of classes (OOP). The deeper the tree, the more complex the desgin. Method fan-in/fan-out: Distinction between origin of calls to other methods (from object or from external methods). Weighted methods per class: Number of methods included in a class, weighted complexity of every method. Too complex classes are dicult as for understanding and maintenance. Number of overriding operations: Operations overridden in sub-classes. The higher this number, the less appropriate may be the super-class.

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Source code: main metrics

Evolutionary studies:
Evolution of the number of lines Clones detection (are parts of the source code being moved to another areas?) Evolution of the architecture Others...

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Index

Introduction Metrics Conclusions References

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Using metrics

Metrics are providing objective results, however general conclusions should be inferred from those. Thus, human interpretation is needed. Benchmarks could be created in order to have a comparison model With that benchmark, you will be able to compare the current situation of the assessed project with others

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

Index

Introduction Metrics Conclusions References

Daniel Izquierdo

Introduction to FLOSS Data Sources

Introduction Metrics Conclusions References

References

Producing OSS by Karl Fogel Tools and datasets for mining libre software repositories, by Gregorio Robles, Jess M. Gonzlez-Barahona, Daniel u a Izquierdo-Cortzar and Israel Herraiz a Metrics and Models in Software Quality Engineering by Stephen H. Kan

Daniel Izquierdo

Introduction to FLOSS Data Sources

Você também pode gostar