Você está na página 1de 53

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

quasi-Systematic Review Simulation Studies in Software Engineering


Breno Bernard Nicolau de Frana

Guilherme Horta Travassos

Janeiro/2011

COPPE/PESC

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

ndice
1 2 2.1 2.1.1 2.1.2 2.2 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 2.3 2.3.1 2.3.2 2.4 2.4.1 2.4.2 2.4.3 2.4.4 2.5 2.5.1 2.5.2 2.5.3 2.5.4 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 4 4.1 4.2 4.3 5 INTRODUO .......................................................................................................... 3 QUASI-REVISO SISTEMTICA ............................................................................ 8 Formulao da Questo de Pesquisa ....................................................................... 8 Foco da Questo: .................................................................................................. 8 Qualidade e Amplitude da Questo ....................................................................... 8 Seleo de Fontes .................................................................................................. 10 Definio de Critrios para Seleo de Fontes ................................................... 10 Idioma dos Estudos ............................................................................................. 10 String de Busca ................................................................................................... 10 Identificao de Fontes ....................................................................................... 11 Seleo de Fontes aps Avaliao ..................................................................... 12 Seleo de Estudos ................................................................................................ 12 Definio dos Estudos ......................................................................................... 12 Execuo da Seleo .......................................................................................... 13 Extrao de Informao .......................................................................................... 14 Definio de Critrios de Incluso e Excluso de Informao ............................. 14 Formulrio de Extrao de Informao ............................................................... 14 Execuo da Extrao ......................................................................................... 15 Resoluo de Divergncias entre Revisores ....................................................... 15 Results Summarization ........................................................................................... 16 Results Presentation in Tables ............................................................................ 16 Sensitivity Analysis .............................................................................................. 16 Plotting ................................................................................................................ 16 Final Remarks ..................................................................................................... 16 ANLISE DOS RESULTADOS .............................................................................. 17 Simulation Approaches ........................................................................................... 17 Software Engineering Domains in Simulation Studies ............................................ 19 Simulation Tools for Software Engineering ............................................................. 24 Characteristics of Simulation Models ...................................................................... 26 Verification and Validation (V&V) Procedures for Simulation Models...................... 31 Simulation Output Analysis ..................................................................................... 37 Study Strategies involving Simulation ..................................................................... 39 CONCLUSIONS...................................................................................................... 46 Threats to validity .................................................................................................... 46 Open Questions ...................................................................................................... 47 State of the Art and Future Directions ..................................................................... 47 REFERNCIAS ...................................................................................................... 49

COPPE/PESC

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

1 Introduo
O termo Simulao pode ser definido como a imitao da operao de um processo ou sistema do mundo real sobre o tempo. Simulao envolve a gerao de um histrico artificial do sistema, e a observao do histrico artificial para realizar inferncias considerando caractersticas operacionais do sistema real que est sendo representado (BANKS, 1999). A simulao computacional1 emergiu em meio a Segunda Guerra Mundial com a utilizao de modelos matemticos contnuos e de Monte Carlo. No existem evidncias exatas sobre a origem de tcnicas como a simulao por eventos discretos, por exemplo, pois estas tcnicas so baseadas em modelos matemticos antigos, de um perodo em que estes ainda eram propostos sem uma implementao computacional. Entretanto, sem o avano da computao em termos de hardware e software, era invivel para alguns pesquisadores e profissionais da prtica pensar na utilizao de simulao como tcnica de resoluo de problemas, e to pouco como a tcnica mais utilizada para a maioria deles (NANCE e SARGENT, 2002). Nance e Sargent (2002) afirmam que a larga utilizao de modelos baseados em simulao dirigida por eventos discretos em diversas reas e por todo o perodo de sua evoluo representa o quo dominante esta tcnica em relao s demais. Um modelo de simulao uma ferramenta til no estudo de comportamentos de sistemas e processos, seja com a finalidade de observao/entendimento ou at de otimizao do objeto simulado. Os modelos de simulao so construdos com base em abordagens2 clssicas (ou variaes destas) para simulao, que representam uma abstrao que pode ser traduzida para estruturas computacionais. Segundo Birta e Arbez (2007) as principais abordagens para simulao so: a simulao dirigida por eventos discretos e a simulao baseada em tempo contnuo, ou simplesmente simulao contnua. Os Estudos Baseados em Simulao so definidos como uma srie de passos, tais como: coleta de dados, codificao e verificao, validao do modelo, projeto experimental, anlise dos dados de sada, e implementao (ALEXOPOULOS, 2007). A simulao possui aspectos interessantes do ponto de vista de experimentao. A simulao permite um alto controle do ambiente, e com isso, realizar observaes ou fatos, e validar hipteses ou teorias. Alm disso, o fato do ambiente ser virtual faz com que o tempo e esforo gastos na execuo das simulaes sejam baixos quando comparados a experimentos com sistemas/processos do mundo real, viabilizando a execuo de todas as combinaes possveis entre as variveis sob investigao (REN, 2009). Maria (1997) apresenta um esquema do que seria um estudo baseado em simulao, conforme apresentado na figura 1. Trata-se de um procedimento iterativo, onde o nmero de ciclos depende de alteraes no sistema em estudo, mudana na perspectiva de observao do sistema, ou ainda resultados inconclusivos. O sistema em questo volta, depois de alterado, a ser alvo de estudo e assim por diante. Cada um dos retngulos contidos na figura representa uma etapa do estudo. Nestas etapas so sempre requeridas tomadas de deciso. A nica etapa que no requer interveno humana a execuo das simulaes, a qual pode ser realizada por pacotes computacionais.

Neste texto, os termos simulao e simulao computacional so utilizados como sinnimos, da mesma forma como so utilizados na literatura tcnica de simulao. 2 Aqui utilizamos a expresso abordagem para simulao, por ter uma semntica mais ampla . Entretanto, na literatura especializada outros termos podem aparecer, tais como: paradigma, mtodo, tcnica, entre outros.
COPPE/PESC 3

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figura 1. Esquema para Estudos Baseados em Simulao (MARIA, 1997).

Maria (1997) lista, ainda, onze etapas para o desenvolvimento de um modelo de simulao, no projeto do experimento e na anlise dos resultados, so elas: 1. Identificao do problema; 2. Formulao do problema; 3. Coleta e processamento de dados do sistema real; 4. Formulao e desenvolvimento de um modelo; 5. Validao do modelo; 6. Documentao do modelo para futura utilizao; 7. Seleo do projeto experimental apropriado; 8. Estabelecimento das condies experimentais para as execues; 9. Execuo das simulaes; 10. Interpretao e apresentao dos resultados; 11. Recomendao de direes futuras. A simulao vem sendo utilizada com sucesso em diferentes disciplinas com os mais diversos propsitos. Exemplos destas disciplinas so as engenharias, economia, biologia e cincias sociais (MLLER e PFAHL, 2008). Na Engenharia de Software (ES), a construo de modelos tambm diversificada no que diz respeito ao propsito e aos seus domnios, principalmente no contexto de sistemas e processos de software complexos.

COPPE/PESC

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

De acordo com Travassos e Barros (2003), estudos in virtuo e in silico so as classes de estudos em Engenharia de Software em que a simulao aplicvel. Nos estudos in virtuo, o objeto de estudo simulado, mas o participante real. J nos estudos in silico, tanto o objeto de estudo quanto os participantes so simulados. Com os Estudos Baseados em Simulao possvel reduzir riscos, tempo e custo, considerando que o ambiente no qual o estudo executado um ambiente virtual. Alm disso, a simulao facilita a repetio de estudos, devido natureza virtual do ambiente j mencionada. Outra vantagem que os estudos baseados em simulao permitem testar hipteses antes de implement-las em experimentos reais. Com isso, possvel prever os efeitos de tais implementaes. Na Engenharia de Software, j existem modelos de simulao baseados nas abordagens existentes, tais como a simulao dirigida por eventos discretos e a dinmica de sistemas. Entretanto, existe uma srie de tomadas de deciso relacionadas escolha da abordagem adequada para construo de um modelo de simulao para o comportamento que se deseja observar. A seguir, alguns exemplos da Engenharia de Software so apresentados para ilustrar como determinadas abordagens vem sendo utilizadas para resolver problemas em subreas dessa disciplina. Em Luckham et al (1995) apresentada a ADL (Architecture Description Language) Rapide, a qual permite a simulao e anlise comportamental de arquiteturas de sistemas. Essa linguagem descreve uma arquitetura de maneira executvel para que simulaes sejam realizadas em fases iniciais do desenvolvimento de software, antes que sejam tomadas decises de implementao. O modelo utilizado para simulao com a linguagem Rapide, tambm chamado de modelo de execuo, baseado em um conjunto de eventos, os quais so gerados juntamente com as causas e o tempo da sua ocorrncia, formando conjuntos parcialmente ordenados de eventos, os quais descrevem a dependncia entre esses eventos. Quando uma simulao iniciada, esse conjunto de eventos gerado e observado por um conjunto de processos (threads de controle que fazem parte dos componentes de uma arquitetura). Estes processos reagem (executam alguma ao) com base na gerao de um evento e que tambm geram outros novos eventos atravs de mecanismos similares a gatilhos. Arief e Speirs (1999) apresentam uma abordagem para gerao automatizada de modelos de simulao com base em modelos UML, com a finalidade de prever o desempenho do sistema modelado. O modelo de entrada composto por diagramas de classe e de interao, por exemplo, diagramas de sequncia, em nvel de projeto (design). O mapeamento realizado por meio de um parser desenvolvido para traduzir os elementos da UML como elementos da simulao por eventos discretos, mais especificamente para a linguagem orientada a objetos JavaSim, uma implementao Java do toolkit C++Sim. Esta linguagem possui construtores relacionados com elementos da abordagem por eventos discretos, tais como: processos, filas, parmetros de entrada, variveis (pseudo) aleatrias, eventos, entre outras. A execuo do modelo gerado ocorre como com a gerao de eventos, de acordo com funes de distribuies tericas (tais como, uniforme, exponencial, normal, Erlang, entre outras), os quais so entradas para a simulao por eventos discretos baseada em processos. Cada processo representa uma instncia de uma classe do modelo, os quais trocam mensagens com outros processos at que a execuo do modelo termine. Os resultados de desempenho do sistema representado pelo modelo UML de entrada permitem avaliar a existncia de gargalos, tempo de processamento e condies de escala do sistema. O modelo DynaREP (AL-EMRAN et al, 2008) segue o paradigma evento-discreto para simulao de processo de software e focado no planejamento e re-planejamento
COPPE/PESC 5

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

de releases. Ele permite saber quais as implicaes do acrscimo ou remoo de uma funcionalidade de um release em termos de custos, esforo, prazos e recursos disponveis. Do processo de planejamento de release, esse modelo se prope a atender apenas as fases de Planejamento Operacional do Release, isto , alocao de recursos para as tarefas em cada release e o Re-planejamento dinmico, isto , a reviso de planos para tratar mudanas inesperadas impostas sobre os gerentes do produto/projeto responsveis pela implementao dos releases individuais. Outra abordagem para simulao bastante utilizada em Engenharia de Software a Dinmica de Sistemas (FORRESTER, 1961). Essa uma abordagem contnua capaz de representar o comportamento de sistemas complexos por meio de diagramas de causalidade baseados em ciclos de retroalimentao, diagramas de estoques e fluxos, e equaes matemticas que determinam o relacionamento entre variveis e taxas relativas aos fluxos. Segundo Madachy (2008), os elementos bsicos da dinmica de sistemas so os nveis (ou estoques), os fluxos e a fonte/drenos. Tais elementos podem ser observados na figura 2, onde apresentado um exemplo de um modelo em dinmica de sistemas que representa o relacionamento entre algumas variveis do desenvolvimento de software, tais como produtividade e insero e deteco de defeitos. Os nveis so representados pelos retngulos, os fluxos pelas vlvulas e as fontes/drenos pelas nuvens.

Figure 2. Exemplo de modelos em dinmica de sistemas (MADACHY, 2008)

O modelo de Abdel-hamid e Madnick (1991) talvez seja em Engenharia de Software o mais difundido entre os modelos que utilizam dinmica de sistemas. Foi proposto como um modelo para simular projetos de software, sendo bastante abrangente no que diz respeito aos subsistemas considerados em seu modelo: Gerncia de Recursos Humanos, subsistema que trata variveis como treinamento, turnover de pessoal em termos organizacionais e de projeto, nvel de experincia e produtividade de desenvolvedores, entre outras; Produo de Software, este subsistema representa a alocao de esforo ao projeto; Desenvolvimento de Software, este o maior subsistema e trata a produtividade como uma varivel complexa e que representa o andamento do projeto, onde existe uma produtividade potencial e a produtividade real bem como essa impactada devido a variveis como motivao e comunicao; Garantia de Qualidade e Retrabalho, como o prprio nome sugere, representa as taxas de insero e deteco de defeitos, os impactos de presses por prazos sobre essas taxas e o retrabalho resultante desses defeitos; Testes, este subsistema representa os ciclos das atividades de testes durante o projeto, bem como o impacto de defeitos no corrigidos ou no detectados em fases posteriores; Controle, este subsistema est relacionado com a medio de tarefas realizadas, produtividade, retrabalho, entre outras,
COPPE/PESC 6

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

alm de ajuste na alocao de esforo e carga de trabalho; e Planejamento, este subsistema representa a estabilidade do cronograma e o trmino de atividades dentro do prazo. A partir desse modelo, muitos outros foram propostos. Exemplos so os modelos de Madachy (1996) e Martin e Raffo (2000). Outros exemplos de modelos que utilizam dinmica de sistemas so os modelos de Barros et al (2003) para Gerncia de Riscos, utilizando simulao baseada em cenrios para modelar o impacto de riscos e a eficcia de estratgias de resoluo, auxiliando na tomada de deciso por todo o processo de desenvolvimento; outro exemplo o modelo de Arajo (2004) para observao de tendncias no decaimento de software, baseado nas Leis de Evoluo de Software de Lehman (LEHMAN, 1980). Embora sejam inmeras as vantagens dos Estudos Baseados em Simulao, necessrio que os modelos construdos sejam avaliados antes de se realizar estudos com base nestes modelos, principalmente no que diz respeito Validade de Constructo, isto , o quanto o modelo construdo se aproxima do sistema real (WOHLIN et al, 2000). Alm disso, existe um custo/esforo considervel associado construo destes modelos. Tendo em vista a grande diversidade de abordagens para simulao e a possibilidade de aplicao dessas abordagens em estudos baseados em simulao na Engenharia de Software, o objetivo dessa quasi-Reviso Sistemtica da literatura caracterizar como as diversas abordagens para simulao existentes na literatura tm sido aplicadas em estudos baseados em simulao na rea de Engenharia de Software. A partir dessa caracterizao, espera-se apontar as vantagens e desvantagens de cada abordagem, bem como as caractersticas que determinam sua aplicabilidade diante de um determinado problema. Com isso, possvel identificar a abordagem mais adequada para simular determinadas caractersticas de sistemas ou processos, ou ainda, de um sistema ou processo em particular, reduzindo o risco de se construir um modelo que no possa retornar os resultados desejados.

COPPE/PESC

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

2 quasi-Systematic Review
Nesta seo apresentado o protocolo da reviso sistemtica realizada. O protocolo contm a definio dos objetivos do estudo, bem como a organizao dos procedimentos a serem seguidos pelos executores da reviso. Para elaborao deste documento foi utilizado o template proposto por Biolchini et al (2005). O estudo em questo possui o propsito de caracterizao, ou seja, no se possui conhecimento prvio que possibilite a realizao de comparaes. Sendo assim, chamamos de uma quasi-Reviso Sistemtica, de acordo com Travassos et al (2008).

2.1 Formulao da Questo de Pesquisa


2.1.1 Foco da Questo: Executar uma quasi-Reviso Sistemtica com objetivo de caracterizar como as diversas abordagens para simulao computacional existentes na literatura vm sendo aplicadas em Estudos Baseados em Simulao na Engenharia de Software (ES), intencionando: Identificar quais os domnios de ES que utilizam abordagens para simulao. Com isso, determinar a freqncia de utilizao por domnio; Identificar as metodologias utilizadas no planejamento e conduo desses estudos; 2.1.2 Qualidade e Amplitude da Questo Problema: Estudos baseados em simulao oferecem vantagens na investigao preliminar de hipteses. Entretanto, necessrio utilizar a abordagem de simulao adequada para a construo do modelo do sistema a ser simulado. Caracterizando as diferentes abordagens para simulao, espera-se aumentar a orientao no planejamento e conduo de estudos baseados em simulao e na construo de modelos de simulao em ES, no sentido em que se possa escolher de forma objetiva a abordagem e o projeto experimental adequados. Questo: 0: Como as diferentes Abordagens para Simulao Computacional existentes na literatura vm sendo aplicadas em Estudos Baseados em Simulao na Engenharia de Software (ES)? Palavras-chave e Sinnimos: o Estudos baseados em simulao: estudos de simulao, simulao computacional, modelagem e simulao, simulao e modelagem, In Virtuo, In Silico, Sampling, Monte Carlo, Modelagem Estocstica, Dinmica de Sistemas, Simulao por eventos discretos, Simulao baseada em estados, Simulao baseada em agentes o Engenharia de Software: engenharia de sistemas, engenharia de aplicaes, desenvolvimento de software o Modelo de Simulao: modelo de dinmica de sistemas, modelo dirigido por eventos discretos, modelo de agentes, modelo de estados. Para estruturar a string de busca, utilizamos a abordagem PICO, conforme definida por Pai et al (2004). Nessa abordagem a questo de pesquisa (string de busca) separada em quatro partes: Populao de interesse, Interveno ou exposio sendo avaliada, Comparao (se aplicvel) e Resultado (Outcome). Populao: Artigos que apresentem Estudos Baseados em Simulao em ES. Interveno: Modelos de simulao computacional utilizados nos estudos.
8

COPPE/PESC

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Comparao: Nenhuma. Medida de Resultado: Objetivo, domnio e aspectos experimentais (projeto experimental) do estudo. Alm de caractersticas do modelo. O resultado (outcome) coincide com as informaes a serem extradas dos artigos (ver formulrio de extrao). Entretanto, os abstracts e ttulos dos artigos no utilizam os termoschave para identificar tais informaes, por exemplo, um abstract de um artigo descreve caractersticas do modelo proposto, mas utiliza o termo caracterstica para identific-las. Por isso, o resultado (outcome) foi suprimido da string de busca e ser considerado somente no momento da extrao de informaes do artigo. Essa deciso foi tomada em virtude de no se encontrar uma boa cobertura com os termos do resultado (outcome), e como o esforo para tratar o nmero de artigos retornados somente considerando a populao e interveno vivel, foi resolvido manter todos os artigos para a fase de seleo. Efeito: Caracterizao de abordagens para simulao no contexto da ES. Aplicao: Pesquisadores em Engenharia de Software, Engenheiros de Software. Projeto Experimental: Nenhum mtodo estatstico ser aplicado sobre os resultados.

Controle: Os artigos utilizados como controle para a elaborao da string de busca foram os seguintes: o Martin, R.; Raffo, D. Application of a hybrid process simulation model to a software development Project. Journal of Systems and Software, Volume 59, Issue 3, 2001, Pages 237-246; o Khosrovian, K.; Pfahl, D.; Garousi, V. GENSIM 2.0: A customizable process simulation model for software process evaluation. Lecture Notes in Computer Science, Volume 5007 LNCS, 2008, Pages 294-306; o Drappa, A.; Ludewig, J. Simulation in software engineering training. Proceedings - International Conference on Software Engineering, 2000, Pages 199-208; o Madachy, R. System dynamics modeling of an inspection-based process. Proceedings - International Conference on Software Engineering, 1995, Pages 376-386; o Al-Emran, A.; Pfahl, D.; Ruhe, G. A method for re-planning of software releases using discrete-event simulation. Software Process Improvement and Practice Volume 13, Issue 1, January 2008, Pages 19-33; o Al-Emran, A.; Pfahl, D.; Ruhe, G. DynaReP: A discrete event simulation model for re-planning of software releases. Lecture Notes in Computer Science, Volume 4470 LNCS, 2007, Pages 246-258;

COPPE/PESC

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

o Luckham, D. C.; Kenney, J. J.; Augustin, L. M.; Vera, J.; Bryan, D.; Mann, W. Specification and analysis of system architecture using rapide. IEEE Transactions on Software Engineering, Volume 21, Issue 4, April 1995, Pages 336-355; o Arief, L. B.; Speirs, N. A. A UML tool for an automatic generation of simulation programs. Proceedings Second International Workshop on Software and Performance WOSP, 2000, Pages 71-76; o Choi, K.; Bae, D.-H.; Kim, T. An approach to a hybrid software process simulation using the DEVS formalism. Software Process Improvement and Practice, Volume 11, Issue 4, July 2006, Pages 373-383.

2.2 Seleo de Fontes


2.2.1 Definio de Critrios para Seleo de Fontes As fontes utilizadas para essa quasi-Reviso Sistemtica da Literatura so as bibliotecas digitais disponveis na web, cujos artigos nelas contidos sejam acessveis. Alm disso, as bibliotecas devem permitir consulta online por meio de um mecanismo de busca no qual se possa utilizar expresses lgicas para definir a string de busca. Em complemento, o mecanismo de busca deve permitir a busca por ttulo, resumo ( abstract) e palavras-chave do artigo. Devem conter artigos de diversos domnios da Engenharia de Software. 2.2.2 Idioma dos Estudos Ingls. 2.2.3 String de Busca 0: Como as diferentes Abordagens para Simulao Computacional existentes na literatura vm sendo aplicadas em Estudos Baseados em Simulao na Engenharia de Software? P: (("simulation modeling" OR "simulation modelling" OR "in silico" OR "in virtuo" OR "simulation based study" OR "simulation study" OR "computer simulation" OR "modeling and simulation" OR "modelling and simulation" OR "simulation and modeling" OR "simulation and modelling" OR "process simulation" OR "discreteevent simulation" OR "event based simulation" OR "system dynamics" OR sampling OR "monte carlo" OR "stochastic modeling" OR "agent based simulation" OR "state based simulation") AND ("software engineering" OR "systems engineering" OR "application engineering" OR "software development" OR "application

development" OR "system development")) I: ("simulation model" OR "discrete event model" OR "event based model" OR "system dynamics model" OR "agent model" OR "state model") C: No aplicvel.
COPPE/PESC 10

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

O: Suprimido. Termos que representam o que deve ser a sada: ("area" OR "domain" OR "context" OR "discipline" OR "study planning" OR "study design" OR "experimental support" OR "experimental planning" OR "experimental design" OR "experimental study" OR "goal" OR "target" OR "objective" OR "purpose" OR

"problem" OR "aim" OR "characteristic" OR "property" OR "feature" OR "attribute" OR "aspect" OR "factor" OR "dimension" OR "perspective" OR "advantage" OR "disadvantage" OR "benefit" OR "approach" OR "technique" OR "method" OR "paradigm" OR "mechanism" OR "instrument" OR "methodology" OR "procedure") 2.2.4 Identificao de Fontes Mtodos de busca de fontes: As fontes sero acessadas via mecanismos de busca das bibliotecas digitais. Lista de Fontes: o Scopus; TITLE-ABS-KEY((("simulation modeling" OR "simulation modelling" OR "in silico" OR "in virtuo" OR "simulation based study" OR "simulation study" OR "computer simulation" OR "modeling and simulation" OR "modelling and simulation" OR "simulation and modeling" OR "simulation and modelling" OR "process simulation" OR "discrete-event simulation" OR "event based simulation" OR "system dynamics" OR sampling OR "monte carlo" OR "stochastic modeling" OR "agent based simulation" OR "state based simulation") AND ("software engineering" OR "systems engineering" OR "application engineering" OR "software development" OR "application development" OR "system development")) AND ("simulation model" OR "discrete event model" OR "event based model" OR "system dynamics model" OR "agent model" OR "state model")) o Web of Science (ISI Knowledge); TS=((("simulation modeling" OR "simulation modelling" OR "in silico" OR "in virtuo" OR "simulation based study" OR "simulation study" OR "computer simulation" OR "modeling and simulation" OR "modelling and simulation" OR "simulation and modeling" OR "simulation and modelling" OR "process simulation" OR "discreteevent simulation" OR "event based simulation" OR "system dynamics" OR sampling OR "monte carlo" OR "stochastic modeling" OR "agent based simulation" OR "state based simulation") AND ("software engineering" OR "systems engineering" OR "application engineering" OR "software development" OR "application

development" OR "system development")) AND ("simulation model" OR "discrete

COPPE/PESC

11

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

event model" OR "event based model" OR "system dynamics model" OR "agent model" OR "state model")) o Engineering Village (Ei Compendenx) ((("simulation modeling" OR "simulation modelling" OR "in silico" OR "in virtuo" OR "simulation based study" OR "simulation study" OR "computer simulation" OR "modeling and simulation" OR "modelling and simulation" OR "simulation and modeling" OR "simulation and modelling" OR "process simulation" OR "discreteevent simulation" OR "event based simulation" OR "system dynamics" OR sampling OR "monte carlo" OR "stochastic modeling" OR "agent based simulation" OR "state based simulation") AND ("software engineering" OR "systems engineering" OR "application engineering" OR "software development" OR "application

development" OR "system development")) AND ("simulation model" OR "discrete event model" OR "event based model" OR "system dynamics model" OR "agent model" OR "state model")) WN KY 2.2.5 Seleo de Fontes aps Avaliao: As trs fontes selecionadas satisfazem os critrios definidos na seo 2.2.1. Alm disso, Scopus, Ei Compendex, Web of Science (ISI Knowledge) englobam as principais bibliotecas digitais (incluindo artigos de conferncias e peridicos) para pesquisa em simulao computacional e engenharia de software, alm de outras reas correlatas. Exemplos destas bibliotecas so ACM, IEEE, Elsevier, Springer e WILEY.

2.3 Seleo de Estudos


2.3.1 Definio dos Estudos Definio dos Critrios de Incluso e Excluso de Estudos o Incluso: Os artigos devem estar disponveis na web; Os artigos devem estar descritos em ingls; Os artigos devem tratar de estudos baseados em simulao (computacional); Os estudos devem pertencer ao domnio da Engenharia de Software, e; O artigo deve mencionar um ou mais modelos de simulao. o Excluso: Artigos no escritos em ingls; Artigos publicados por meios que no exigem reviso por pares; Artigos que tratem de simulao no computacional; Artigos que no apresentem estudo algum, e; Prefcios e apresentaes de Proceedings de conferncias. Definio dos Tipos de Artigos
COPPE/PESC

Artigos Tericos (com fundamentao terica) descrevendo algum modelo de simulao; Estudos primrios quantitativos, e; Estudos secundrios;
12

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Procedimento para Seleo de Estudos Trs pesquisadores aplicaro a estratgia de busca para a identificao de potenciais artigos. A seleo dos estudos ser baseada no ttulo e resumo dos artigos. O Pesquisador 1 aplica as Search String nas mquinas, recupera os artigos, armazena no gerenciador de referncias JabRef3 juntamente com os resumos, acrescentando uma coluna para indicar o status do artigo: I - Includo, E - Excludo, D Dvida, e elimina as duplicatas. Aps isso, realiza a primeira classificao. Ento, o Pesquisador 2 recebe o arquivo Jabref (em formato BibTeX) contendo as referncias e anotaes adicionais (por exemplo, o status do artigo), e realiza a conferncia dos I e E. Caso algum precise ser alterado, marca como D2 e reclassifica os Ds como includos ou excludos, marcando como I2 ou E2 para saber o que pode mudar. Por ltimo, o Pesquisador 3 realiza o mesmo que o 2, porm com marcaes I3 ou E3. Os D que sobrarem sero repassados e ao final ser realizada uma reunio para a deciso final, com anlise elaborada dos Ds. Ainda que restem dvidas (Ds), aps a passagem pelo Pesquisador 3, estes artigos so includos para posterior anlise. Mesmo artigos includos na etapa de seleo, pela leitura de ttulo e resumo dos artigos, podem ser excludos posteriormente na etapa de extrao. Na etapa de extrao ocorre a leitura do texto na ntegra, e isso pode ocasionar um melhor entendimento do artigo, esclarecendo dvidas e permitindo uma melhor deciso de sua manuteno como artigo includo ou sua excluso por no satisfazer, de fato, os critrios estabelecidos. 2.3.2 Execuo da Seleo Seleo Inicial dos Estudos:
Data da Busca Scopus Web of Science Ei Compendex Total Duplicatas Artigos para Seleo 14/03/2011 906 85 501 1492 546 946

Avaliao da Qualidade dos Estudos: Os critrios utilizados para avaliao da qualidade do artigo esto relacionados aos itens contidos no formulrio de extrao, conforme a tabela 1.

A ferramenta para gerenciamento de referncias bibliogrficas JabRef (http://jabref.sourceforge.net/) permite o armazenamento estruturado de referncias bibliogrficas, bem como a adio de anotaes sobre estas referncias. Para esta reviso, as informaes extradas de cada referncia foram armazenadas como anotaes na ferramenta.
COPPE/PESC 13

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Tabela 1. Critrios de Qualidade dos Artigos

Critrio
Identifica a abordagem para simulao utilizada? Explicita o propsito do modelo de simulao? Explicita o propsito do estudo realizado? possvel identificar o domnio (disciplina da Engenharia de Software) em que o estudo foi aplicado? Menciona o apoio ferramental utilizado para conduzir as simulaes? Descreve as caractersticas apontadas em relao ao modelo de simulao? Apresenta uma classificao das caractersticas apontadas? Apresenta as vantagens do modelo de simulao? Apresenta as desvantagens do modelo de simulao Tcnicas de Verificao e Validao Descreve a metodologia de anlise para os resultados da simulao? Identifica o tipo de estudo em que o modelo de simulao foi utilizado como instrumento juntamente com seu projeto experimental? 1 pt 1 pt 1 pt 1 pt 0,5 pt 1 pt 0,5 pt 0,5 pt 0,5 pt 1 pt 1 pt

Valor

1 pt (0,5 para o tipo de estudo + 0,5 para o projeto experimental)

Reviso das Selees: Como um (R1) dos trs revisores executou a primeira seleo dos artigos com base nos ttulos e resumos, os outros dois revisores (R2 e R3) inspecionaram o resultado da seleo realizada por R1, para que possa haver um critrio de desempate e reduzir o vis na seleo dos artigos.
Artigos para Seleo Conflitos Aps desempate Excludos Para Extrao dos Dados 946 29 796 150

2.4 Extrao de Informao


2.4.1 Definio de Critrios de Incluso e Excluso de Informao As informaes extradas dos artigos devem conter descries de estudos baseados em simulao, bem como dos modelos utilizados nestes estudos. 2.4.2 Formulrio de Extrao de Informao Para cada artigo selecionado aps a execuo do processo de seleo dos estudos, os pesquisadores extrairo os seguintes dados apontados na tabela 2 e organizaro atravs da ferramenta JabRef.

COPPE/PESC

14

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Tabela 2. Formulrio de Extrao de Informao

Campo
Identificao do Artigo Nome da abordagem para simulao utilizada Propsito do modelo de simulao utilizado Propsito do estudo realizado Domnio (disciplina da Engenharia de Software) em que o estudo foi aplicado Apoio ferramental Caractersticas apontadas em relao ao modelo de simulao Classificao das caractersticas apontadas Vantagens do modelo de simulao Desvantagens do modelo de simulao Tcnicas de Verificao e Validao

Informao Extrada
[ttulo, autores, fonte, ano, tipo de artigo]

[Objetivo para o qual o modelo foi construdo] [Objetivo para o qual o estudo foi realizado] [rea de aplicao do estudo. Caso o estudo pertena a mais de uma rea, explicit-las.] [o modelo utilizado no estudo possui algum apoio ferramental? Caso sim, qual?] [por exemplo, discreto, contnuo, determinstico, estocstico, baseado em eventos, entre outras] [possui alguma classificao das caractersticas? Descreva-a.]

Metodologia de Anlise Tipo de estudo em que o modelo de simulao foi utilizado como instrumento Principais resultados do artigo em relao abordagem

[tcnicas de V&V utilizadas para avaliar o modelo utilizado no estudo, em termos de validade interna, externa e de construo] [metodologias de anlise de resultados do estudo baseado em simulao] [experimento, observao, ...] [aplicabilidade da abordagem ao problema, acurcia dos resultados]

2.4.3 Execuo da Extrao


Artigos para Extrao Excludos Sem acesso Total de Extrados 150 28 14 108

2.4.4 Resoluo de Divergncias entre Revisores No aplicvel.

COPPE/PESC

15

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

2.5 Results Summarization


2.5.1 Results Presentation in Tables All the papers, as well as the extracted information, were stored in the JabRef tool. 2.5.2 Sensitivity Analysis Not applicable. 2.5.3 Plotting Content presented in analysis section. 2.5.4 Final Remarks Number of papers: After information extraction for each of 108 research papers, including title, abstract and body of text, it was possible to identify 88 simulation models, distributed among several Software Engineering research sub-areas, called domains. Search, Selection and Extraction Bias: Publications are restricted to the sources indexed by the digital libraries and used search engines. Publication Bias: As expected, no negative result was found. However, as a characterization review we did not consider just the results of each study, but mainly how they were conducted and in which context to understand how that simulation model or approach could be used in SE. Inter-Reviewers Variation: All solved by a third reviewer and agreed by the others. Results Application: The results found in this review can be used as a starting point for future research directions that must be addressed by the Software Engineering community when conducting simulation-based studies. Besides, the information can be organized as a body of knowledge to support the decision making regarding simulation in Software Engineering. Recommendations:

COPPE/PESC

16

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

3 Anlise dos Resultados


Among the 108 papers selected for extraction, only two secondary studies were identified: a literature survey [Ahmed et al, 2008] and a Systematic Literature Review [Zhang et al, 2008]. Both discuss specifically Software Process Simulation. The survey by Ahmed et al (2008) relates its results to the current practice in simulation. Such results indicate that software process simulation practitioners are, in general, methodic, work with complex problems (resulting in large scale models), and use a systematic process to develop the simulation models. This work also points out that the simulation modeling process and the model evaluation are both the main issues needing attention from the community. About the systematic review by Zhang et al (2008), it has the goal of tracing the evolution in Software Process Simulation and Modeling research during 10 years, from 1998 to 2008. The authors analyzed about 200 relevant papers in order to answer research questions defined in their research protocol. Among the main contributions of the conducted systematic review, the authors highlight: (1) Categories for classifying software process simulation models; (2) Research improving the efficiency of SPSM is gaining importance. (3) Hybrid process simulation models have attracted interest as a possibility to more realistically capture complex real-world software processes. In the following sub-sections we present the characterization results about the studies and models found into the context of Software Engineering.

3.1 Simulation Approaches


The large amount of papers about simulation-based studies in Software Engineering adopts a discrete-event or continuous simulation (mostly represented by System Dynamics) approach. The Systematic Review presented by Zhang et al (2008) also confirms this statement in the context of Software Process and Project Simulation. Some slightly different approaches appear in the technical literature, but they rely on discrete or continuous behavior. For example, Agent-Based Simulation is often mentioned as a distinct abstraction, but agents and their environment are usually characterized by a continuous behavior. Figure 1 presents the simulation approaches found in this study distributed according to their numbers of models and papers.

COPPE/PESC

17

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 1. Simulation Approaches Distribution

In Figure 1, there are clear dominances of System Dynamics and Discrete-event approaches. Even when authors present hybrid models, most part of the combinations fall into these two approaches. Discrete-event simulation is a mature approach that has shown to succeed for years in a vast range of research areas. However, System Dynamics seems to have another explanation for its majority over the other ones, specifically in Software Engineering: the influence of Abdel-Hamid and Madnick (AHM) software project integrated model [Abdel-Hamid and Madnick, 1990]. Many works mentioned their model as a basis for new ones, for example, Martin and Raffo (2001), Lee and Miller (2004) and Choi et al (2006) use some parts of the AHM model to conceive their own models. The AHM model encompasses a great part from what can be observed in software projects from a continuous perspective. Some simulation approaches found in this quasi-systematic review were not clearly defined. This issue happens basically for two reasons: either papers do not explicitly say that models were based on a specific approach or their specification of the proposed/used models are not similar to any know approach. The model presented in [Ormon et al, 2001] seems to be an analytical model, instead of simulation model, but there are not enough details to confirm it and no simulation approach was mentioned. Another example of not clearly identified simulation approach is [Navarro and Hoek, 2005], we were not able to find details about their model and there is no sentence mentioning the simulation approach used to execute simulations. These approaches were grouped into the Not Specified category, since we were not able to understand the underlying approach. Something similar to that occurs in the categories General Discrete-time Simulation and General Continuous-time Simulation, where it was possible to perceive in the model description
COPPE/PESC 18

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

that such models implement discrete or continuous time-advancing mechanisms. Unfortunately, we were not able to recognize the specific approach used to build these models. Examples of discrete and continuous-time simulation can be found in [Grillinger et al, 2004], where a simulation approach is applied to design and test embedded systems and in [Tunru et al, 2006], where a continuous model for open source development using Test-Driven Development is presented, respectively. The remaining approaches appearing for one (at most two) simulation model seems to be an investigation for the suitability of them to simulate software engineering systems or processes, since they do not submit their models to systematic validation procedures. Most of papers found present motivation for simulation studies and model development, brief literature review, theoretical foundation, the proposed simulation model and an example of how the model works. Typical examples can be found in [Choi et al, 2006] and [Al-Emran and Pfahl, 2007]. For System Dynamics approach, it is possible to see (in figure 1) a greater number of papers (gray bar) compared to number of models (black bar). This fact occurs due to use of a same model as an instrument across different simulation-based studies or even through several replications of a specific study. Houston et al (2001) uses four models in their study, and most of them are captured by this quasi-Systematic Review, i.e., a model used in [Houston et al, 2001] is also present at another study, in a different paper. A replication example of the original study in [Pfahl et al, 2001] can be found in [Pfahl et al, 2003], it means that the same model was present in both papers. In the Monte Carlo simulation approach, it was considered just simulation models described strictly as a function of pseudo-random variables. Models (based on other approaches, like discrete-event) only applying this technique over input parameters for stochastic simulation were not taken into account here. Thelin et al (2004) present a Monte Carlo model using such approach to improve software inspections performance by sampling documents according to systematic criteria in order to reduce the set of artifacts to inspect, and so, reducing the time spent on inspections.

3.2 Software Engineering Domains in Simulation Studies


Although domains involved in Software Engineering simulation studies can be easily identified in model descriptions, the same does not occur with the purpose for which the model was built. Frequently, authors use terms like requirements engineering simulation model, but these simulators seldom encompass the whole domain and its peculiarities. Its important to establish boundaries and scope of simulation models in order to evaluate its adoption as an instrument when conducting a simulation study or identifying a research opportunity. About domains which simulation studies have been applied, Software Process and Software Project domains are the most present in the technical literature [Stopford and Counsell, 2008]. In this quasi-Systematic Review, we could confirm this statement based on papers found. Of course, it is possible to say, almost always, that simulation models can rely on software process or product. But, in our classification, we use Software Process to characterize simulation models concerned with applying analysis in the whole software development process structure and performance points of view. Analysis of process bottlenecks, activities dependencies, and cross-project issues are of interesting of Software Process Simulation. For instance, in [Chen & Liu, 2006] the following quote exemplifies what we characterize as a Software Process Simulation concern:

COPPE/PESC

19

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

So carry on software process simulation dynamically can predict defect and bottleneck in advance, help to eliminate the defect and optimize the software development process, and offer theory support for making decision.

On the other side, Software Project Simulation is related to (human and material) resource management, allocation policies, and scheduling and cost issues, among others. Examples of such studies can be found in a series of studies conducted by Abdel-Hamid and Madnick in [Abdel-Hamid and Madnick, 1986] [Abdel-Hamid, 1988a] [Abdel-Hamid, 1988b] [Abdel-Hamid, 1989] [Abdel-Hamid, 1990] and [Abdel-Hamid, 1993]. In figure 2, it is clear the majority number of papers and models related to Software Process and Software Project domains, even when analyzing them separately. Software Architecture and Design domain groups simulation models which the purpose encompass design issues for different classes of systems, for instance: fault-tolerant systems, embedded systems and real-time systems, under the perspective of quality attributes such as reliability and performance. This group is characterized also by an approach of simulating the product (design specification), instead of design process. Alvarez e Cristian (1997) presented a simulation tool (and model) to the design and performance evaluation of fault-tolerant systems and Kang et al (1998) presented the ASADAL/SIM tool, a simulation and analysis tool for real-time software specifications. Almost all other domains are related to a process-based perspective, for instance, Software Inspections (Madachy and Khoshnevis, 1997), Quality Assurance (Drappa and Ludewig, 1999) and Requirements Engineering (Ferreira et al, 2009) processes. In these cases, simulation models are used to provide an understanding of the impact of these specific sub-processes variables on the whole software development process performance.

COPPE/PESC

20

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 2. Software Engineering Domains Distribution

An interesting perspective of this distribution (figure 2) is presented in figure 3; it shows the coverage4 of each simulation approach (figure 1) over different Software Engineering domains.

By coverage we mean the percentage of elements (SE domains in this case) observed to appear given a simulation approach.
COPPE/PESC 21

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 3. Simulation Approaches Coverage over Software Engineering Domains

The bar chart presented in figure 3 should be read considering each percentage as a number representing the relative occurrence of a specific approach in the space of software engineering domains. For example, the Discrete-Event approach appears in 47.1% percent of the Software Engineering domains found in this review, considering the number of papers and not models only. Among the different simulation approaches, Discrete-Event and System Dynamics present the major coverage over these domains. Besides, Hybrid Simulation, which is mostly a combination of Discrete-Event and System Dynamics, covers 23.5% of the domains. So, the representation of these approaches in Software Engineering, in our sample, can be considered as the main alternatives to build a simulation model in this context. A detailed view of this coverage can be seen in table 1. It illustrates this coverage by mapping each SE Domain (rows) to the simulation approaches (columns), according to the findings in analyzed papers. The numbers inside the cells indicate in how many papers such mapping was found. For instance, the Hybrid Simulation (Continuous + Discrete) approach was used to develop modes of the following domains: Agile Methods (1 paper), Global Software Development (3 papers), Software Process (6 papers) and Software Project Management (2 papers). Thus, these four domains represent 23.53% of the domains observed in this quasi-systematic review, in other words, the Hybrid Simulation (Continuous + Discrete) approach has coverage of 23.53%.

COPPE/PESC

22

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Table 1. Mapping between SE Domains and Simulation Approaches


ABS Agile Methods CBSE Global Sw Development OSS Dev QA Req Eng Software Acquisition Software Architecture/Design Estimation Sw Evolution Inspections Software Process Product Line Sw Project Management Release Planning Testing Technology Substitution / New Technology Adoption / Innovation 1 2 4 1 3 2 1 2 1 1 1 1 23 3 1 1 6 1 1 1 1 1 1 CG DEVS GCS 1 HS (Cont + GDS Discrete) 1 HS (PN + DEVS) KBS MCS Not Specified ProxelOOS based QS qCS SQS SBS SPA SD 2 1 1 1 1 1 3 1 1 1 6 1 5 1 2 2 1 1 4 3 6 1 1 TPA COVERAGE (%) 17,65 5,88 17,65 23,53 5,88 11,76 5,88 29,41 5,88 17,65 11,76 47,06 5,88 52,94 11,76 11,76

1 23,53 5,88 5,88 5,88 23,53 5,88 5,88 11,76 11,76 5,88 5,88 5,88 76,47 OOS Object-Oriented Simulation QS Qualitative Simulation qCS quasi-Continuous Simulation SQS Semi-Quantitative Simulation SBS State-Based Simulation SPA Stochastic Process Algebra SD System Dynamics TPA Temporal Parallel Automata 5,88

5,88

COVERAGE (%) 11,76 5,88 47,06 17,65 5,88 ABS Agent-Based Simulation CG Conditional Growth DEVS Discrete-Event GCS General Continuous Simulation GDS General Discrete Simulation HS (Cont + Discrete) Hybrid Simulation (Continuous + Discrete) HS (PN + DEVS) Hybrid Simulation (Petri Net + Discrete-Event) KBS Knowledge-Based Simulation MCS Monte Carlo Simulation

COPPE/PESC

23

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

3.3 Simulation Tools for Software Engineering


Simulation models are systems or processes abstractions specified in a simulation language, representing the concepts involved in the underlying simulation approach. So, when using these models as instrument for a simulation-based study, simulation tools are needed in order to make feasible the simulation trials. For the papers found in this quasi-Systematic Review, most papers present an experience using generic-purpose simulation tools, like Vensim5, Arena6 and iThink7. On the other side, specific simulation tools appear as the authors implementation of their models ; it is the case of SESAM (Software Engineering Simulation by Animated Models) [Drappa & Ludewig, 2000], where a tool with an interactive graphical user interface is used to simulate a software project organization for training purposes. Often, specific-purpose tools are used only in a few studies. It may occur because the tool purpose reaches only objectives of a specific model and of its respective studies. Figure 4 presents simulation tools distribution across papers and models captured in this quasi-systematic review.

Figure 4. Simulation Tools Distribution

The most used simulation tools as it can be seen in figure 4 are Vensim, Extend8 and iThink. The main reasons for this adoption may be related to their generic purpose and also for simulation approaches supported by these tools, named System Dynamics and Discrete-event simulation. The tools and their references are presented in table 2, as well as the simulation approach supported by them.
5 6

http://www.vensim.com http://www.arenasimulation.com 7 http://www.iseesystems.com 8 http://www.extendsim.com


COPPE/PESC 24

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Table 2. Simulation Tools References Simulation Tool AnyLogic ARENA ASADAL/SIM C-Sim CESIUM DEVSim++ Reference www.xjtek.com/anylogic/why_anylogic www.arenasimulation.com selab.postech.ac.kr/xe/?mid=selab_link www.atl.lmco.com/projects/csim No website found. sim.kaist.ac.kr/M5_1.htm Simulation Approach -Discrete-Event -Discrete-Event -Discrete-Event -Discrete-time Simulation -Object-oriented Simulation - Hybrid Simulation (SD + DEVS) -Discrete-Event -Not Specified Extend www.extendsim.com - Hybrid Simulation (SD + DEVS) -Discrete-Event -System Dynamics GENSIM iThink Matlab NetLogo PEPA PowerSim Professional Dynamo Plus (PD+) Prolog QSIM No website found. www.iseesystems.com/ www.mathworks.com/products/matlab/ ccl.northwestern.edu/netlogo www.dcs.ed.ac.uk/pepa/tools www.powersim.com No website found. www.swi-prolog.org ii.fmph.uniba.sk/~takac/QMS/qsimHowTo.html -System Dynamics -System Dynamics -Discrete-Event -Agent-based Simulation -Stochastic Process Algebra -System Dynamics -System Dynamics -Discrete-Event -Qualitative Simulation -Semi-quantitative Simulation ReliaSim RiskSim No website found. www.treeplan.com/risksim.htm -Not Specified. -Hybrid Simulation (PN + DEVS) SEPS SES No website found. www.stackpoleengineering.com/software.aspx -System Dynamics -Hybrid Simulation (PN + DEVS) SESAM SIMNET SimSE SLAMSYSTEM Statemate Magnum SystemC Vensim www.iste.unistuttgart.de/en/se/forschung/schwerpunkte/sesam.html No website found. www.ics.uci.edu/~emilyo/SimSE research.microsoft.com/en-us/projects/slam No website found. www.systemc.org www.vensim.com -quasi-Continuous Simulation -Discrete-Event -Not Specified -Discrete-Event -State-based Simulation -Not Specified -System Dynamics

COPPE/PESC

25

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

3.4 Characteristics of Simulation Models


This quasi-systematic review aims at characterizing simulation-based studies in many facets. So, among other data, we looked for characteristics of simulation models. Besides, we tried to identify any kind of taxonomy or classification for models characterization. After intense information extraction through selected papers, we have found a set of characteristics explicitly mentioned by the authors conceived with the model descriptions (figure 5). In several papers, we perceived that authors do not characterize their models; but just present the model or a representative part of it. There is no concern on describing models according to their essential characteristics; it is likely only to mention the underlying simulation approach (and sometimes to describe this approach) instead. For example, Pfahl and Lebsanft (2000) mentioned the simulation approach in the quote The simulation model was implemented in a modular way using the SD (System Dynamics) tool Vensim. Another example is We present a discrete-time simulator tailored to software projects which [Padberg, 2003]. The remaining characterization is just a model specification of brief description; in terms of its variables, instead of how it works.

Figure 5. Simulation Model Characteristics

All the characteristics presented in figure 5 are mostly related to the simulation approaches, instead of simulation models. Thus, we conclude that authors assume simulation model characteristics are known through its underlying approach. When authors are using a new or hybrid approach, a brief description of how it works is given. For these reasons, the characteristics distribution is biased by System Dynamics characteristics (such as, dynamic, causal relationships, feedback loops, and others), since this approach has more occurrences than others. Table 2 presents the description of it characteristic taken from these papers.

COPPE/PESC

26

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Characteristic Analytic Asynchronous Bi-directional Simulation (Forward and Reverse)

Causal Relationships Continuous-time Deterministic Discrete-time

Dynamic

Envelop functions Extensible Feedback loops Formal Fuzzy variables Hierarchical Interactive

Knowledge Representation Nonlinear interactions PI-Calculus Process-based Qualitative Abstraction Qualitative Differential Equations

Table 2. Description of Simulation Models Characteristics Description Express high-level quantitative relationships between input and output parameters through mathematic equations. Lack of centralized coordination of the simulation model. Using both types of models together: Forward simulation - models of the system of interest as it evolves forward through time. Reverse simulation - models of the system as it moves in reverse or backward in time. Establish cause-effect relationships. Time advances in constant and small steps, as a continuously differentiable function. Given a specific input, the output will be the same for every simulation run. Timeless steps that are interleaved with user defined simulated durations. Timeless execution does not mean that the code in one step takes zero time to execute; it means that the model time that is used in discrete simulations is frozen during the step. Basically, the state space is built by observing all possible options of what can happen at the next time step. Models parameters are less reliant on highly precise numerical data, and they are likely to exceed historic ranges in any case. Cause-effect relationships constantly interact while the model is being executed. Given interval bounds on the values of some landmarks and envelopes on the monotonic functions, its QDE defines a constraint-satisfaction problem (CSP). Provide explicit extension point through input and output ports. If any model has a compatible input or output port, the model can be extended. A continuous flow of information within a system and it has a property of self-correction. The structural relations between variables must be explicitly and precisely defined. Use of fuzzy logic on defining model (input and output) variables. Structured in different levels and blocks. Users are responsible for generating control signals or data out of external entities. Users can also change system states interactively to situate special conditions and to debug and locate the cause of unexpected behavior. Use of knowledge modeling techniques (for instance, Cognitive Maps) Interactions between variables not following a linear function. Uses a rigorous semantics described in pi-calculus formalism. A discrete model described as a workflow, different from event-driven. Qualitative Abstraction (QA) of the empirical data transforms a sequence of measurements into a pattern. Parameters of the differential equations do not need to be specified as real numbers. It is sufficient to know whether they are positive or negative, specifying simply as monotonically increasing and decreasing. Expresses natural types of the incomplete knowledge from real world. Each stage is modeled as number of servers, where every server has its own (resource) queue. The rule part of the model determines how the state is changed with every time step. Each rule consists of a condition and an action part. Enable the user to test the effects with several combinations of events on the process. It is a model that allows a user to define several different scenarios for the system or process. The state of a system is changed only when certain events occur and is not changed between these events. Processes an input event based on its state and condition, and it generates an output event and changes its state (state transition). Instead of assigning deterministic values to model parameters and variables these values can be sampled from plausible input distributions. Opposite of asynchronous.

Queue models Rule-based Scenario-based State based on events State-based Stochastic Synchronous

COPPE/PESC

27

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

The approaches coverage for these characteristics is presented in figure 6.

Figure 6. Approaches coverage over characteristics

According to figure 6, Discrete-Event and System Dynamics approaches are the ones which cover, or at least were described like this, the most part of the characteristics found in the set of papers selected for this quasi-systematic review, and besides, the Hybrid Simulation approach covers 24.14% since it is described based on the characteristics of two previous approaches and how they are combined. The majority of papers describing models based on these two approaches may explain it, in part. So, the probability to find a better characterization about these approaches is substantially larger than the others. Other approaches, in general, can only cover their specific characteristics, since it is what is highlighted in papers that contain models based on them. Table 3 illustrates this coverage by mapping each characteristic (rows) to the simulation approaches (columns), according to the findings in analyzed papers. The numbers indicate how many times such mapping was found. For instance, the characteristic Nonlinear Interactions is mentioned in five research papers describing or using a model based on System Dynamics approach.

COPPE/PESC

28

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

ABS Analytic Asynchronous Bi-directional Simulation Causal Relationships Continuous-time Deterministic Discrete-time Dynamic Envelop functions Extensible Feedback loops Formal Fuzzy variables Hierarchical Interactive Knowledge Representation Nonlinear interactions PI-Calculus Process-based Qualitative Abstraction Qualitative Differential Equations Queue models Rule-Based Scenario-based State based on events State-based Stochastic Synchronous COVERAGE (%)
COPPE/PESC

CG 1

DEVS GCS

GDS

Tabela 3. Characteristics X Simulation Approach Mapping HS (Cont + HS (PN + Not ProxelDiscrete) DEVS) KBS MCS Specified OOS based QS

COV.(%) qCS SQS SBS 1 SPA SD TPA 5.26 5.26 5.26

1 1 2 5 1 2 1 1 4 2 4 4 4 1 2 1 1 1 5 1 2 1 2 2 3 2 4 3 2 1 7.14 7.14 11 1 42.86 10.71 7.14 25.00 3.57 7.14 3.57 14.29 0.00
29

15 1 1 1 1 1 1 1 1 2 13 1 1 1 22 4 7

10.53 26.32 5.26 26.32 31.58 5.26 15.79 15.79 10.53 5.26 10.53 5.26 10.53 5.26 5.26 10.53 5.26 10.53

1 5.26 5.26 5.26 1 1 1 1 3 7.14 15.79 10.53 52.63 5.26

2 1 1 1 2 3.57 1 10.71 14.29 7.14

7.14

7.14 39.29

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

ABS Agent-Based Simulation CG Conditional Growth DEVS Discrete-Event GCS General Continuous Simulation GDS General Discrete Simulation HS (Cont + Discrete) Hybrid Simulation (Continuous + Discrete) HS (PN + DEVS) Hybrid Simulation (Petri Net + Discrete-Event) KBS Knowledge-Based Simulation MCS Monte Carlo Simulation

OOS Object-Oriented Simulation QS Qualitative Simulation qCS quasi-Continuous Simulation SQS Semi-Quantitative Simulation SBS State-Based Simulation SPA Stochastic Process Algebra SD System Dynamics TPA Temporal Parallel Automata

COPPE/PESC

30

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Among the selected papers, no one gives a classification or taxonomy for simulation models. However, Raffo (2005) says that Software Process Simulation Models rely on three main paradigms: discrete-event, state-based and system dynamics. In that opportunity, some characteristics are only mentioned. On a Systematic Review about Software Process Simulation, Zhang et al (2008) also present the simulation paradigms for each papers/model selected by their study, but no characteristics were discussed.

3.5 Verification and Validation (V&V) Procedures for Simulation Models


Any simulation model based on observation of a real-world system or process needs to be validated in order to ensure a minimum confidence degree of its output results and a compliance with the observed system or process structure and behavior. In an attempt reach such validity, reported simulation studies describe, or mention, some of the procedures in figure 7.

Figure 7. Verification and Validation Procedures Distribution

Table 5 presents a brief description of each V&V procedure found in this review.

COPPE/PESC

31

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Table 5.

Procedure Comparison against actual (dataset) results

Comparison against data from literature

Comparison against reference behaviors from the technical literature

Comparison against other models results

Review with experts

Interview with experts

Survey with experts

Testing structure and model behavior

Based on empirical evidence from the technical literature

Description This procedure consists in comparing the simulation output results against actual output of the same phenomenon. It is likely to use this procedure for measuring model accuracy. This procedure consists in comparing the simulation output results against output (performance) data from others studies presented in the technical literature. The studies should have the same goals. It is likely to use this procedure when there is no complete data at hands. This procedure consists in comparing the simulation output results against trends or expected results often reported in the technical literature. It is likely to use this procedure when no comparable data is available. This procedure consists in comparing the simulation output results of one simulation model against other model. Controlled experiments can be used to arrange such comparison. This procedure consists in getting feedback from system or process experts in order to evaluate if simulation results seems to be reasonable. This review may be performed using any method, including inspections. It is likely to use this procedure for model validation purposes. This procedure consists in getting feedback from system or process experts through interviews in order to evaluate if simulation results seems to be reasonable. It is likely to use this procedure for model validation purposes. This procedure consists in getting feedback from system or process experts through surveys in order to evaluate if simulation results seems to be reasonable. It is likely to use this procedure for model validation purposes. This procedure consists in submitting the simulation model to several of tests cases and evaluating the responses and traces. It is likely to use this procedure for model verification purposes. This procedure consists in collecting evidence from the technical literature (experimental studies reports) to develop the simulation model.

It seems that the first alternative chosen has been the comparison between simulation results against actual data, i.e., this is the most common procedure found in the technical literature. Its a nice way to verify what we call model accuracy, in terms of its result outputs, but many others threats to the study validity should be evaluated. In order to make such comparisons like in early 90s, when Abdel-Hamid and Madnick applied their model in several different environment configurations, as mentioned in section 3.2, it is important to check whether the actual collected data capture the whole context (influence variables and constants) the model requires or assumes to be real, in other words, model parameters and variables should share the same measurement context. Otherwise, it will be a naive comparison between two distinct enough contexts that cannot be compared.

COPPE/PESC

32

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

The same problem occurs when the comparison is between simulation output results against data from the technical literature; the latter is seldom available in enough details in order to make valid comparisons. Thus, it is very difficult to assure the same configuration (input and calibration parameters) for both simulation results and data collected from the technical literature. Ambrosio et al (2011) use two procedures for model validation: one is a comparison against actual data from a software company, and the other is a comparison against summarized data from different sources extracted from the technical literature. Lack of data is a known problem in Software Engineering simulation [Raffo et al, 1999], including support to experimentation [Garcia et al, 2005]. So, different efforts are needed in order to increase the validity of simulation studies. An interesting approach is the comparison of simulation results against known reference behaviors 9 in some research areas. In this case, it is possible to analyze whether the simulation model is capable of reproduce consistent results, even if it is not possible to measure accuracy using this procedure. Setamanit et al (2007) compare the output results of a global software development simulation model against behaviors of GSD projects as described in the technical literature. Once we have validated models, or at least know them in terms of performance and accuracy, it is possible to compare them. It would be good to have benchmark results and available datasets to perform controlled experiments aiming at comparing models, establishing distinct treatment and control groups in order to test hypothesis about independent and dependent variables influence relationships. Considering the data and reference behaviors unavailability needed for the previous V&V procedures, it still useful to conduct reviews, interviews and surveys with simulation and system under study experts. These kinds of procedure help in a better understanding of the simulation model structure and getting insights to improve it. It is more like a validation procedure, since model validation gets the customer involved. Choi et al (2006) mention a feedback review with experts for verification of a simulation model based on UML for mission-critical real-time embedded system development. Setamanit and Raffo (2008) calibrated their model based on information from survey questionnaires as well as in-depth interviews with the technical director, the project manager, and software developers. The testing model structure and behavior is related to apply several test cases to the simulation model. No paper gives details about how to plan and perform these tests. One of the most important V&V procedures should be considered before the model to be conceived, i.e., it does not work as a test, but it brings confidence to the simulation model. The model building methodology should be based on empirical evidence in the technical literature; it would rather be based on controlled experiments, which can take conclusion about hypothesis involving the simulation model variables. Anyway, to use results of wellconducted studies (such as controlled experiments, surveys and case studies) in the conception of simulation models would be better than observing in an ad-hoc manner the system or process to be simulated. Melis et al (2006) presented a series of experiments and case studies results about pair programming and test-driven development. These results were used both to determine variables relationships and equations of a system dynamics model. Figure 8 shows how simulation approaches cover the procedures mentioned before.

Examples of reference behaviors in Software Engineering can be th e Brooks Law for software projects, the Lehman's laws of software evolution, and any process/product repeatable behavior in a software organization.
COPPE/PESC 33

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 8. Simulation Approaches Coverage over V&V Procedures

All procedures found in papers report simulations studies seems to be applicable to System Dynamics models, i.e., it possible means that this approach has reached a high maturity degree, covering several ways to verify and validate a model built under this approach. Or, maybe, it is only a matter of a higher sample of studies using System Dynamics. On the other side, there are approaches with no attempt to verification or validation. Further, it will be discussed the evaluation settings in terms of the studies performed in the analyzed papers. In another perspective view, figure 9 presents the coverage of V&V procedures over simulation approaches.

COPPE/PESC

34

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 9. V&V Procedures Coverage over Simulation Approaches

According to figure 9, the procedure Comparison against actual results covers the largest set of simulation approaches with 31.58%. It is a common procedure, but in many cases it seems to be used inappropriately, since data came from two distinct contexts and a comparison cannot be established. Also, Testing structure and model behavior and Based on empirical evidence from the technical literature are the two other approaches with a greater coverage, reaching 26.32% of the simulation approaches. The former is presented as an evaluation of the simulator (tool), but not the model itself. On the other hand, evidence from the literature is directly related to the simulation model, i.e., the relationship among variables as well as equations derived from collected data.

COPPE/PESC

35

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

ABS Comparison against actual (dataset) results Comparison against data from technical literature Comparison against reference behaviors from the technical literature Comparison against other models results Review with experts Interview with experts Survey with experts Testing model structure and behavior Based on empirical evidence from the technical literature 1 1

CG

DEVS GCS 2 1

GDS

HS (Cont + Discrete) 1

HS (PN + DEVS)

KBS

MCS

Not Specified

OOS 1

Proxelbased

COV.(%) QS qCS SQS SBS SPA SD 11 2 TPA 31.58 10.53

15.79

4 3 1 1

1 1 9

21.05 15.79 5.26 15.79 1 26.32 5.26 11

1 1 1 1 22 0 1 22 11 0 0 0 1

1 1 8 100

1 56

22 0 44 11 0 COVERAGE (%) ABS Agent-Based Simulation CG Conditional Growth DEVS Discrete-Event GCS General Continuous Simulation GDS General Discrete Simulation HS (Cont + Discrete) Hybrid Simulation (Continuous + Discrete) HS (PN + DEVS) Hybrid Simulation (Petri Net + Discrete-Event) KBS Knowledge-Based Simulation MCS Monte Carlo Simulation

11 33 11 0 OOS Object-Oriented Simulation QS Qualitative Simulation qCS quasi-Continuous Simulation SQS Semi-Quantitative Simulation SBS State-Based Simulation SPA Stochastic Process Algebra SD System Dynamics TPA Temporal Parallel Automata

COPPE/PESC

36

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

3.6 Simulation Output Analysis


After executing simulation runs several data is generated and need to be properly analyzed. Analysis of simulation output results is a challenging research area and more about it can be found in [Alexopoulos, 2007] and [Law, 2007]. Here, in this section, we are interested in presenting the statistical instruments (namely charts, tests and metrics) often used to analyze simulation output results. Mostly, output analyses are based on charts, rarely statistical procedures such as hypothesis tests are used upon. Maybe, it relies on the lack of a rigorous and systematic approach for Simulation-Based Studies. Figure 10 shows the instruments found for simulation output analysis.

Figure 10. Simulation Output Results Analysis Instruments

The far mostly present analysis instrument in figure 10 is the Sequence Run Chart. This chart can be used for both discrete and continuous data and well-represent time sensible data, showing a time increasing on it x-axis. Maybe, these characteristics are some reason for its high adoption. In figure 11, we have almost the same chart from figure 10, but excluding Sequence Run Chart to better analyze other occurrences.

COPPE/PESC

37

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 11. Simulation Output Results Analysis Instruments without Sequence Run Chart

A curious case happens with Sensitivity Analysis technique: the number of models is greater than number of papers. It happens since there are some papers promoting the use of this technique to analyze and understand simulation models, identifying most relevant model factors and input parameters. In these cases, authors apply sensitivity analysis in more than one simulation model, making some comparisons and presenting different situations that can be found. For example, when models make use of too many input parameters, but just a few really contribute for results. Examples of studies involving sensitivity analysis are [Houston et al, 2001] using four models and [Wakeland et al, 2004] using just one model. The major part (13 out of 22) of these instruments is represented by statistical charts. Besides, they are the most found in papers. Of course they are a relevant way of presenting data, but the significance of these results should never be analyzed only looking to charts. Other statistical instruments are needed, namely significance tests and systematic analysis procedures (such as sensitivity analysis). These other instruments are still underused and misused. Even descriptive statistics havent received the needed attention, as long as they are good sample summaries. Figure 12 presents simulation approaches coverage over statistical analysis instruments through papers selected.

COPPE/PESC

38

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 12. Simulation Approaches Coverage over Statistical Analysis Instruments

System Dynamics covers, as in other perspectives, the major part of the occurrences with 66.67%, followed by Hybrid Simulation (which uses System Dynamics too) and DiscreteEvent approaches with 42.86% and 38.10%, respectively. Again, it seems to be that the number of papers and models reporting these approaches biased the results. However, we can still perceive a lower use of analysis instruments and methods per paper or model, considering approaches with a little number of papers reported. It can be interpreted such that a coverage factor of 4.76% and 9.52% means that these papers present only one or two analysis instruments. Thus, just seven simulation approaches have applied more than two analysis instruments, and considering the complexity of simulation output results, likely, it is not enough.

3.7 Study Strategies involving Simulation


A common motivation to use and develop simulation models is to provide a basis for experimentation. Simulation allows a researcher to estimate the behavior of an existing system under some conditions and can maintain much better control over experimental conditions [Wu and Yan, 2009]. However, simulation models have not been used as experimental instruments in many cases. Among 108 papers selected in this quasi-systematic review, just 57 present primary studies. Besides, there is a misunderstanding on classifying them in study strategies such as case studies, experiments and others. Most of them are only examples of use (assertions or informal feasibility studies [Zelkowitz, 2007]), with no systematic methodology for planning, execute and analyze the study. As we could not find any taxonomy or classification schema specific for simulation based studies, we tried to analyze the studies found from the perspective of known study strategies of experimental software engineering research area. First, we present the terminology used and, then, we apply this terminology to our findings. For an understanding and classification of primary studies, it is important to be aware of the level of control in experimental studies, and it is also important in the context of

COPPE/PESC

39

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

simulation. Travassos and Barros (2003) present a four-staged classification of empirical studies in Software Engineering concerning their environment and the participants: In vivo: studies involving subjects in their own environments; In vitro: studies executed in a controlled environment, such as a laboratory or a controlled population; In virtuo: studies involving interaction among subjects and a computerized model of reality; In silico: studies characterized for both subjects and real world being described as computer models.

Figure 13. Software Engineering Primary Studies Classification

As presented in Figure 13, the setting of environment and participants of each study strategy impact on the level of control, risk and cost. However, for our scope, simulationbased studies, we are interested just on the in virtuo and in silico studies, where the threats to control, risk, and cost is claimed to be low, and need for SE knowledge to be high. Some confusion can be observed about what control does mean in simulation studies. The explicit variation of input parameters (many times used in techniques like Sensitivity Analysis) does not mean that a model user has the control over the object under investigation. In doing so, he/she is just establishing a parameter setting or configuration, which is different from controlling factors (independent variables) in a controlled experiment arrangement. There is a slightly difference from understand a model behavior and arrange a valid experiment design. When the variation of the input parameters is performed in an ad-hoc manner, just intending to understand the model behavior through the correspondent impact on the model output variables, you have no control over the simulation model. In other words, the output values are not meaningful yet in a real context, since you cannot make assumptions that you are generating confident or valid output values without previously observing the actual behavior, for instance, on in vivo or in virtuo contexts as pointed out in Figure 13. Unless the variation of model input parameters change the model behavior, what has been done is just to determine the value of a dependent variable for the model behavior curve/function/equation, given independent variable value.

COPPE/PESC

40

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

The same concept can be applicable to a study comparing two models: one that has been built and evaluated against actual data collected from the system it has been abstracted from, and a new model based on the first model but adding new modules/features without actual data support. If both models are running and being compared based on the same data set, this use of shared data set can be a threat to construct and conclusion validity of the study. The explanation for that relies on the fact that output values of a modified model does not come from the same measurement context of the former model. So, they are not comparable. Applying this concept of control to the different study strategies, we took as a starting point the glossary10 and the ontology11, both proposed by Lopes (2008), in order to classify the simulation studies observed in the SLR. Below, we present the research strategies taken from the ontology and their respective definition from the glossary: Action Research: Action research is a form of action inquiry that employs recognized research techniques to inform the action taken to improve practice. o Characteristics: applied in a real-life and non-controlled environment, researcher intervention, collaborative action with subjects; o In simulation: not applicable, if not case study. Survey: A comprehensive research method for collecting information to describe, compare or explain knowledge, attitudes and behavior. A survey is often an investigation performed in retrospect, when, for example, a toll or technique, has been in use for a while. o Characteristics: information gathering, retrospective, no control. o In simulation: to run simulations based on quantitative historical data without change model variables. Case Study: o A case study is an empirical inquiry that investigates a contemporary phenomenon within its real-life context, especially when the boundaries between the phenomenon and context are not clearly evident (Yin, 2003). Data is collected for a specific purpose throughout the study. It is normally aimed at tracking a specific attribute or establishing relationships between different attributes. o Characteristics: real-life context/environment, unclear boundaries and scope, specific purposed data gathering, low level of control; o In simulation: to run simulations based on current quantitative real-life context data in order to investigate with a specific purpose. Observational Study: Observational study collects relevant qualitative, sometimes quantitative, data as a project develops. There is relatively little control over the development process. o Characteristcs: data gathering, low level of control; o In simulation: not applicable. Controlled Study: A controlled experiment is an investigation of a testable hypothesis where one or more independent variables are manipulated to measure their effect on one or more dependent variables. Controlled experiments allow us to determine in precise terms how the variables are related and, specifically, whether a cause-effect relationship exists between them. o Characteristics: high level of control, testing hypothesis, cause-effect relationship among variables; 12 o In simulation: comparison of output variables of two distinct models .
10 11

http://lens-ese.cos.ufrj.br/wikiese/ http://lens.cos.ufrj.br/esee/ 12 Here, we are talking both simulation and analytical models.
COPPE/PESC 41

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

So, weve tried to apply these concepts from the external13 technical literature to simulation-based studies, classifying them into: survey (retrospect data), case study (current real data) and controlled experiments. In addition to this classification scheme, we proposed a set of information to simulationbased studies characterization, according to table 1.
Table 1. Simulation-based studies characterization
Perspective Number of models Number of datasets Data set Input parameters 1, 2, , N 1, 2, , N - Project data [Historical or Current] - Artificial data [Example or Systematically Generated] - Determined in an ad-hoc way - Determined in an systematic way - Constant - Variables Model calibration Study procedure - Calibrated - Non-calibrated - Comparison against other or modified models results - Variation of input parameters to observe the impact on response variable - Just execute simulation runs Possible values

We obtained some statistics in characterizing the studies found in our quasi-systematic review according to the terminology already presented. As shown in figure 14(A), Survey studies cover more than a half of the studies. It is important to remember that these surveys were not designed as a collection of experts opinion using forms as instruments , but surveying past (historical) project data through simulation runs using a simulation model as an instrument. It is more like ask the simulation model (built with retrospect data) for values of output variables given a configuration of input parameters. Several authors call their studies an experiment. In fact, these studies are what the technical literature call simulation experiments and it is different from controlled experiments, as defined before and what is the meaning in figure 14(A) term Experiment. By simulation experiment we mean a test or a series of tests in which meaningful changes are made to the input variables of a simulation model so that we may observe and identify the reasons for changes in the performance measures [Maria, 1997], and this definition is closer of what we called Survey. In any case, it is difficult to identify hypothesis and experimental design in these reports, and also difficult to identify control and treatment groups in controlled experiments. Survey studies are proportional to the procedures pointed in figure 14(B), where Variation of input parameters to observe the impact on output variables as the procedure adopted to survey the simulation model. The same interpretation can be applied to percentage of controlled experiments and the procedure of Comparison other/modified models results. These comparisons were, in most part, made in a particular experimental design using distinct or same datasets in which a treatment and a control group were established to do a fair comparison among involved models.

13

Not captured by this review.


42

COPPE/PESC

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Figure 14. Characterization of simulation-based studies

Another interesting characteristic of these studies is related to how input parameters were determined (figure 14-C) for the study. They can be determined in four ways: Systematic constant: there is a pre-defined procedure just to generate or choose these values, which will be the same for the whole simulation study; it can also include multiple runs; Systematic variable: there is a pre-defined procedure just to generate or choose these values, which will be varied in different simulation instants of the simulation study; it can also include multiple runs. Sensitivity Analysis, like in [Houston et al, 2001] can be included in this category; Ad-hoc constant: there is no pre-defined procedure to generate or choose these values, which will be the same for the whole simulation study; it can also include multiple runs; Ad-hoc variables: there is no pre-defined procedure to generate or choose these values, which will be varied in different simulation instants of the simulation study; it can also include multiple runs. One common behavior of it is to set a new parameter value at a time t during the simulation run.

The model calibration is also an important characteristic in simulation studies [ren, 1981]. In figure 14(D) it is possible to see that many studies do not report this kind of information. Garousi et al (2009) presented an experiment using two distinct calibration parameters set, and also discussed the calibration, in order to understand the impact of V&V activities on project performance. Many of the simulation studies with no calibrated model are based on artificial data. The distribution of simulation studies over simulation approaches is presented in figure 15. The vast majority of simulation-based studies use System Dynamics models. Also, we have replications of simulation studies, as already mentioned before. Replications happen almost always by the same author. One main contribution for it is the fact that, currently,

COPPE/PESC

43

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

reported data is not enough to replicate a study. Some missing information in simulationbased study reports will be commented later in this section.

Figure 15. Distribution of Simulation Studies over Simulation Approaches

Analogous to simulation approaches, we present the studies distributed over Software Engineering domains in figure 16.

Figure 16. Distribution of Simulation Studies over SE Domains

COPPE/PESC

44

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

As occurs with System Dynamics approach, Software Project Management is the most studied domain in simulation studies selected. In the same way it has been explained for others perspectives (characteristics, v&v procedures, analysis instruments, and others), System Dynamics models for Software Project Management can be found in most selected papers in this quasi-systematic review. So, we may also conclude that these are the simulation-based studies that have been through a high experience and it is possible to learn with their maturity. Maybe, the will of explain the whole software development dynamics lead to this focus. However, study purpose and goals differ too much to explain it all. In papers analyzed, purpose and goals of simulation models are not clearly defined, and the same happens to the goals of the performed studies. It is very common to find descriptions mentioning only the problem where the proposed simulation model is involved, but it is very hard to find specific or structured research questions, hypothesis or a GQM approach, for example. At a first impression, one can believe that models were conceived before problems arise. It must not be true, in fact. But, the way how simulationbased studies have been reported lead to this kind of conclusion. Another issue found in simulation studies reports is related to the experimental design. In general, it is not reported at all. Its possible, but hard to identify factor s (many times, just parameters) and response variables for the studies where the output data is presented, at least, by charts. The arrangement is rarely found, i.e., to answer simples questions such as what are the treatments/levels for each experimental factor? What are the other model input parameters (context variables)? Do they remain constant? What were their initial values for each simulation run? Seldom, simple information like number of simulation runs are reported, and if were, no explained why such number was used. These problems should be considered of main importance, because without addressing these issues, replicate and audit these studies (or even verify the results) is an unfeasible task, as well as the difficulty to compare studies results and to make a benchmark of models, since there is not a comparable baseline. There are some exceptions among simulation-based studies as in [Houston, 2001] and [Wakeland et al, 2004], but the vast number of these studies rely on proof of concept-like studies, consisting in ad hoc experimental designs, not complying to any systematic experimental methodology, and often missing relevant information in their reports. Houston et al used DOE to measure the relative contribution of each factor to the variation in the response variables in order to characterize the behavior of system dynamic simulation models. Wakeland et al proposed the use of DOE (Design of Experiments) and BRSA (Broad Range Sensitivity Analysis) in order to understand the interactions and nonlinear effects at work in the model, i.e., the model logic and behavior, and in this way, leading to a better understanding of the underlying system/process.

COPPE/PESC

45

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

4 Conclusions
This section presents some concluding remarks taken from the performed quasisystematic review. First, we discuss threats to this review validity. Later, we present the open questions emerged after analyzing the characterization results. Finally, we summarize the current state of the art and future perspectives.

4.1 Threats to validity


It is possible to identify threats and risks to this quasi-systematic review validity in several performed activities. We tried to reduce them as much as possible when they were identified. Here, we present a discussion about each threat to validity of our work. Keywords. Terminology problems have been reported in many research areas involving computers science and engineering, including computer simulation and software engineering. So, to establish the set of keywords used to compose our search string we performed on a previously performed ad-hoc review. Also, we submit them to two experts in computer simulation applied to Software Engineering in order to minimize the absence of an unknown term. We do not use specific terms for each software engineering domains (such as testing, design, requirements, inspections, and others), instead of it, we used general terms to represent the Software Engineering area. It may be a threat to our study validity since it is possible that papers do not mention general terms of software engineering, rather than specific ones. Sources Selection. Scopus, Ei Compendex, Web of Science (ISI Knowledge) encompass the main publication databases (including journals and conference papers) for computer simulation and software engineering research. Examples of such databases include ACM, IEEE, Elsevier, Springer and WILEY. Among these databases we can find papers from journals like SIMULATION, Simulation Modelling Practice and Theory, Journal of Systems and Software, Software Process Improvement and Practice (now incorporated in the Journal of Software Maintenance and Evolution: Research and Practice), IEEE Transactions, LNCS and conferences like ICSSP (and their former versions), ICSE, ICST, among many others. For characterization purposes, a sample of the most important technical literature seems to be enough, since it is not feasible to review all research papers published about this subject. Inclusion and Exclusion Criteria. We tried to filter as much as possible papers not considering simulation applied to the Software Engineering field. Nine control papers (papers that should compose the answer to the research question) were pre-selected from the ad-hoc literature review and according to the authors experience . The coverage of search string related to our control papers was about 67%. From the relevant papers selected after the application of inclusion and exclusion criteria 14 papers were unavailable for download. Personal Understanding and Bias. Although we know that doing quasi-systematic reviews may impose a lot of extra and manual work that are error prone [Dyba et al, 2007], three reviewers were involved in order to reduce the selection and information extraction (guided by the information extraction form) bias. Classification. As long as we have no consensual taxonomy or classification schema we based it on the information presented by each paper. It may cause inaccurate classification. We tried to group terms (simulation approaches, characteristics, analysis and V&V procedures) pursuing a semantically similar definition or description. Conclusions. Another limitation of this study is the publication selection bias, since publications rarely contain negative results and present their weakness. Considering all
COPPE/PESC 46

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

these threats and the caution taken in order to reduce them, we believe our quasisystematic review was systematically performed and it brings some confidence to the results.

4.2 Open Questions


Some questions still remain after running this quasi-systematic review. They are listed as follows: Which kind of method or procedure can reduce the gap from what is observed from Software Engineering systems and processes and what has been modeled? What could be considered a minimum set of V&V procedures that, when successfully implemented, could bring confidence to a simulation model? What are the requirements for replicating simulation studies?

4.3 State of the Art and Future Directions


We will drive this section based on results found in this review and also on the open questions presented in the last section (4.2). Simulation approaches are likely to be mentioned and, in many cases, described. It is important to give an overview of the model underlying approach since it gives an interesting background of how the real system or process was abstracted and what is the execution mechanism used to drive simulations. Unfortunately, a few published simulation models use an approach not clearly defined (what we called Not specified), and it difficult to realize a simulation model developed without considering a standard abstraction and behavior. Domains also appear to be clearly defined. And, simulation in software engineering tends to be biased by the System Dynamics approach and the Software Process and Project Management models. It was not possible to capture what is the motivation for using a specific simulation approach. Maybe, such motivation should be the perspective modelers want to analyze the systems or processes or that some kinds of problems are related to a given simulation approach. We tried to relate simulation approaches to software engineering domains and characteristics found, but the only thing we could conclude is that models characteristics are just driven by simulation approaches. As long as we were not able to clearly capture and group problems and purposes of simulation models, it continues as a hypothesis. Simulation studies comprehend the main concerning of our results. There is a lack of rigor in planning studies (mainly experimental design issues), in assure model validity before perform studies, and analysis procedures for simulation output data. All these issues are treated, most likely, in an ad-hoc fashion, except for few studies that present a systematic way of doing one or another activity, but never all of them together. Following this reasoning about simulation studies, we believe that methodologies encompassing from planning to report of results of simulation-based studies, passing through model validity assurance and output analysis can be the next future direction and important problems to be solved in Software Engineering field. Such methodology should also consider peculiarities of Software Engineering such as lack of data and measurement issues. Improvements taken from such a methodology can highlight the requirements for replicating simulation studies, once it is done in a systematic way, it should be repeatable.
COPPE/PESC 47

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Another possible direction which is, in fact, on the way is about proposing concrete methods for developing simulation models in software engineering. Some specific approaches were proposed in literature, for instance, the IMMoS (Integrated Measurement, Modelling, and Simulation) methodology for development of System Dynamics models for Software Process Simulation [Pfahl and Ruhe, 2002].

COPPE/PESC

48

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

5 Referncias
Abdel-Hamid, T. Understanding the "90% syndrome" in software project management: A simulation-based case study The Journal of Systems and Software, 1988, 8, 319-33. Abdel-Hamid, T. The economics of software quality assurance: A simulation-based case study MIS Quarterly: Management Information Systems, 1988, 12, 395-410. Abdel-Hamid, T. K. Dynamics of software project staffing: A system dynamics based simulation approach IEEE Transactions on Software Engineering, 1989, 15, 109 119. Abdel-Hamid, T. K. Investigating the cost/schedule trade-off in software development IEEE Software, 1990, 7, 97-105. Abdel-Hamid, T. A multiproject perspective of single-project dynamics The Journal of Systems and Software, 1993, 22, 151-165. Abdel-Hamid, T. K. & Madnick, S. E. IMPACT OF SCHEDULE ESTIMATION ON SOFTWARE PROJECT BEHAVIOR. IEEE Software, 1986, 3, 70 75. Abdel-hamid, Tarek; e Madnick, Stuart. Software Project Dynamics: An Integrated Approach. Facsimile Edition, Prentice-Hall. 1991. Ahmed, R.; Hall, T.; Wernick, P.; Robinson, S. & Shah, M. Software process simulation modelling: A survey of practice. Journal of Simulation, 2008, 2, 91 102. Al-Emran, A. & Pfahl, D. Operational planning, re-planning and risk analysis for software releases. Lecture Notes in Computer Science, 2007, 4589 LNCS, 315 329. Al-Emran, Ahmed; Pfahl, Dietmar; Ruhe, Gnther. A Method for Re-planning of Software Releases Using Discrete-event Simulation. Software Process Improve and Practice, v. 13, p 1933. 2008. Alexopoulos, Statistical analysis of simulation output: State of the art. Simulation Conference, 2007 Winter, 2007. Alvarez, Guillermo A., C. F. Applying simulation to the design and performance evaluation of fault-tolerant systems Proceedings of the IEEE Symposium on Reliable Distributed Systems, IEEE Comp Soc, Los Alamitos, CA, United States, 1997, 35-42. Ambrosio, B. G.; Braga, J. L. & Resende-Filho, M. A. Modeling and scenario simulation for decision support in management of requirements activities in software projects Journal of Software Maintenance and Evolution, 2011, 23, 35 50. Arajo, M. A. P.; Travassos, G.H. Towards a Framework for Experimental Studies on Object-Oriented Software Decay. In: ACM/IEEE ISESE04 -International Symposium on Empirical Software Engineering, Redondo Beach, USA, 2004. Arief, L. B.; Speirs, N. A. A UML Tool for an Automatic Generation of Simulation Programs. In Proceedings of WOSP 2000 Banks, J. Introduction to Simulation. In: WINTER SIMULATION CONFERENCE (WSC99). Phoenix, AZ, USA, 1999. Barros, Mrcio O.; Werner, Claudia M. L.; Travassos, Guilherme H. Supporting risks in software project management. The Journal of Systems and Software, v. 70, p. 21 35. 2003.

COPPE/PESC

49

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Biolchini, J., Mian, P.G., Natali, A.C., Travassos, G.H. (2005). Systematic Review in Software Engineering: Relevance and Utility. Technical Report. PESC COPPE/UFRJ. Brazil. http://www.cos.ufrj.br/uploadfiles/es67905.pdf . Birta, L. G. e Arbez, G. Modelling and Simulation: Exploring Dynamic System Behaviour. Springer. 2007. Chen, Y.-X. & Liu, Q. Hierarchy-based team software process simulation model Wuhan University Journal of Natural Sciences, 2006, 11, 273 277. Choi, K.; Jung, S.; Kim, H.; Bae, D.H. & Lee, D. UML-based modeling and simulation method for mission-critical real-time embedded system development. Proceedings of the IASTED International Conference on Software Engineering, as part of the 24th IASTED International Multi-Conference on APPLIED INFORMATICS, 2006, 160 165. Drappa, A. & Ludewig, J. Quantitative modeling for the interactive simulation of software projects Journal of Systems and Software, 1999, 46, 113 122. Drappa, A. & Ludewig, J. Simulation in software engineering training Proceedings International Conference on Software Engineering, 2000, 199 208. Tore Dyb, Torgeir Dingsyr, Geir K. Hanssen. Applying Systematic Reviews to Diverse Study Types: An Experience Report. First International Symposium on Empirical Software Engineering and Measurement, 2007. Ferreira, S.; Collofello, J.; Shunk, D. & Mackulak, G. Understanding the effects of requirements volatility in software engineering by using analytical modeling and software process simulation Journal of Systems and Software, 2009, 82, 1568 1577. Forrester, Jay W. (1961). ISBN 1883823366. Industrial Dynamics. Pegasus Communications.

Garcia, R. E.; Oliveira, M. C. F.; Maldonado, J. C. . Genetic Algorithms to Support Software Engineering Experimentation. In: IV International Symposium on Empirical Software Engineering (ISESE), 2005, v. 1. p. 488-497. Grillinger, P.; Brada, P.; Racek, S. Simulation approach to embedded system programming and testing Proceedings - 11th IEEE International Conference and Workshop on the Engineering of Computer-Based Systems, ECBS 2004, 2004, 248254. Hst, M., Regnell, B., Tingstrm, C. A framework for simulation of requirements engineering processes EUROMICRO 2008 - Proceedings of the 34th EUROMICRO Conference on Software Engineering and Advanced Applications, SEAA 2008, 2008, 183-190 Houston, D. X; Ferreira, S; Collofello, J. S.; Montgomery, D. C.; Mackulak, G. T.; Shunk, D. L. Behavioral characterization: Finding and using the influential factors in software process simulation models Journal of Systems and Software, 2001, 59, 259-270 Kang, K.; Lee, K.; Lee, J. & Kim, G. ASADAL/SIM: An incremental multi-level simulation and analysis tool for real-time software specifications SOFTWARE-PRACTICE & EXPERIENCE, JOHN WILEY & SONS LTD, 1998, 28, 445-462. Law, Averill M. Statistical Analysis of Simulation Output Data: The Practical State Of The Art. Proceedings of the 2007 Winter Simulation Conference.

COPPE/PESC

50

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Lee, B. & Miller, J. Multi-project management in Software Engineering using simulation modeling. SOFTWARE QUALITY JOURNAL, KLUWER ACADEMIC PUBL, 2004, 12, 59-82. Lehman, M.M., 1980, Programs, Life Cycle and the Laws of Software Evolution, Proc. IEEE Special Issue on Software Engineering, vol. 68, no. 9, pp. 1060 -1076. Lopes, V. P. Repositrio de Conhecimento de um Ambiente de Apoio a Experimentao em Engenharia de Software. Dissertao de Mestrado. Programa de Ps-graduao em Engenharia de Sistemas e Computao, COPPE-UFRJ, 2010. Luckham, D. C.; Kenney, J. J.; Augustin, L. M.; Vera, J.; Bryan, D.; Mann, W. Specification and Analysis of System Architecture Using Rapide. IEEE Transactions on Software Engineering. Volume 21 , Issue 4. Special issue on software architecture. Pages: 336 355. 1995. Madachy, Raymond J. System dynamics modeling of an inspection-based process. In: 18TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE1996. Berlin, Germany: IEEE Computer Society, 1996. P. 376 - 386. Madachy, R. Software Process Dynamics. Wiley-IEEE Press. 2008. Madachy, R. & Khoshnevis, B. Dynamic simulation modeling of an inspection-based software lifecycle processes Simulation, 1997, 69, 35 - 47 Maria, A. Introduction to Modeling and Simulation. Proceedings of the 1997 Winter Simulation Conference. Martin, Robert H.; Raffo, David. A Model of the Software Development Process Using Both Continuous and Discrete Models. Software Process: Improvement and Practice, v. 5, n. 2-3, p. 147-157, 2000. Martin, R. & Raffo, D. Application of a hybrid process simulation model to a software development project Journal of Systems and Software, 2001, 59, 237 246. Melis, M.; Turnu, I.; Cau, A. & Concas, G. Evaluating the impact of test-first programming and pair programming through software process simulation Software Process Improvement and Practice, 2006, 11, 345 360. Mller, M.; Pfahl, D. Simulation Methods. Guide to Advanced Empirical Software Engineering. 2008, Section I, 117-152, DOI: 10.1007/978-1-84800-044-5_5 Nance, R. E.; Sargent, R. G. Perspectives on the Evolution of Simulation. OPERATIONS RESEARCH. Vol. 50, No. 1, January-February 2002, pp. 161-172. DOI: 10.1287/opre.50.1.161.177902002. Navarro, E. O. & Hoek, A. V. D. Design and evaluation of an educational software process simulation environment and associated model Proceedings - 18th Conference on Software Engineering Education and Training, CSEE and T 2005, 2005, 25 34. ren, T. I. Uses of Simulation. Principles of Modeling and Simulation: A Multidisciplinary Approach / John A. Sokolowski, Catherine M. Banks. John Wiley & Sons, Inc. 2009. Ormon, S.; Cassady, C. & Greenwood, A. A simulation-based reliability prediction model for conceptual design Proceedings of the Annual Reliability and Maintainability Symposium, 2001, 433 436. Padberg, F. A software process scheduling simulator Proceedings - International Conference on Software Engineering, 2003, 816-817. Pai, M. McCulloch, M. Gorman, J.D. et al. (2004) Systematic Reviews and meta-analyses: An illustrated, step-by-step guide, The National Medical Journal of India, vol. 17, n.2.
COPPE/PESC 51

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Pfahl, D. Lebsanft, K. Using simulation to analyze the impact of software requirement volatility on project performance Information and Software Technology, 2000, 42, 1001 1008. Pfahl, D.; Klemm, M. & Ruhe, G. A CBT module with integrated simulation component for software project management education and training Journal of Systems and Software, 2001, 59, 283 298. D Pfahl, G Ruhe. IMMoS: a methodology for integrated measurement, modelling and simulationby. Software Process: Improvement and Practice (2002) Volume: 7, Issue: 3-4, Publisher: Wiley, Pages: 189-210. Pfahl, D.; Laitenberger, O.; Dorsch, J. & Ruhe, G. An Externally Replicated Experiment for Evaluating the Learning Effectiveness of Using Simulations in Software Project Management Education Empirical Software Engineering, 2003, 8, 367 395. Raffo, D.; Kaltio, T.; Partridge, D.; Phalp, K.; Ramil, J. F. (1999). Empirical Studies Applied to Software Process Models. In Empirical Software Engineering, volume 4, issue 4, pages 353-369. Setamanit, S.O.; Wakeland, W. & Raffo, D. Using simulation to evaluate global software development task allocation strategies Software Process Improvement and Practice, 2007, 12, 491 503. Setamanit, S.O. & Raffo, D. Identifying key success factors for globally distributed software development project using simulation: A case study Lecture Notes in Computer Science 2008, 5007 LNCS, 320 - 332 Stopford, B. & Counsell, S. A Framework for the Simulation of Structural Software Evolution ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, ASSOC COMPUTING MACHINERY, 2008, 18. Thelin, T.; Petersson, H.; Runeson, P. & Wohlin, C. Applying sampling to improve software inspections Journal of Systems and Software, 2004, 73, 257 269. Travassos, G. H.; Santos, P. M.; Mian, P. G.; Dias Neto, A. C.; Biolchini, J. (2008) "An Environment to Support Large Scale Experimentation in Software Engineering," Engineering of Complex Computer Systems, IEEE International Conference on, pp. 193-202, 13th IEEE International Conference on Engineering of Complex Computer Systems (iceccs 2008). G. H. Travassos and M. O. Barros, Contributions of In Virtuo and In Silico Experiments for the Future of Empirical Studies in Software Engineering,Proc. 2nd Workshop in Workshop Series on Empirical Software Engineering, The Future of Empirical Studies in Software Engineering, Rome, WSESE03, Fraunhofer IRB Verlag, 2003. Turnu, I.; Melis, M.; Cau, A.; Setzu, A.; Concas, G. & Mannaro, K. Modeling and simulation of open source development using an agile practice Journal of Systems Architecture, 2006, 52, 610 618. Wakeland, W. W.; Martin, R. H. & Raffo, D. Using Design of Experiments, sensitivity analysis, and hybrid simulation to evaluate changes to a software development process: A case study Software Process Improvement and Practice, 2004, 9, 107 119. C. Wohlin, P. Runeson, M. Hst, M. C. Ohlsson, B. Regnell and A. Wessln, "Experimentation Software Engineering - An Introduction", Kluwer Academic Publishers, ISBN 0-7923-8682-5, 2000.

COPPE/PESC

52

Abordagens para Simulao em Engenharia de Software

Reviso Sistemtica

Wu, M. & Yan, H. Simulation in software engineering with system dynamics: A case study Journal of Software, 2009, 4, 1127 1135. Zelkowitz, M. V. Techniques for Empirical Validation. V. Basili et al. (Eds.): Empirical Software Engineering Issues, LNCS 4336, Springer-Verlag Berlin Heidelberg, pp. 4 9, 2007. Zhang, H.; Kitchenham, B.; Pfahl, D. Reflections on 10 years of software process simulation modeling: A systematic review. Lecture Notes in Computer Science, 2008, 5007 LNCS, 345-356.

COPPE/PESC

53