MSR 2014 May 31–June 1. Hyderabad, India
The 11th Working Conference on Mining Software Repositories

Keynotes

Is Mining Software Repositories Data Science?
Dr. Mockus, Avaya Labs Research

Program at a glance

DAY 1
8:15-8:30 MSR Opening Message
8:30-9:30 Keynote 1 - Dr. Audris Mockus
9:30-10:30 1. Green mining [Session]
10:30-11:00 Coffee Break
11:00-12:00 2. Code Clones and Origin Analysis [Session]
12:00-12:30 Data showcase teasers
12:30-14:00 Working Lunch
14:00-15:30 3. Mining Repos and QA sites [Session]
15:30-16:00 Coffee Break
16:00-17:00 MSR Challenge
17:00-18:00 4. Bug Characterizing [Session]
   
DAY 2
8:30-9:00 Awards/announcement
9:00-10:30 5. Mining Applications [Session]
10:30-11:00 Coffee Break
11:00-11:45 6. Defect Prediction [Session]
11:45-12:35 Short papers
12:35-14:00 Working Lunch
14:00-15:30 7. Mining Mix [Session]
15:30-16:00 Coffee Break
16:00-17:00 8. Code Review and Code Search [Session]
17:00-18:00 9. Effort Estimation and Reuse [Session]
18:00-18:10 Wrap Up

MSR 2014 Potential Schedule

Program print available here

Use the links below to jump to the day program schedule.

DAY 1
8:15-8:30 MSR Opening Message
8:30-9:30 Keynote 1 [Session Chair: Prem Devanbu]
  Is Mining Software Repositories Data Science?
Audris Mockus
(Avaya Labs Research, USA)
Article Search
9:30-10:30 Green mining [Session Chair: Mei Nagappan]
 

Mining Energy-Greedy API Usage Patterns in Android Apps: an Empirical Study
Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cardenas, Rocco Oliveto, Massimiliano Di Penta and Denys Poshyvanyk
(College of William and Mary, USA; University of Sannio, Italy; Universidad Nacional de Colombia, Colombia; University of Molise, Italy)
Preprint Available

GreenMiner: A Hardware Based Mining Software Repositories Software Energy Consumption Framework
Abram Hindle, Alex Wilson, Kent Rasmussen, Jed Barlow, Joshua Campbell and Stephen Romansky
(University of Alberta, Canada)
Preprint Available

Mining Questions About Software Energy Consumption
Gustavo Pinto, Fernando Castor and Yu David Liu
(Federal University of Pernambuco, Brazil; SUNY Binghamton, USA)
Preprint Available

10:30-11:00 Coffee Break
11:00-12:00 Code Clones and Origin Analysis [Session Chair: Massimiliano Di Penta]
 

Prediction and Ranking of Co-change Candidates for Clones
Manishankar Mondal, Chanchal K. Roy and Kevin Schneider
(University of Saskatchewan, Canada)
Preprint Available

Incremental Origin Analysis of Source Code Files
Daniela Steidl, Benjamin Hummel and Elmar Juergens
(CQSE, Germany)
Preprint Available

Oops! Where Did That Code Snippet Come From?
Lisong Guo, Julia Lawall and Gilles Muller
(INRIA, France; LIP6, France; Sorbonne, France; UPMC, France)
Preprint Available

12:00-12:30 Data showcase teasers [Session Chair: Emad Shihab]
 

(Best Data Showcase Award) A dataset for pull-based development research
Georgios Gousios and Andy Zaidman
(Delft University of Technology, Netherlands)
Preprint Available

The Bug Catalog of the Maven Ecosystem
Dimitris Mitropoulos, Vassilios Karakoidas, Panagiotis Louridas, Georgios Gousios and Diomidis Spinellis
(Athens University of Economics and Business, Greece; Delft University of Technology, Netherlands)
Preprint Available

A Dataset of Feature Additions and Feature Removals from the Linux Kernel
Leonardo Passos and Krzysztof Czarnecki
(University of Waterloo, Canada)
Article Search

Kataribe:A Hosting Service of Historage Repositories - Fine-Grained Analaysis for all
Kenji Fujiwara, Hideaki Hata, Erina Makihara, Yusuke Fujihara, Naoki Nakayama, Hajimu Iida and Ken-Ichi Matsumoto
(NAIST, Japan)
Preprint Available Info

Lean GHTorrent: GitHub data on demand
Georgios Gousios, Bogdan Vasilescu, Alexander Serebrenik and Andy Zaidman
(Delft University of Technology, Netherlands; Eindhoven University of Technology, Netherlands)
Preprint Available

A Code Clone Oracle
Daniel Krutz and Wei Le
(Rochester Institute of Technology, USA)
Article Search

Generating Duplicate Bug Datasets
Alina Lazar, Sarah Ritchey and Bonita Sharif
(Youngstown State University, USA)
Article Search

FLOSS 2013: A survey dataset about free software contributors. Challenges for curating, sharing and combining
Gregorio Robles, Laura Arjona-Reina, Bogdan Vasilescu, Alexander Serebrenik and Jesús M. González-Barahona
(Universidad Rey Juan Carlos, Spain; Universidad Politécnica de Madrid, Spain; Eindhoven University of Technology, Netherlands)
Article Search Info

A Green Miner’s Dataset: Mining the Impact of Software Change on Energy Consumption
Chenlei Zhang and Abram Hindle
(University of Alberta, Canada)
Preprint Available

Gentoo Package Dependencies over Time
Remco Bloemen and Chintan Amrit
(University of Twente, Netherlands)
Preprint Available

Models of OSS Project Meta-Information: A Dataset of Three Forges
James Williams, Davide Di Ruscio, Nicholas Matragkas, Dimitris Kolovos and Juri Di Rocco
(University of York, UK; University of L'Aquila, Italy)
Article Search

A Dataset of Clone References with Gaps
Hiroaki Murakami, Yoshiki Higo and Shinji Kusumoto
(Osaka University, Japan)
Article Search

Maven Components and Bug Patterns in them
Hitesh Sajnani, Vaibhav Saini, Joel Ossher and Cristina Lopes
(University of California at Irvine, USA)
Article Search

OpenHub: A scalable architecture for the analysis of software quality attributes
Gabriel Farah, Juan Tejada and Dario Correal
(Universidad de los Andes, Colombia)
Article Search

Understanding software evolution: the Maisqual Ant data set
Boris Baldassari and Philippe Preux
(SQuORING Technologies, France; LIFL, France; CNRS, France; INRIA, France; University of Lille, France)
Article Search

12:30-14:00 Working Lunch
14:00-15:30 Mining Repos and QA sites [Session Chair: Andrew Begel]
 

The Promises and Perils of Mining GitHub
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel German and Daniela Damian
(University of Victoria, Canada; Delft University of Technology, Netherlands)
Article Search Info

Mining StackOverflow to Turn the IDE into a Self-confident Programming Prompter
Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto and Michele Lanza
(University of Lugano, Switzerland; University of Sannio, Italy; University of Molise, Italy)
Article Search

Mining Questions Asked by Web Developers
Kartik Bajaj, Karthik Pattabiraman and Ali Mesbah
(University of British Columbia, Canada)
Preprint Available

Process Mining Multiple Repositories for Software Defect Resolution from Control and Organizational Perspective
Monika Gupta, Ashish Sureka and Srinivas Padmanabhuni
(IIIT Delhi, India; Infosys, India)
Article Search

15:30-16:00 Coffee Break
16:00-17:00 MSR Challenge [Session Chair: Olga Baysal]
 

(Honorable Mention) A Study of External Community Contribution to Open-Source Projects on GitHub
Rohan Padhye, Senthil Mani and Vibha Singhal Sinha
(IBM Research, India)
Article Search

Understanding “Watchers” on GitHub
Jyoti Sheoran, Kelly Blincoe, Eirini Kalliamvakou, Daniela Damian and Jordan Ell
(University of Victoria, Canada)
Article Search

(Honorable Mention) Do developers discuss design?
João Brunet, Gail C. Murphy, Ricardo Terra, Jorge Figueiredo and Dalton Serey
(Federal University of Campina Grande, Brazil; University of British Columbia, Canada; Federal University of Lavras, Brazil)
Preprint Available

Magnet or Sticky?: An OSS Project-by-Project Typology
Kazuhiro Yamashita, Shane McIntosh, Yasutaka Kamei and Naoyasu Ubayashi
(Kyushu University, Japan; Queen's University, Canada)
Preprint Available

Security and Emotion: Sentiment Analysis of Security Discussions on GitHub
Daniel Pletea, Bogdan Vasilescu and Alexander Serebrenik
(Eindhoven University of Technology, Netherlands)
Article Search

(MSR Challenge Winner/Honorable Mention) Sentiment Analysis of Commit Messages in GitHub: An Empirical Study
Emitza Guzman, David Azócar and Yang Li.
(TU München, Germany)
Article Search

Analysing the `Biodiversity' of Open Source Ecosystems: The Github Case
Nicholas Matragkas, James Williams, Dimitris Kolovos and Richard Paige
(University of York, UK)
Article Search

Co-Evolution of Project Documentation and Popularity within Github
Karan Aggarwal, Abram Hindle and Eleni Stroulia
(University of Alberta, Canada)
Preprint Available

An Insight into the Pull Requests of GitHub
Mohammad Masudur Rahman and Chanchal K. Roy
(University of Saskatchewan, Canada)
Preprint Available

17:00-18:00 Bug Characterizing [Session Chair: Abram Hindle]
 

Works For Me! Characterizing Non-reproducible Bug Reports
Mona Erfani Joorabchi, Mehdi Mirzaaghaei and Ali Mesbah
(University of British Columbia, Canada)
Preprint Available

Characterizing and Predicting Blocking Bugs in Open Source Projects
Harold Valdivia Garcia and Emad Shihab
(Rochester Institute of Technology, USA)
Article Search

An Empirical Study of Dormant Bugs
Tse-Hsun Chen, Meiyappan Nagappan, Emad Shihab and Ahmed E. Hassan
(Queen's University, Canada; Rochester Institute of Technology, USA)
Article Search

   
DAY 2
8:30-9:00 Awards/announcement
9:00-10:30 Mining Applications [Session Chair: Chris Bird]
 

MUX: Algorithm Selection for Software Model Checkers
Varun Tulsian, Aditya Kanade, Rahul Kumar, Akash Lal and Aditya Nori
(Indian Institute of Science, India; Microsoft Research, India)
Preprint Available

Improving the Effectiveness of Test Suite Through Mining Historical Data
Jeff Anderson, Saeed Salem and Hyunsook Do
(Microsoft, USA; North Dakota State University, USA)
Article Search

Finding Patterns in Static Analysis Alerts
Quinn Hanam, Lin Tan, Reid Holmes and Patrick Lam
(University of Waterloo, Canada)
Article Search

Impact Analysis of Change Requests on Source Code based on Interaction and Commit Histories
Motahareh Bahrami Zanjani, George Swartzendruber and Huzefa Kagdi
(Wichita State University, USA)
Article Search

10:30-11:00 Coffee Break
11:00-11:45 Defect Prediction [Session Chair: Romain Robbes]
 

An Empirical Study of Just-In-Time Defect Prediction Using Cross-Project Models
Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita and Naoyasu Ubayashi
(Kyushu University, Japan; Queen's University, Canada)
Preprint Available

(Distinguished Paper Award) Towards Building a Universal Defect Prediction Model
Feng Zhang, Audris Mockus, Iman Keivanloo and Ying Zou
(Queen's University, Canada; Avaya Labs Research, USA)
Preprint Available

11:45-12:35 Short papers [Session Chair: Girish Maskeri Rama]
 

Innovation diffusion in Open Source Software
Remco Bloemen, Chintan Amrit, Stefan Kuhlmann and Gonzalo Ordóñez–matamoros
(University of Twente, Netherlands)
Preprint Available

Improving the Accuracy of Duplicate Bug Report Detection Using Textual Similarity Measures
Alina Lazar, Sarah Ritchey and Bonita Sharif
(Youngstown State University, USA)
Article Search

Undocumented and Unchecked: Exceptions that Spell Trouble
Maria Kechagia and Diomidis Spinellis
(Athens University of Economics and Business, Greece)
Preprint Available

New Features for Duplicate Bug Detection
Nathan Klein, Christopher Corley and Nicholas Kraft
(Oberlin College, USA; University of Alabama, USA)
Preprint Available

Mining Modern Repositories with Elasticsearch
Oleksii Kononenko, Olga Baysal, Reid Holmes and Michael W. Godfrey
(University of Waterloo, Canada)
Article Search

Collaboration in Open-Source Projects: Myth or Reality?
Yuriy Tymchuk, Andrea Mocci and Michele Lanza
(University of Lugano, Switzerland)
Article Search

A Dictionary to Translate Change Tasks to Source Code
Katja Kevic and Thomas Fritz
(University of Zurich, Switzerland)
Article Search

Tracing Dynamic Features in Python Programs
Beatrice Åkerblom and Tobias Wrigstad
(Stockholm University, Sweden; Uppsala University, Sweden)
Article Search

It's Not A Bug, It's A Feature: Does Misclassification Affect Bug Localization?
Pavneet Singh Kochhar, Tien-Duy Le and David Lo
(Singapore Management University, Singapore)
Article Search

Classifying Unstructured Data into Natural Language Text and Technical Information
Thorsten Merten, Bastian Mager, Simone Bürsner and Barbara Paech
(Bonn-Rhein-Sieg University of Applied Sciences, Germany; University of Heidelberg, Germany)
Article Search

12:35-14:00 Working Lunch
14:00-15:30 Mining Mix [Session Chair: Vibha Sinha]
 

Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Models
Joshua Campbell, Abram Hindle and José Nelson Amaral
(University of Alberta, Canada)
Preprint Available Info

Do Developers Feel Emotions? An Exploratory Analysis of Emotions in Software Artifacts
Alessandro Murgia, Parastou Tourani, Bram Adams and Marco Ortu
(University of Antwerp, Belgium; Polytechnique Montréal, Canada; University of Cagliari, Italy)
Preprint Available Article Search

How Does a Typical Tutorial for Mobile Development Look Like?
Rebecca Tiarks and Walid Maalej
(University of Hamburg, Germany)
Preprint Available

Unsupervised Discovery of Intentional Process Models From Event Logs
Ghazaleh Khodabandelou, Rebecca Deneckère, Camille Salinesi and Charlotte Hug
(Sorbonne, France)
Article Search

15:30-16:00 Coffee Break
16:00-17:00 Code Review and Code Search [Session Chair: Chanchal Roy]
 

(Distinguished Paper Award) The Impact of Code Review Coverage and Code Review Participation on Software Quality: A Case Study of the Qt, VTK, and ITK Projects
Shane Mcintosh, Yasutaka Kamei, Bram Adams and Ahmed E. Hassan
(Queen's University, Canada; Kyushu University, Japan; Polytechnique Montréal, Canada)
Preprint Available

Modern Code Reviews in Open-Source Projects: Which Problems Do They Fix?
Moritz Beller, Alberto Bacchelli, Andy Zaidman and Elmar Jürgens
(Delft University of Technology, Netherlands; CQSE, Germany)
Article Search

Thesaurus-Based Automatic Query Expansion for Interface-Driven Code Search
Otavio Lemos, Adriano de Paula, Felipe Zanichelli and Cristina Lopes
(Federal University of São Paulo, Brazil; University of California at Irvine, USA)
Article Search

17:00-18:00 Effort Estimation and Reuse [Session Chair: Tim Menzies]
 

Estimating Development Effort in Free/Open Source Software Projects by Mining Software Repositories
Gregorio Robles, Jesús M. González-Barahona, Carlos Cervigón-Ávila, Andrea Capiluppi and Daniel Izquierdo-Cortázar
(Universidad Rey Juan Carlos, Spain; Brunel University, UK; Bitergia, Spain)
Preprint Available Info

An Industrial Case Study of Automatically Identifying Performance Regression-Causes
Thanh H. D. Nguyen, Meiyappan Nagappan, Ahmed E. Hassan, Mohamed Nasser, and Parminder Flora
(Queen's University, Canada; BlackBerry, Canada)
Article Search

Revisiting Android Reuse Studies in the Context of Code Obfuscation and Library Usages
Mario Linares-Vásquez, Andrew Holtzhauer, Carlos Eduardo Bernal Cardenas and Denys Poshyvanyk
(College of William and Mary, USA; Universidad Nacional de Colombia, Colombia)
Preprint Available

18:00-18:10 Wrap Up [Session Chair: Martin Pinzger]


-
If you wished to have your accepted paper's pre-print made available at the MSR website or wanted your personal url to be linked to your paper, please kindly send an email to the Web Chair.