MSR 2014 May 31–June 1. Hyderabad, India
The 11th Working Conference on Mining Software Repositories


    MSR 2014: In the land of the Nizams from Sung Kim

    Program at a glance (TBC)

    DAY 1
    8:15-8:30 MSR Opening Message
    8:30-9:30 Keynote 1 - Dr. Audris Mockus
    9:30-10:30 1. Green mining [Session]
    10:30-11:00 Coffee Break
    11:00-12:00 2. Code Clones and Origin Analysis [Session]
    12:00-12:30 Data showcase teasers
    12:30-14:00 Working Lunch
    14:00-15:30 3. Mining Repos and QA sites [Session]
    15:30-16:00 Coffee Break
    16:00-17:00 MSR Challenge
    17:00-18:00 4. Bug Characterizing [Session]
    DAY 2
    8:30-9:00 Awards/announcement
    9:00-10:30 5. Mining Applications [Session]
    10:30-11:00 Coffee Break
    11:00-11:45 6. Defect Prediction [Session]
    11:45-12:35 Short papers
    12:35-14:00 Working Lunch
    14:00-15:30 7. Mining Mix [Session]
    15:30-16:00 Coffee Break
    16:00-17:00 8. Code Review and Code Search [Session]
    17:00-18:00 9. Effort Estimation and Reuse [Session]
    18:00-18:10 Wrap Up

    Social Network

    Twitter Icon   Facebook Icon

    Follow us on Twitter & Facebook!

    Important Dates

    (all deadlines are set to 23.59.59, AOE Time)

    Research and Practice papers
    Abstract due:January 31, 2014
    Papers due:February 7, 2014
    Author notification:March 10, 2014

    Papers due:February 21, 2014
    Author notification:March 10, 2014

    Abstract due:February 16, 2014
    Papers due:February 21, 2014
    Author notification:March 10, 2014

    Submission Guidelines

    Submit papers through EasyChair:

    Paper submission guidelines:

    Corporate Supporters

    Supporting Indian students to attend MSR '14

    Supporting MSR Challenge


    Welcome to the official website of MSR 2014!

    The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects. The goal of this two-day working conference is to advance the science and practice of MSR.

    The 11th Working Conference on Mining Software Repositories is sponsored by IEEE TCSE and ACM SIGSOFT.

    Accepted Papers

    Use the links below to jump to your specific track.

    Research/Practice Track

    The Promises and Perils of Mining GitHub
    Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel German and Daniela Damian

    An Empirical Study of Just-In-Time Defect Prediction Using Cross-Project Models
    Takafumi Fukushima, Yasutaka Kamei, Shane McIntosh, Kazuhiro Yamashita and Naoyasu Ubayashi

    The Impact of Code Review Coverage and Code Review Participation on Software Quality: A Case Study of the Qt, VTK, and ITK Projects
    Shane Mcintosh, Yasutaka Kamei, Bram Adams and Ahmed E. Hassan

    Mining StackOverflow to Turn the IDE into a Self-confident Programming Prompter
    Luca Ponzanelli, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto and Michele Lanza

    Towards Building a Universal Defect Prediction Model
    Feng Zhang, Audris Mockus, Iman Keivanloo and Ying Zou

    Innovation diffusion in Open Source Software(short)
    Remco Bloemen, Chintan Amrit, Stefan Kuhlmann and Gonzalo Ordóñez–matamoros

    MUX: Algorithm Selection for Software Model Checkers
    Varun Tulsian, Aditya Kanade, Rahul Kumar, Akash Lal and Aditya Nori

    Process Mining Multiple Repositories for Software Defect Resolution from Control and Organizational Perspective
    Monika Gupta, Ashish Sureka and Srinivas Padmanabhuni

    Modern Code Reviews in Open-Source Projects: Which Problems Do They Fix?
    Moritz Beller, Alberto Bacchelli, Andy Zaidman and Elmar Jürgens

    Mining Questions Asked by Web Developers
    Kartik Bajaj, Karthik Pattabiraman and Ali Mesbah

    Improving the Effectiveness of Test Suite Through Mining Historical Data
    Jeff Anderson, Saeed Salem and Hyunsook Do

    Mining Energy-Greedy API Usage Patterns in Android Apps: an Empirical Study
    Mario Linares-Vásquez, Gabriele Bavota, Carlos Eduardo Bernal Cardenas, Rocco Oliveto, Massimiliano Di Penta and Denys Poshyvanyk

    Do Developers Feel Emotions? An Exploratory Analysis of Emotions in Software Artifacts
    Alessandro Murgia, Parastou Tourani, Bram Adams and Marco Ortu

    Thesaurus-Based Automatic Query Expansion for Interface-Driven Code Search
    Otavio Lemos, Adriano de Paula, Felipe Zanichelli and Cristina Lopes

    Improving the Accuracy of Duplicate Bug Report Detection Using Textual Similarity Measures(short)
    Alina Lazar, Sarah Ritchey and Bonita Sharif

    Undocumented and Unchecked: Exceptions that Spell Trouble(short)
    Maria Kechagia and Diomidis Spinellis

    New Metrics for Duplicate Bug Detection(short)
    Nathan Klein, Christopher Corley and Nicholas Kraft

    Mining Modern Repositories with Elasticsearch(short)
    Oleksii Kononenko, Olga Baysal, Reid Holmes and Michael Godfrey

    An Industrial Case Study of Automatically Identifying Performance Regression-Causes
    Thanh H. D. Nguyen, Meiyappan Nagappan and Ahmed E. Hassan

    Collaboration in Open-Source Projects: Myth or Reality?(short)
    Yuriy Tymchuk, Andrea Mocci and Michele Lanza

    Oops! Where Did That Code Snippet Come From?
    Lisong Guo, Julia Lawall and Gilles Muller

    A Dictionary to Translate Change Tasks to Source Code(short)
    Katja Kevic and Thomas Fritz

    Impact Analysis of Change Requests on Source Code based on Interaction and Commit Histories
    Motahareh Bahrami Zanjani, George Swartzendruber and Huzefa Kagdi

    GreenMiner: A Hardware Based Mining Software Repositories Software Energy Consumption Framework
    Abram Hindle, Alex Wilson, Kent Rasmussen, Jed Barlow, Joshua Campbell and Stephen Romansky

    Works For Me! Characterizing Non-reproducible Bug Reports
    Mona Erfani Joorabchi, Mehdi Mirzaaghaei and Ali Mesbah

    Characterizing and Predicting Blocking Bugs in Open Source Projects
    Harold Valdivia Garcia and Emad Shihab

    Finding Patterns in Static Analysis Alerts
    Quinn Hanam, Lin Tan, Reid Holmes and Patrick Lam

    Unsupervised Discovery of Intentional Process Models From Event Logs
    Ghazaleh Khodabandelou, Rebecca Deneckère, Camille Salinesi and Charlotte Hug

    Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Models
    Joshua Campbell, Abram Hindle and José Nelson Amaral

    Use of Dynamic Features in Python Programs(short)
    Beatrice Åkerblom and Tobias Wrigstad

    Prediction and Ranking of Co-change Candidates for Clones
    Manishankar Mondal, Chanchal K. Roy and Kevin Schneider

    It's Not A Bug, It's A Feature: Does Misclassification Affect Bug Localization?(short)
    Pavneet Singh Kochhar, Tien-Duy Le and David Lo

    How Does a Typical Tutorial for Mobile Development Look Like?
    Rebecca Tiarks and Walid Maalej

    Estimating Development Effort in Free/Open Source Software Projects by Mining Software Repositories
    Gregorio Robles, Jesús M. González-Barahona, Carlos Cervigón-Ávila, Andrea Capiluppi and Daniel Izquierdo-Cortázar

    An Empirical Study of Dormant Bugs
    Tse-Hsun Chen, Meiyappan Nagappan, Emad Shihab and Ahmed E. Hassan

    Revisiting Android Reuse Studies in the Context of Code Obfuscation and Library Usages
    Mario Linares-Vásquez, Andrew Holtzhauer, Carlos Eduardo Bernal Cardenas and Denys Poshyvanyk

    Mining Questions About Software Energy Consumption
    Gustavo Pinto, Fernando Castor and Yu David Liu

    Incremental Origin Analysis of Source Code Files
    Daniela Steidl, Benjamin Hummel and Elmar Juergens

    Classifying Development Documents as Code, Stack Traces, Patches or Natural Language Text(short)
    Thorsten Merten, Bastian Mager, Simone Bürsner and Barbara Paech

    Mining Challenge Track

    A Study of External Community Contribution to Open-Source Projects on GitHub
    Rohan Padhye, Senthil Mani and Vibha Singhal Sinha

    Understanding "Watchers" on GitHub
    Jyoti Sheoran, Kelly Blincoe, Eirini Kalliamvakou, Daniela Damian and Jordan Ell

    Do developers discuss design?
    João Brunet, Gail C. Murphy, Ricardo Terra, Jorge Figueiredo and Dalton Serey

    Magnet or Sticky?: An OSS Project-by-Project Typology
    Kazuhiro Yamashita, Shane McIntosh, Yasutaka Kamei and Naoyasu Ubayashi

    Security Discussions on GitHub: Topic Mining and Sentiment Analysis
    Daniel Pletea, Bogdan Vasilescu and Alexander Serebrenik

    Sentiment Analysis of Commit Messages in GitHub: An Empirical Study
    Emitza Guzman, David Azócar and Yang Li

    Analysing the "Biodiversity" of Open Source Ecosystems: The Github Case
    Nicholas Matragkas, James Williams, Dimitris Kolovos and Richard Paige

    Co-Evolution of Project Documentation and Popularity within Github
    Karan Aggarwal, Abram Hindle and Eleni Stroulia

    An Insight into the Pull Requests of GitHub
    Mohammad Masudur Rahman and Chanchal K. Roy

    Data Showcase Track

    (short) A dataset for pull-based development research
    Georgios Gousios and Andy Zaidman

    The Bug Catalog of the Maven Ecosystem
    Dimitris Mitropoulos, Vassilios Karakoidas, Panagiotis Louridas, Georgios Gousios and Diomidis Spinellis

    A Dataset of Feature Additions and Feature Removals from the Linux Kernel
    Leonardo Passos and Krzysztof Czarnecki

    Kataribe:A Hosting Service of Historage Repositories - Fine-Grained Analaysis for all -
    Kenji Fujiwara, Hideaki Hata, Erina Makihara, Yusuke Fujihara, Naoki Nakayama, Hajimu Iida and Ken-Ichi Matsumoto

    Lean GHTorrent: GitHub data on demand
    Georgios Gousios, Bogdan Vasilescu, Alexander Serebrenik and Andy Zaidman

    A Code Clone Oracle
    Daniel Krutz and Wei Le

    Generating Duplicate Bug Datasets
    Alina Lazar, Sarah Ritchey and Bonita Sharif

    FLOSS 2013: A survey dataset about free software contributors. Challenges for curating, sharing and combining
    Gregorio Robles, Laura Arjona-Reina, Bogdan Vasilescu, Alexander Serebrenik and Jesús M. González-Barahona

    A Green Miner's Dataset: Mining the Impact of Software Change on Energy Consumption
    Chenlei Zhang and Abram Hindle

    Gentoo Package Dependencies over Time
    Remco Bloemen and Chintan Amrit

    Models of OSS Project Meta-Information: A Dataset of Three Forges
    James Williams, Davide Di Ruscio, Nicholas Matragkas, Dimitris Kolovos and Juri Di Rocco

    A Dataset of Clone References with Gaps
    Hiroaki Murakami, Yoshiki Higo and Shinji Kusumoto

    Maven Components and Bug Patterns in them
    Hitesh Sajnani, Vaibhav Saini, Joel Ossher and Cristina Lopes

    OpenHub: A scalable architecture for the analysis of software quality attributes
    Gabriel Farah, Juan Tejada and Dario Correal

    Understanding software evolution: the Maisqual Ant data set
    Boris Baldassari and Philippe Preux