introduction to the project#
Replication package for the MSR2021 “Challenges in Developing Desktop Web Apps: a Study of Stack Overflow and GitHub” paper#
Online appendix#
The online appendix with the complete discussion of all topics mentioned in the paper is available HERE.
Replication package#
Data used in the study is available in the folder data/processed.
Raw data (w/o cleaning & filtering) is available in the folder data/raw.
Scripts#
Selection of relevant Stack Overflow questions#
The queries used to select relevant Stack Overflow questions from SOTorrent are available in the file: so_torrent_queries.txt
First query selects relevant questions (based on their tags).
Second query was used to collect accepted answers for the questions returned by the first query.
Topic modeling#
Topic modeling was executed by means of the Mallet tool.
The commands used to execute the tool from the command line is provided in the mallet_instructions.txt file.
Statistical analysis#
Scripts used to analyze the collected data are available in the folder notebook. The Python scripts in the folder were used to perform data cleaning and exploratory analysis. The statistical tests performed in the study were implemented in the R language and are available in the file tests.r
StackOverflow datasets#
To analyse StackOverflow datasets, run the jupyter notebook using the following command:
jupyter-notebook SO_dataset_analysis.ipynb
This notebooks run the scripts to clean the dataset, run the Mallet Tool and analyse the results.
For more instructions on how to run the scripts access the Getting Started document.