Software Engineering - Open Source Software
Licensing
Research Works and Datasets
findOSSLicense: License recommender for open source
software findOSSLicense.
A test dataset used to evaluate similarity among SourceForge
projects is available here
Information extraction from FOSS license texts (FOSS-LTE):
Information available in licenses is useful for better understanding
their content. The text can be used for extracting information on
license terms using various NLP techniques. A dataset of licenses we
are using for this purpose is available here, whereas the results of the application
of topic modeling techniques using MALLET can be found in these results.
The assignment of terms to topics and respective results are also
available here.
Open source licensing and compatibility: The abundance of
open source licenses makes the compatibility among them an important
research challenge for any organization that employees open source
software. For this purpose we are studying a number of open source
projects from different repositories (e.g., SourceForge, Maven) and
with different characteristics (e.g., size, programming language).
We have tested a number of license extraction tools based on the OSS-license-dataset-1.0
dataset (consisting of 100 projects).
SPDXCompatTools:
Checks violations on Software Package Data Exchange (SPDX)
files based on the license compatibility graph presented in this
publication.
Extracting developers' expertise: Developers' social
networks, such as GitHub, and Q&A websites, such as Stack
Overflow, can be a valuable source of information for understanding
what software engineers do and which expertise they have. Datasets
of the work on expertiss extraction sre available here. The dataset contains developers’
expertise in different programming languages.
Scrum-Agile global survey: this survey conducted in 2012
(as part of the bachelor thesis of M. Christou) demonstrates where
Scrum and Agile are standing in 2012. Motivated by this global
spread of adaptive software development and our personal experience
in a Scrum industrial environment it was interesting to demonstrate
where agile and Scrum adoption lies today globally in terms of
quantities, discover the success or failure rate of and agile- and
Scrum-driven projects, perform a comparison among the results of
using agile or Scrum-based techniques and of following traditional
development approaches (i.e., heavyweight) and study development
aspects relevant today (e.g., team geographical distribution). The
survey results are available here.