Introduction
cppstats is a tool for analyzing software systems regarding their preprocessor-based variability. Therefore, we focus on software systems written in C using the capabilities of the C preprocessor (CPP) to express variability.
The project statement is to get information about the usage of the CPP in C software projects. Apart from the quantity of CPP usage within software projects, more information about, where the preprocessor is used and what is the extension that CPP annotations are enframing. This leads to the question of granularity of CPP within software projects.
So far, we have used the tool cppstats in various studies to analyze publicly available open source projects from different domains including compilers, database management systems, operating systems, and application software – and also some closed-source systems from industry. We have consequently extended cppstats along the way.
Furthermore, cppstats is integrated into Codeface to enable feature-based analysis of software systems.
Publications and Supplementary Sites
For a list of publications related to cppstats, see below.
Furthermore, here a list of supplementary sites of empirical studies using cppstats as supporting tool:
- http://www.fosd.net/oss_vs_is – Supplementary website to our EMSE'16 paper
Functionality
Basic Procedure
cppstats measures the use of CPP directives directly on normalized source code. To perform the analysis, the normalized code is translated to srcML. In the end, there are basically two steps performed inside cppstats for each implemented kind of analysis.
- Source-Code Preparation and Normalization
- Code normalization using appropriate methodology
- Conversion of C code including CPP annotations to srcML
- Analysis regarding CPP based on the srcML representation
Source Code Conversion to srcML
Consider the following source code written in C.
#ifdef USE_LIBXML
char *str;
...
#else
NO_XML_SUPPORT();
return NULL;
#endif
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<unit xmlns="http://www.sdml.info/srcML/src" xmlns:cpp="http://www.sdml.info/srcML/cpp" language="C" filename="source.c"><cpp:ifdef>#<cpp:directive>ifdef</cpp:directive> <name>USE_LIBXML</name></cpp:ifdef>
<decl_stmt><decl><type><name>char</name> *</type><name>str</name></decl>;</decl_stmt>
...
<cpp:else>#<cpp:directive>else</cpp:directive></cpp:else>
<expr_stmt><expr><call><name>NO_XML_SUPPORT</name><argument_list>()</argument_list></call></expr>;</expr_stmt>
<return>return <expr><name>NULL</name></expr>;</return>
<cpp:endif>#<cpp:directive>endif</cpp:directive></cpp:endif>
</unit>
Analysis Types
Currently, cppstats supports following analyses. For more details, please refer to the README file in the cppstats repository.
-
GENERAL
Measurement of CPP-related metrics regarding scattering, tangling, and nesting -
GENERALVALUES
Calculation of raw scattering, tangling, and nesting values -
DISCIPLINE
Analysis of the discipline of used CPP annotations -
FEATURELOCATIONS
Analysis of the locations of CPP annotation blocks in the given file or project folder -
DERIVATIVE
Analysis of all derivatives in the given project folder -
INTERACTION
Analysis of pair-wise interactions of configurations constants that have been used alltogether in one expression (# of constants involved >= 3)
Exemplary Result Files
For a list of case studies, result files with metric values are available on an additional site.
Source Code
License
cppstats is distributed under LGPL v3.
Dependencies
cppstats is written in Python and relies on certain libraries and other tools in order to work. It has been tested with Python 2.7.x and runs on Unix-based systems.
- astyle (source code indention and reformatting filter)
- xsltproc (command-line XSLT processor)
- cpp (the C preprocessor, as a part of the compiler-suite gcc)
More detailed information and instructions regarding installation can be found in the project's README and INSTALL files.
Source and Development
-
current version available on GitHub —
https://github.com/clhunsen/cppstats - earlier versions as tarballs —
Contact
cppstats has been developed at the University of Passau, Germany.
If you have any questions regarding the cppstats project, please contact the developer:
- Claus Hunsen (current maintainer) (University of Passau, Germany)
- Jörg Liebig (former maintainer) (University of Passau, Germany)
Publications (copyright notice)
2016
- Claus Hunsen, Bo Zhang, Janet Siegmund,
Christian Kästner, Olaf Leßenich, Martin Becker, and Sven Apel.
Preprocessor-Based
Variability in Open-Source and Industrial Software Systems: An Empirical
Study.
Empirical Software Engineering (EMSE), 21(2):449–482, April
2016.
2011
- Jörg Liebig, Christian Kästner,
and Sven Apel.
Analyzing
the Discipline of Preprocessor Annotations in 30 Million Lines of C
Code.
In Proceedings of the International Conference on Aspect-Oriented
Software Development (AOSD), pages 191–202. ACM, March 2011.
Acceptance rate: 23% (21 / 92).
2010
- Jörg Liebig, Sven Apel, Christian
Lengauer, Christian Kästner, and Michael Schulze.
An
Analysis of the Variability in Forty Preprocessor-Based Software Product
Lines.
In Proceedings of the International Conference on Software Engineering
(ICSE), pages 105–114. ACM, May 2010.
Acceptance rate: 14% (52 / 380).
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these publications may not be reposted without the explicit permission of the copyright holder.