cppstats is a tool for analyzing software systems regarding their preprocessor-based variability. Therefore, we focus on software systems written in C using the capabilities of the C preprocessor (CPP) to express variability.
The project statement is to get information about the usage of the CPP in C software projects. Apart from the quantity of CPP usage within software projects, more information about, where the preprocessor is used and what is the extension that CPP annotations are enframing. This leads to the question of granularity of CPP within software projects.
So far, we have used the tool cppstats in various studies to analyze publicly available open source projects from different domains including compilers, database management systems, operating systems, and application software – and also some closed-source systems from industry. We have consequently extended cppstats along the way.
Furthermore, cppstats is integrated into Codeface to enable feature-based analysis of software systems.
Publications and Supplementary Sites
For a list of publications related to cppstats, see below.
Furthermore, here a list of supplementary sites of empirical studies using cppstats as supporting tool:
- http://www.fosd.net/oss_vs_is – Supplementary website to our EMSE'15 paper
cppstats measures the use of CPP directives directly on normalized source code. To perform the analysis, the normalized code is translated to srcML. In the end, there are basically two steps performed inside cppstats for each implemented kind of analysis.
- Source-Code Preparation and Normalization
- Code normalization using appropriate methodology
- Conversion of C code including CPP annotations to srcML
- Analysis regarding CPP based on the srcML representation
Source Code Conversion to srcMLConsider the following source code written in C.
#ifdef USE_LIBXML char *str; ... #else NO_XML_SUPPORT(); return NULL; #endif
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <unit xmlns="http://www.sdml.info/srcML/src" xmlns:cpp="http://www.sdml.info/srcML/cpp" language="C" filename="source.c"><cpp:ifdef>#<cpp:directive>ifdef</cpp:directive> <name>USE_LIBXML</name></cpp:ifdef> <decl_stmt><decl><type><name>char</name> *</type><name>str</name></decl>;</decl_stmt> ... <cpp:else>#<cpp:directive>else</cpp:directive></cpp:else> <expr_stmt><expr><call><name>NO_XML_SUPPORT</name><argument_list>()</argument_list></call></expr>;</expr_stmt> <return>return <expr><name>NULL</name></expr>;</return> <cpp:endif>#<cpp:directive>endif</cpp:directive></cpp:endif> </unit>
Currently, cppstats supports following analyses. For more details, please refer to the README file in the cppstats repository.
Measurement of CPP-related metrics regarding scattering, tangling, and nesting
Calculation of raw scattering, tangling, and nesting values
Analysis of the discipline of used CPP annotations
Analysis of the locations of CPP annotation blocks in the given file or project folder
Analysis of all derivatives in the given project folder
Analysis of pair-wise interactions of configurations constants that have been used alltogether in one expression (# of constants involved >= 3)
Exemplary Result Files
For a list of case studies, result files with metric values are available on an additional site.
cppstats is distributed under LGPL v3.
cppstats is written in Python and relies on certain libraries and other tools in order to work. It has been tested with Python 2.7.x and runs on Unix-based systems.
- astyle (source code indention and reformatting filter)
- xsltproc (command-line XSLT processor)
- cpp (the C preprocessor, as a part of the compiler-suite gcc)
More detailed information and instructions regarding installation can be found in the project's README and INSTALL files.
Source and Development
current version available on GitHub —
- earlier versions as tarballs —
- Claus Hunsen, Bo Zhang, Janet Siegmund,
Christian Kästner, Olaf Leßenich, Martin Becker, and Sven Apel.
Variability in Open-Source and Industrial Software Systems: An Empirical
Empirical Software Engineering, 21(2):449–482, April
- Jörg Liebig, Christian Kästner,
and Sven Apel.
the Discipline of Preprocessor Annotations in 30 Million Lines of C
In Proceedings of the ACM International Conference on Aspect-Oriented
Software Development (AOSD), pages 191–202. ACM Press, March 2011.
Acceptance rate: 23% (21 / 92).
- Jörg Liebig, Sven Apel, Christian
Lengauer, Christian Kästner, and Michael Schulze.
Analysis of the Variability in Forty Preprocessor-Based Software Product
In Proceedings of the ACM/IEEE International Conference on Software
Engineering (ICSE), pages 105–114. ACM Press, May 2010.
Acceptance rate: 14% (52 / 380).
Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these publications may not be reposted without the explicit permission of the copyright holder.