cppstats

a C-preprocessor analyzer

fork me on GitHub →

Introduction

cppstats is a tool for analyzing software systems regarding their preprocessor-based variability. Therefore, we focus on software systems written in C using the capabilities of the C preprocessor (CPP) to express variability.

The project statement is to get information about the usage of the CPP in C software projects. Apart from the quantity of CPP usage within software projects, more information about, where the preprocessor is used and what is the extension that CPP annotations are enframing. This leads to the question of granularity of CPP within software projects.

So far, we have used the tool cppstats in various studies to analyze publicly available open source projects from different domains including compilers, database management systems, operating systems, and application software – and also some closed-source systems from industry. We have consequently extended cppstats along the way.

Furthermore, cppstats is integrated into Codeface to enable feature-based analysis of software systems.

Publications and Supplementary Sites

For a list of publications related to cppstats, see below.

Furthermore, here a list of supplementary sites of empirical studies using cppstats as supporting tool:

Functionality

Basic Procedure

cppstats measures the use of CPP directives directly on normalized source code. To perform the analysis, the normalized code is translated to srcML. In the end, there are basically two steps performed inside cppstats for each implemented kind of analysis.

  1. Source-Code Preparation and Normalization
    1. Code normalization using appropriate methodology
    2. Conversion of C code including CPP annotations to srcML
  2. Analysis regarding CPP based on the srcML representation

Source Code Conversion to srcML

Consider the following source code written in C.
#ifdef USE_LIBXML
char  *str;
   ...
#else
NO_XML_SUPPORT();
return NULL;
#endif
Using srcML, the source code is transformed to an XML derivative, while preserving all CPP annotations from the source code.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<unit xmlns="http://www.sdml.info/srcML/src" xmlns:cpp="http://www.sdml.info/srcML/cpp" language="C" filename="source.c"><cpp:ifdef>#<cpp:directive>ifdef</cpp:directive> <name>USE_LIBXML</name></cpp:ifdef>
<decl_stmt><decl><type><name>char</name>  *</type><name>str</name></decl>;</decl_stmt>
   ...
<cpp:else>#<cpp:directive>else</cpp:directive></cpp:else>
<expr_stmt><expr><call><name>NO_XML_SUPPORT</name><argument_list>()</argument_list></call></expr>;</expr_stmt>
<return>return <expr><name>NULL</name></expr>;</return>
<cpp:endif>#<cpp:directive>endif</cpp:directive></cpp:endif>

</unit>
This srcML output is then analyzed by cppstats, evidently by respecting all CPP annotations.

Analysis Types

Currently, cppstats supports following analyses. For more details, please refer to the README file in the cppstats repository.

  • GENERAL
    Measurement of CPP-related metrics regarding scattering, tangling, and nesting
  • GENERALVALUES
    Calculation of raw scattering, tangling, and nesting values
  • DISCIPLINE
    Analysis of the discipline of used CPP annotations
  • FEATURELOCATIONS
    Analysis of the locations of CPP annotation blocks in the given file or project folder
  • DERIVATIVE
    Analysis of all derivatives in the given project folder
  • INTERACTION
    Analysis of pair-wise interactions of configurations constants that have been used alltogether in one expression (# of constants involved >= 3)

Exemplary Result Files

For a list of case studies, result files with metric values are available on an additional site.

Source Code

License

cppstats is distributed under LGPL v3.

Dependencies

cppstats is written in Python and relies on certain libraries and other tools in order to work. It has been tested with Python 2.7.x and runs on Unix-based systems.

  • astyle (source code indention and reformatting filter)
  • xsltproc (command-line XSLT processor)
  • cpp (the C preprocessor, as a part of the compiler-suite gcc)

More detailed information and instructions regarding installation can be found in the project's README and INSTALL files.

Source and Development

Contact

cppstats has been developed at the University of Passau, Germany.
If you have any questions regarding the cppstats project, please contact the developer:

  • Claus Hunsen (current maintainer) (University of Passau, Germany)
  • Jörg Liebig (former maintainer) (University of Passau, Germany)

Publications (copyright notice)

2016


2011


2010


Copyright Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these publications may not be reposted without the explicit permission of the copyright holder.