cppstats: C-program analyzer


cppstats is a tool for analyzing software systems regarding their variability. Therefore, we focus on software systems written in C using the capabilities of the cpp (The C preprocessor) to express variability.

Basic work

C source xml representation
C source

xml representation

Data about the cpp

The project statement is to get information about the usage of the cpp in C software projects. Apart from the quantity of cpp usage within software projects, more information about, where the preprocessor is used and what is the extension that cpp annotations are enframing. This leads to the question of granularity of cpp within software projects.

We have developed a tool, called cppstats to analyze 40 public available open source projects from different domains including compilers, database management systems, operating systems, and application software. The following table contains (1) the names of all analyzed software systems, (2) its version (development versions with date of download in brackets), (3) the domain the software refers to, and (4) the output of the analysis as CSV file.

number software system version domain data
01 apache 2.2.11 Web server apache.csv
02 berkeley db 4.7.25 database system berkeley.csv
03 cherokee 0.99.11 Web server cherokee.csv
04 clamav 0.94.2 antivirus program clamav.csv
05 dia 0.96.1 diagramming software dia.csv
06 emacs 22.3 text editor emacs.csv
07 freebsd 7.1 operating system freebsd.csv
08 gcc 4.3.3 compiler framework gcc.csv
09 ghostscript 8.62.0 postscript/pdf interpreter ghostscript
10 gimp 2.6.4 graphics editor gimp.csv
11 glibc 2.9 programming library glibc.csv
12 gnumeric 1.9.5 spreadsheet appl. gnumeric.csv
13 gnuplot 4.2.5 plotting tool gnuplot.csv
14 irssi 0.8.13 IRC client irssi.csv
15 libxml2 2.7.3 XML library libxml2.csv
16 lighttpd 1.4.22 Web server lighttpd.csv
17 linux operating system linux.csv
18 lynx 2.8.6 Web browser lynx.csv
19 minix 3.1.1 operating system minix.csv
20 mplayer 1.0rc2 media player mplayer.csv
21 mpsolve 2.2 mathematical software mpsolve.csv
22 openldap 2.4.16 LDAP directory service openldap.csv
23 opensolaris (2009-05-08) operating system solaris.csv
24 openvpn 2.0.9 security application openvpn.csv
25 parrot 0.9.1 virtual machine parrot.csv
26 php 5.2.8 programming language php.csv
27 pidgin 2.4.0 instant messenger pidgin.csv
28 postgresql (2009-05-08) database system postgresql.csv
29 privoxy 3.0.12 proxy server privoxy.csv
30 python 2.6.1 programming language python.csv
31 sendmail 8.14.2 mail transfer agent sendmail.csv
32 sqlite 3.6.10 database system sqlite.csv
33 subversion 1.5.1 revision control system subversion.csv
34 sylpheed 2.6.0 e-mail client sylpheed.csv
35 tcl 8.5.7 programming language tcl.csv
36 vim 7.2 text editor vim.csv
37 xfig 3.2.5 vector graphics editor xfig.csv
38 xine-lib media library xine.csv
39 xorg-server 1.5.1 X server xorg.csv
40 xterm 2.4.3 terminal emulator xterm.csv



cppstats is written in Python and relies on certain libraries and other tools in order to work. It has been tested with Python (version 2.5.2) and runs on Linux-based systems.

After installing all the programs and libraries cppstats relies on, download cppstats from here and unzip the archive. An example of the invokation of the tool is given below. Get a C software program of Your choice and download it. Unpack the sources and apply the cppstats analyzer.

Typical workflow:

# select all projects that should be analyzed by editing the cppstats_input.txt
# note, the project selected in cppstats_input.txt should have the following form:
# project-folder/
#             -> source
# cppstats_input.txt contains all absolute paths to the project-folders
	vim cppstats_input.txt

# for a general analysis run following scripts; see ICSE 2010 paper

# for discipline analysis run following scripts; see AOSD 2011 paper

After a successful analysis a csv-file (cppstats.csv) with the results can be found in the _cppstats subfolder of the given folder parameter (ICSE 2010) or the txt-file (disciplined_stats.txt) in the cppstats folder (AOSD 2011).



C grammar


cppstats has been developed at the University of Passau, Germany cppstats project, please contact the developer: