This Web site provides supplementary material for the paper "Polyhedral Search Space Exploration in the ExaStencils Code Generator". The paper presents an optimized, multi-dimensional polyhedral search space exploration that selects efficiently a good subset of all legal affine transformations. It also proposes a set of seven heuristic filters that restrict the search space, customized for the domain of stencil codes.

The exploration technique and the filter levels were implemented in the ExaStencils code generator as part of project ExaStencils (Advanced Stencil-Code Engineering).


Exploration Results

The following plots depict the performance distribution for all twelve experiments and every filter level. The five fastest schedules along with the "isl heuristics" schedule are shown in the tables below.

Complete versions of the presented tables can be found here.


Jacobi 3D cc1
ID MLUPs Schedule
5380 4437.87 { S0003[z, y, x] -> [y, z+y, z+y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1+z+y+x, 1]; ... }
5396 4421.18 { S0003[z, y, x] -> [y, z+y, y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1+y+x, 1]; ... }
5382 4406.57 { S0003[z, y, x] -> [y, z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1-z+y+x, 1]; ... }
5237 4386.58 { S0003[z, y, x] -> [y, -z+y, z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1+z+y+x, 1]; ... }
5240 4383.85 { S0003[z, y, x] -> [y, -z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1-z+y+x, 1]; ... }
isl heur. 4332.20 { S0003[z, y, x] -> [z, z+y, z+y+x, 0]; S0004[z, y, x] -> [1+z, 1+z+y, 1+z+y+x, 1]; ... }
Jacobi 3D cc2
ID MLUPs Schedule
4287 2655.61 { S0003[z, y, x] -> [y, z+y, -z+y+x, 0]; S0004[z, y, x] -> [2+y, 2+z+y, 2-z+y+x, 1]; ... }
4299 2651.40 { S0003[z, y, x] -> [y, z+y, y+x, 0]; S0004[z, y, x] -> [2+y, 2+z+y, 2+y+x, 1]; ... }
4286 2633.94 { S0003[z, y, x] -> [y, z+y, z+y+x, 0]; S0004[z, y, x] -> [2+y, 2+z+y, 2+z+y+x, 1]; ... }
4442 2589.77 { S0003[z, y, x] -> [y, -z+y, z+y+x, 0]; S0004[z, y, x] -> [2+y, 2-z+y, 2+z+y+x, 1]; ... }
4455 2588.71 { S0003[z, y, x] -> [y, -z+y, y+x, 0]; S0004[z, y, x] -> [2+y, 2-z+y, 2+y+x, 1]; ... }
isl heur. 2561.76 { S0003[z, y, x] -> [z, z+y, z+y+x, 0]; S0004[z, y, x] -> [2+z, 2+z+y, 2+z+y+x, 1]; ... }
Jacobi 3D ccd
ID MLUPs Schedule
14642 1823.01 { S0003[z, y, x] -> [y, z, x, 0]; S0004[z, y, x] -> [3+y, 1+z, 1+x, 0]; ... }
14581 1820.68 { S0003[z, y, x] -> [y, -z, y+x, 0]; S0004[z, y, x] -> [3+y, 1-z, 2+y+x, 0]; ... }
14589 1819.71 { S0003[z, y, x] -> [y, -z, -z+y+x, 0]; S0004[z, y, x] -> [3+y, 1-z, 3-z+y+x, 0]; ... }
670 1812.98 { S0003[z, y, x] -> [y, -z, y+x, 0]; S0004[z, y, x] -> [2+y, 1-z, 2+y+x, 0]; ... }
14648 1808.14 { S0003[z, y, x] -> [y, z, y+x, 0]; S0004[z, y, x] -> [3+y, 1+z, 3+y+x, 0]; ... }
isl heur. 111.26 { S0003[z, y, x] -> [z, y, x, 0]; S0004[z, y, x] -> [1+z, 1+y, 1+x, 1]; ... }
Jacobi 3D vc1
ID MLUPs Schedule
7179 1036.91 { S0004[z, y, x] -> [ y, -z+y, y+x, 0]; S0005[z, y, x] -> [1+y, 1-z+y, 1+y+x, 1]; ... }
7173 1036.61 { S0004[z, y, x] -> [ y, -z+y, z+y+x, 0]; S0005[z, y, x] -> [1+y, 1-z+y, 1+z+y+x, 1]; ... }
7167 1035.56 { S0004[z, y, x] -> [ y, -z+y, -z+y+x, 0]; S0005[z, y, x] -> [1+y, 1-z+y, 1-z+y+x, 1]; ... }
6139 1034.15 { S0004[z, y, x] -> [-y, -z-y, -y+x, 0]; S0005[z, y, x] -> [1-y, 1-z-y, 1-y+x, 1]; ... }
6134 1032.54 { S0004[z, y, x] -> [-y, -z-y, z-y+x, 0]; S0005[z, y, x] -> [1-y, 1-z-y, 1+z-y+x, 1]; ... }
isl heur. 1039.83 { S0004[z, y, x] -> [ z, z+y, z+x, 0]; S0005[z, y, x] -> [1+z, 1+z+y, 1+z+x, 1]; ... }
RBGS 3D cc1
ID MLUPs Schedule
8576 2848.22 { S0002[z, y, x] -> [y, -z+y, -z+x, 0]; S0003[z, y, x] -> [1+y, 1-z+y, 1-z+x, 0]; ... }
8578 2841.77 { S0002[z, y, x] -> [y, -z+y, x, 0]; S0003[z, y, x] -> [1+y, 1-z+y, 1+x, 0]; ... }
8579 2835.49 { S0002[z, y, x] -> [y, -z+y, z+x, 0]; S0003[z, y, x] -> [1+y, 1-z+y, 1+z+x, 0]; ... }
8639 2822.94 { S0002[z, y, x] -> [y, z+y, z+x, 0]; S0003[z, y, x] -> [1+y, 1+z+y, 1+z+x, 0]; ... }
8636 2817.33 { S0002[z, y, x] -> [y, z+y, -z+x, 0]; S0003[z, y, x] -> [1+y, 1+z+y, 1-z+x, 0]; ... }
isl heur. 696.49 { S0002[z, y, x] -> [z, y, x, 7]; S0003[z, y, x] -> [1+z, 1+y, 1+x, 6]; ... }
RBGS 3D vc1
ID MLUPs Schedule
8534 511.87 { S0003[z, y, x] -> [-y, -z-y, z-y-x, 0]; S0004[z, y, x] -> [1-y, 1-z-y, 1+z-y-x, 1]; ... }
8738 509.68 { S0003[z, y, x] -> [-y, z-y, -y-x, 0]; S0004[z, y, x] -> [1-y, 1+z-y, 1-y-x, 1]; ... }
4927 508.05 { S0003[z, y, x] -> [ y, -z+y, z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1+z+y+x, 1]; ... }
4932 507.54 { S0003[z, y, x] -> [ y, -z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1-z+y+x, 1]; ... }
4728 507.24 { S0003[z, y, x] -> [ y, z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1-z+y+x, 1]; ... }
isl heur. 294.24 { S0003[z, y, x] -> [ z, y, x, 0]; S0004[z, y, x] -> [1+z, 1+y, 1+x, 3]; ... }
Jacobi 2D cc1
ID MLUPs Schedule
68 4907.41 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [1+y, 1+y+x, 1]; ... }
34 4803.46 { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [1-y, 1-y+x, 1]; ... }
76 4512.45 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [3+y, 3+y+x, 0]; ... }
42 4450.92 { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [3-y, 3-y+x, 0]; ... }
40 3581.10 { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [3-y, 3-y-x, 0]; ... }
isl heur. 4907.41 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [1+y, 1+y+x, 1]; ... }
Jacobi 2D cc2
ID MLUPs Schedule
33 4332.32 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [2+y, 2+y+x, 1]; ... }
67 4260.92 { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [2-y, 2-y+x, 1]; ... }
74 2435.20 { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [6-y, 6-y+x, 0]; ... }
40 2394.38 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [6+y, 6+y+x, 0]; ... }
68 2328.94 { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [2-y, 2-y-x, 1]; ... }
isl heur. 4332.32 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [2+y, 2+y+x, 1]; ... }
Jacobi 2D ccd
ID MLUPs Schedule
96 4574.54 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [3+y, 3+y+x, 0]; ... }
108 4503.19 { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [3-y, 3-y+x, 0]; ... }
4 4450.60 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [2+y, 2+y+x, 0]; ... }
16 4398.03 { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [2-y, 2-y+x, 0]; ... }
110 2942.89 { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [3-y, 3-y-x, 0]; ... }
isl heur. 348.02 { S0003[y, x] -> [ y, x, 0]; S0004[y, x] -> [1+y, 1+x, 1]; ... }
Jacobi 2D vc1
ID MLUPs Schedule
46 1736.18 { S0004[y, x] -> [-y, -y+x, 0]; S0005[y, x] -> [1-y, 1-y+x, 1]; ... }
56 1614.92 { S0004[y, x] -> [ y, y+x, 0]; S0005[y, x] -> [1+y, 1+y+x, 1]; ... }
62 1592.34 { S0004[y, x] -> [ y, y-x, 0]; S0005[y, x] -> [3+y, 3+y-x, 0]; ... }
43 1448.90 { S0004[y, x] -> [-y, -y-x, 0]; S0005[y, x] -> [1-y, 1-y-x, 1]; ... }
57 1387.57 { S0004[y, x] -> [ y, y-x, 0]; S0005[y, x] -> [1+y, 1+y-x, 1]; ... }
isl heur. 1614.92 { S0004[y, x] -> [ y, y+x, 0]; S0005[y, x] -> [1+y, 1+y+x, 1]; ... }
RBGS 2D cc1
ID MLUPs Schedule
34 2751.77 { S0002[y, x] -> [ y, y-x, 0]; S0003[y, x] -> [1+y, 1+y-x, 1]; ... }
33 2732.99 { S0002[y, x] -> [ y, y+x, 0]; S0003[y, x] -> [1+y, 1+y+x, 1]; ... }
67 2606.05 { S0002[y, x] -> [-y, -y+x, 0]; S0003[y, x] -> [1-y, 1-y+x, 1]; ... }
68 2581.45 { S0002[y, x] -> [-y, -y-x, 0]; S0003[y, x] -> [1-y, 1-y-x, 1]; ... }
45 2497.41 { S0002[y, x] -> [ x, y+x, 0]; S0003[y, x] -> [1+x, 1+y+x, 1]; ... }
isl heur. 589.79 { S0002[y, x] -> [ y, x, 7]; S0003[y, x] -> [1+y, 1+x, 2]; ... }
RBGS 2D vc1
ID MLUPs Schedule
34 1249.66 { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [1+y, 1+y+x, 1]; ... }
68 1240.09 { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [1-y, 1-y-x, 1]; ... }
33 1043.73 { S0003[y, x] -> [ y, y-x, 0]; S0004[y, x] -> [1+y, 1+y-x, 1]; ... }
67 966.13 { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [1-y, 1-y+x, 1]; ... }
42 932.44 { S0003[y, x] -> [ y, y-x, 0]; S0004[y, x] -> [3+y, 3+y-x, 0]; ... }
isl heur. 129.63 { S0003[y, x] -> [ y, x, 2]; S0004[y, x] -> [1+y, 1+x, 0]; ... }
Experiment Replication

A compiled version of the ExaStencils code generator, the ExaSlang 4 code, and the configuration files can be found here.

Note: The code generator is still under development and neither the ExaSlang code nor the configuration files are intended to demonstrate anything other than the polyhedral search space exploration technique. Currently, an exploration can be executed only for ExaSlang 4 code that contains no more than one static control part that is not tagged sequentially.

Contents of the archive:

  • compiler_expl.jar: version of the ExaStencils code generator used in the experiments
  • knowledge*.txt: common configuration files for all experiments; options of special interest for the exploration are documented in the corresponding files
  • platform.txt: specification of the target platform and the target compiler (see documentation in the file)
  • settings.txt: specification of some input/output paths (the provided scripts rely on these paths) and allows adding special compiler flags
  • *.sh: simple bash scripts to assist the exploration, code generation, compilation, and execution of all variants
  • L4_templ.exa: templated ExaSlang source file that must be preprocessed by cpp
  • jacobi_* and rbgs_*: directories for all experiments containing one knowledge file for the exploration and one for the isl version, as well as a file that can be precocessed using cpp to generate the ExaSlang code; the knowledge files enable the setting and/or overriding of options per experiment (as, e.g., in jacobi_3D_ccd/knowledge.txt)

An experiment can be conducted by the invocation of the scripts 1*.sh to 5*.sh in sequence inside one of the twelve experiment directories (i.e., the working directory must contain the files L4_in.exa and knowledge.txt). While the first four have no special external dependences, 5sc_runExplored.sh must be adapted to run the binaries on the desired hardware. In its current form, a slurm array job is submitted.

The generated C++ codes for the two exploration runs of all experiments mentioned in the paper can be downloaded:

Warning: While the former archive is only 203 MB large, it contains almost 1.6 million files that consume more than 16 GB disc space.

Contact