This Web site provides supplementary material for the paper "Polyhedral Search Space Exploration in the ExaStencils Code Generator". The paper presents an optimized, multi-dimensional polyhedral search space exploration that selects efficiently a good subset of all legal affine transformations. It also proposes a set of seven heuristic filters that restrict the search space, customized for the domain of stencil codes.
The exploration technique and the filter levels were implemented in the ExaStencils code generator as part of project ExaStencils (Advanced Stencil-Code Engineering).
The following plots depict the performance distribution for all twelve experiments and every filter level. The five fastest schedules along with the "isl heuristics" schedule are shown in the tables below.
Complete versions of the presented tables can be found here.
ID | MLUPs | Schedule |
---|---|---|
5380 | 4437.87 | { S0003[z, y, x] -> [y, z+y, z+y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1+z+y+x, 1]; ... } |
5396 | 4421.18 | { S0003[z, y, x] -> [y, z+y, y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1+y+x, 1]; ... } |
5382 | 4406.57 | { S0003[z, y, x] -> [y, z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1-z+y+x, 1]; ... } |
5237 | 4386.58 | { S0003[z, y, x] -> [y, -z+y, z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1+z+y+x, 1]; ... } |
5240 | 4383.85 | { S0003[z, y, x] -> [y, -z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1-z+y+x, 1]; ... } |
isl heur. | 4332.20 | { S0003[z, y, x] -> [z, z+y, z+y+x, 0]; S0004[z, y, x] -> [1+z, 1+z+y, 1+z+y+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
4287 | 2655.61 | { S0003[z, y, x] -> [y, z+y, -z+y+x, 0]; S0004[z, y, x] -> [2+y, 2+z+y, 2-z+y+x, 1]; ... } |
4299 | 2651.40 | { S0003[z, y, x] -> [y, z+y, y+x, 0]; S0004[z, y, x] -> [2+y, 2+z+y, 2+y+x, 1]; ... } |
4286 | 2633.94 | { S0003[z, y, x] -> [y, z+y, z+y+x, 0]; S0004[z, y, x] -> [2+y, 2+z+y, 2+z+y+x, 1]; ... } |
4442 | 2589.77 | { S0003[z, y, x] -> [y, -z+y, z+y+x, 0]; S0004[z, y, x] -> [2+y, 2-z+y, 2+z+y+x, 1]; ... } |
4455 | 2588.71 | { S0003[z, y, x] -> [y, -z+y, y+x, 0]; S0004[z, y, x] -> [2+y, 2-z+y, 2+y+x, 1]; ... } |
isl heur. | 2561.76 | { S0003[z, y, x] -> [z, z+y, z+y+x, 0]; S0004[z, y, x] -> [2+z, 2+z+y, 2+z+y+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
14642 | 1823.01 | { S0003[z, y, x] -> [y, z, x, 0]; S0004[z, y, x] -> [3+y, 1+z, 1+x, 0]; ... } |
14581 | 1820.68 | { S0003[z, y, x] -> [y, -z, y+x, 0]; S0004[z, y, x] -> [3+y, 1-z, 2+y+x, 0]; ... } |
14589 | 1819.71 | { S0003[z, y, x] -> [y, -z, -z+y+x, 0]; S0004[z, y, x] -> [3+y, 1-z, 3-z+y+x, 0]; ... } |
670 | 1812.98 | { S0003[z, y, x] -> [y, -z, y+x, 0]; S0004[z, y, x] -> [2+y, 1-z, 2+y+x, 0]; ... } |
14648 | 1808.14 | { S0003[z, y, x] -> [y, z, y+x, 0]; S0004[z, y, x] -> [3+y, 1+z, 3+y+x, 0]; ... } |
isl heur. | 111.26 | { S0003[z, y, x] -> [z, y, x, 0]; S0004[z, y, x] -> [1+z, 1+y, 1+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
7179 | 1036.91 | { S0004[z, y, x] -> [ y, -z+y, y+x, 0]; S0005[z, y, x] -> [1+y, 1-z+y, 1+y+x, 1]; ... } |
7173 | 1036.61 | { S0004[z, y, x] -> [ y, -z+y, z+y+x, 0]; S0005[z, y, x] -> [1+y, 1-z+y, 1+z+y+x, 1]; ... } |
7167 | 1035.56 | { S0004[z, y, x] -> [ y, -z+y, -z+y+x, 0]; S0005[z, y, x] -> [1+y, 1-z+y, 1-z+y+x, 1]; ... } |
6139 | 1034.15 | { S0004[z, y, x] -> [-y, -z-y, -y+x, 0]; S0005[z, y, x] -> [1-y, 1-z-y, 1-y+x, 1]; ... } |
6134 | 1032.54 | { S0004[z, y, x] -> [-y, -z-y, z-y+x, 0]; S0005[z, y, x] -> [1-y, 1-z-y, 1+z-y+x, 1]; ... } |
isl heur. | 1039.83 | { S0004[z, y, x] -> [ z, z+y, z+x, 0]; S0005[z, y, x] -> [1+z, 1+z+y, 1+z+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
8576 | 2848.22 | { S0002[z, y, x] -> [y, -z+y, -z+x, 0]; S0003[z, y, x] -> [1+y, 1-z+y, 1-z+x, 0]; ... } |
8578 | 2841.77 | { S0002[z, y, x] -> [y, -z+y, x, 0]; S0003[z, y, x] -> [1+y, 1-z+y, 1+x, 0]; ... } |
8579 | 2835.49 | { S0002[z, y, x] -> [y, -z+y, z+x, 0]; S0003[z, y, x] -> [1+y, 1-z+y, 1+z+x, 0]; ... } |
8639 | 2822.94 | { S0002[z, y, x] -> [y, z+y, z+x, 0]; S0003[z, y, x] -> [1+y, 1+z+y, 1+z+x, 0]; ... } |
8636 | 2817.33 | { S0002[z, y, x] -> [y, z+y, -z+x, 0]; S0003[z, y, x] -> [1+y, 1+z+y, 1-z+x, 0]; ... } |
isl heur. | 696.49 | { S0002[z, y, x] -> [z, y, x, 7]; S0003[z, y, x] -> [1+z, 1+y, 1+x, 6]; ... } |
ID | MLUPs | Schedule |
---|---|---|
8534 | 511.87 | { S0003[z, y, x] -> [-y, -z-y, z-y-x, 0]; S0004[z, y, x] -> [1-y, 1-z-y, 1+z-y-x, 1]; ... } |
8738 | 509.68 | { S0003[z, y, x] -> [-y, z-y, -y-x, 0]; S0004[z, y, x] -> [1-y, 1+z-y, 1-y-x, 1]; ... } |
4927 | 508.05 | { S0003[z, y, x] -> [ y, -z+y, z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1+z+y+x, 1]; ... } |
4932 | 507.54 | { S0003[z, y, x] -> [ y, -z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1-z+y, 1-z+y+x, 1]; ... } |
4728 | 507.24 | { S0003[z, y, x] -> [ y, z+y, -z+y+x, 0]; S0004[z, y, x] -> [1+y, 1+z+y, 1-z+y+x, 1]; ... } |
isl heur. | 294.24 | { S0003[z, y, x] -> [ z, y, x, 0]; S0004[z, y, x] -> [1+z, 1+y, 1+x, 3]; ... } |
ID | MLUPs | Schedule |
---|---|---|
68 | 4907.41 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [1+y, 1+y+x, 1]; ... } |
34 | 4803.46 | { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [1-y, 1-y+x, 1]; ... } |
76 | 4512.45 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [3+y, 3+y+x, 0]; ... } |
42 | 4450.92 | { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [3-y, 3-y+x, 0]; ... } |
40 | 3581.10 | { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [3-y, 3-y-x, 0]; ... } |
isl heur. | 4907.41 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [1+y, 1+y+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
33 | 4332.32 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [2+y, 2+y+x, 1]; ... } |
67 | 4260.92 | { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [2-y, 2-y+x, 1]; ... } |
74 | 2435.20 | { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [6-y, 6-y+x, 0]; ... } |
40 | 2394.38 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [6+y, 6+y+x, 0]; ... } |
68 | 2328.94 | { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [2-y, 2-y-x, 1]; ... } |
isl heur. | 4332.32 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [2+y, 2+y+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
96 | 4574.54 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [3+y, 3+y+x, 0]; ... } |
108 | 4503.19 | { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [3-y, 3-y+x, 0]; ... } |
4 | 4450.60 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [2+y, 2+y+x, 0]; ... } |
16 | 4398.03 | { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [2-y, 2-y+x, 0]; ... } |
110 | 2942.89 | { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [3-y, 3-y-x, 0]; ... } |
isl heur. | 348.02 | { S0003[y, x] -> [ y, x, 0]; S0004[y, x] -> [1+y, 1+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
46 | 1736.18 | { S0004[y, x] -> [-y, -y+x, 0]; S0005[y, x] -> [1-y, 1-y+x, 1]; ... } |
56 | 1614.92 | { S0004[y, x] -> [ y, y+x, 0]; S0005[y, x] -> [1+y, 1+y+x, 1]; ... } |
62 | 1592.34 | { S0004[y, x] -> [ y, y-x, 0]; S0005[y, x] -> [3+y, 3+y-x, 0]; ... } |
43 | 1448.90 | { S0004[y, x] -> [-y, -y-x, 0]; S0005[y, x] -> [1-y, 1-y-x, 1]; ... } |
57 | 1387.57 | { S0004[y, x] -> [ y, y-x, 0]; S0005[y, x] -> [1+y, 1+y-x, 1]; ... } |
isl heur. | 1614.92 | { S0004[y, x] -> [ y, y+x, 0]; S0005[y, x] -> [1+y, 1+y+x, 1]; ... } |
ID | MLUPs | Schedule |
---|---|---|
34 | 2751.77 | { S0002[y, x] -> [ y, y-x, 0]; S0003[y, x] -> [1+y, 1+y-x, 1]; ... } |
33 | 2732.99 | { S0002[y, x] -> [ y, y+x, 0]; S0003[y, x] -> [1+y, 1+y+x, 1]; ... } |
67 | 2606.05 | { S0002[y, x] -> [-y, -y+x, 0]; S0003[y, x] -> [1-y, 1-y+x, 1]; ... } |
68 | 2581.45 | { S0002[y, x] -> [-y, -y-x, 0]; S0003[y, x] -> [1-y, 1-y-x, 1]; ... } |
45 | 2497.41 | { S0002[y, x] -> [ x, y+x, 0]; S0003[y, x] -> [1+x, 1+y+x, 1]; ... } |
isl heur. | 589.79 | { S0002[y, x] -> [ y, x, 7]; S0003[y, x] -> [1+y, 1+x, 2]; ... } |
ID | MLUPs | Schedule |
---|---|---|
34 | 1249.66 | { S0003[y, x] -> [ y, y+x, 0]; S0004[y, x] -> [1+y, 1+y+x, 1]; ... } |
68 | 1240.09 | { S0003[y, x] -> [-y, -y-x, 0]; S0004[y, x] -> [1-y, 1-y-x, 1]; ... } |
33 | 1043.73 | { S0003[y, x] -> [ y, y-x, 0]; S0004[y, x] -> [1+y, 1+y-x, 1]; ... } |
67 | 966.13 | { S0003[y, x] -> [-y, -y+x, 0]; S0004[y, x] -> [1-y, 1-y+x, 1]; ... } |
42 | 932.44 | { S0003[y, x] -> [ y, y-x, 0]; S0004[y, x] -> [3+y, 3+y-x, 0]; ... } |
isl heur. | 129.63 | { S0003[y, x] -> [ y, x, 2]; S0004[y, x] -> [1+y, 1+x, 0]; ... } |
A compiled version of the ExaStencils code generator, the ExaSlang 4 code, and the configuration files can be found here.
Note: The code generator is still under development and neither the ExaSlang code nor the configuration files are intended to demonstrate anything other than the polyhedral search space exploration technique. Currently, an exploration can be executed only for ExaSlang 4 code that contains no more than one static control part that is not tagged sequentially.
Contents of the archive:
An experiment can be conducted by the invocation of the scripts 1*.sh to 5*.sh in sequence inside one of the twelve experiment directories (i.e., the working directory must contain the files L4_in.exa and knowledge.txt). While the first four have no special external dependences, 5sc_runExplored.sh must be adapted to run the binaries on the desired hardware. In its current form, a slurm array job is submitted.
The generated C++ codes for the two exploration runs of all experiments mentioned in the paper can be downloaded:
Warning: While the former archive is only 203 MB large, it contains almost 1.6 million files that consume more than 16 GB disc space.