Documentation related to LooPo
Note that the descriptions can be out of date.
Basic Theory
The basics of loop parallelization in the polytope are described in a survey
paper by Christian
Lengauer. However, LooPo is much more flexible than the methods of
this paper. E.g., it allows affine dependences, piecewise affine,
multi-dimensional schedules and allocations for individual statements in
not necessarily perfectly nested loops.
Theses
The following theses were developed as part of the LooPo project:
- "Automatische Methoden zur Parallelisierung im
Polyedermodell" (Christian Wieninger, May 1995)
Describes and compares two automatic methods for generating space-time
mappings: Lamport's hyperplane method and Feautrier's work on
one- and multi-dimensional schedules and allocations. Both have
been implemented as part of LooPo.
- "Automatic Code Generation in the Polytope
Model" (Sabine Wetzel,
Nov. 1995)
Describes code generation algorithms for space-time mappings
generated by Lamport's or Feautrier's method. Implemented in LooPo.
- "Theoretischer und praktischer Vergleich von Algorithmen zur Lösung
von Ungleichungssystemen" (Klaus Köbler, Oct. 1995, on paper only)
Describes and compares three algorithms for solving systems of linear
inequations: Feautrier's PIP, Fourier-Motzkin elimination and
Weispfenning's algorithm. Points out which of them is suitable for
which parts of LooPo.
- "Practical Methods for Scheduling and Allocation in the Polytope Model" (Wolfgang Meisl, Sept. 1996)
Describes the scheduling method of Darte/Vivien and the allocation
method of Darte/Dion/Robert and some extensions thereof, including
some hints on our implementation.
- "Datenabhängigkeitsanalyse zur Schleifenparallelisierung: Vergleich und Erweiterung zweier Ansätze" (Harald Keimer, Januar 1997)
Describes an extended version of Banerjee's method for
determining data dependence.
- "The Mechanical Parallelization of Loop Nests Containing
while
Loops" (Martin Griebl, Oct. 1996)
Describes how the polytope model, useful for the automatic
parallelization of nested for
loops, can be adapted to
while
loops; esp. the problems of code
generation are described and solved.
- "Parallelization of Loop Nests with General Bounds in the Polyhedron Model" (Max Geigl, Mar. 1997)
One part examines the implications of different loop types for target code generation and integrates the results in a class hierarchy. Another part extends the polyhedron model to imperfect loop nests containing general for
loops, while
loops and (plain) if
statements.
- "Graphische Darstellung des Indexraumes von Anweisungen in Schleifenprogrammen" (Alexander Wüst, Mar. 1997)
Describes a visualisation tool for index spaces and dependences in loop programs.
- "Partitionierung von parallelen Schleifenprogrammen" (Martina Wieninger, Sept. 1997)
Describes an LSGP tiling method and its adaptation for parallel loop nests, and an LPGS tiling scheme for parallel loop nests.
- "Transformation von Shared-Memory-Programmen zu Distributed-Memory-Programmen" (Peter Faber, Nov. 1997)
Describes the generation of SPMD-Code for distributed-memory machines.
- "Index Set Splitting Based on the Polytope Model" (Bernhard Lehner, Feb. 1999)
Describes how the result of state-of-the-art schedulers and
allocators can be improved by partitioning the index set of the loop
nests into "homogeneous" pieces w.r.t. the existing dependences in every
piece.
- "Parallele Schleifenprogramme in Java" (Andreas Wolf, Apr. 2000)
Describes Java dialects with a focus on parallel execution and their applicability to high performance computing.
- "Analyse und Optimierung des
Kommunikationsaufwands in automatisch parallelisierten Programmen"
(Sven Anders, Mai 2000)
Describes techniques for speeding up automatically parallelized
programs by reducing the communications. Two methods are presented,
whereby one describes the tiling of time.
- "Kontrollfluß basierte Array-Abhängigkeitsanalyse in Schleifenprogrammen" (Babette Eckart, September 2000)
Describes a dependence analysis for arbitrarily nested loops with
dynamic control (if statements). The method is based on the
control flow graph and was presented in "A
Precise Fixpoint Reaching Definition Analysis for Arrays" at
LCPC'99.
- "Verfahren zur Aufzählung von Polyedern in der Schleifenparallelisierung" (Oliver Nyderle, September 2000)
Describes two methods for scanning polyhedra: one by Quillere and one
by Weispfenning, and shows their application/integration in LooPo.
- "Extending the Polyhedron Model to Inequality Systems with Non-linear Parameters using Quantifier Elimination" (Armin Größlinger, Sept. 2003)
Describes how quantifier elimination in the real numbers can be used to introduce non-linear parameters in the polyhedron model and gives some examples of algorithms for this generalized model, e.g., a generalized Fourier-Motzkin elimination.
- "Methods for Adaptive Tiling in the Polyhedron Model" (Georg Seidel, September 2004)
When the number of pysical processors is unknown at compile time,
code generation becomes more complex. The thesis describes how these
difficulties could be solved and which degrees of freedoms remain.
The described ideas are then extended to the case that the
space tile size is changed at run time in order to achieve a better
load balance.
- "Automatic Parallelization of Loop Programs for Distributed Memory Architectures" (Martin Griebl, November 2004, habilitation, 800 kByte)
This habilitation thesis touches nearly every parallelization task and
demonstrates the interactions between them, i.e., it describes the
underlaying theory of most parts of LooPo. A focus is in tiling
space and/or time dimensions.
- "Automatic Code Generation for Distributed Memory Architectures in the Polytope Model" (Michael Claßen, Sep. 2005)
This thesis presents a fully automated method for generating
efficient target code within the framework of the LooPo project. The generated communication code
enables the execution on clusters that are based on a distributed memory architecture. In order
to achieve a good message vectorization, the tiling technique is used to increase the granularity
of parallelism in the program. The implementation uses a combination of C/C++ and MPI as target language.
- "Verteilung von allgemeinen Berechnungsinstanzen in automatisch generierten Schleifenprogrammen" (Thomas Wondrack, Nov 2005)
The thesis deals with the presentation, implementation and evaluation of a placement method for computation instances. The important new aspect is that the cost-model may suggest re-computations instead of communiation values.
- "Volume Calculation and Estimation of Parameterized
Integer Polytopes" (Tilmann Rabl, January 2006)
For cost models, we need to know the number points inside polytopes.
This thesis reviews the methods of Clauss and Barvinok for exactly
counting the number of integral points, and it describes a method to
compute the real volume of a polytope as an estimate for the number of
integral points. The latter method contains also (several variants of)
Chernikova's algorithm to compute the dual representation of a
polyhedron. A novum: the method is also applicable for non-linearly
parameterized polytopes.
- "Optimierung der Speicherausnutzung automatisch
parallelisierter Programme" (Benjamin Schulte, March 2007)
This diploma thesis reviews the method of Quillere and Rajopadhye on
"Optimizing Memory Usage in the Polyhedral Model", which describes memory
reduction for functional programs. Schulte extends the theory so that it
is applicable to parallel, imperative loop programs and provides an
implementation.
- "Code Generation in the Polytope Model with Non-linear Parameters" (Philipp Claßen, April 2007)
This bachelor thesis generalizes Quillere's algorithm for
target code generation to non-linear parameters.
- "On Algorithmic and Heuristic Approaches to Integral Problems in the Polyhedron Model with Non-linear Parameters" (Stefan Schuster, June 2007)
This thesis examines possibilities to describe the integral solutions to certain problems occurring in the polyhedron model with non-linear parameters exactly. In particular, the solvability of systems of linear Diophantine equations as they appear in Banerjee's data dependence analysis is studied.
- "Code Optimization in the Polyhedron Model - Improving the Efficiency of Parallel Loop Nests" (Peter Faber, October 2007)
This doctoral thesis introduces common subexpression elimination to automatic loop paralleization with the polyhedron model. Expressions with a common value are identified. One computation of the value is being placed in time and space, and further uses of this value are facilitated via references.
- The Challenges of Non-linear Parameters and Variables in Automatic Loop Parallelisation (Armin Größlinger, December 2009)
This doctoral thesis studies approches to go beyound the restriction to
linear/affine expressions in dependence analysis, transformations and
code generation.
Internal Reports
The following internal documentation was written as part of the LooPo project:
Further Papers
Papers on loop parallelization not directly connected with the LooPo project
are listed in our general
overview
of recent publications.
Martin Griebl