The Paraguin Project is being conducted at the Department of Computer Science at the University of North Carolina at Wilmington .  The purpose is to develop an open source message-passing parallelizing compiler for distributed-memory parallel computer systems.  We are using the SUIF Compiler from Stanford University to build our compiler.  In fact, the paraguin compiler is simply a SUIF compiler pass.

People

Faculty and Staff

Students

Introduction

Beowulf-class clusters of computers are becoming more and more popular every year. These clusters are inexpensive computer systems capable of parallel computation. The philosophy behind Beowulf clusters is that they can be built from low cost, off-the-shelf components. This allows small companies or departments to build their own high performance parallel computer system. Unfortunately, there are not many parallelizing compilers that can generate message-passing code suitable for a Beowulf system.  

Most parallelizing compilers will produce code that assumes a shared-memory model of the system.  Fortunately for users of Beowulf systems, Distributed-Shared Memory (DSM) libraries exsits that will simulate a shared-memory model on a distributed-memory system, such as Treadmarks [1] or SAM [2]. These DSM packages have alleviated the message-passing and scheduling concerns for both the user and the compiler. There are also compilers that provide a global addressing model to the user, such as Olden [3] or Split-C [4]. However, the abstraction created by the DSM library is a source of inefficiency [5,6]. The paper by Cox, et. al. [5] showed several examples of regular programs where the compiler-generating message passing solution outperformed the DSM solution. Although DSMs create a convenient abstraction, they are not necessary the best option.  Therefore, we feel that it is necessary to continue research in compilers that are aware of the distributed-memory model and can generate message-passing code.

The research for distributed-memroy systems shared-memory systems is less advanced than that for shared-memory. There are several reasons for this:

  1. Parallel computation for distributed-memory systems is more complex than for shared-memory. 
  2. DSMs have given users another option other than using a parallelizing compiler that can generate message-passing code. 
  3. The lack of open-source parallelizing compilers has hindered research in automatic parallelization for distributed-memory. 
If someone wishes to conduct research in this area, there are few open source compilers from which they can start. In fact, we were unable to find one.  As a result, much of the work done in this area, such as static scheduling heuristics, has used simulation or a small number of inputs. The SUIF Compiler [7] from Stanford University is an example of the benefit of having an open-source compiler. Much research has been conducted using Suif as the base compiler, not only at Stanford but also around the world. This has been very beneficial for advancing compiler technology. Smaller research institutions, that do not have the resources of a school like Stanford, can now conduct research in compiler technology without making the enormous investment of building the infrastructure. 

We are building an open-source message-passing automatic parallelizing compiler based on the Suif compiler.  It is our intent that, by making this an open source compiler, we will stimulate interest as well as collaboration in research for automatic parallelizing compilers for distributed-memory systems.

Downloads

References

  1. C. Amza, A.L. Cox, S. Dwarkada, P. Keleher, H. Lu, R. Rajamony, W. Yu, and W. Zwaenepoel, "TreadMarks: Shared memory computing on networks of workstations," IEEE Computer , vol. 29, no. 2, Feb. 1996, pp. 18-28.
  2. D.J. Scales and M.S. Lam, "An efficient shared memory layer for distributed memory machines," Computer Systems Laboratory Technical Report CSL-TR-94-627, Department of Computer Science, University of Stanford, 1994.
  3. M.C. Carlisle and A. Rogers. "Software caching and computation migration in Olden." In Proc. of the Fifth ACM SIGPLAN Symp. on Principles & Practice of Parallel Programming , Santa Barbara, Calif., July 1995, pp. 29-38.
  4. Split-C, The Computer Science Division, University of California, Berkeley, http://www.cs.berkeley.edu/projects/parallel/castle/split-c/ .
  5. A. L. Cox, S. Dwarkadas, H. Lu, and W. Zwaenepoel, "Evaluating the performance of software distributed shared memory as a target for parallelizing compilers," In the Proc. of the 11th International Parallel Processing Symposium , Geneva, Switzerland, Apr. 1997, pp. 475-482.
  6. P. J. Keleher, "Update Protocols and cluster-based shared memory," Computer Communications , vol. 22, no.11, July 1999, pp.1045-1055.
  7. M. W. Hall, J. M. Anderson, S. P. Amarasinghe, B. R. Murphy, S.-W. Liao, E. Bugnion and M. S. Lam, "Maximizing multiprocessor performance with the SUIF compiler," IEEE Computer , vol. 29, no. 12, Dec. 1996, pp. 84-89.




This page was last updated: October 5, 2006

 

Email: