This project uses MATLAB (and its open-source counterpart, Octave) as an example language to study optimizations that might benefit scripting languages, and specifically those that are used for computationally-intensive programming.
Perhaps the best known work on compiling MATLAB is the FALCON project at UIUC. Unfortunately, that work has been inactive for more than 15 years. In the mean time, MATLAB has evolved tremendously, as a language as well as in terms of its runtime and library support. MATLAB now has a (rapidly evolving) Just-In-Time (JIT) compiler, operating on the proprietary byte-code that MATLAB parser produces. The libraries have been improved, and many have been parallelized. With the Parallel Programming and Distributed Computing Toolboxes MATLAB now also has support for directly writing parallel programs, albeit in a somewhat restricted way.
Even with a byte-code JIT and enhanced libraries performance of MATLAB code is no match for a well-written Fortran or C program, indicating potential for improving MATLAB code. The ParaM effort is aimed at exploiting some of this potential.
One area that we have identified as critically important is memory optimizations. Our studies indicate potential for improving the memory behavior at the source-level that might be extremely difficult, if not impossible, to leverage at the byte-code or library-level. On modern multi-core machines more and more applications are memory-bound, making the traditional models of algorithmic complexity of very limited use in estimating the performance on real machines. We are currently researching a theoretical model, based on defining reuse distances at the source-level, to enable more accurate estimation of compiler optimizations on modern machines. We will use this research to inform the MATLAB / Octave to C compiler, called ParaM, which we are developing.
Funded by NSF, award CCF-0811703
Arun Chauhan and Chun-Yu Shei. Static Reuse Distances for Locality-based Optimizations in MATLAB. In Proceedings of the 24th ACM International Conference on Supercomputing (ICS), 2010.
Chun-Yu Shei, Arun Chauhan and Sidney Shaw. Compile-time Disambiguation of MATLAB Types through Concrete Interpretation with Automatic Run-time Fallback. In Proceedings of the 16th annual IEEE International Conference on High Performance Computing (HiPC), 2009.
Daniel McFarlin and Arun Chauhan. Library Function Selection in Compiling Octave. In Proceedings of the Workshop on Performance Optimization for High-Level Languages and Libraries (POHLL), held in conjunction with the 21st IEEE Parallel and Distributed Processing Symposium (IPDPS), March 2007.