Run-time information like the number of times each variable has been referenced can usually be obtained from the dynamic profile of a program. Instead of compiling and running the program once, the program is compiled, run and recompiled using the dynamic information collected during the first run. Obtaining the dynamic information is the job of a profiler and this process is known as dynamic recompilation. Gathering profile information during the first run of the program can potentially slow down the overall compilation of the program even though the cost of recompilation is expected to be amortized over many runs of it. [1]
Minimizing the amount of profile information required can further decrease the compile time. If we don't depend solely on the profile information to make the optimization decisions, it may be possible to considerably cut down on profiling cost by requiring less detailed information. The idea is to use profile information just to steer the compiler in the right direction instead of using the information as the only basis for its decision.
However, even in this scenario, it will not be possible to avoid detailed profiling all the time. We will require a set of programs for which detailed profiling information and the corresponding suitable optimization are known. A new program can now be compared to the programs in this database for similarity. And, if a match is found, it will be optimized in the same way. The similarity metric will be still based on program profile but not as much of it as may have been required to decide the most appropriate optimization for a program in the database. Thus we are reducing our profile information requirement for new programs not present in the database.
Such a technique can be made successful by using a machine learning algorithm to compare the programs, find a suitable match, maintain and expand the existing database. The vast search space makes it intractable to use other methods. Exhaustive search is ruled out. The problem is ideal for applying Artificial Intelligence methods.
As mentioned later, similar work has been done using Decision Tree and Neural Network learning. An important part of my research will be to zero in on a good Machine Learning algorithm for this domain. The appropriate type of training experience will be critical in this selection process. A computer program is said to learn from experience with respect to some class of tasks and a performance measure, if its performance at those tasks improves with experience. In this case, the computer program under study is the compiler, the class of tasks is that of choosing the most optimal transformations for the source code on the way to making the final executable and the performance measure is simply the execution time.