Research StatementMatthew Fluet |
I have a proven publication record, with papers regularly appearing at top-tier programming-languages conferences such as ICFP, POPL, and ESOP, as well as at smaller conferences and workshops. Three of my papers were judged to be amongst the best at their respective conferences and solicited for journal length treatment in the Journal of Functional Programming and Fundamenta Informaticae.
On the basis of my research accomplishments, I have been asked to do substantial service work. I have reviewed papers for the Journal of Functional Programming and for the ACM Transactions on Programming Languages and Systems. I have served (or am serving) on the program committees for MLWRK’06, ICFP’07, DAMP’09, and IFL’09. Finally, I am currently serving a 2-year term as co-organizer of the annual Oregon Programming Languages Summer School and a 3-year term on the ICFP Steering Committee.
The most immediate problem to be overcome when implementing programs that run fast is understanding how to utilize the commodity processors of the (near) future, which will have ten or even a hundred processor cores on a single chip. Together with John Reppy of the University of Chicago, I am leading the Manticore project, an effort to develop new programming models and implementation techniques that will help programmers easily, safely, and efficiently exploit this kind of parallelism. Unlike some other parallel-programming research that addresses only the needs of computationally intensive scientific applications, Manticore aims to address the needs of a wide range of traditional desktop applications. Successfully addressing these needs drives my current research with Manticore to explore the combination of static program analyses, compiler transformations, and dynamic runtime policies. To support an easy and safe programming model, Manticore is founded on the functional-programming paradigm. Functional programming languages have been shown to provide a good semantic base for concurrent and parallel languages and to provide competitive performance. For these reasons, I am confident that Manticore will be an important research project. This confidence is shared by others: John Reppy and I are co-PIs for a $500K/3-year NSF funded grant supporting this research.
The most common violation of safety in a program involves the misuse of memory. For example, reading or writing an object after memory has been freed can lead to arbitrary unsafe behavior. Languages today offer a choice: either fine-grained control (that is potentially unsafe) or guaranteed safety (that is potentially inefficient), but not both. Languages like Java and Scheme use garbage collection to guarantee safety, but revoke the fine-grained control available in languages like C to explicitly allocate and deallocate memory. I have explored the issue of safe and efficient memory management in the context of Cyclone, a dialect of C that provides safe manual memory management. Cyclone uses a static type system to ensure that the memory-management primitives are used safely (e.g., disallowing the reading of a freed object). With Dan Wang (Microsoft Research), I implemented, in Cyclone, a copying garbage collector that manages the memory allocated by a Scheme interpreter. In most garbage-collected languages, the safety guaranteed by the garbage collector assumes the safe execution of the garbage collector itself, which is not otherwise guaranteed. In contrast, as a Cyclone program, our garbage collector was guaranteed to use memory safely. Benchmarks showed that our safe garbage collector compared favorably to other approaches that guarantee safety, both in terms of execution speed and the amount of trusted code. The experimental results were promising, but they were somewhat ill-founded: a formal model that justified the safety of the full set of Cyclone’s memory-management features and their complex interactions had never been constructed, due to sheer complexity. I therefore undertook, as my dissertation work, to prove the safety of this set of features. I showed formally, via type-safety and translation-correctness proofs, that these features can be expressed in a simpler, more expressive, and more uniform target language. This provided a foundational account of the safety of these memory-management features. The techniques developed in my dissertation are broadly applicable to the management of a wide range of resources; indeed, others have built upon my work and implemented programming libraries (in the Haskell language) that ensure that file and database resources, amongst others, are used safely and tracked accurately.
To provide stronger safety guarantees about programs, I explore new ways of using existing language features. Riccardo Pucella (Northeastern University) led a collaboration in which we investigated the use of phantom types. We took this folk-lore programming technique and showed precisely how to use it to enforce program invariants in languages with type systems that appear to be too weak for this purpose. For example, we showed that the Hindley-Milner type system used in Standard ML can be used to statically verify that a toDNF function takes an arbitrary formula and yields a formula in disjunctive normal form.
To make it easier to write programs, I also explore new language features. I led a collaboration with Kevin Donnelly (Boston University) in which we introduced transactional events: a novel concurrency abstraction that combines first-class synchronous-message-passing events with all-or-nothing transactions. We built a proof-of-concept implementation of transactional events and we proved that the implementation correctly executes the high-level semantics. Using transactional events, a programmer can write modular solutions to problems in concurrent programming that can only be solved with existing languages by violating module and abstraction boundaries. A common and important example of such a problem is that of guarded receive: a process receives a message on a channel only if the message satisfies a certain condition. For example, a process that manages an logging mechanism may wish to only receive messages marked as high-priority. Using transactional events, a programmer can add guarded receive to a channel communication by modifying only the receiver’s code. Using existing languages, a programmer must modify the code of all possible senders and receivers in order to establish a cooperative protocol, clearly violating module and abstraction boundaries.
When evaluating the ease of programming and the speed of execution, it is essential that programming-languages research be expressed and validated in the implementation of a programming language. Implementation requires familiarity with infrastructure, and I am one of the principal developers of MLton, widely regarded as one of the best compilers for any functional programming language. MLton a is open-source whole-program optimizing compiler for Standard ML; it is actively used in industry (MathWorks (PolySpace Technologies), Intel Research, SSH Communications Security, Reactive Systems) and academia (Carnegie Mellon University, Massachusetts Institute of Technology, Georgia Institute of Technology, Northeastern University, University of Chicago, Purdue University, Toyota Technological Institute at Chicago), and has served as both a direct research vehicle as well as critical infrastructure for other research projects. For example, CMU’s Twelf Logical Framework project and Princeton’s Foundational Proof-Carrying Code project have found MLton’s performance advantage to be invaluable for tackling large problems.
I am currently leading the MLton project; in addition to general expertise with the infrastructure, I have contributed to the development of novel compiler optimizations and to the design and implementation of new compiler intermediate representations (based on SSA). With Umut Acar (TTI-C) and Ruy Ley-Wild (CMU), I developed a language-based approach to writing and compiling self-adjusting programs and implemented the approach in MLton. Validation of our approach in a full language implementation was an integral contribution of our work. MLton will continue to be a vehicle for significant research. As a mature infrastructure with numerous users, it provides ample opportunity for students of all levels to investigate research topics in language design and implementation.
One of my strengths as a researcher in programming languages is the diversity, from compiler construction to type theory, of my past accomplishments. My future research will continue to pursue the goals of ease of programming, guarantees of safety, and speed of execution, while balancing theory and implementation. For ease of programming, I will explore new language features and new uses of existing features. For guarantees of safety, I will develop expressive static techniques, especially advanced type systems. For speed of execution, I will employ a broad knowledge of compilation techniques in the development of language implementations, especially those suited to future many-core systems. Finally, my familiarity with existing infrastructure through the Manticore and MLton projects will create opportunities for students to investigate exciting problems.