Previous S-REPLS meetings

Fifth meeting

The fifth S-REPLS meeting was held on Thursday 12th January 2017 at the University of Oxford.  The invited speaker was Jan Hoffman, with volunteered talks from Benjamin Kaminski, Michael Arntzenius, John Wickerson, David Sherratt, Dan Ghica, Bradley Hardy, and Neel Krishnaswami.  Around 60 people attended with representation from most of the major universities in the South East and many local companies.

The fifth S-REPLS meeting was organised by Ohad Kammar, Sam Staton, and Jeremy Gibbons, with money kindly donated by the University of Oxford.

Jan Hoffman: Resource Aware ML

In this talk, I will describe the design and implementation of Resource Aware ML (RAML), a functional programming language that automatically derives worst-case resource bounds at compile time.

RAML is integrated in the OCaml compiler and automatically derives resource bounds for higher-order polymorphic programs with user-defined algebraic types. The analysis is parametric in the resource and can derive bounds for time, memory allocations, and energy usage. The derived bounds are multivariate resource polynomials which are functions of different size parameters of algebraic data structures. Bound inference is fully automatic and reduced to a linear optimization problem that is passed to an off-the-shelf LP solver.

I will first introduce type-based amortized resource analysis, the analysis technique that RAML is based on. Starting from simple examples and linear resource usage, I will move to polynomial and multivariate bounds, and highlight challenges such as higher-order functions and user-defined types. I will then outline the design of the resource-aware type system, the formal soundness proof with respect to an operational cost semantics, and the automatic type inference using LP solving. Finally, I will demonstrate the implementation of Resource Aware ML and the integration in the OCaml compiler using illustrative examples that highlight strengths and limitations of the technique.

Jan Hoffmann is a Tenure-Track Assistant Professor of Computer Science at Carnegie Mellon University. He received his PhD in 2011 from LMU Munich under the direction of Martin Hofmann. His research interests are in the intersection of programming languages and formal methods with a focus on quantitative properties of software. He is known for his work on automatic static cost analysis of programs and the design and implementation of Resource Aware ML. His current research is funded by NSF, DARPA-STAC, Schmidt Sciences Grant, and a Google Research Award. It includes projects such as automatic resource regression analysis and finding space/time vulnerabilities in Java bytecode. In the past, Hoffmann has been supported by scholarships of the German Research Foundation (DFG) and the German National Academic Foundation (Studienstiftung).

Benjamin Kaminski: Reasoning about expected run-times of probabilistic programs

We present a weakest-precondition-style calculus for reasoning about run-times of deterministic programs and expected run-times of probabilistic programs. Run-times of loops are defined in terms of least fixed points of continuous functions acting on an appropriate run-time domain. We present several proof rules for upper- and lower-bounding the (expected) run-time of loops. In particular, we present a very simple, yet complete proof rule for upper bounds based on just a single run-time invariant. For lower bounds, however, we will discuss how the analogon of the single-invariant-rule is sound for deterministic programs, but unsound for probabilistic ones.

Michael Arntzenius: Finding fixed-points faster

Datalog is a forward-chaining logic programming language; a Datalog program may be run by repeatedly applying its inference rules until nothing further is deducible. This computes the fixed point of a function (applying the inference rules) over finite sets (of deduced facts).

Datafun is a higher-order functional skin over a generalized version of Datalog’s semantics. It generalizes Datalog in two ways: 1. it permits functional abstraction; 2. it can take fixed-points of functions over other semilattices, not just finite sets.

Datalog has an extensive optimization literature. I and my advisor Neel Krishnaswami aim to generalize Datalog optimizations and apply them to Datafun. I will present our work in progress, which adapts Cai et al’s work on incremental lambda-calculus to provide an analogue of Datalog’s semi-naive evaluation, which avoids unnecessary re-computation when iterating towards a fixed-point.

John Wickerson: Hardware synthesis of weakly consistent C concurrency

Since 2011, the C language has provided ‘’weak atomics’’ – instructions that allow the implementation of lock-free algorithms with racy memory accesses, but which don’t guarantee that all threads have a consistent view of shared memory. Much work has been done to study the semantics of weak atomics, and how weak atomics can be implemented, both correctly and efficiently, on conventional multiprocessors (x86, PowerPC, ARM, …) and on conventional graphics processors (NVIDIA, AMD, …).

This talk reports on the first attempt to implement C’s weak atomics on an unconventional computing device: the field-programmable gate array (FPGA). FPGAs do not process instructions like conventional processors; rather, they are configured to implement a particular function in hardware. This configuring can be done using a ‘high-level synthesis’ tool, given a C program as a specification.

We implement weak atomics on FPGAs by adjusting the ‘scheduling’ algorithm that maps C instructions to hardware clock cycles. The scheduling algorithm seeks to satisfy a set of constraints on the ordering of each thread’s instructions. We show that strong atomics necessitate many scheduling constraints, and that the weaker acquire/release/relaxed atomics can be obtained by selectively removing some constraints.

We evaluate the performance of our implementation by synthesising hardware from a C implementation of a fine-grained circular buffer. We show that using strong atomics instead of conventional locks yields a 2.5x average performance improvement, and that using weak atomics instead of strong atomics yields a further 1.5x average whole-application performance improvement.

This is joint work with Nadesh Ramanathan, Shane T. Fleming, and George A. Constantinides, all from the Electrical and Electronic Engineering Dept. at Imperial College London. It has been accepted to appear as a full paper at the ACM/SIGDA International Symposium on FPGAs in February.

David Sherratt: Towards an Atomic Abstract Machine

The familiar call-by-need evaluation strategy (as used in Haskell compilers), also known as lazy evaluation shows best performance when terms are duplicated. When we have more control over the duplication of the term, such that we can choose which segments of the term to duplicate, then we can implement full laziness. Full laziness means we duplicate only what is necessary, and the maximal free subexpressions (the largest subterms such that, if they contain a bound variable x, they also contain its binder λx) remain.

Haskell compilers can perform full laziness at compile time. However, during runtime the implementation becomes too expensive. The idea behind this project is to further study the process of implementing a fully lazy evaluation strategy.

The purpose of sharing in the λ-calculus is to have better control over duplication of terms. Sharing is the use of a single representation for multiple instances of a common subterm. During evaluation, instead of duplicating a term, we can share it. This allows us to evaluate all the copies of the subterm simultaneously, by evaluating their shared representation [6].

In the atomic λ-calculus [2], we implement two types of closures. The usual notion of closure is a term with an associated sharing implemented as a let-construct. The second notion of closure introduces the distributor construct, which allows for duplication of a term atomically. The distributor uses a more restricted notion of unsharing similar to Lamping’s sharing graphs [4] which performs optimal reductions as defined in [5]. This is exactly what allows this calculus to perform fully lazy evaluation.

An abstract machine is a theoretical implementation of a strategy for a language on an abstract notion of computer architecture, made from directly implementable constructs. An abstract machine for evaluating atomic λ-terms would be a fully lazy abstract machine. For my PhD I aim to develop an atomic abstract machine based off the atomic λ-calculus. To achieve this, we make all the reduction rules in the calculus local in the sense of [3]. To achieve this, we construct a new calculus based on the atomic λ-calculus that makes the variable scopes explicit while maintaining fully lazy sharing.

This directed atomic λ-calculus uses a variant of director strings [1] to make the scopes of terms explicit. When performing substitution, instead of traversing the complete term, we instead only follow the path of the variable we want to replace which is highlighted through the annotations. An application can indicate the location of a variable with a 0 (in the left hand side) or with a 1 (in the right hand side), and similar for sharing. The scoping information of terms can then be used so that all the reduction rules can be considered local. One can then start work on an abstract machine for this calculus, that performs fully lazy λ-evaluation.

[1] Maribel Fernandez, Ian Mackie, and Francois-Regis Sinot. Lambda-calculus with director strings. Applicable Algebra in Engineering, Communication and Computing. 2005.
[2] Tom Gundersen, Willem Heijltjes, and Michel Parigot. Atomic lambda calculus: A typed lambda-calculus with explicit sharing. Proceedings of the 2013 28th Annual ACM/IEEE Sym- posium on Logic in Computer Science. IEEE Computer Society, 2013.
[3] Yves Lafont. Interaction nets. Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 1989.
[4] John Lamping. An algorithm for optimal lambda calculus reduction.Proceedings of the 17th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 1989.
[5] Jean-Jacques Levy. Optimal reductions in the lambda-calculus. To HB Curry: Essays on Combinatory Logic, Lambda Calculus and Formalism. 1980.
[6] Christopher Peter Wadsworth. Semantics and Pragmatics of the Lambda-Calculus. Diss. University of Oxford, 1971.

Dan Ghica: Unifying structural and nominal syntax in the language of diagrams

The correspondence between various monoidal categories and graphical languages of diagrams has been studied extensively, leading to applications in quantum computing and communication, systems theory, circuit design and more. From the categorical perspectives, diagrams can be specified using (name-free) combinators which enjoy elegant equational properties. However, established notations for diagrammatic structures, such as hardware description languages (VHDL, extsc{Verilog}) or graph languages ( extsc{Dot}), use a totally different style, which is flat, relational, and reliant on extensive use of names (labels). Such languages are not known to enjoy nice syntactic equational properties. However, since they make it relatively easy to specify (and modify) arbitrary diagrammatic structures they are much more popular than the combinator-style specifications. This parallels the situation in programming languages, where combinator-based languages (e.g. APL) proved to be far less popular than languages with variables. In this paper we show how the two approaches to diagram syntax can be reconciled and unified in a way that does not change the original categorical semantics and the existing (categorical) equational theory. We also show how the two notations are semantically complementary and we give sound and complete equational theories.

Bradley Hardy: Improving Agda’s equational reasoning with reflection

Agda is a Haskell-style programming language, whose dependent type system allows it to function as an elegant theorem proving language. But compared to similar languages like Coq, Agda lacks a built-in tactic language. Instead, users are encouraged to write libraries to ease reasoning in Agda itself. As an example of this, equational reasoning combinators are often used to lay out proofs by transitivity. These combinators allow the Agda code to resemble standard pen-and-paper equational reasoning proofs. But where a pen-and-paper proof might simply invoke, say, ‘commutativity’ between steps, in Agda we must specify the exact proof that takes us from one step to the next. This can often be very tedious.

In this talk, I will present a new library that eliminates one of the more annoying parts of this process: invoking congruence, which allows us to provide a lambda expression to locate a subterm to rewrite. However, these lambda expressions often amount to duplicating a lot of code. They are both tedious and error-prone to write, and ugly to read. My library uses Agda’s reflection API to automate this. Instead of providing a lambda expression, the user simply annotates the lefthand expression to specify where to rewrite. This improves both ease of writing Agda code, and the readability of the resulting proofs.

Neel Krishnaswami: Can ML be stopped?

Higher-order imperative programs are well-known as a source of small-but-ferociously-difficult-to-verify-programs. Languages like ML permit the subtle interplay of features like generative name creation, pointers to code, and type abstraction. Each of these features requires sophisticated mathematics to model, and their combination yields models of daunting complexity.

In recent years, the standard technique for analysing these programs have been based on step-indexed logical relations. However, while step-indexing is well-suited to proving safety properties, it is less able to establish liveness properties such as termination.

In this talk, I will describe some in-progress research on how to prove termination of higher-order imperative programs by combining ideas from process algebra, separation logic, and logical relations.


Fourth meeting

The fourth S-REPLS meeting was held on Thursday 27th September 2016 at Imperial College, London.  The invited speaker was Christoph Dubach, with volunteered talks from John Wickerson, Justyna Petke, Koko Muroya, Alastair Reid, Edd Barrett, Jonny Shipton, Ohad Kammar, and Bob Kowalski.  Around 50 people attended with representation from most of the major universities in the South East and many local companies.

The fourth S-REPLS meeting was organised by Alastair Donaldson and Luis Pina with money kindly donated by Imperial College and GCHQ.

Christophe Dubach: Lift, a functional approach to generating high-performance GPU code using rewrite rules

Graphic processors (GPUs) are the cornerstone of modern heterogeneous systems. GPUs exhibit tremendous computational power but are notoriously hard to program. High-level programming languages have been proposed to address this issue. However, they often rely on complex analysis in the compiler and device-specific implementations to achieve maximum performance. This means that compilers and software implementations need to be re-tuned for every new device.

In this talk, I will present Lift, a novel high-level data-parallel programming model. The language is based on a surprisingly small set of functional primitives which can be combined to define higher-level algorithmic patterns. A system of rewrite-rules is used to derive device-specific optimised low-level implementations of the algorithmic patterns. The rules encode both algorithmic choices and low-level optimisations in a unified system and let the compiler explore the optimisation space automatically. Preliminary results show this approach produces GPU code that matches the performance of highly tuned implementations of several computational kernels including linear algebra operations.

John Wickerson: Automatically comparing memory consistency models

A memory consistency model (MCM) is the part of a programming language or computer architecture specification that defines which values can legally be read when a thread in a concurrent program reads from a shared memory location. Because MCMs have to take into account various optimisations employed by modern architectures (such as store buffering and instruction reordering) and compilers (such as constant propagation), they often end up being complex and counterintuitive, which makes them challenging to design and to understand.

In this work, we identify four important tasks involved in designing and understanding MCMs: generating conformance tests, distinguishing two MCMs, checking compiler optimisations, and checking compiler mappings. We show that all four tasks can be cast as instances of a general constraint-satisfaction problem to which the solution is either a program or a pair of programs. We further show that although these constraints aren’t tractable for automatic solvers when phrased over programs directly, we can solve analogous constraints over program executions, and then reconstruct programs that satisfy the original constraints.

We illustrate our technique, which is implemented in the Alloy modelling framework, on a range of software- and architecture-level MCMs, both axiomatically and operationally defined. First, we automatically recreate – often in a simpler form – several known results, including: distinctions between several variants of the C/C++ MCM; a failure of the ‘SC-DRF guarantee’ in an early C++ draft; that x86 is ‘multi-copy atomic’ and Power is not; bugs in common C/C++ compiler optimisations; and bugs in compiler mappings from OpenCL to AMD-style and NVIDIA GPUs. Second, we use our technique to develop and validate a stronger MCM for NVIDIA GPUs that supports a natural mapping from OpenCL.

This is joint work with Mark Batty (U Kent), Tyler Sorensen (Imperial) and George A. Constantinides (Imperial).

Justyna Petke: Revolutionising the process of software development

Nowadays a lot of programmers’ effort is spent on testing and optimising software. If you take multiple non-functional properties into account, such as energy and memory consumption, finding the best compromise between such optimisation objectives becomes not only time consuming but also extremely difficult even for the most skilful programmers. The need for automated techniques for improvement of existing software led to the development of the new field of research called genetic improvement (GI). This paradigm has already been successfully used to optimise software efficiency, for instance, it led to a 70x speed-up of a particular bio-informatics system. Our group has also used genetic improvement techniques to reduce energy consumption and in award-winning work on software specialisation and software transplantation. In particular, we improved the efficiency of MiniSAT, a popular Boolean satisfiability solver, by specialising it for a particular set of benchmarks using GI. Furthermore, we autotransplanted a video encoding functionality from one system to the popular VLC media player within 26 hours of computation time compared to 20 days of elapsed time taken by human programmers.

Koko Muroya: Dynamic GoI machines

Girard’s Geometry of Interaction is known to give an interpretation of lambda-calculus as an abstract machine in which a token computes a value by travelling around a proof net. Danos and Regnier prove that their Interaction Abstract Machine can correspond to the Krivine abstract machine and hence implement the call-by-name evaluation strategy. Since then it has been a challenge to implement the call-by-value evaluation strategy by means of GoI-based abstract machines. Difficulties lie on both the order and cost of evaluation. A GoI-based abstract machine naturally postpones, repeats, or even neglect the evaluation of function arguments. Some previous attempts include: use of the CPS transformation by Hoshino et al. and Schoepp, and use of parallelism by Dal Lago et al. They prevent postponing and neglecting the evaluation of function arguments, and therefore achieve adequacy for the call-by-value evaluation strategy. However re-evaluation of function arguments has not been fully ruled out.

In this progress report we first introduce a variant of GoI-based abstract machine that dynamically rewrites an underlying proof net during its execution. Each rewrite is triggered by a token, i.e. a redex is detected using a path the token followed so far. This abstract machine just excludes re-evaluation of function arguments, and therefore would implement the call-by-need evaluation strategy. We also mention our attempt to implement the call-by-value evaluation strategy, in which our abstract machine is not any more allowed to postpone, repeat nor neglect the evaluation of a function argument. This attempt will lead to a GoI-based abstract machine that has a great opportunity to safely accommodate computational effects.

Alastair Reid: Trusting large specifications: the virtuous cycle

The last decade has seen impressive steps towards creating fully verified software systems with formally verified applications, crypto libraries, compilers, operating systems. The foundation on which we are building these towers of verified software is the processor specification that forms the contract between the hardware designer and the programmer.

But can we trust these processor specifications on which we build these soaring towers?

The approach we have taken in creating specifications of ARM processors is to make the specifications useful for many different purposes by different teams. This leads to a virtuous cycle where any bug fixed by one team benefits all users. I will illustrate this by talking about the diversity of uses of the specification: from formal verification of hardware to checking for security holes in new architecture extensions.

Edd Barrett: Virtual Machine warm-up blows hot and cold

Virtual Machines (VMs) with Just-In-Time (JIT) compilers are traditionally thought to execute programs in two phases: first the “warmup” phase determines which parts of a program would most benefit from dynamic compilation; after compilation has occurred the program is said to be at “peak performance”. When measuring the performance of JIT compiling VMs, data collected during the warmup phase is generally discarded, placing the focus on peak performance. In this talk we run a number of small, deterministic benchmarks on a variety of well known VMs. In our experiment, less than one quarter of the benchmark/VM pairs conform to the traditional notion of warmup, and none of the VMs we tested consistently warms up in the traditional notion. This raises a number of questions about VM benchmarking, which are of interest to both VM authors and end users.

Jonny Shipton: Using C# for high-performance network programming

Achieving high performance in network programming can be a rough task, made worse by the difficulty of debugging hardware. What if there was a way to keep all your cosy software tooling and language features without sacrificing performance? In this talk I will explore using a general purpose language like C# to target the more limited semantics of hardware programming, and present my experience of writing a high performance NAT in C# with the tools available today.

Ohad Kammar: On the expressive power of Algebraic Effects and Handlers

Monads are the de-facto mechanism for adding user-defined computational effects to programming languages. Support for monads has been developed in general-purpose statically-typed functional programming languages such as Haskell and OCaml, proof assistants such as Coq, and other verification tools such as F-star. Algebraic effects and handlers are a recently developed programming abstraction that allows programmers to define and program with custom effects using a specific delimited-control structure. This control structure has straightforward type-system, computational intuition, operational and denotational semantics, and reasoning principles.

Our goal is to compare the expressive power of programming constructs facilitating user-defined effects. We consider a calculus of effect handlers, Filinski’s calculus of monadic reflection, which encapsulate user-defined monads as a programmable abstraction, and a calculus of the standard delimited control operators shift-zero and reset. We compare the calculi using Felleisen’s notion of macro translations. When we ignore the natural type systems associated with each calculus, each pair of abstractions is macro inter-translateable, but this translation does not preserve typability. We present preliminary results regarding the non-existence of typability-preserving macro-translations, and extensions to the type system that preserve typability.

Joint work with Yannick Forster, Sam Lindley, and Matija Pretnar.

Bob Kowalski: LPS (Logic Production System), a logic and computer language for programming, databases, and AI knowledge representation

The logic and computer language LPS (Logic-based Production System) combines the characteristics of an imperative programming language with those of a declarative database and knowledge representation language. It is the result of over a decade of research led by Bob Kowalski and Fariba Sadri at Imperial College London.

The kernel of LPS consists of a database, together with logical rules of the form if antecedent then consequent. The database is a model-theoretic structure, which changes destructively as a result of external events and actions. Computation consists in making the logical rules true in the model-theoretic structure that is the history of database states, actions and events, by performing actions to make consequents true whenever antecedents becomes true. In addition, LPS also includes logic programs both to recognise when antecedents become true and to generate actions to make consequents true.

The slides at give a flavour of the language. Additional material about the kernel of LPS is available on the RuleML wiki. More technical papers can be found on Bob Kowalski’s home page, which also includes a copy of his 2011 book, Computational Logic and Human Thinking, which presents the philosophical background of LPS.

Third meeting

The third S-REPLS meeting was held on Thursday 21st April 2016 at the University of Kent at Canterbury.  The invited speaker was Derek Dreyer, with volunteered talks from Jeremy Gibbons, Stephen Dolan, Matthew Pickering, Kathryn Gray, Andrew Kennedy, Luis Pina, and Ilya Sergey.  Around 40 people attended with representation from most of the major universities in the South East and many local companies.

The third S-REPLS meeting was organised by Scott Owens and Mark Batty with money kindly donated by the University of Kent.

Abstracts of the talks are below.

Derek Dreyer: RustBelt: Logical Foundations for the Future of Safe Systems Programming

Rust is a new language developed at Mozilla Research that marries together the low-level flexibility of modern C++ with a strong “ownership-based” type system guaranteeing type safety, memory safety, and data race freedom. As such, Rust has the potential to revolutionize systems programming, making it possible to build software systems that are safe by construction, without having to give up low-level control over performance. Unfortunately, none of Rust’s safety claims have been formally investigated, and it is not at all clear that they hold. To rule out data races and other common programming errors, Rust’s core type system prohibits the aliasing of mutable state, but this is too restrictive for implementing some low-level data structures. Consequently, Rust’s standard libraries make widespread internal use of “unsafe” blocks, which enable them to opt out of the type system when necessary. The hope is that such “unsafe” code is properly encapsulated, so that Rust’s language-level safety guarantees are preserved. But due to Rust’s bleeding-edge type system, along with its reliance on a weak memory model of concurrency, verifying that Rust and its libraries are actually safe will require fundamental advances to the state of the art. In the RustBelt project, funded by an ERC Consolidator Grant, we aim to build the first formal tools for verifying safety of Rust. To achieve this goal, we will build on recent breakthrough developments in the area of concurrent separation logic, which I will explain and motivate in the talk starting from first principles.

Jeremy Gibbons: Free Delivery

Remote procedure calls are computationally expensive, because network round-trips take several orders of magnitude longer than local interactions. One common technique for amortizing this cost is to batch together multiple independent requests into one compound request. Batching requests amounts to serializing a small program in order to transmit it and run it remotely. The standard representation for serialized programs is to use free monads; we show that free applicative functors are actually a better choice of representation for the problem.


Stephen Dolan: Principal Type Inference with Subtyping

The Hindley-Milner type inference algorithm is popular and practical. However, since it is based on solving equations between types it has proven difficult to extend with subtyping, where one must deal with inequalities rather than equations. In this talk, I demonstrate a simple language with subtyping, and show how to infer principal types for it. The syntax, semantics and even typing rules of this language are exactly those of ML, augmented with the standard subtyping rule. The novelty appears in the definition of the set of types themselves: by adopting a more algebraic and less syntactic approach to constructing the lattice of types, we can build in enough structure that an analogue of unification can be used to infer types.


Matthew Pickering: Profunctor Optics

Different optics allow programmers to succinctly access and modify different parts of a data structure. The most well-known example of an optic is a lens which is characterised by a getter and a setter. Recently, programming with optics has become increasingly popular in the Haskell community due to the `lens` library. This is in large due to the so-called van Laarhoven representation which makes composing different parts of the optics hierarchy extremely easy. In this talk, I will explain the optics hierachy, why the van Laarhoven representation has proved so successful and also about profunctor optics which are a generalisation of van Laarhoven style optics.


Kathryn Gray: Sail: A DSL for Rigorous Architectural Specifications

Processor architectures are critical interfaces in computing, but they are typically defined only by prose and pseudocode documentation. This is especially problematic for the subtle concurrency behaviour of weakly consistent multiprocessors such as ARM and IBM POWER: the traditional documentation does not define precisely what programmer-observable behaviour is (and is not) allowed for concurrent code; it is not executable as a test oracle for pre-Silicon or post-Silicon hardware testing; it is not executable as an emulator for software testing; and it is not mathematically rigorous enough to serve as a foundation for software verification. In order to rigorously specify architectures, the instructions, pipelines, and memory controllers must be specified using formally defined languages and and tools. In this talk, I will present my domain specific language, Sail, for specifying architecture instructions sets. The resulting specifications form one part of rigorous architectural models for POWER and ARM. I will present the interesting aspects of Sail’s dependent type system, which is a light-weight system balancing the benefits of static bounds and effects checking with the usability of the language for engineers. I will also present the interface between the instruction sets and concurrent models of the pipeline and memory systems, whose requirements drove the development of our new language Sail.

Andrew Kennedy: The Hack programming language: Types for PHP

Facebook’s web frontend is implemented in PHP, a dynamically typed language with more strange and error-prone features even than another well-known dynamically typed language in widespread use. Hack is Facebook’s answer to the legacy problem of millions of lines of PHP: a new language, based on PHP, with rich static types, async, typed XML literals, and other features. New code is written in Hack; old PHP code is gradually becoming Hackified. My own interest is in Hack’s type system, and this will be the focus of my talk.


Luis Pina: Deploying Dynamic Analyses and Preventing Compiler Backdoors with Multi-Version Execution

Bug-finding tools based on dynamic analysis (DA), such as Valgrind or the compiler sanitizers provided by Clang and GCC, have become ubiquitous during software development. These analyses are precise but incur a large performance overhead (often several times slower than native execution), which makes them prohibitively expensive to use in production. Multi-version execution allows multiple program versions to run concurrently as long as they behave in the same way (i.e., issue the same sequence of system calls). In this talk, I will present early work on the exciting possibility of deploying expensive dynamic analyses in production code using multi-version execution. I will also describe how multi-version execution can be used to detect compiler backdoors in security sensitive programs.


Ilya Sergey: Hoare-style Specifications as Correctness Conditions for Non-linearizable Concurrent Objects

Designing efficient concurrent objects often requires abandoning the standard specification technique of linearizability in favor of more relaxed correctness conditions. However, the variety of alternatives makes it difficult to choose which condition to employ, and how to compose them when using objects specified by different conditions. In this work, we propose a uniform alternative in the form of Hoare logic, which can explicitly capture-in the auxiliary state-the interference of environment threads. We demonstrate the expressiveness of our method by verifying a number of concurrent objects and their clients, which have so far been specified only by non-standard conditions of concurrency-aware linearizability, quiescent, and quantitative quiescent consistency.


Second meeting

The second S-REPLS meeting was held on Friday 20th November at Middlesex University, London.  The invited speakers were Nick Benton, Phil Wadler, Paul Kelly and Sam Staton, with volunteered talks from Richard Bornat and John Wickerson.  Around 60 people attended with representation from most of the major universities in the South East and many local companies.

The second S-REPLS meeting was organised by Andrei Popescu, Jaap Boender and Raja Nagarajan using money kindly donated by OCaml Labs and Middlesex University.

Abstracts and videos of the talks are below.

Nick Benton: What we talk about when we talk about types

Types are a central topic in programming language research, as well as being a major trigger of heated and unproductive language advocacy arguments on the internet. Yet, even amongst expert types researchers, there is a surprising diversity of opinion about just what types are: what they mean, what they’re for, and what properties a type system should have. This talk will explore the various positions, including, of course, presenting the correct answers to these questions.

Video (part one, part two).

Phil Wadler: Everything old is new again: Quoted DSLs

Fashions come, go, return. We describe a new approach to domain specific languages, called QDSL, that resurrects two old ideas: quoted terms for domain specific languages, from McCarthy’s Lisp of 1960, and the subformula property, from Gentzen’s natural deduction of 1935. Quoted terms allow the domain specific language to share the syntax and type system of the host language. Normalising quoted terms ensures the subformula property, which provides strong guarantees, e.g., that one can use higher-order or nested code in the source while guaranteeing first-order or flat code in the target, or using types guide loop fusion. We give three examples of QDSL: QFeldspar (a variant of Feldsar), P-LINQ for F#, and some uses of Scala LMS; and we provide a comparison between QDSL and EDSL (embedded DSL).

Joint work with James Cheney, Sam Lindley, Shayan Najd, and Josef Svenningsson.

Video (part one, part two).

Richard Bornat: Lace, a progrtam logic for weak memory

Weak memory reorders instruction execution and propagation of writes to the store. It’s hard to understand and harder still to program. We have a program logic based on Hoare logic, Owicki-Gries and Jones’s rely/guarantee, which makes programming easier and proof possible.


Paul Kelly: How does domain specificity enable domain-specific performance optimisations?

My group at Imperial has worked on six or seven “domain-specific optimisation” (DSO) projects, mostly in computational science applications. This talk aims to reflect on our experiences, and in particular, to ask what enables us to deliver a DSO? Is it some special semantic property deriving from the domain? Is it because we have a domain-specific language that abstracts from implementation details – enabling the compiler to make choices that would be committed in lower-level code? Is it that the DSL captures large-scale dataflows that are obscured when coded in a conventional general-purpose language? Is it simply that we know that particular optimisations are good for a particular context? The talk will explore this question with reference to our DSO projects in finite-element methods, unstructured meshes, linear algebra and Fourier interpolation.

Video (part one, part two).

Sam Staton: Equations, effects and resources

I will discuss equational theories of programming languages. I’ll focus on first order languages that use some kind of resource. I’ll take illustrations from quantum computation, where the resource is “qubits”. The development is based around an equivalence between first-order programming languages and algebraic theories, as I’ll explain.


John Wickerson: Overhauling SC atomics in C11 and OpenCL

In this talk, I will outline our proposal for overhauling the semantics of SC atomics in C11 and OpenCL. In the C11 case, I will describe a revised model that is shorter, simpler, and demonstrably more efficient to simulate. Our revised model is a little stronger than the current one (i.e., allows fewer program behaviours) but existing compilation schemes (Power, x86) remain sound. I will then describe how we have extended our revised C11 model to obtain the first rigorous formalisation of the OpenCL memory model. (OpenCL is an extension of C for heterogeneous programming across CPUs, GPUs, FPGAs, etc.) In the OpenCL case, we refine the SC rules still further, to give a sensible semantics to SC atomics that employ a ‘memory scope’ to restrict their visibility to specific threads.

This is joint work with Mark Batty and Ally Donaldson, and will be presented at POPL 2016.


Inaugural meeting

The inaugural S-REPLS meeting was held on Thursday 30th April in the Lee Hall at Wolfson College, Cambridge. Our invited speaker was Conor McBride with volunteered talks from Jose Calderon, Alastair Donaldson, Laurence Tratt and Tony Hoare.  Around 70 people attended with representation from most of the major universities in the South East and many local companies.

The first S-REPLS meeting was organised by Ohad Kammar, Dominic Mulligan and Jeremy Yallop using money kindly donated by OCaml Labs.

Abstracts and slides for the talks are below.


Conor McBride: The Dependent Lollipop

The business of combining linear types with dependent types has seen an upsurge in activity, recently. The usual strategy is to write a paper whose title emphasizes the combination of the two about a system which carefully arranges the separation of the two: type dependency is permitted over only the cartesian fragment of the linear data. Whilst that is an effective compromise, it leaves me a little unhappy from a propositions-as types perspective, where we are able to reason about only those things upon which types may depend. In the course of my recent investigations into type systems which manage information flow between “worlds”, I stumbled upon a way to distinguish run-time usage (“consumption”) from usage-in-types (“contemplation”) in such a way that linearity of the former has no bearing on the latter. The upshot is that we gain a dependent lollipop (x : S) -o T[x], where the type of a linear function’s output can depend on what the input used to be, even though the output is constructed by consuming the input. You can contemplate your cake and eat it too.

No slides (slides were handwritten).

Jose Calderon: Improving Implicit Parallelism

Using static analysis techniques compilers for lazy functional languages can identify parts of a program that can be legitimately evaluated in parallel with the main thread of execution. These techniques can produce improvements in the runtime performance of a program, but are limited by the static analyses’ poor prediction of runtime performance. This talk outlines the development of a system that uses iterative profile-directed improvement in addition to well-studied static analysis techniques. This allows us to achieve higher performance than through static analysis alone.


Alastair Donaldson: Many-core Compiler Fuzzing

Parallel programming models for many-core systems, such as the OpenCL programming model, claim to allow the construction of portable many-core software.  Though performance portability may be an elusive goal, functional portability should not be. Functional portability depends on reliable compilers for many-core programming languages.  This presents a real challenge for industry because many-core devices, such as GPUs, are evolving rapidly, as are the associated many-core languages (e.g., a revision of the OpenCL specification appears approximately once every 18 months). Compiler-writers are thus continually playing catch-up.

I will present recent ideas on how to apply random testing (fuzzing) to many-core compilers, in the context of OpenCL.  The aim of this work is to help vendors to improve the quality of their compilers by discovering bugs quickly.  Following up on successful prior works on sequential compiler testing, we have designed two techniques for generating random OpenCL kernels for purposes of compiler testing. The first approach builds on the Csmith project from the University of Utah (PLDI’11).  Here, we generate random OpenCL kernels that are guaranteed to be free from undefined behaviour and to behave deterministically.  For such a kernel, differing results between two OpenCL implementations indicates that one of the implementations has compiled the kernel erroneously.  The second approach builds on the “equivalence modulo inputs” idea from researchers at UC Davis (PLDI’14).  Here we start with an OpenCL kernel and generate mutations from the kernel such that, for a given input, each mutant is guaranteed by construction to compute the same result as the original kernel. In this case, observable differences between mutants for the given input indicate compiler bugs.

I will report on a large testing campaign with respect to 21 OpenCL (device, compiler) configurations.  We found bugs in every configuration that we tested, including in compilers from all the major device vendors, many of which have been fixed in response to our reports.  I will show some examples of the bugs the technique has uncovered, and will discuss open problems in this area.

This is joint work with Christopher Lidbury, Andrei Lascu and Nathan Chong, and is due to appear at PLDI’15.


Laurence Tratt: Fine-grained Language Composition

Although run-time language composition is common, it normally takes the form of a Foreign Function Interface (FFI). While useful, such compositions tend to be coarse-grained and slow. In this talk I will show how rich languages can be composed in a fine-grained manner, allowing users to embed syntactic fragments of different languages inside each other, including referencing variables across languages. We show that sensible cross-language design decisions can be made in the face of different semantics, and that composed programs have a surprisingly low performance overhead.


Tony Hoare: Graphical Models of Concurrency

Familiar examples of graphical models are History Petri Nets, Message Sequence Charts, and Litmus Tests for relaxed memory.  The graphs can be composed according to the programs which generate them.  Composition can be sequential or concurrent, with interaction both by shared memory and  by communication.  Languages are defined as sets of graphs, and (like sets of strings) they obey all the Laws of Regular Algebra, plus a few  similar laws for concurrency.  My motivation is that more familiar forms of semantics (e.g., in the style of Milner or O’Hearn) can be easily deduced from these Laws. However my presentation will convey the ideas and intuitions only pictorially by examples.


2 Responses

Leave a Reply