Abstract:

Parity games are deceptively simple two-player games on directed graphs

labeled with numbers.

Parity games have important practical applications in formal

verification and synthesis, especially to solve the model-checking problem

of the modal mu-calculus. They are also interesting from the theory

perspective, because they are widely believed to admit a polynomial

solution, but so far no such algorithm is known. In recent years, a number

of new algorithms and improvements to existing algorithms have been

proposed.

In this talk, we introduce parity games in an accessible way and discuss

why they are so interesting. We present various solutions that have been

proposed over the years. We also present a comprehensive empirical evaluation

of modern parity game algorithms and solvers, both on real world benchmarks

and randomly generated games.

Abstract:

Symbolic executions (and their recent variants called dynamic symbolic

executions) are an important technique in automated testing. Instead

of analysing only concrete executions of a program, one could treat

such executions symbolically (i.e. with some variables that are not

bound to concrete values) and use constraint solvers to determine this

(symbolic) path feasibility so as to guide the path explorations of

the system under test, which in combination with dynamic analysis

gives the best possible path coverage. For string-manipulating

programs, solvers need to be able to handle constraints over the

string domain. This gives rise to the following natural question: what

is an ideal decidable logic over strings for reasoning about path

feasibility in a program with strings? This is a question that is

connected to a number of difficult results in theoretical computer

science (decidability of the theory of strings with concatenations,

a.k.a., word equations) and long-standing open problems (e.g.

decidability of word equations with length constraints). Worse yet,

recent examples from cross-site scripting vulnerabilities suggest that

more powerful string operations (e.g. finite-state transducers) might

be required as first class citizens in string constraints. Even though

putting all these string operations in a string logic leads to

undecidability, recent results show that there might be a way to

recover decidability while retaining expressivity for applications in

symbolic execution. In this talk, I will present one such result from

my POPL’16 paper (with P. Barcelo). The string logic admits

concatenations, regular constraints, finite-state transductions,

letter-counting and length constraints (which can consequently express

charAt operator, and string disequality). I will provide a number of

examples from the cross-site scripting literature that shows how a

string logic can, for the first time, be used to discover a bug in or

prove correctness of the programs. I will conclude by commenting on a

new decision procedure for the logic that leads to an efficient

implementation (POPL’18 with L. Holik, P. Janku, P. Ruemmer, and T.

Vojnar) and a recent attempt to incorporate the fully-fledged

replace-all operator into a string logic (POPL’18 with T. Chen, Y.

Chen, M. Hague, and Z. Wu).

Abstract:

Buggy and insecure software could have serious consequences including

the loss of human lives, financial losses, and confidential

information leakage, to name a few. Program analysis is a field that

concerns the problem of analysing the behaviour of programs especially

with respect to the issue of correctness. Over the years computational

logic has played an important role in program analysis particularly in

the development of precise automatic methods for verifying the

correctness and optimising the performance of programs. In this talk I

will illustrate how logic can help program analysis, drawing examples

from my own research inspired by challenges in web security (e.g. how

to detect/prevent cross-site scripting vulnerabilities), web

performance optimisation (e.g. how to remove code redundancies in web

pages), and verification of distributed protocols. A theme that will

emerge during the talk is that there is often a tight connection

between logic and automata that can be exploited when designing (often

theoretically optimal) algorithms.

Abstract:

The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blendthe inferential and computational sciences. That classical perspectives from these fields are not adequate to address emerging problems in Data Science is apparent from their sharply divergent nature at an elementary level—in computer science, the growth of the number of data points is a source of “complexity” that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data points is a source of “simplicity” in that inferences are generally stronger and asymptotic results can be invoked. On a formal level, the gap is made evident by the lack of a role for computational concepts such as “runtime” in core statistical theory and the lack of a role for statistical concepts such as “risk” in core computational theory. I present several research vignettes aimed at bridging computation and statistics, including the problem of inference under privacy and communication constraints, and including a surprising cameo role for symplectic geometry.

Abstract:

Machine learning (ML) models, e.g., deep neural networks

(DNNs), are vulnerable to adversarial examples: malicious inputs

modified to yield erroneous model outputs, while appearing unmodified

to human observers. Potential attacks include having malicious content

like malware identified as legitimate or controlling vehicle

behavior. Yet, most existing adversarial example attacks require

knowledge of either the model internals or its training data. We will

describe a black-box attack on ML models. These algorithms yield

adversarial examples misclassified by Amazon and Google at rates of

96.19% and 88.94%. We also find that this black-box attack strategy is

capable of evading defense strategies previously found to make

adversarial example crafting harder.

Data nets are yet another extension of Petri Nets in which the relations between consumed and produced tokens are very restricted. Their subclasses like, Unordered Data Petri Nets (UDPN), from the theory perspective are natural and easy to define. Furthermore, similarly

to Petri Nets they have a lot of structure to explore. During the talk, we will start form defining Data Nets and formulating the state equation, a generalization of one of the simplest and most important equations for Petri Nets.

Next, I will present a sketch of the proof of the correctness of the NP-time algorithm, to solve the equation in case of Unordered Data Petri Nets. Finally, I will mention some novel results for other classes of Data Nets and open problems that we are working on.

The talk will base on a joint work with Patrick Totzke and Jerome Leroux

“Linear Combinations of Unordered Data Vectors” published at LICS-2017

and on unpublished results with Sławomir Lasota.

Reachability analysis is a useful tool for checking whether a cyber-physical system satisfies a given safety property. For instance, one could ask whether an electro-magnetic braking system brings a car to a standstill within a given time frame. In set-based reachability, one takes a given set of initial states (ranges for the position and speed of the car) and computes the image of the set of states as it evolves over time. Even for simple types of systems, this so-called reach set can only be computed approximately, and accuracy comes at an extremely steep cost. A highly scalable way to approximate the reach set is known for the special case of linear dynamics. It is based on template polyhedra, which are polyhedra (sets bounded by linear constraints) with normal vectors from a given finite set. Simple instances of template polyhedra are boxes or octagons. A template instance that tightly bounds the reach set is found by solving a set of optimization problems. The accuracy of the approximation can be improved by adding more normal vectors to the template.

In this talk, we propose an approach that extends this idea from linear to nonlinear dynamics. We linearize the system around a reference trajectory and solve ODEs to obtain templates that bound the reach set. The ODEs are particular in that they involve an optimization problem constrained by the template itself. We show how, similarly to the linear case, the template can be adapted over time to match the dynamics of the system. For both static and dynamic templates, we identify conditions that guarantee convergence. The potential of the approach is discussed on several benchmarks.

Asynchronous concurrency is a model of concurrency based on the concept of tasks. It executes tasks one-at-a-time, choosing the next task to run from a set called the task buffer, adding tasks to the buffer using a “post” primitive, and scheduling a new task only when the currently running task gives up control. Task can depend on each other using explicit “wait” operations. This model and its variants are used widely in languages such as JavaScript, C# (the async and await primitives), or the monadic concurrency libraries of Haskell and OCaml. The more restricted scheduling separates it from the more commonly considered multi-threaded programming model.

In this talk, I will address talk about two projects. The first project deals with the question how we can reason about asynchronous programs. We present a program logic and a corresponding type system that allow us to reason locally about programs with asynchronous concurrency and mutable state; we instantiate this model for OCaml using the Lwt library. The key innovation is to introduce the concept of a “wait permission”, which describes the resources that a given task will yield on termination. An earlier version of this work was published at ECOOP ’15. The second project deals with the question how we can perform a kind of “asynchronous parallelization” optimization, where we are given a (more-or-less) sequential program and rewrite it to make use of asynchronous concurrency. We use a set of program rewriting rules, most notably replacing synchronous I/O operations with asynchronous counterparts in a safe way, and pushing wait statements as far back as possible. As it turns out, proving the soundness of these rewriting rules is surprisingly tricky; I will sketch a reasoning approach that allows us to show refinement, in the the following sense: Let $e$ be a program, and $e’$ the result of rewriting $e$ using the given rules. For every terminating execution of $e’$, there is a corresponding terminating execution of $e$ that ends in an equivalent state.

Interactive program synthesizers enable a user to communicate his/her intent via input-output examples. Unfortunately, such synthesizers only guarantee that the synthesized program is correct on the provided examples. A user that wishes to guarantee correctness for all possible inputs has to manually inspect the synthesized program, an error-prone and challenging task.

In this talk, I will present a novel synthesis framework that communicates only through (abstract) examples and guarantees that the synthesized program is correct on all inputs.

The main idea is to use abstract examples—a new form of examples that represent a potentially unbounded set of concrete examples. An abstract example captures how part of the input space is mapped to corresponding outputs by the synthesized program. Our framework uses a generalization algorithm to compute abstract examples which are then presented to the user. The user can accept an abstract example, or provide a counterexample in which case the synthesizer will explore a different program. When the user accepts a set of abstract examples that covers the entire input space, the synthesis process is completed.

I will also show some experimental results to demonstrate that our synthesizer communicates with the user effectively.

Joint work with Sharon Shoham and Eran Yahav.