Émilien Antoine will defend his PhD thesis on December 5th 2013 at 2pm in the “Pavillon des Jardins” meeting room, at ENS de Cachan (61, avenue du Président Wilson, 94230 Cachan).
Title: “Distributed data management with a declarative rule-based language: Webdamlog”
The jury will be composed of:
Serge Abiteboul (supervisor)
Christine Collet (reviewer)
Pascal Molli (reviewer)
Nicole Bidoit (examiner)
Bogdan Cautis (examiner)
David Gross-Amblard (examiner)
Information management on the Internet relies on a wide variety of systems, each specialized for a particular task. The personal data and favorite applications of a Web user are typically distributed across many heterogeneous devices and systems, e.g., residing on a smartphone, laptop, tablet, TV box, or managed by Facebook, Google, etc. Additional data and computational resources are also available to the user from relatives, friends, colleagues, possibly via social network systems. Because of the distribution and heterogeneity, the management of personal data and knowledge has become a major challenge.
For instance, let us consider a typical Web user who has a blog, a Facebook account, a Dropbox account, and also stores data on his smartphone and laptop. This user wants to post on his blog a review of the last movie he watched. He also wishes to advertise his review to his Facebook friends and to include a link to his Dropbox folder where the movie has been uploaded. This is a cumbersome task to carry out manually, yet writing a script for it, is far beyond the skills of most Web users.
Our goal is to enable a Web user to easily specify distributed data management tasks “in place”, i.e. without centralizing the data to a single provider. Our system is therefore not a replacement for Facebook, or any centralized system, but an alternative that allows users to launch their own peers on their machines with their own personal data, and to collaborate with Web services.
We introduce Webdamlog, a datalog-style language for managing distributed data and knowledge. The language extends datalog in a number of ways, notably with a novel feature, namely delegation, allowing peers to exchange not only facts but also rules. We present a user study that demonstrates the usability of the language. We describe a Webdamlog engine that extends a distributed datalog engine, namely Bud, with the support of delegation and of a number of other novelties of Webdamlog such as the possibility to have variables denoting peers or relations. We mention novel optimization techniques, notably one based on the provenance of facts and rules. We exhibit experiments that demonstrate that the rich features of Webdamlog can be supported at reasonable cost and that the engine scales to large volumes of data. Finally, we discuss the implementation of a Webdamlog peer system that provides an environment for the engine. In particular, a peer supports wrappers to exchange Webdamlog data with non-Webdamlog peers. We illustrate these peers by presenting a picture management application that we used for demonstration purposes.
The defense will be followed by a reception in the “Pavillon des Jardins” at “ENS de Cachan”, where you are also invited.
More details at