Webdam Project

WebDam-MoDaS Workshop in Eilat

September 27th, 2013

Joint workshop on Web data management and Crowd data sourcing

Eilat, October 2012

Presentation

This meeting is joint between Webdam (in its last year) and MoDaS (inits first). The meeting will bring together members of the two projects with the best world specialists in the topics.

Meeting Topic:

We are being overwhelmed by the masses of information that are available. Typically pieces of information are noisy: imprecise, incomplete, inconsistent. This may be the case for global information on the public Web as well as for private information in social networks systems. We are concerned with combining all the techniques we can to evaluate the quality of information and work to improve it. This will typically involve both reasoning in an imprecise environment (as stressed by Webdam) and relying on crowd participation (as advocated by MoDaS). The workshop will bring together the two approaches with an emphasis on the intersection of the two topics but also considering their disjunction to bring the two groups up to date with the two topics. The workshop will serve both as an assessment for Webdam and a brainstorming for MoDaS.

Program chairs: Tova Milo (Tel Aviv University), Serge Abiteboul
(INRIA, ENSCachan)

Brief closing words

The participants

This was a reasonably small workshop (in terms of number of participants). However, the diversity of the talks brought up a number of opportunities of synergies between complementing approaches. The two topics are fascinating and very active. The workshop highlighted the rich interaction between them.
This is a brief conclusion that attempts to summarize some discussions during the brain storming meetings at the workshop. We will ignore a large number of issues that were raised but were already nicely covered in the talks and can be found in their slides. We will focus on a few issues that we felt are more novel and striking.
This brief report is organized as follows. We briefly consider Web data management, then crowd data sourcing. Finally we discuss issues that relate to both together.

Web data management

In the spirit of the evolution of the Webdam project for the last couple of years, the focus of workshop on that topic was on personal and social data management. This is putting more emphasis on imprecision, inconsistencies, beliefs, opinions, etc. This is also bringing up in this setting the issue of ontology alignment (since there is no reason all individuals should use the same ontology).
With notably the work of Webdam, it is now understood that approaches to these problems require combining distributed data management, knowledge management (deductive databases), and probabilistic data management. This clearly requires more work as well as investigating issues still largely unexplored such as privacy.

Crowd data sourcing

For crowd data sourcing, the workshop highlighted the richness of applications notably in sciences. Crowd data sourcing is a more recent topic and this workshop helped clarify some of its aspects:
By essence, crowd data sourcing leads to an open-world semantics where facts, that are still not known to be true, may be stated by individuals.
The issue was raised of the distinction between facts and opinions. It is sometimes possible to approach this issue using probabilities: a fact has quasi sure probability whereas an opinion does not. This relates clearly to beliefs (how many people think A holds) and trust (how trustworthy are these people).

Towards a world of knowledge for machines and humans

To see a simple example, suppose we want to contact a friend. A system may try to help locating this friend using the information available in the web (social network systems, personal agendas, etc.). This network of systems may try to reason collectively to find an answer. A system participating in this task may also target individuals with questions in a crowd sourcing style, possibly ask them to validate some beliefs. Similarly, information can be pushed to the user as a result of a collective effort by machines and people. We can thus envision a world where machines and humans collaborate to process information. This was the idea underlying the cooperation between MoDas and Webdam.
An essential difference between the two projects is distribution. Distribution is in the essence of Web data management. The data sources are distributed (and autonomous). For now, the crowd data sourcing works in MoDas seem to privilege more a centralized setting. But there is no fundamental reason for that.
On the other hand, we kept running into issues that were found to be common to both topics during the workshop:

Imprecision plays a critical role in both cases with issues such as uncertainty, inconsistencies, trust and belief. In particular, probabilities play an important role for both.
Scaling of course in the number of machines or of humans.
Intentionality/Open world.
In both case, data and knowledge exist somewhere and has to be discovered:
- By exchanging knowledge between systems (Webdam)
- By asking individuals (Modas)
- By integrating/interrogating knowledge/data bases (many talks)
- By deciding how to allocate tasks (both projects).

All these aspects may be seen somewhat sketching the contours of a wide research area that encompasses both Webdam and Modas.

The participants

Alin Deutsch, UCSD	Amélie Marian, Rutgers	Anastasia Ailamaki, EPFL
Benny Kimelfeld, IBM	Catriel Beeri, Hebrew University	Christoph Koch, EPFL
Daniel Deutch, Ben Gurion Univesity	Émilien Antoine, Paris Sud University	Ezra Levin, Tel Aviv University
Fabian Suchanek, Max-Planck Institute	Gerome Miklau, UMASS	H. V. Jagadish, University of Michigan
Ilia Lotosh, Tel Aviv University	Julia Stoyanovich, UPenn	Lilach Messinger, Tel Aviv U
Lior Wolf, Tel Aviv University	Marilena Oita, Telecom ParisTech	Meghyn Bienvenu, Leo-IASI team, LRI
Mike Franklin, Berkley	Ohad Greenshpan, Tel Aviv University	Peter Buneman, University of Edinburgh
Pierre Senellart, Telecom ParisTech	Rubi Boim, Tel AViv University	Sara Cohen, Hebrew University
Serge Abiteboul, INRIA	Yehoshua Sagiv, Hebrew University	Slava Novogdrov, Tel Aviv University
Susan Davidson, UPenn	Tova Milo, Tel Aviv University	Val Tannen, UPenn
Victor Vianu, UCSD	Yael Amsterdamer, Tel Aviv University	Yael Grossman, Tel Aviv University
	Yaron Kanza, Technion

Program

Monday19:00-21:00:

Dinner

Tuesday9:00-10:00:

10:00-11:00:

(Session Chair – Serge Abiteboul)

Gong Show – the 20 speakers, 2 slides One topic you think is very important to work on (2 minutes max each!)

11:30-13:00:

(Session Chair – Tova Milo)

CrowdDB: Query Processing with People and Machines, Mike Franklin
Using the Crowd for Top-k and Group-by Queries, Susan Davidson
Crowdsourcing the reconstruction of the Cairo Genizah, Lior Wolf
Pricing Aggregate Queries in a Data Marketplace, Gerome Miklau

13:00- 16:00:

Free discussion by the pool

16:00-17:30:

(Session Chair – Susan Davidson)

19:00-21:00:

Dinner

Wednesday9:00-10:30:

(Session Chair – Daniel Deutch)

11:00-12:30:Panel:Tentative topic:How can humans and systems collaborate in a social network to answer queries: issues and challenges. S. Abiteboul (moderator), P. Buneman, M. Franklin, H.V. Jagadish13:00-17:00:

Snorkeling + lunch at Coral Beach

18:30-19:30:

Brainstorming (organization: Y. Amsterdamer, E. Antoine, R. Boim)

19:30-21:00:

Dinner

Thursday9:00-10:00:

(Session Chair – H.V. Jagadish)

10:30-12:00:

(Session Chair – Benny Kimelfeld)

12:00-12:15:

Closing by Tova and Serge

Comments are closed.

WebDam-MoDaS Workshop in Eilat

Joint workshop on Web data management and Crowd data sourcing Eilat, October 2012

Presentation

Brief closing words

The participants

Web data management

Crowd data sourcing

Towards a world of knowledge for machines and humans

The participants

Program

Joint workshop on Web data management and Crowd data sourcing

Eilat, October 2012