Archive

Archive for the ‘News’ Category

Alexandra Meliou: Reverse-engineering data transformations

May 6th, 2013
Comments Off

Alexandra Meliou will be visiting the LSV on Monday 13 May, she will give a talk at 10:30 in the LSV library at ENS-Cachan.

Title

Reverse-engineering data transformations: A new perspective on data management

Abstract

Current trends have seen data grow larger, more intertwined, and more diverse, as more and more users contribute to and use it. This trend has given rise to the need to support richer data analysis tasks. Such tasks involve determining the causes of observations, finding and correcting the sources of error in query results, as well as modifying the data in order to make it conform to complex desirable properties.
In this talk, I will discuss data analysis tasks that require tracing the history of data and reverse-engineering the transformations that it undergoes. I will discuss two tasks in particular: (1) providing explanations through support for causal queries, and (2) modifying datasets based on high-level declarative constraints . First, I will show how to apply causal reasoning to tuple provenance in order to determine the causes of query results, and to identify the source of possible errors. I will present extensive analysis of the data complexity for the case of conjunctive queries, and focus on a complete dichotomy between NP-hard and PTIME cases for the problem of computing responsibility.
Finally, I will demonstrate the Tiresias system, the first how-to query engine, which seamlessly integrates database systems with constrained problem solving capabilities. The contributions of the system are threefold: (a) a declarative interface for defining how-to queries over a database, (b) translation rules from the declarative statements to the constrained problem specification, and (c) a suite of data-specific optimizations that allow scaling to large data sizes. Initial results of our prototype system implementation show order-of-magnitude speedups to state-of-the-art solver runtimes, which indicates that there are significant gains in pushing this functionality within the database engine. I will conclude with a discussion of the next steps with the Tiresias system, and the bigger vision of reverse data management.

Bio

Alexandra Meliou is an Assistant Professor and the Department of Computer Science, at the University of Massachusetts, Amherst. She has held this position since September 2012. Prior to that, she was a Post-Doctoral Research Associate at the University of Washington, working with Dan Suciu. Alexandra received her Ph.D and M.S. degrees from the Electrical Engineering and Computer Sciences Department at the University of California, Berkeley, in 2009 and 2005 respectively. She is a 2008 Siebel Scholar, and her research interests are in the area of data and information management, with a current emphasis on provenance, causality, and reverse data management.

News

Yanlei Diao: Scalable, Low-Latency Data Analytics and its Applications

December 6th, 2012
Comments Off

When: Thursday, December 20th 2012, 14.00, room 445, PCRI

Abstract

An integral part of many data-intensive applications is the need to collect and analyze enormous data sets, such as click streams, search logs, and sensor streams to derive answers and insights with low latencies. Concurrently, new programming models and architectures have been developed for large-scale cluster computing, exemplified by recent MapReduce systems. However, these systems are designed for batch processing and require data set to be fully loaded into the cluster before running analytical queries, hence causing high delays of query answers.

In this talk, I present the design of a scalable, low-latency analytics platform, called Scalla, that fundamentally transforms the existing cluster computing paradigm into an incremental parallel processing paradigm, which provides the combined benefits of massive parallelism, incremental answers, and I/O efficiency. Our technical contributions include replacing an existing popular mechanism for partitioned parallelism with a purely hash-based mechanism and using dynamic frequency analysis to offer in-memory processing for most of the data. In this talk, I will also examine two application scenarios, click stream analysis, which has been used in our evaluation, and genomic data analysis, which is a new project that leverages Scalla for massive-scale genomic data processing and analysis.

Short bio

Yanlei Diao is an Associate Professor of Computer Science at the University of Massachusetts Amherst. Her research interests are in information architectures and data management systems, with a focus on large-scale data analysis, data streams, uncertain data management, and flash memory databases. She received her PhD in Computer Science from the University of California, Berkeley in 2005, her M.S. in Computer Science from the Hong Kong University of Science and Technology in 2000, and her B.S. in Computer Science from Fudan University in 1998.

Yanlei Diao was a recipient of the NSF Career Award and the IBM Scalable Innovation Faculty Award, and was a finalist of the Microsoft Research New Faculty Fellowship. She spoke at the Distinguished Faculty Lecture Series at the University of Texas at Austin. Her PhD dissertation “Query Processing for Large-Scale XML Message Brokering” won the 2006 ACM-SIGMOD Dissertation Award Honorable Mention. She is an associate editor of PVLDB 2013 and has served on the organizing committees of SIGMOD, CIDR, DMSN, the New Researcher Symposium, and the New England Database Summit. She has served on program committees of numerous international conferences and workshops.

Events, News

WebDam-MoDaS Workshop in Eilat 2012

November 6th, 2012
Comments Off

Information and the report about the Eilat workshop are available here.

You could also find the details on the website

Events, News

Seminar Yannis Papakonstantinou – Friday September 14th – ENS Cachan

August 30th, 2012
Comments Off

Speaker: Prof. Yannis Papakonstantinou, Computer Science and
Engineering, Univ of California at San Diego

When: Friday September 14th, 10:30

Where: ENS Cachan http://www.ens-cachan.fr/
amphithéâtre 121, Léonard de Vinci building

Title: Declarative, optimizable data-driven specifications of web &
mobile applications

Abstract:
Developers of web and mobile application development write too much
low level “plumbing” code to efficiently access, integrate and
coordinate application state that resides on multiple sub-systems of
the architecture, and is accessed using different languages: SQL at
the database server; HTML and Javascript at the browser, which in
HTML5 includes its own database state; Java or other programming
languages at the application server.

In the spirit of Active XML, the FORWARD project replaces such low level
code with declarative specifications. Its cornerstones are
(i) the unified application state virtual database, which enables
modeling and manipulating the entire application state in an extension
of SQL, named SQL++
(ii) specification of Ajax pages as essentially rendered views over
the unified application state.

We discuss problems solved in the last three years and the system
resulting from this activity. We then discuss a cluster of issues resulting
from both mobile agents and demanding Big Data visualizations and
propose a recently-initiated effort on an asynchronous SQL.

Consequently the following three problems are resolved by appropriate
reduction to data management problems, where prior database research
literature is leveraged and extended.

1. The partial change of Ajax pages, in response to application state
changes, is reduced to an incremental view maintenance problem. Id’s
that retain the provenance of the page data play an instrumental
efficiency role.

2. Efficient data access is reduced to semistructured query processing
over an integrated view that involves large database(s) and small main
memory-based sources. We connect with prior works in OQL.

3. The inherent location transparency of the specifications is
exploited in order to perform computation at the appropriate location
(browser vs server). More broadly, the talk discusses ongoing and
future work in utilizing the increased abilities of HTML5 clients
towards achieving low latency mobile web applications applications,
while location transparency of the specifications is retained.

Short Bio:
Yannis Papakonstantinou (http://db.ucsd.edu/people/yannis.htm) is a
Professor of Computer Science and Engineering at the University of
California, San Diego. His research is in the intersection of data
management technologies and the web, where he has published over
eighty research articles. He has given multiple tutorials and invited
talks, has served on journal editorial boards and has chaired and
participated in program committees for many international conferences
and workshops.

Yannis enjoys to commercialize his research and to inform his research
accordingly. He was the CEO and Chief Scientist of Enosys Software,
which built and commercialized an early XML-based Enterprise
Information Integration platform. Enosys Software was acquired in 2003
by BEA Systems. His lab’s FORWARD platform (for the rapid development
of data-driven Ajax applications) is now in use by many commercial
applications. He is involved in data analytics in the pharmaceutical
industry and is in the technical advisory board of Brightscope Inc.

Yannis holds a Diploma of Electrical Engineering from the National
Technical University of Athens, MS and Ph.D. in Computer Science from
Stanford University (1997) and an NSF CAREER award for his work on
data integration.

Events, News

WebDam-MoDaS Workshop in Eilat

July 30th, 2012
Comments Off

This meeting will be joint between Webdam (in its last year) and
MoDaS (inits first). The meeting will bring together members of the
two projects with the best world specialists in the topics.

http://www.cs.tau.ac.il/workshop/modas/

Meeting Topic:

We are being overwhelmed by the masses of information that are
available. Typically pieces of information are noisy: imprecise,
incomplete, inconsistent. This may be the case for global information
on the public Web as well as for private information in social networks
systems. We are concerned with combining all the techniques we can
to evaluate the quality of information and work to improve it. This
will typically involve both reasoning in an imprecise environment
(asstressed by Webdam) and relying on crowd participation (as
advocated by MoDaS). The workshop will bring together the two
approaches with an emphasis on the intersection of the two topics
but also considering their disjunction to bring the two groups up to
date with the two topics.
The workshop will serve both as an assessment for Webdam and
a brainstorming for MoDaS.

Program chairs: Tova Milo (Tel Aviv University), Serge Abiteboul
(INRIA, ENSCachan)

Events, News

Webdam at BDA 2012 Summer School in the Alps (France)

November 22nd, 2011
Comments Off

The 2012 thematic summer school of the BDA (Bases de données avancées) conference on distributed very large databases will take place in Aussois (France) from May 27th 2012 to June 1st 2012 with the support of Webdam.

The summer school will take place in the Paul-Langevin center in Aussois in the middle of the Alps mountain range.

News ,

Julia Stoyanovich is going to visit Webdam on December

November 8th, 2011
Comments Off

Webdam look forward to welcoming Julia Stoyanovich from December 2nd to 9th.

Julia Stoyanovich is currently a visiting scholar at the University of Pensylvania, where she works with Professor Susan Davidson and her group’s research.

Julia Stoyanovich is motivated by the data management needs of life sciences applications and of social information processing. She is working on incorporating semantic context into search, ranking, and data exploration in large complex datasets. She is also interested in managing provenance in scientific workflows, and in the related privacy and security considerations.

News

Gerome Miklau is going to visit Webdam on December

November 3rd, 2011
Comments Off

Webdam look forward to welcoming Gerome Miklau from December 1st to 7th.

Gerome Miklau is an associate professor at the University of Massachusetts, Amherst. Professor Miklau’s research interests are in the area of Database research with an emphasis on security; database theory; semi-structured data. The objective of his research is to enable secure and trustworthy data management in both conventional database systems and distributed environments like the World Wide Web. His work focuses on classical security concerns such as confidentiality, privacy, and integrity of data.

News

PARIS: Probabilistic Alignment of Relations, Instances, and Schema (website)

November 2nd, 2011
Comments Off

A website for PARIS

One of the main challenges that the Semantic Web faces is the integration of a growing number of independently designed ontologies. In this work, we present paris, an approach for the automatic alignment of ontologies. paris aligns not only instances, but also relations and classes. Alignments at the instance level cross-fertilize with alignments at the schema level. Thereby, our system provides a truly holistic solution to the problem of ontology alignment. The heart of the approach is probabilistic, i.e., we measure degrees of matchings based on probability estimates. This allows paris to run without any parameter tuning. We demonstrate the efficiency of the algorithm and its precision through extensive experiments. In particular, we obtain a precision of around 90 % in experiments with some of the world’s largest ontologies.

News , ,

Collège de France designated Serge Abiteboul to the Annual Chair in Information Technology and Digital Sciences for 2011-2012

October 4th, 2011
Comments Off

The inaugural lecture will take place at amphitheatre Halbwachs at Collège de France, Thursday March 8th 2012 at 18h.

The lectures will start next Wednesday from 10 to 11 AM followed by a seminar given by illustrious guests. You can find the calendar with more details about the guests and the date either :

News , , ,