Archive

Archive for March, 2009

PeerSoN : a P2P social networks

March 10th, 2009
Comments Off

Report on the presentation of Sonja Buchegger, March 9th, 2009
See PeerSoN web site and slides for more details
Warning : this report outlines the understanding of the post author (Alban Galland) and nothing more.

Context

Ubiquitous computing is a model where devices and systems collaborate to solve tasks given by the user without him being conscious of it. This paradigm leads to problems of privacy, since you leave trace everywhere in a virtual world integrated to real world. All these data could be used for data-mining, from advertising to surveillance. This virtual world usually suffer for a lack of memory loss. These systems also tends to centralize the data of the users on a part of the system. The personal (private), public and commercial spheres collide in this context.

Social networks are another model where this privacy issue is risen, since users store very personal data on these systems. They are usually web 2.0 services which need Internet connexion. The main feature is to let users keep in touch with their friend in an ambient way.

Integrating ubiquitous computing and social networks in an ubiquitous P2P social network helping privacy is then specially challenging. One of the main reason to design such a system is that social networks naturally collide with real world and ubiquity is then specially desirable. It also solves most of the ownership question about data and avoid that systems dictate terms of use.

Distribution

Social networks and ubiquitous computing are naturally distributed. PeerSoN use a distributed storage of data. To solve online availability problem, it uses replication on friends, the keys parameters chosen given a trace of users characterizing their temporal and geographical distribution. To solve boot-strapping, it also use storage on random nodes. The peers communicate directly but they use a DHT for lookup. This DHT was build using openDHT and Planet Labs in a first version, but too many availability problems lead to a centralized emulation of HT (put/get/remove operations) on the current version. The peers are identified by the hash of a globally unique identifier (such as email address) . When connecting to the DHT, the user register his user id, his machine IP and his data.

Direct exchange

In order not to be dependent of a network connection (and to go further on the ubiquity), the design should take in account delay tolerant networks. It is useful to carry information from friend to friend. Asynchronous messaging is an example of such content. But it is not clear that distribution will work well this way. It is also useful for storage, since the system should use the storage available around.

Access control

There is a trade-off between privacy and search. The user defines what he want to be searchable. The system emulates a fine-grained access control with keys (whom can see which part of the profile). This method would also provide protection against storage provider. The key management emulates a standard public key infrastructure and key may be exchanged by direct contact.

Related work and issues

  • Distributed file management: usually, the assumptions are that data is stable and interests follow Zipf’s distribution. In SN context, data change a lot and distribution of interest is local
  • Anonymity: distribution in a DHT leaves less traces of the query
  • Media storage: the storage should be optimized using novelty.

On-going work

Response time testings using different assumptions on the network.

News , , ,

Sonja Buchegger visiting Webdam

March 9th, 2009
Comments Off

Sonja Buchegger is visiting Webdam Monday 9 March 2009. She is a senior research scientist at the Deutsche Telekom Laboratories, Berlin.
She will present Peerson, a P2P Social Networks at 2:00pm in the meeting room of building N.

Title : Peerson: P2P Social Networks

Summary : Online Social Networks like Facebook, MySpace, Xing, etc. have become extremely popular. Yet they have some limitations that we want to overcome for a next generation of social networks: privacy concerns and requirements of Internet connectivity, both of which are due to web-based applications on a central site whose owner has access to all data.
To overcome these limitations, we envision a paradigm shift from client-server to a peer-to-peer infrastructure coupled with encryption so that users keep control of their data and can use the social network also locally, without Internet access. This shift gives rise to many research questions intersecting networking, security, distributed systems and social network analysis, leading to a better understanding of how technology can support social interactions.

Short Bio : Sonja Buchegger is a senior research scientist at the Deutsche Telekom Laboratories, Berlin. In 2005 and 2006, she was a post-doctoral scholar at the University of California at Berkeley, School of Information. She received her Ph.D. in Communication Systems from EPFL, Switzerland, in 2004, a graduate degree in Computer Science in 1999, and undergraduate degrees in Computer Science in 1996 and in Business Administration in 1995 from the University of Klagenfurt, Austria. In 2003 and 2004 she was a research and teaching assistant at EPFL and from 1999 to 2003 she worked at the IBM Zurich Research Laboratory in the Network Technologies Group. Her current research interests are in social, economics, and security aspects of self-organized networks.

News ,

Anonymisation in social-based P2P networks

March 2nd, 2009
Comments Off

Report on the presentation of Fabrice Le Fessant, February 23th, 2009
See slides for more details.
Warning : this report outlines the understanding of the post author (Alban Galland) and nothing more.

Context

In a context of P2P file sharing networks, some malicious peer may try to keep a log of the queries issued on the network in order to build upload and download profiles of other peers. To avoid censorship in particular, one may want to design a network where non-trusted peers may contribute to the life of the network without being able to locate publisher neither querier. A social-based P2P network naturally fits this requirement : friends are not hidden but trusted and they can anonymise the exchanges.

Previous work

There is already some social based P2P networks, such as the turtle network. It is close to gnutella but based on social network, which means that connexions are chosen and trusted. The search is done by flooding, which is quiet expensive in bandwidth.

There is also some anti-censorship networks, such as freenet. It manages small encrypted documents. The search is done by depth-first search, oriented by a notion of distance between users. The data is accessed by replication on the back-path. Such a network could be easily limited to friends.

Gnunet is another example of anti-censorship networks. The search is done by a limited breadth-first search. It use a shortcut system to randomly modify the id on the queries for the anonymisation. There is also a credit system to avoid flooding. It has been shown that these two optimizations are indeed a weakness for the anonymisation.

Some clues about Orkut

Some simulations have been done based on a trace of Orkut. They raised interesting questions about the topology of the network.

  • What is the distribution of the nodes degrees?
  • What happen for the connectivity when removing nodes?

The answers of these questions deeply depend of how the crawl have been made.

Problem

  • How to manage big files?
  • How to specify the level of the attacker to have different theoretical guaranties?
  • How to restrict real network to sub-network?

Ideas

The load should be balanced between query time and publication time. Most of the P2P methods are based on the query, but one could also think of diffusion process when a resource is published (through subscription to feeds, replication or local index tables materialization). Both methods could be mixed. It is the case in structured networks such as DHT where a distributed index is materialized and queried.

Finally, the methods should be optimized depending of the file type and the file size.

News , , ,