Report on the presentation of Sonja Buchegger, March 9th, 2009
See PeerSoN web site and slides for more details
Warning : this report outlines the understanding of the post author (Alban Galland) and nothing more.
Ubiquitous computing is a model where devices and systems collaborate to solve tasks given by the user without him being conscious of it. This paradigm leads to problems of privacy, since you leave trace everywhere in a virtual world integrated to real world. All these data could be used for data-mining, from advertising to surveillance. This virtual world usually suffer for a lack of memory loss. These systems also tends to centralize the data of the users on a part of the system. The personal (private), public and commercial spheres collide in this context.
Social networks are another model where this privacy issue is risen, since users store very personal data on these systems. They are usually web 2.0 services which need Internet connexion. The main feature is to let users keep in touch with their friend in an ambient way.
Social networks and ubiquitous computing are naturally distributed. PeerSoN use a distributed storage of data. To solve online availability problem, it uses replication on friends, the keys parameters chosen given a trace of users characterizing their temporal and geographical distribution. To solve boot-strapping, it also use storage on random nodes. The peers communicate directly but they use a DHT for lookup. This DHT was build using openDHT and Planet Labs in a first version, but too many availability problems lead to a centralized emulation of HT (put/get/remove operations) on the current version. The peers are identified by the hash of a globally unique identifier (such as email address) . When connecting to the DHT, the user register his user id, his machine IP and his data.
In order not to be dependent of a network connection (and to go further on the ubiquity), the design should take in account delay tolerant networks. It is useful to carry information from friend to friend. Asynchronous messaging is an example of such content. But it is not clear that distribution will work well this way. It is also useful for storage, since the system should use the storage available around.
There is a trade-off between privacy and search. The user defines what he want to be searchable. The system emulates a fine-grained access control with keys (whom can see which part of the profile). This method would also provide protection against storage provider. The key management emulates a standard public key infrastructure and key may be exchanged by direct contact.
Related work and issues
- Distributed file management: usually, the assumptions are that data is stable and interests follow Zipf’s distribution. In SN context, data change a lot and distribution of interest is local
- Anonymity: distribution in a DHT leaves less traces of the query
- Media storage: the storage should be optimized using novelty.
Response time testings using different assumptions on the network.