Sample Datasets
-
XML documents
- movies.xml: all the movies in one file, without references.
- zipmovies.zip: a collection, one file for each movie file; each movie is self-contained (no refs)
- movies_refs.xml , including all the movies, with a list of movie elements that refer to artists elements
- movies_alone.xml and artists_alone.xml: one file for movies, another for artists; each movies refers to directors and actors by their ID.
-
JSON documents
- jsonmovies.zip: all the movies encoded in JSON, in one file, without references.
- JSon files representing the DBLP books
-
Text documents
- Text files with one line per biblio entry, beginning with the author: small.txt, medium.txt, large.txt
- Text files with one line per proceedings entry: small.txt, medium.txt,