Data

The lab give access for registered participants to a massive collection of microblogs and urls related to cultural festivals in the world.

It allows researchers in IR and NLP to experiment a broad variety of multilingual microblog search techniques (WikiPedia entity search, automatic summarization, language identification, text localization, etc.).

A login is required to acces the data, once registered on CLEF each registered team can obtain up to 4 extra individual logins by writing to admin@talne.eu.


Articles in this section

  • The festival galleries dataset

    by Eric SanJuan

    This data set allows to experiment microblog search and stream summarization.
    Microblog collection
    The document collection is provided to registered participants by ANR GAFES project. It consists in a pool of more than 50M unique micro-blogs from different sources with their meta-information (...)