3 - Time Line Illustration

1. Goal

The goal is to retrieve all relevant tweets dedicated to each event of a festival, according to the program provided. We are really looking here at a kind of "total recall" retrieval, based on initial shows names, artists names, the date and time of shows.

We focus in this task on 4 festivals. Two french Music festivals, one french theater festival and one great-britain theater festival:

2. Topics

Topics are given in the file clef_mc2_task3_topics.xml

Each topic is related to one cultural event.
In our terminology, one event is one occurrence of a show (theater, music, ...).
Several occurrences of the same show correspond then to several events (e.g. plays can be presented several times during theater festivals).
More precisely, one topic is described by: one id, one festival name, one title, one artist (or band) name, one timeslot (date/time begin and end), and one location venue.

An excerpt from the topic list is:

<topics>
...
        <topic>
                <id>5</id>
                <title></title>
                <artist>Klangstof</artist>
                <festival>transmusicales</festival>
                <startdate>04/12/16-17:45</startdate>
                <enddate>04/12/16-18:30</enddate>
                <venue>UBU</venue>
        </topic>
...
</topics>

The id is an integer ranging from 1 to 664.
We see from the excerpt above that, for a live music show without any specific title, the title field is empty.
The artist name is a single artist, a list of artist names,
an artistic company name or orchestra name, as they appear in the official programs of the festivals.
The festival labels are:

  • charrues for Vielles Charrues 2015,
  • transmusicales for Transmusicales 2015,
  • avignon for Avignon 2016,
  • edinburgh for Edinburgh 2016.

For the fields and , the format is : DD/MM/YY-HH:MM .
If the start or end time is unknown, they’re replaced with : DD/MM/YY-xx:xx .
If the day is unknown, the date format is the following: -HH:MM (day is omitted).
The venue is a string corresponding to the name of the location, given by the official programs.

3. Dataset

A login is required to access the data, once registered on CLEF each registered team can obtain up to 4 extra individual logins by writing to admin@talne.eu.

Participants are required to use the full dataset to conduct their experiments:

4. Runs

The runs are expected to respect the classical trec top files format. Only the top 1000 results for each query run must be given. Each retrieved document is identified using its tweet id.
The evaluation will be achieved on a subset of the full set of topics, according to the richness of the results obtained.
The official evaluation measures planned are recall values at 5, 10, 25, 50 and 100 documents.
Each registered participant should submit no more than 6 runs. The protocol to submit the runs will be described later.
The evaluation protocol is likely to change depending on the submission received.

5. Evaluation

As much retweets will be excluded from the pools.
Tweet relevance will be based on a 3-level scale:

  • Not relevant: the tweet is not related to the topic
  • Partially relevant: the tweet is somehow related to the topic (e.g. the tweet is related to the artist, song, play but not to the event, or is related to a similar event with no possible way to check if they are the same)
  • Relevant: the tweet is related to the event

News

  • CLEF 2017 Microblog Cultural Contextualization overviews in Dublin

    Labs 4, 13:45-15:45, CMC, room 5039

    Content analysis and Microblog Search:

    1. Detailed overview
    2. participant presentations
    3. discussion towards Cultural Image Queries over Social Media.

    Labs 5, 16:45-18:15, CMC, room 5039

    Time Line Illustration:

    1. Detailed overview
    2. evaluation material release
    3. discussion towards dealing with Language Dialects and Varieties in Mining and Search over Cultural Social Media posts.

    View online : CLEF 2017 program

  • Topics released for task 3

    Topics are given in the file clef_mc2_task3_topics.xml

    There are extracted from 4 festival programs (see readme file): Vielles Charrues 2015
    Transmusicales 2015, Avignon 2016, Edinburgh 2016.

  • Topics released for tasks 1 and 2

    Topics have been released for tasks 1 and 2.

    A login is required to acces the data, once registered on CLEF each registered team can obtain up to 4 extra individual logins by writing to admin@talne.eu.

    The complete stream of 70 000 000 microblogs is available for registered participants.
    An indri Index with a web interface and online API are available to query the whole set of microblogs.

Articles in this section

  • TimeLine Illustration based on Microblogs

    by Lorraine, Philippe

    This paper by Nayanika DOGRA, Philippe MULHEM, Nawal OULD AMER, and Lorraine GOEURIOT presents the approach used by the LIG-MRIM research group to the participation of the pilot task TimeLine illustration based on Microblogs for the 2016 CLEF Cultural Microblog (...)

  • TimeLine illustration of a festival based on Microblogs

    by Lorraine, Philippe

    Objective
    The goal of this task is to link the events of a festival program to a related microblog posts. This information is very important for attendees of festivals and for organizers to get feedback.
    Microblog posts will be provided with their timestamps, which are crucial as a basis for (...)