Skip to content

Repository for storing dataset related with the paper "A Three-level Classification of French Tweets in Ecological Crisis"

License

Notifications You must be signed in to change notification settings

DiegoKoz/french_ecological_crisis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Dataset description

This repository contains the first crisis-related dataset for French, developed at IRIT-Toulouse University and IJN-CNRS/ENS/EHESS/PSL. The dataset is about 13 ecological crises that occurred in metropolitan France and its overseas departments targeting the most discussed crises in the French media from 2016 to 2019: two floods that occured in the Aude and Corsica regions, ten storms (Béryl, Berguitta, Fionn, Eleanor, Bruno, Egon, Ulrika, Susanna, Fakir and Ana) and two hurricanes (Irma and Harvey).

It is composed of around 13k tweets posted before (24h before), during (48h) and after the crisis (72h after). Each message has been annotated according to three dimensions: relatedness (useful vs. non useful), urgency (non useful vs. urgent vs. non urgent), and intentions to act (6 categories among: human/material damages, warnings/advices, critics, etc.).

The distribution of the tweets in the dataset is as follows:

Repository structure

The dataset is in csv format, each line contains:

tweetID crisisname period relatedness_annotation urgency_annotation intention_annotation

where:

  • tweetID: unique Tweet ID
  • crisisname: the name of the crisis
  • period: before (24h before), during (48h) and after the crisis (72h after).
  • relatedness_annotation: useful vs. non useful
  • urgency_annotation: urgent, non urgent, non useful
  • intention_annotation: Human damages, Material damages, Warning-advice, Support, Critics, Other messages.

To obtain the individual tweets, use the Twitter API of your choice and query for the ID's provided.

Reference

More information about the dataset and the guidelines can be found in the following article:

Diego Kozlowski, Elisa Lannelongue, Frédéric Saudemont, Farah Benamara, Alda Mari, Véronique Moriceau, Abdelmoumene Boumadane (2020). A three-level classification of French tweets in ecological crises. Information Processing & Management, Volume 57, Issue 5, 2020.

If you use any of the provided material in your work, please cite us as follows

@article{Kozlowski-et-al2020,
title = "A three-level classification of French tweets in ecological crises",
journal = "Information Processing & Management",
volume = "57",
number = "5",
pages = "102284",
year = "2020",
issn = "0306-4573",
doi = "https://doi.org/10.1016/j.ipm.2020.102284",
url = "http://www.sciencedirect.com/science/article/pii/S0306457320300650",
author = "Diego Kozlowski and Elisa Lannelongue and Frédéric Saudemont and Farah Benamara and Alda Mari and Véronique Moriceau and Abdelmoumene Boumadane",
keywords = "Crisis response from social media, Machine learning, Natural language processing, Transfer learning",
}

Licence

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Contact information

If you have any question or suggestion, do not hesitate to contact us at the following addresses:

Aknowledgement

This work has been supported by the Chemi-INTACT project funded by French Ministère de l'Intérieur (Department of Home Affairs) that involved the Institut de Recherche en Informatique de Toulouse (IRIT) and Institut Jean Nicod (IJN).

About

Repository for storing dataset related with the paper "A Three-level Classification of French Tweets in Ecological Crisis"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published