Skip to content

CS-lol: a Dataset for Audience Opinion Mining of E-sports Live-streaming

License

Notifications You must be signed in to change notification settings

junj1ehx/CS-lol

Repository files navigation

CS-lol

CS-lol: a Dataset of Viewer Comment with Scene in E-sports Live-streaming

License

The dataset CS-lol follows CC-BY-NC-SA-4.0. Thus, this dataset are freely available for academic purpose or individual research, but restricted for commecial use.

the source code YouTube-vtt-to-srt.py is inherited from ytb-vtt-to-srt project, which follows the MPL v2.0 license.

Citation

if you use the developed tools or dataset from this work, please kindly cite our paper:


@inproceedings{cslol,
author = {Xu, Junjie H. and Nakano, Yu and Kong, Lingrong and Iizuka, Kojiro},
title = {CS-Lol: A Dataset of Viewer Comment with Scene in E-Sports Live-Streaming},
year = {2023},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
booktitle = {Proceedings of the 2023 Conference on Human Information Interaction and Retrieval},
pages = {422–426},
location = {Austin, TX, USA},
series = {CHIIR '23}
}

About usage of CS-lol

Twitch (Comments)

According to Official narrative by Twitch

8. User Content
Twitch allows you to distribute streaming live and pre-recorded audio-visual works;
to use services, such as chat, bulletin boards, forum postings, wiki contributions, 
and voice interactive services; and to participate in other activities 
in which you may create, post, transmit, perform, or store content,
messages, text, sound, images, applications, code, or other data 
or materials on the Twitch Services (“User Content”).

Follow aforementioned terms, we conclude the use of user comments (chat while watching live-streaming) are license-free. Moreover, considering the process of constructing dataset takes time and different construction process might case results varies, we distribute the raw dataset used in this work.

Youtube (Discriptions)

All of contents used in this dataset use subtitles that automatically generated by integrated ASR system in Youtube. However Youtube itself does not seems to have the license for those subtitles, as it is a kind of transformation of creation of other people, namely (impromptu) transcripts written by commentators.

Due to not sure about the license, alternatively we provide the script to download those subtitles as well as preprocessing script to transform it into the data expected using our implementation of retrieval models.

To get Descriptions please follows README.md

Need help?

if you want to have request or question, feel free to send issue here, send your request via email to jhxu dotto acm.org or DM me in twitter.

Thank you for your attention!

About

CS-lol: a Dataset for Audience Opinion Mining of E-sports Live-streaming

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published