From 547c20e4c31c736fa8f46612cb987033e9cf5099 Mon Sep 17 00:00:00 2001 From: Santiago Castro Date: Sat, 14 Mar 2020 16:11:40 -0400 Subject: [PATCH 1/2] Fix typo in README --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index eef2835..9d2417e 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ Hasan, Md Kamrul, Wasifur Rahman, Amir Zadeh, Jianyuan Zhong, Md Iftekhar Tanvee You can find the version of the dataset that we used in the EMNLP paper in the following link: (https://github.com/ROC-HCI/UR-FUNNY/blob/master/UR-FUNNY-V1.md) ## UR-FUNNY-V2 -We have created second version of the dataset which removes nosiy data instances and the humor insatnces has no overlap. This new version also has more context sentences. You will also find the raw videos in here. The format of this version is simialr to previous one. Please read the followings for details about the extracted features. +We have created second version of the dataset which removes nosiy data instances and the humor instances has no overlap. This new version also has more context sentences. You will also find the raw videos in here. The format of this version is simialr to previous one. Please read the followings for details about the extracted features. raw videos: (https://www.dropbox.com/s/lg7kjx0kul3ansq/urfunny2_videos.zip?dl=1) extracted features: (https://www.dropbox.com/sh/9h0pcqmqoplx9p2/AAC8yYikSBVYCSFjm3afFHQva?dl=1) From 11a79b966d60d9b1f82caa38342529e02c8a1eb9 Mon Sep 17 00:00:00 2001 From: Santiago Castro Date: Sat, 14 Mar 2020 16:12:05 -0400 Subject: [PATCH 2/2] Fix typo in README --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9d2417e..68a4212 100644 --- a/README.md +++ b/README.md @@ -41,7 +41,7 @@ In the extracted features folder, it has five pkl files: data_folds.pkl has the ductionary that contains train, dev and test list of humor/not humor video segments **id**. -## Langauge Features: +## Language Features: word_embedding_list.pkl has the list of word embeddings of all unique words that are present in the UR-FUNNY dataset. We use the **word indexes** from this list as language feature. Later we can use these **word indexes** to retrive the glove embedding of those words. We followed this approach to reduce the space. Because same word appears multiple times.