forked from acl-org/acl-anthology
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2020.fnp.xml
346 lines (346 loc) · 46 KB
/
2020.fnp.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
<?xml version='1.0' encoding='UTF-8'?>
<collection id="2020.fnp">
<volume id="1" ingest-date="2020-11-29">
<meta>
<booktitle>Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation</booktitle>
<editor><first>Dr Mahmoud</first><last>El-Haj</last></editor>
<editor><first>Dr Vasiliki</first><last>Athanasakou</last></editor>
<editor><first>Dr Sira</first><last>Ferradans</last></editor>
<editor><first>Dr Catherine</first><last>Salzedo</last></editor>
<editor><first>Dr Ans</first><last>Elhag</last></editor>
<editor><first>Dr Houda</first><last>Bouamor</last></editor>
<editor><first>Dr Marina</first><last>Litvak</last></editor>
<editor><first>Dr Paul</first><last>Rayson</last></editor>
<editor><first>Dr George</first><last>Giannakopoulos</last></editor>
<editor><first>Nikiforos</first><last>Pittaras</last></editor>
<publisher>COLING</publisher>
<address>Barcelona, Spain (Online)</address>
<month>December</month>
<year>2020</year>
</meta>
<frontmatter>
<url hash="e65cdc18">2020.fnp-1.0</url>
</frontmatter>
<paper id="1">
<title>The Financial Narrative Summarisation Shared Task (<fixed-case>FNS</fixed-case> 2020)</title>
<author><first>Mahmoud</first><last>El-Haj</last></author>
<author><first>Ahmed</first><last>AbuRa’ed</last></author>
<author><first>Marina</first><last>Litvak</last></author>
<author><first>Nikiforos</first><last>Pittaras</last></author>
<author><first>George</first><last>Giannakopoulos</last></author>
<pages>1–12</pages>
<abstract>This paper presents the results and findings of the Financial Narrative Summarisation shared task (FNS 2020) on summarising UK annual reports. The shared task was organised as part of the 1st Financial Narrative Processing and Financial Narrative Summarisation Workshop (FNP-FNS 2020). The shared task included one main task which is the use of either abstractive or extractive summarisation methodologies and techniques to automatically summarise UK financial annual reports. FNS summarisation shared task is the first to target financial annual reports. The data for the shared task was created and collected from publicly available UK annual reports published by firms listed on the London Stock Exchange (LSE). A total number of 24 systems from 9 different teams participated in the shared task. In addition we had 2 baseline summarisers and additional 2 topline summarisers to help evaluate and compare against the results of the participants.</abstract>
<url hash="b438a705">2020.fnp-1.1</url>
</paper>
<paper id="2">
<title>The Financial Document Structure Extraction Shared task (<fixed-case>F</fixed-case>in<fixed-case>T</fixed-case>oc 2020)</title>
<author><first>Najah-Imane</first><last>Bentabet</last></author>
<author><first>Rémi</first><last>Juge</last></author>
<author><first>Ismail</first><last>El Maarouf</last></author>
<author><first>Virginie</first><last>Mouilleron</last></author>
<author><first>Dialekti</first><last>Valsamou-Stanislawski</last></author>
<author><first>Mahmoud</first><last>El-Haj</last></author>
<pages>13–22</pages>
<abstract>This paper presents the FinTOC-2020 Shared Task on structure extraction from financial documents, its participants results and their findings. This shared task was organized as part of The 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation (FNP-FNS 2020), held at The 28th International Conference on Computational Linguistics (COLING’2020). This shared task aimed to stimulate research in systems for extracting table-of-contents (TOC) from investment documents (such as financial prospectuses) by detecting the document titles and organizing them hierarchically into a TOC. For the second edition of this shared task, two subtasks were presented to the participants: one with English documents and the other one with French documents.</abstract>
<url hash="c5cc5e48">2020.fnp-1.2</url>
</paper>
<paper id="3">
<title>The Financial Document Causality Detection Shared Task (<fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020)</title>
<author><first>Dominique</first><last>Mariko</last></author>
<author><first>Hanna</first><last>Abi-Akl</last></author>
<author><first>Estelle</first><last>Labidurie</last></author>
<author><first>Stephane</first><last>Durfort</last></author>
<author><first>Hugues</first><last>De Mazancourt</last></author>
<author><first>Mahmoud</first><last>El-Haj</last></author>
<pages>23–32</pages>
<abstract>We present the FinCausal 2020 Shared Task on Causality Detection in Financial Documents and the associated FinCausal dataset, and discuss the participating systems and results. Two sub-tasks are proposed: a binary classification task (Task 1) and a relation extraction task (Task 2). A total of 16 teams submitted runs across the two Tasks and 13 of them contributed with a system description paper. This workshop is associated to the Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation (FNP-FNS 2020), held at The 28th International Conference on Computational Linguistics (COLING’2020), Barcelona, Spain on September 12, 2020.</abstract>
<url hash="08403b89">2020.fnp-1.3</url>
</paper>
<paper id="4">
<title><fixed-case>L</fixed-case>ang<fixed-case>R</fixed-case>esearch<fixed-case>L</fixed-case>ab_<fixed-case>NC</fixed-case> at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Task 1: A Knowledge Induced Neural Net for Causality Detection</title>
<author><first>Raksha</first><last>Agarwal</last></author>
<author><first>Ishaan</first><last>Verma</last></author>
<author><first>Niladri</first><last>Chatterjee</last></author>
<pages>33–39</pages>
<abstract>Identifying causal relationships in a text is essential for achieving comprehensive natural language understanding. The present work proposes a combination of features derived from pre-trained BERT with linguistic features for training a supervised classifier for the task of Causality Detection. The Linguistic features help to inject knowledge about the semantic and syntactic structure of the input sentences. Experiments on the FinCausal Shared Task1 datasets indicate that the combination of Linguistic features with BERT improves overall performance for causality detection. The proposed system achieves a weighted average F1 score of 0.952 on the post-evaluation dataset.</abstract>
<url hash="aa7535c1">2020.fnp-1.4</url>
</paper>
<paper id="5">
<title><fixed-case>GB</fixed-case>e at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Task 2: Span-based Causality Extraction for Financial Documents</title>
<author><first>Guillaume</first><last>Becquin</last></author>
<pages>40–44</pages>
<abstract>This document describes a system for causality extraction from financial documents submitted as part of the FinCausal 2020 Workshop. The main contribution of this paper is a description of the robust post-processing used to detect the number of cause and effect clauses in a document and extract them. The proposed system achieved a weighted-average F1 score of more than 95% for the official blind test set during the post-evaluation phase and exact clauses match for 83% of the documents.</abstract>
<url hash="9fe275df">2020.fnp-1.5</url>
</paper>
<paper id="6">
<title><fixed-case>LIORI</fixed-case> at the <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020 Shared task</title>
<author><first>Denis</first><last>Gordeev</last></author>
<author><first>Adis</first><last>Davletov</last></author>
<author><first>Alexey</first><last>Rey</last></author>
<author><first>Nikolay</first><last>Arefiev</last></author>
<pages>45–49</pages>
<abstract>In this paper, we describe the results of team LIORI at the FinCausal 2020 Shared task held as a part of the 1st Joint Workshop on Financial Narrative Processing and MultiLingual Financial Summarisation. The shared task consisted of two subtasks: classifying whether a sentence contains any causality and labelling phrases that indicate causes and consequences. Our team ranked 1st in the first subtask and 4th in the second one. We used Transformer-based models with joint-task learning and their ensembles.</abstract>
<url hash="9c0046b2">2020.fnp-1.6</url>
</paper>
<paper id="7">
<title><fixed-case>JDD</fixed-case> @ <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Task 2: Financial Document Causality Detection</title>
<author><first>Toshiya</first><last>Imoto</last></author>
<author><first>Tomoki</first><last>Ito</last></author>
<pages>50–54</pages>
<abstract>This paper describes the approach we built for the Financial Document Causality Detection Shared Task (FinCausal-2020) Task 2: Cause and Effect Detection. Our approach is based on a multi-class classifier using BiLSTM with Graph Convolutional Neural Network (GCN) trained by minimizing the binary cross entropy loss. In our approach, we have not used any extra data source apart from combining the trial and practice dataset. We achieve weighted F1 score to 75.61 percent and are ranked at 7-th place.</abstract>
<url hash="fec1db74">2020.fnp-1.7</url>
</paper>
<paper id="8">
<title><fixed-case>UPB</fixed-case> at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal-2020, Tasks 1 & 2: Causality Analysis in Financial Documents using Pretrained Language Models</title>
<author><first>Marius</first><last>Ionescu</last></author>
<author><first>Andrei-Marius</first><last>Avram</last></author>
<author><first>George-Andrei</first><last>Dima</last></author>
<author><first>Dumitru-Clementin</first><last>Cercel</last></author>
<author><first>Mihai</first><last>Dascalu</last></author>
<pages>55–59</pages>
<abstract>Financial causality detection is centered on identifying connections between different assets from financial news in order to improve trading strategies. FinCausal 2020 - Causality Identification in Financial Documents – is a competition targeting to boost results in financial causality by obtaining an explanation of how different individual events or chain of events interact and generate subsequent events in a financial environment. The competition is divided into two tasks: (a) a binary classification task for determining whether sentences are causal or not, and (b) a sequence labeling task aimed at identifying elements related to cause and effect. Various Transformer-based language models were fine-tuned for the first task and we obtained the second place in the competition with an F1-score of 97.55% using an ensemble of five such language models. Subsequently, a BERT model was fine-tuned for the second task and a Conditional Random Field model was used on top of the generated language features; the system managed to identify the cause and effect relationships with an F1-score of 73.10%. We open-sourced the code and made it available at: https://github.com/avramandrei/FinCausal2020.</abstract>
<url hash="fb9b3635">2020.fnp-1.8</url>
</paper>
<paper id="9">
<title><fixed-case>NITK</fixed-case> <fixed-case>NLP</fixed-case> at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal-2020 Task 1 Using <fixed-case>BERT</fixed-case> and Linear models.</title>
<author><first>Hariharan</first><last>R L</last></author>
<author><first>Anand Kumar</first><last>M</last></author>
<pages>60–63</pages>
<abstract>FinCausal-2020 is the shared task which focuses on the causality detection of factual data for financial analysis. The financial data facts don’t provide much explanation on the variability of these data. This paper aims to propose an efficient method to classify the data into one which is having any financial cause or not. Many models were used to classify the data, out of which SVM model gave an F-Score of 0.9435, BERT with specific fine-tuning achieved best results with F-Score of 0.9677.</abstract>
<url hash="d621f784">2020.fnp-1.9</url>
</paper>
<paper id="10">
<title>Fraunhofer <fixed-case>IAIS</fixed-case> at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Tasks 1 & 2: Using Ensemble Methods and Sequence Tagging to Detect Causality in Financial Documents</title>
<author><first>Maren</first><last>Pielka</last></author>
<author><first>Rajkumar</first><last>Ramamurthy</last></author>
<author><first>Anna</first><last>Ladi</last></author>
<author><first>Eduardo</first><last>Brito</last></author>
<author><first>Clayton</first><last>Chapman</last></author>
<author><first>Paul</first><last>Mayer</last></author>
<author><first>Rafet</first><last>Sifa</last></author>
<pages>64–68</pages>
<abstract>The FinCausal 2020 shared task aims to detect causality on financial news and identify those parts of the causal sentences related to the underlying cause and effect. We apply ensemble-based and sequence tagging methods for identifying causality, and extracting causal subsequences. Our models yield promising results on both sub-tasks, with the prospect of further improvement given more time and computing resources. With respect to task 1, we achieved an F1 score of 0.9429 on the evaluation data, and a corresponding ranking of 12/14. For task 2, we were ranked 6/10, with an F1 score of 0.76 and an ExactMatch score of 0.1912.</abstract>
<url hash="b612206a">2020.fnp-1.10</url>
</paper>
<paper id="11">
<title><fixed-case>NTUNLPL</fixed-case> at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Task 2:Improving Causality Detection Using <fixed-case>V</fixed-case>iterbi Decoder</title>
<author><first>Pei-Wei</first><last>Kao</last></author>
<author><first>Chung-Chi</first><last>Chen</last></author>
<author><first>Hen-Hsen</first><last>Huang</last></author>
<author><first>Hsin-Hsi</first><last>Chen</last></author>
<pages>69–73</pages>
<abstract>In order to provide an explanation of machine learning models, causality detection attracts lots of attention in the artificial intelligence research community. In this paper, we explore the cause-effect detection in financial news and propose an approach, which combines the BIO scheme with the Viterbi decoder for addressing this challenge. Our approach is ranked the first in the official run of cause-effect detection (Task 2) of the FinCausal-2020 shared task. We not only report the implementation details and ablation analysis in this paper, but also publish our code for academic usage.</abstract>
<url hash="8e862016">2020.fnp-1.11</url>
</paper>
<paper id="12">
<title><fixed-case>F</fixed-case>i<fixed-case>NLP</fixed-case> at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020 Task 1: Mixture of <fixed-case>BERT</fixed-case>s for Causal Sentence Identification in Financial Texts</title>
<author><first>Sarthak</first><last>Gupta</last></author>
<pages>74–79</pages>
<abstract>This paper describes our system developed for the sub-task 1 of the FinCausal shared task in the FNP-FNS workshop held in conjunction with COLING-2020. The system classifies whether a financial news text segment contains causality or not. To address this task, we fine-tune and ensemble the generic and domain-specific BERT language models pre-trained on financial text corpora. The task data is highly imbalanced with the majority non-causal class; therefore, we train the models using strategies such as under-sampling, cost-sensitive learning, and data augmentation. Our best system achieves a weighted F1-score of 96.98 securing 4th position on the evaluation leaderboard. The code is available at https://github.com/sarthakTUM/fincausal</abstract>
<url hash="60bbeee2">2020.fnp-1.12</url>
</paper>
<paper id="13">
<title><fixed-case>P</fixed-case>rosper<fixed-case>AM</fixed-case>net at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Task 1 & 2: Modeling causality in financial texts using multi-headed transformers</title>
<author><first>Zsolt</first><last>Szántó</last></author>
<author><first>Gábor</first><last>Berend</last></author>
<pages>80–84</pages>
<abstract>This paper introduces our efforts at the FinCasual shared task for modeling causality in financial utterances. Our approach uses the commonly and successfully applied strategy of fine-tuning a transformer-based language model with a twist, i.e. we modified the training and inference mechanism such that our model produces multiple predictions for the same instance. By designing such a model that returns k>1 predictions at the same time, we not only obtain a more resource efficient training (as opposed to fine-tuning some pre-trained language model k independent times), but our results indicate that we are also capable of obtaining comparable or even better evaluation scores that way. We compare multiple strategies for combining the k predictions of our model. Our submissions got ranked third on both subtasks of the shared task.</abstract>
<url hash="1b3a265f">2020.fnp-1.13</url>
</paper>
<paper id="14">
<title><fixed-case>ISIKUN</fixed-case> at the <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020: Linguistically informed Machine-learning Approach for Causality Identification in Financial Documents</title>
<author><first>Gökberk</first><last>Özenir</last></author>
<author><first>İlknur</first><last>Karadeniz</last></author>
<pages>85–89</pages>
<abstract>This paper presents our participation to the FinCausal-2020 Shared Task whose ultimate aim is to extract cause-effect relations from a given financial text. Our participation includes two systems for the two sub-tasks of the FinCausal-2020 Shared Task. The first sub-task (Task-1) consists of the binary classification of the given sentences as causal meaningful (1) or causal meaningless (0). Our approach for the Task-1 includes applying linear support vector machines after transforming the input sentences into vector representations using term frequency-inverse document frequency scheme with 3-grams. The second sub-task (Task-2) consists of the identification of the cause-effect relations in the sentences, which are detected as causal meaningful. Our approach for the Task-2 is a CRF-based model which uses linguistically informed features. For the Task-1, the obtained results show that there is a small difference between the proposed approach based on linear support vector machines (F-score 94%) , which requires less time compared to the BERT-based baseline (F-score 95%). For the Task-2, although a minor modifications such as the learning algorithm type and the feature representations are made in the conditional random fields based baseline (F-score 52%), we have obtained better results (F-score 60%). The source codes for the both tasks are available online (https://github.com/ozenirgokberk/FinCausal2020.git/).</abstract>
<url hash="5c235c68">2020.fnp-1.14</url>
</paper>
<paper id="15">
<title>Domino at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Task 1 and 2: Causal Extraction System</title>
<author><first>Sharanya</first><last>Chakravarthy</last></author>
<author><first>Tushar</first><last>Kanakagiri</last></author>
<author><first>Karthik</first><last>Radhakrishnan</last></author>
<author><first>Anjana</first><last>Umapathy</last></author>
<pages>90–94</pages>
<abstract>Automatic identification of cause-effect relationships from data is a challenging but important problem in artificial intelligence. Identifying semantic relationships has become increasingly important for multiple downstream applications like Question Answering, Information Retrieval and Event Prediction. In this work, we tackle the problem of causal relationship extraction from financial news using the FinCausal 2020 dataset. We tackle two tasks - 1) Detecting the presence of causal relationships and 2) Extracting segments corresponding to cause and effect from news snippets. We propose Transformer based sequence and token classification models with post-processing rules which achieve an F1 score of 96.12 and 79.60 on Tasks 1 and 2 respectively.</abstract>
<url hash="5960491b">2020.fnp-1.15</url>
</paper>
<paper id="16">
<title><fixed-case>IIT</fixed-case>kgp at <fixed-case>F</fixed-case>in<fixed-case>C</fixed-case>ausal 2020, Shared Task 1: Causality Detection using Sentence Embeddings in Financial Reports</title>
<author><first>Arka</first><last>Mitra</last></author>
<author><first>Harshvardhan</first><last>Srivastava</last></author>
<author><first>Yugam</first><last>Tiwari</last></author>
<pages>95–99</pages>
<abstract>The paper describes the work that the team submitted to FinCausal 2020 Shared Task. This work is associated with the first sub-task of identifying causality in sentences. The various models used in the experiments tried to obtain a latent space representation for each of the sentences. Linear regression was performed on these representations to classify whether the sentence is causal or not. The experiments have shown BERT (Large) performed the best, giving a F1 score of 0.958, in the task of detecting the causality of sentences in financial texts and reports. The class imbalance was dealt with a modified loss function to give a better metric score for the evaluation.</abstract>
<url hash="320606fe">2020.fnp-1.16</url>
</paper>
<paper id="17">
<title>Extractive Financial Narrative Summarisation based on <fixed-case>DPP</fixed-case>s</title>
<author><first>Lei</first><last>Li</last></author>
<author><first>Yafei</first><last>Jiang</last></author>
<author><first>Yinan</first><last>Liu</last></author>
<pages>100–104</pages>
<abstract>We participate in the FNS-Summarisation 2020 shared task to be held at FNP 2020 workshop at COLING 2020. Based on Determinantal Point Processes (DPPs), we build an extractive automatic financial summarisation system for the specific task. In this system, we first analyze the long report data to select the important narrative parts and generate an intermediate document. Next, we build the kernel Matrix L for the intermediate document, which represents the quality of its sentences. On the basis of L, we then can use the DPPs sampling algorithm to choose those sentences with high quality and diversity as the final summary sentences.</abstract>
<url hash="30d861d5">2020.fnp-1.17</url>
</paper>
<paper id="18">
<title><fixed-case>P</fixed-case>oin<fixed-case>T</fixed-case>-5: Pointer Network and <fixed-case>T</fixed-case>-5 based Financial Narrative Summarisation</title>
<author><first>Abhishek</first><last>Singh</last></author>
<pages>105–111</pages>
<abstract>Companies provide annual reports to their shareholders at the end of the financial year that de-scribes their operations and financial conditions. The average length of these reports is 80, andit may extend up to 250 pages long. In this paper, we propose our methodology PoinT-5 (thecombination of Pointer Network and T-5 (Test-to-text transfer Transformer) algorithms) that weused in the Financial Narrative Summarisation (FNS) 2020 task. The proposed method usesPointer networks to extract important narrative sentences from the report, and then T-5 is used toparaphrase extracted sentences into a concise yet informative sentence. We evaluate our methodusing Rouge-N (1,2), L, and SU4. The proposed method achieves the highest precision scores inall the metrics and highest F1 scores in three out of four evaluation metrics that are Rouge 1, 2,and LCS and only solution to cross MUSE solution baseline in Rouge-LCS metrics.</abstract>
<url hash="1c7a8b0f">2020.fnp-1.18</url>
</paper>
<paper id="19">
<title>Combining financial word embeddings and knowledge-based features for financial text summarization <fixed-case>UC</fixed-case>3<fixed-case>M</fixed-case>-<fixed-case>MC</fixed-case> System at <fixed-case>FNS</fixed-case>-2020</title>
<author><first>Jaime</first><last>Baldeon Suarez</last></author>
<author><first>Paloma</first><last>Martínez</last></author>
<author><first>Jose Luis</first><last>Martínez</last></author>
<pages>112–117</pages>
<abstract>This paper describes the systems proposed by HULAT research group from Universidad Carlos III de Madrid (UC3M) and MeaningCloud (MC) company to solve the FNS 2020 Shared Task on summarizing financial reports. We present a narrative extractive approach that implements a statistical model comprised of different features that measure the relevance of the sentences using a combination of statistical and machine learning methods. The key to the model’s performance is its accurate representation of the text, since the word embeddings used by the model have been trained with the summaries of the training dataset and therefore capture the most salient information from the reports. The systems’ code can be found at https://github.com/jaimebaldeon/FNS-2020.</abstract>
<url hash="30935e72">2020.fnp-1.19</url>
</paper>
<paper id="20">
<title>End-to-end Training For Financial Report Summarization</title>
<author><first>Moreno</first><last>La Quatra</last></author>
<author><first>Luca</first><last>Cagliero</last></author>
<pages>118–123</pages>
<abstract>Quoted companies are requested to periodically publish financial reports in textual form. The annual financial reports typically include detailed financial and business information, thus giving relevant insights into company outlooks. However, a manual exploration of these financial reports could be very time consuming since most of the available information can be deemed as non-informative or redundant by expert readers. Hence, an increasing research interest has been devoted to automatically extracting domain-specific summaries, which include only the most relevant information. This paper describes the SumTO system architecture, which addresses the Shared Task of the Financial Narrative Summarisation (FNS) 2020 contest. The main task objective is to automatically extract the most informative, domain-specific textual content from financial, English-written documents. The aim is to create a summary of each company report covering all the business-relevant key points. To address the above-mentioned goal, we propose an end-to-end training method relying on Deep NLP techniques. The idea behind the system is to exploit the syntactic overlap between input sentences and ground-truth summaries to fine-tune pre-trained BERT embedding models, thus making such models tailored to the specific context. The achieved results confirm the effectiveness of the proposed method, especially when the goal is to select relatively long text snippets.</abstract>
<url hash="628f542b">2020.fnp-1.20</url>
</paper>
<paper id="21">
<title><fixed-case>SCE</fixed-case>-<fixed-case>SUMMARY</fixed-case> at the <fixed-case>FNS</fixed-case> 2020 shared task</title>
<author><first>Marina</first><last>Litvak</last></author>
<author><first>Natalia</first><last>Vanetik</last></author>
<author><first>Zvi</first><last>Puchinsky</last></author>
<pages>124–129</pages>
<abstract>With the constantly growing amount of information, the need arises to automatically summarize this written information. One of the challenges in the summary is that it’s difficult to generalize. For example, summarizing a news article is very different from summarizing a financial earnings report. This paper reports an approach for summarizing financial texts, which are different from the documents from other domains at least in three parameters: length, structure, and format. Our approach considers these parameters, it is adapted to hierarchical structure of sections, document length, and special “language”. The approach builds an hierarchical summary, visualized as a tree with summaries under different discourse topics. The approach was evaluated using extrinsic and intrinsic automated evaluations, which are reported in this paper. As all participants of the Financial Narrative Summarisation (FNS 2020) shared task, we used FNS2020 dataset for evaluations.</abstract>
<url hash="74521073">2020.fnp-1.21</url>
</paper>
<paper id="22">
<title>Knowledge Graph and Deep Neural Network for Extractive Text Summarization by Utilizing Triples</title>
<author><first>Amit</first><last>Vhatkar</last></author>
<author><first>Pushpak</first><last>Bhattacharyya</last></author>
<author><first>Kavi</first><last>Arya</last></author>
<pages>130–136</pages>
<abstract>In our research work, we represent the content of the sentence in graphical form after extracting triples from the sentences. In this paper, we will discuss novel methods to generate an extractive summary by scoring the triples. Our work has also touched upon sequence-to-sequence encoding of the content of the sentence, to classify it as a summary or a non-summary sentence. Our findings help to decide the nature of the sentences forming the summary and the length of the system generated summary as compared to the length of the reference summary.</abstract>
<url hash="4a1d877d">2020.fnp-1.22</url>
</paper>
<paper id="23">
<title><fixed-case>AMEX</fixed-case> <fixed-case>AI</fixed-case>-Labs: An Investigative Study on Extractive Summarization of Financial Documents</title>
<author><first>Piyush</first><last>Arora</last></author>
<author><first>Priya</first><last>Radhakrishnan</last></author>
<pages>137–142</pages>
<abstract>We describe the work carried out by AMEX AI-LABS on an extractive summarization benchmark task focused on Financial Narratives Summarization (FNS). This task focuses on summarizing annual financial reports which poses two main challenges as compared to typical news document summarization tasks : i) annual reports are more lengthier (average length about 80 pages) as compared to typical news documents, and ii) annual reports are more loosely structured e.g. comprising of tables, charts, textual data and images, which makes it challenging to effectively summarize. To address this summarization task we investigate a range of unsupervised, supervised and ensemble based techniques. We find that ensemble based techniques perform relatively better as compared to using only the unsupervised and supervised based techniques. Our ensemble based model achieved the highest rank of 9 out of 31 systems submitted for the benchmark task based on Rouge-L evaluation metric.</abstract>
<url hash="75e35e28">2020.fnp-1.23</url>
</paper>
<paper id="24">
<title>Extractive Summarization System for Annual Reports</title>
<author><first>Abderrahim</first><last>Ait Azzi</last></author>
<author><first>Juyeon</first><last>Kang</last></author>
<pages>143–147</pages>
<abstract>In this paper, we report on our experiments in building a summarization system for generating summaries from annual reports. We adopt an “extractive” summarization approach in our hybrid system combining neural networks and rules-based algorithms with the expectation that such a system may capture key sentences or paragraphs from the data. A rules-based TOC (Table Of Contents) extraction and a binary classifier of narrative section titles are main components of our system allowing to identify narrative sections and best candidates for extracting final summaries. As result, we propose one to three summaries per document according to the classification score of narrative section titles.</abstract>
<url hash="f61b35da">2020.fnp-1.24</url>
</paper>
<paper id="25">
<title><fixed-case>SUMSUM</fixed-case>@<fixed-case>FNS</fixed-case>-2020 Shared Task</title>
<author><first>Siyan</first><last>Zheng</last></author>
<author><first>Anneliese</first><last>Lu</last></author>
<author><first>Claire</first><last>Cardie</last></author>
<pages>148–152</pages>
<abstract>This paper describes the SUMSUM systems submitted to the Financial Narrative Summarization Shared Task (FNS-2020). We explore a section-based extractive summarization method tailored to the structure of financial reports: our best system parses the report Table of Contents (ToC), splits the report into narrative sections based on the ToC, and applies a BERT-based classifier to each section to determine whether it should be included in the summary. Our best system ranks 4<sup>th</sup>, 1<sup>st</sup>, 2<sup>nd</sup> and 17<sup>th</sup> on the Rouge-1, Rouge-2, Rouge-SU4, and Rouge-L official metrics, respectively. We also report results on the validation set using an alternative set of Rouge-based metrics that measure performance with respect to the best-matching of the available gold summaries.</abstract>
<url hash="51a62fce">2020.fnp-1.25</url>
</paper>
<paper id="26">
<title><fixed-case>AMEX</fixed-case>-<fixed-case>AI</fixed-case>-<fixed-case>LABS</fixed-case>: Investigating Transfer Learning for Title Detection in Table of Contents Generation</title>
<author><first>Dhruv</first><last>Premi</last></author>
<author><first>Amogh</first><last>Badugu</last></author>
<author><first>Himanshu</first><last>Sharad Bhatt</last></author>
<pages>153–157</pages>
<abstract>We present a transfer learning approach for Title Detection in FinToC 2020 challenge. Our proposed approach relies on the premise that the geometric layout and character features of the titles and non-titles can be learnt separately from a large corpus, and their learning can then be transferred to a domain-specific dataset. On a domain-specific dataset, we train a Deep Neural Net on the text of the document along with a pre-trained model for geometric and character features. We achieved an F-Score of 83.25 on the test set and secured top rank in the title detection task in FinToC 2020.</abstract>
<url hash="9f300557">2020.fnp-1.26</url>
</paper>
<paper id="27">
<title><fixed-case>UWB</fixed-case>@<fixed-case>F</fixed-case>in<fixed-case>TOC</fixed-case>-2020 Shared Task: Financial Document Title Detection</title>
<author><first>Tomáš</first><last>Hercig</last></author>
<author><first>Pavel</first><last>Kral</last></author>
<pages>158–162</pages>
<abstract>This paper describes our system created for the Financial Document Structure Extraction Shared Task (FinTOC-2020): Title Detection. We rely on the Apache PDFBox library to extract text and all additional information e.g. font type and font size from the financial prospectuses. Our constrained system uses only the provided training data without any additional external resources. Our system is based on the Maximum Entropy classifier and various features including font type and font size. Our system achieves F1 score 81% and #1 place in the French track and F1 score 77% and #2 place among 5 participating teams in the English track.</abstract>
<url hash="c9e62cf1">2020.fnp-1.27</url>
</paper>
<paper id="28">
<title>Taxy.io@<fixed-case>F</fixed-case>in<fixed-case>TOC</fixed-case>-2020: Multilingual Document Structure Extraction using Transfer Learning</title>
<author><first>Frederic</first><last>Haase</last></author>
<author><first>Steffen</first><last>Kirchhoff</last></author>
<pages>163–168</pages>
<abstract>In this paper we describe our system submitted to the FinTOC-2020 shared task on financial doc- ument structure extraction. We propose a two-step approach to identify titles in financial docu- ments and to extract their table of contents (TOC). First, we identify text blocks as candidates for titles using unsupervised learning based on character-level information of each document. Then, we apply supervised learning on a self-constructed regression task to predict the depth of each text block in the document structure hierarchy using transfer learning combined with document features and layout features. It is noteworthy that our single multilingual model performs well on both tasks and on different languages, which indicates the usefulness of transfer learning for title detection and TOC generation. Moreover, our approach is independent of the presence of actual TOC pages in the documents. It is also one of the few submissions to the FinTOC-2020 shared task addressing both subtasks in both languages, English and French, with one single model.</abstract>
<url hash="65681eb8">2020.fnp-1.28</url>
</paper>
<paper id="29">
<title><fixed-case>DNLP</fixed-case>@<fixed-case>F</fixed-case>in<fixed-case>TOC</fixed-case>’20: Table of Contents Detection in Financial Documents</title>
<author><first>Dijana</first><last>Kosmajac</last></author>
<author><first>Stacey</first><last>Taylor</last></author>
<author><first>Mozhgan</first><last>Saeidi</last></author>
<pages>169–173</pages>
<abstract>Title Detection and Table of Contents Generation are important components in detecting document structure. In particular, these two elements serve to provide the skeleton of the document, providing users with an understanding of organization, as well as the relevance of information, and where to find information within the document. Here, we show that using tesseract with Levenstein distance, a feature set inspired by Alk et al., we were able to correctly classify the title to an F1 measure 0.73 and 0.87, and the table-of-contents to a harmonic mean of 0.36 and 0.39, in English and French respectively. Our methodology works with both PDF and scanned documents, giving it a wide range of applicability within the document engineering and storage domains.</abstract>
<url hash="0a28a8a2">2020.fnp-1.29</url>
</paper>
<paper id="30">
<title>Daniel@<fixed-case>F</fixed-case>in<fixed-case>TOC</fixed-case>’2 Shared Task: Title Detection and Structure Extraction</title>
<author><first>Emmanuel</first><last>Giguet</last></author>
<author><first>Gaël</first><last>Lejeune</last></author>
<author><first>Jean-Baptiste</first><last>Tanguy</last></author>
<pages>174–180</pages>
<abstract>We present our contributions for the 2020 FinTOC Shared Tasks: Title Detection and Table of Contents Extraction. For the Structure Extraction task, we propose an approach that combines information from multiple sources: the table of contents, the wording of the document, and lexical domain knowledge. For the title detection task, we compare surface features to character-based features on various training configurations. We show that title detection results are very sensitive to the kind of training dataset used.</abstract>
<url hash="1cc225b0">2020.fnp-1.30</url>
</paper>
<paper id="31">
<title>A Computational Analysis of Financial and Environmental Narratives within Financial Reports and its Value for Investors</title>
<author><first>Felix</first><last>Armbrust</last></author>
<author><first>Henry</first><last>Schäfer</last></author>
<author><first>Roman</first><last>Klinger</last></author>
<pages>181–194</pages>
<abstract>Public companies are obliged to include financial and non-financial information within their cor- porate filings under Regulation S-K, in the United States (SEC, 2010). However, the requirements still allow for manager’s discretion. This raises the question to which extent the information is actually included and if this information is at all relevant for investors. We answer this question by training and evaluating an end-to-end deep learning approach (based on BERT and GloVe embeddings) to predict the financial and environmental performance of the company from the “Management’s Discussion and Analysis of Financial Conditions and Results of Operations” (MD&A) section of 10-K (yearly) and 10-Q (quarterly) filings. We further analyse the mediating effect of the environmental performance on the relationship between the company’s disclosures and financial performance. Hereby, we address the results of previous studies regarding environ- mental performance. We find that the textual information contained within the MD&A section does not allow for conclusions about the future (corporate) financial performance. However, there is evidence that the environmental performance can be extracted by natural language processing methods.</abstract>
<url hash="93580757">2020.fnp-1.31</url>
</paper>
<paper id="32">
<title>Information Extraction from Federal Open Market Committee Statements</title>
<author><first>Oana</first><last>Frunza</last></author>
<pages>195–203</pages>
<abstract>We present a novel approach to unsupervised information extraction by identifying and extracting relevant concept-value pairs from textual data. The system’s building blocks are domain agnostic, making it universally applicable. In this paper, we describe each component of the system and how it extracts relevant economic information from U.S. Federal Open Market Committee (FOMC) statements. Our methodology achieves an impressive 96% accuracy for identifying relevant information for a set of seven economic indicators: household spending, inflation, unemployment, economic activity, fixed in-vestment, federal funds rate, and labor market.</abstract>
<url hash="63b90458">2020.fnp-1.32</url>
</paper>
<paper id="33">
<title>Mitigating Silence in Compliance Terminology during Parsing of Utterances</title>
<author><first>Esme</first><last>Manandise</last></author>
<author><first>Conrad</first><last>de Peuter</last></author>
<pages>204–212</pages>
<abstract>This paper reports on an approach to increase multi-token-term recall in a parsing task. We use a compliance-domain parser to extract, during the process of parsing raw text, terms that are unlisted in the terminology. The parser uses a similarity measure (Generalized Dice Coefficient) between listed terms and unlisted term candidates to (i) determine term status, (ii) serve putative terms to the parser, (iii) decrease parsing complexity by glomming multi-tokens as lexical singletons, and (iv) automatically augment the terminology after parsing of an utterance completes. We illustrate a small experiment with examples from the tax-and-regulations domain. Bootstrapping the parsing process to detect out- of-vocabulary terms at runtime increases parsing accuracy in addition to producing other benefits to a natural-language-processing pipeline, which translates arithmetic calculations written in English into computer-executable operations.</abstract>
<url hash="fb22bb11">2020.fnp-1.33</url>
</paper>
<paper id="34">
<title>Hierarchical summarization of financial reports with <fixed-case>RUNNER</fixed-case></title>
<author><first>Marina</first><last>Litvak</last></author>
<author><first>Natalia</first><last>Vanetik</last></author>
<author><first>Zvi</first><last>Puchinsky</last></author>
<pages>213–225</pages>
<abstract>With the constantly growing amount of information, the need arises to automatically summarize this written information. One of the challenges in the summary is that it’s difficult to generalize. For example, summarizing a news article is very different from summarizing a financial earnings report. This paper reports an approach for summarizing financial texts, which are different from the documents from other domains at least in three parameters: length, structure, and format. Our approach considers these parameters, it is adapted to hierarchical structure of sections, document length, and special “language”. The approach builds an hierarchical summary, visualized as a tree with summaries under different discourse topics. The approach was evaluated using extrinsic and intrinsic automated evaluations, which are reported in this paper. As all participants of the Financial Narrative Summarisation (FNS 2020) shared task, we used FNS2020 dataset for evaluations.</abstract>
<url hash="79d4efc0">2020.fnp-1.34</url>
</paper>
<paper id="35">
<title>Predicting Modality in Financial Dialogue</title>
<author><first>Kilian</first><last>Theil</last></author>
<author><first>Heiner</first><last>Stuckenschmidt</last></author>
<pages>226–234</pages>
<abstract>In this paper, we perform modality prediction in financial dialogue. To this end, we introduce a new dataset and develop a binary classifier to detect strong or weak modal answers depending on surface, lexical, and semantic representations of the preceding question and financial features. To do so, we contrast different algorithms, feature categories, and fusion methods. Perhaps counter-intuitively, our results indicate that the strongest features for the given task are financial uncertainty measures such as market and individual firm risk.</abstract>
<url hash="7f24f5c5">2020.fnp-1.35</url>
</paper>
<paper id="36">
<title>Extracting Fine-Grained Economic Events from Business News</title>
<author><first>Gilles</first><last>Jacobs</last></author>
<author><first>Veronique</first><last>Hoste</last></author>
<pages>235–245</pages>
<abstract>Based on a recently developed fine-grained event extraction dataset for the economic domain, we present in a pilot study for supervised economic event extraction. We investigate how a state-of-the-art model for event extraction performs on the trigger and argument identification and classification. While F1-scores of above 50% are obtained on the task of trigger identification, we observe a large gap in performance compared to results on the benchmark ACE05 dataset. We show that single-token triggers do not provide sufficient discriminative information for a fine-grained event detection setup in a closed domain such as economics, since many classes have a large degree of lexico-semantic and contextual overlap.</abstract>
<url hash="e44a86fd">2020.fnp-1.36</url>
</paper>
</volume>
</collection>