problem while extracting the proper noun #2

nandanii · 2018-08-04T11:18:45Z

file not found at line 24" wordsEn.txt "

dereckson · 2018-08-04T19:21:43Z

The README provides the following instructions:

Source text
-----------
You need a copy of the text you want to extract from as plain text.

Source English word list
------------------------
The expected format is a list in lowercase, each line a substantive word.
Filename should be wordsEn.txt or modified in eliminate-common-nouns script.

Such file is available at http://www-01.sil.org/linguistics/wordlists/english/

Usage
-----
./extract-proper-nouns source.txt > nouns.txt

To sort them and eliminate duplicates:
./extract-proper-nouns source.txt | sort | uniq > nouns.txt

To discard known English words:
./eliminate-common-nouns nouns.txt

I guess there are two things to solve:

offer a nice message if the file hasn't been found to explain how to generate it
clarify README to indicate the download of a list of common nouns is mandatory

dereckson · 2018-08-04T19:25:48Z

This works:

wget http://www-01.sil.org/linguistics/wordlists/english/wordlist/wordsEn.txt
./extract-proper-nouns somebook.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

problem while extracting the proper noun #2

problem while extracting the proper noun #2

nandanii commented Aug 4, 2018

dereckson commented Aug 4, 2018 •

edited

Loading

dereckson commented Aug 4, 2018

problem while extracting the proper noun #2

problem while extracting the proper noun #2

Comments

nandanii commented Aug 4, 2018

dereckson commented Aug 4, 2018 • edited Loading

dereckson commented Aug 4, 2018

dereckson commented Aug 4, 2018 •

edited

Loading