Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grouping the proper names #1

Open
ardanalbant opened this issue Apr 18, 2016 · 2 comments
Open

Grouping the proper names #1

ardanalbant opened this issue Apr 18, 2016 · 2 comments

Comments

@ardanalbant
Copy link

ardanalbant commented Apr 18, 2016

The code on your github can find proper names but i want this:
For Example;

"Abraham Lincoln Hotel is very beautiful place and i want to go there with
Barbara Palvin. Also there are stores like Adidas ,Nike , Reebok."

The Output should be:

['Abraham Lincoln Hotel'] is very beautiful place and i want to go there with ['Barbara Palvin']. ['Also'] there are stores like ['Adidas'], ['Nike'], ['Reebok'].

As you mentioned words like "Also" isn't a problem for me because i have a lot of dataset to compare these proper names.

See Also:
http://stackoverflow.com/questions/36688176/python-group-sequential-array-members

@dereckson
Copy link
Owner

dereckson commented Apr 18, 2016

What do you have currently instead with this sentence?

To disable the first sentence part ignore, try to comment the lines 47 to 49, that should work.

@ardanalbant
Copy link
Author

ardanalbant commented Apr 18, 2016

Hi Sebastien,

First of all i can't use the script because i think there is no model like PunktWordTokenizer anymore in latest version. But i tought your output is like an array full of proper names which are not chunked right?

Also i dont know what to download in ntlk.download() for PunktWordTokenizer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants