Fixed emoji to text conversion for emoji not surrounded by whitespace #57
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes this issue: Not predicting sentiment of emoticons correctly #56
Since the current method splits up the text into tokens by whitespace, it won't recognize multiple emoji in a row without whitespace, ie "😀😀😀" isn't given any meaning since the exact string "😀😀😀" isn't in the emoji lexicon, when it should probably have the same meaning as "😀 😀 😀". By checking for emoji on a character by character basis should fix this. Example output after the fix:
The compound score goes up as expected for three emoji in a row