-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
author last name "de Lhoneux" incorrectly uppercased to "De Lhoneux"? #171
Comments
It looks like the heuristic implemented there is that each word of the name that is all-lowercase or all-uppercase is converted to initial capitalization. It might be better to tweak the capitalization of the words of the name only if none of the words of the name distinguish uppercase and lowercase, i.e.: if len(last_name)>2:
if all(n.isupper() or n.islower() for n in last_name.split(" ")): # name does not contain any words with both uppercase and lowercase characters; impose initial-only capitalization for each word
last_name = " ".join([n[0].upper() + n[1:].lower() if (n==n.upper() or n==n.lower()) else n for n in last_name.split(" ")]) UPDATE: realized the inline if len(last_name)>2:
if all(n.isupper() or n.islower() for n in last_name.split(" ")): # name does not contain any words with both uppercase and lowercase characters; impose initial-only capitalization for each word
last_name = " ".join([n[0].upper() + n[1:].lower() for n in last_name.split(" ")]) |
I would love to see a list of names as exported from Open Review alongside the output of this function. We should really have a unit or regression test for this function since it is very important and getting it wrong causes a lot of corrections and headaches downstream. |
There was a similar issue for hyphenated names, which was addressed in #159 maybe we can take a similar approach here? Basically, it uses what people provide as ground truth and always capitalizes the first and last. Middle is only capitalized if the person capitalized it in their softconf profile (so would need to be addressed for open review. Potentially worth having shared functionality in a file of its own, so that we avoid duplicates :) |
As reported by @mdelhoneux in acl-org/acl-anthology#3208
I wonder if
aclpub2/openreview/util.py
Lines 62 to 63 in 47dc3d2
might be the culprit.
The text was updated successfully, but these errors were encountered: