You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to add 3 new ASCII-specific Str functions to our standard library. They were proposed here. Copied from that gist, they are listed here with their type signatures and proposed docs:
Str.with_ascii_lowercased : Str -> Str
Returns a version of the string with all ASCII characters lowercased. Non-ASCII characters are left unmodified. For example:
expect "CAFÉ".with_ascii_lowercased() == "cafÉ"
This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than lowercasing operations which take Unicode into account.
That said, strings received from user input can always contain non-ASCII Unicode characters, and lowercasing Unicode works differently in different languages. For example, the string "I" lowercases to "i" in English and to "ı" (a dotless i) in Turkish. These rules can also change in each Unicode release, so we have a separate unicode package for Unicode capitalization that can be upgraded independently from the language's builtins.
To do a case-insensitive comparison of the ASCII characters in a string, use caseless_ascii_equals.
Str.with_ascii_uppercased : Str -> Str
Returns a version of the string with all ASCII characters uppercased. Non-ASCII characters are left unmodified. For example:
expect "café".with_ascii_uppercased() == "CAFé"
This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than uppercasing operations which take Unicode into account.
That said, strings received from user input can always contain non-ASCII Unicode characters, and uppercasing Unicode works differently in different languages. For example, the string "i" uppercases to "I" in English and to "İ" (a dotted I) in Turkish. These rules can also change in each Unicode release, so we have a separate unicode package for Unicode capitalization that can be upgraded independently from the language's builtins.
To do a case-insensitive comparison of the ASCII characters in a string, use caseless_ascii_equals.
Str.caseless_ascii_equals : Str, Str -> Bool
Returns True if all the ASCII characters in the string are the same when ignoring differences in capitalization. Non-ASCII characters must all be exactly the same, including capitalization. For example:
The first call returns True because all the ASCII characters are the same when ignoring differences in capitalization, and the only non-ASCII character (é) is the same in both strings. The second call returns False because é and É are not ASCII characters, and they are different.
This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than case-insensitive comparisons which take Unicode into account.
That said, strings received from user input can always contain non-ASCII Unicode characters, and case-insensitive Unicode comparisons work differently in different languages. For example, the strings "i" and "I" are the same in English when ignoring capitalization. In Turkish, the case-insensitive equivalent of "i" is not "I" but rather "İ" (a dotted I), and the case-insensitive equivalent of "I" is not "i" but rather "ı" (a dotless i). These rules can also change in each Unicode release, so we have a separate unicode package for Unicode capitalization that can be upgraded independently from the language's builtins.
We want to add 3 new ASCII-specific
Str
functions to our standard library. They were proposed here. Copied from that gist, they are listed here with their type signatures and proposed docs:Str.with_ascii_lowercased : Str -> Str
Returns a version of the string with all ASCII characters lowercased. Non-ASCII characters are left unmodified. For example:
This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than lowercasing operations which take Unicode into account.
That said, strings received from user input can always contain non-ASCII Unicode characters, and lowercasing Unicode works differently in different languages. For example, the string
"I"
lowercases to"i"
in English and to"ı"
(a dotless i) in Turkish. These rules can also change in each Unicode release, so we have a separateunicode
package for Unicode capitalization that can be upgraded independently from the language's builtins.To do a case-insensitive comparison of the ASCII characters in a string, use
caseless_ascii_equals
.Str.with_ascii_uppercased : Str -> Str
Returns a version of the string with all ASCII characters uppercased. Non-ASCII characters are left unmodified. For example:
This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than uppercasing operations which take Unicode into account.
That said, strings received from user input can always contain non-ASCII Unicode characters, and uppercasing Unicode works differently in different languages. For example, the string
"i"
uppercases to"I"
in English and to"İ"
(a dotted I) in Turkish. These rules can also change in each Unicode release, so we have a separateunicode
package for Unicode capitalization that can be upgraded independently from the language's builtins.To do a case-insensitive comparison of the ASCII characters in a string, use
caseless_ascii_equals
.Str.caseless_ascii_equals : Str, Str -> Bool
Returns
True
if all the ASCII characters in the string are the same when ignoring differences in capitalization. Non-ASCII characters must all be exactly the same, including capitalization. For example:The first call returns
True
because all the ASCII characters are the same when ignoring differences in capitalization, and the only non-ASCII character (é
) is the same in both strings. The second call returnsFalse
becauseé
andÉ
are not ASCII characters, and they are different.This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than case-insensitive comparisons which take Unicode into account.
That said, strings received from user input can always contain non-ASCII Unicode characters, and case-insensitive Unicode comparisons work differently in different languages. For example, the strings
"i"
and"I"
are the same in English when ignoring capitalization. In Turkish, the case-insensitive equivalent of"i"
is not"I"
but rather"İ"
(a dotted I), and the case-insensitive equivalent of"I"
is not"i"
but rather"ı"
(a dotless i). These rules can also change in each Unicode release, so we have a separateunicode
package for Unicode capitalization that can be upgraded independently from the language's builtins.To convert a string's ASCII characters to uppercase or lowercase, use
with_ascii_uppercased
andwith_ascii_lowercased
.The text was updated successfully, but these errors were encountered: