Add ASCII-specific Str functions #7473

smores56 · 2025-01-06T19:40:52Z

We want to add 3 new ASCII-specific Str functions to our standard library. They were proposed here. Copied from that gist, they are listed here with their type signatures and proposed docs:

Str.with_ascii_lowercased : Str -> Str

Returns a version of the string with all ASCII characters lowercased. Non-ASCII characters are left unmodified. For example:

expect "CAFÉ".with_ascii_lowercased() == "cafÉ"

This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than lowercasing operations which take Unicode into account.

That said, strings received from user input can always contain non-ASCII Unicode characters, and lowercasing Unicode works differently in different languages. For example, the string "I" lowercases to "i" in English and to "ı" (a dotless i) in Turkish. These rules can also change in each Unicode release, so we have a separate unicode package for Unicode capitalization that can be upgraded independently from the language's builtins.

To do a case-insensitive comparison of the ASCII characters in a string, use caseless_ascii_equals.

Str.with_ascii_uppercased : Str -> Str

Returns a version of the string with all ASCII characters uppercased. Non-ASCII characters are left unmodified. For example:

expect "café".with_ascii_uppercased() == "CAFé"

This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than uppercasing operations which take Unicode into account.

That said, strings received from user input can always contain non-ASCII Unicode characters, and uppercasing Unicode works differently in different languages. For example, the string "i" uppercases to "I" in English and to "İ" (a dotted I) in Turkish. These rules can also change in each Unicode release, so we have a separate unicode package for Unicode capitalization that can be upgraded independently from the language's builtins.

To do a case-insensitive comparison of the ASCII characters in a string, use caseless_ascii_equals.

Str.caseless_ascii_equals : Str, Str -> Bool

Returns True if all the ASCII characters in the string are the same when ignoring differences in capitalization. Non-ASCII characters must all be exactly the same, including capitalization. For example:

expect "café".caseless_ascii_equals("CAFé")

expect !"café".caseless_ascii_equals("CAFÉ")

The first call returns True because all the ASCII characters are the same when ignoring differences in capitalization, and the only non-ASCII character (é) is the same in both strings. The second call returns False because é and É are not ASCII characters, and they are different.

This function is useful for things like command-line options and environment variables where you know in advance that you're dealing with a hardcoded string containing only ASCII characters. It has better performance than case-insensitive comparisons which take Unicode into account.

That said, strings received from user input can always contain non-ASCII Unicode characters, and case-insensitive Unicode comparisons work differently in different languages. For example, the strings "i" and "I" are the same in English when ignoring capitalization. In Turkish, the case-insensitive equivalent of "i" is not "I" but rather "İ" (a dotted I), and the case-insensitive equivalent of "I" is not "i" but rather "ı" (a dotless i). These rules can also change in each Unicode release, so we have a separate unicode package for Unicode capitalization that can be upgraded independently from the language's builtins.

To convert a string's ASCII characters to uppercase or lowercase, use with_ascii_uppercased and with_ascii_lowercased.

The text was updated successfully, but these errors were encountered:

HajagosNorbert · 2025-01-06T19:57:04Z

Hi! I want to do this, can you assign me?

smores56 · 2025-01-06T20:21:03Z

Good luck, and may the odds be ever in your favor

smores56 added good first issue Good for newcomers builtins Relates to roc builtins like Bool, List, Str ... labels Jan 6, 2025

smores56 assigned HajagosNorbert Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ASCII-specific Str functions #7473

Add ASCII-specific Str functions #7473

smores56 commented Jan 6, 2025

HajagosNorbert commented Jan 6, 2025

smores56 commented Jan 6, 2025

Add ASCII-specific Str functions #7473

Add ASCII-specific Str functions #7473

Comments

smores56 commented Jan 6, 2025

Str.with_ascii_lowercased : Str -> Str

Str.with_ascii_uppercased : Str -> Str

Str.caseless_ascii_equals : Str, Str -> Bool

HajagosNorbert commented Jan 6, 2025

smores56 commented Jan 6, 2025