String.next_codepoint
You're seeing just the function
next_codepoint
, go back to String module for more information.
Specs
Returns the next code point in a string.
The result is a tuple with the code point and the
remainder of the string or nil
in case
the string reached its end.
As with other functions in the String
module, next_codepoint/1
works with binaries that are invalid UTF-8. If the string starts
with a sequence of bytes that is not valid in UTF-8 encoding, the
first element of the returned tuple is a binary with the first byte.
Examples
iex> String.next_codepoint("olá")
{"o", "lá"}
iex> invalid = "\x80\x80OK" # first two bytes are invalid in UTF-8
iex> {_, rest} = String.next_codepoint(invalid)
{<<128>>, <<128, 79, 75>>}
iex> String.next_codepoint(rest)
{<<128>>, "OK"}
Comparison with binary pattern matching
Binary pattern matching provides a similar way to decompose a string:
iex> <<codepoint::utf8, rest::binary>> = "Elixir"
"Elixir"
iex> codepoint
69
iex> rest
"lixir"
though not entirely equivalent because codepoint
comes as
an integer, and the pattern won't match invalid UTF-8.
Binary pattern matching, however, is simpler and more efficient, so pick the option that better suits your use case.