Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex character classes fail with single scalar, non-NFC range bound elements #750

Open
kasei opened this issue Jul 10, 2024 · 1 comment

Comments

@kasei
Copy link

kasei commented Jul 10, 2024

Using a non-NFC, single-scalar code point like U+F900 as the start of a character class range causes an error:

1 | let r = #/[\u{F900}-\u{FDCF}]/#
  |           `- error: cannot parse regular expression: invalid bound for character class range

Tested with swift 5.10 and 6.0 (Xcode 16b2 16A5171r):

swift-driver version: 1.90.11.1 Apple Swift version 5.10 (swiftlang-5.10.0.13 clang-1500.3.9.4)
Target: arm64-apple-macosx14.0

swift-driver version: 1.110 Apple Swift version 6.0 (swiftlang-6.0.0.4.52 clang-1600.0.21.1.3)
Target: arm64-apple-macosx14.0

This seems to be because U+F900 is not in NFC, normalizing to U+8C48. I find this surprising, because while this code point is not in NFC, this character class range isn't ambiguous as other non-NFC cases might be (e.g. using a decomposed combination or U+F900 as a literal instead of with the \u escape).

I am trying to port older code that uses NSRegularExpression, and this seems to be a blocker to moving away from the old APIs (short of expanding ranges like this into non-range classes of thousands of individual scalars).

@kasei kasei changed the title Regex character classes fail with single scalar, non-NFC bound elements Regex character classes fail with single scalar, non-NFC range bound elements Jul 10, 2024
@kasei
Copy link
Author

kasei commented Oct 29, 2024

Still failing in Xcode 16.1 (16B40):

swift-driver version: 1.115 Apple Swift version 6.0.2 (swiftlang-6.0.2.1.2 clang-1600.0.26.4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant