You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe I have found a missing capability in CustomConsumingRegexComponent 2 and I would like to confirm that I'm not simply using it wrong.
To implement CustomConsumingRegexComponent you must implement consuming(_:startingAt:in:) which returns (upperBound: String.Index, output: Self.RegexOutput)?. Notice how the tuple only contains an upperBound. There is no lowerBound and there is no Range. As far as I know there is no way for a CustomConsumingRegexComponent to declare where the start of the match is.
In the short term this missing capability should be documented.
In the long term, it would be beneficial to pitch an implementation of this to Swift Evolution.
Reproduction
Consider the following test data:
letdateStringHaystacks=["2023-04-15 some other text","some other text 2023-04-15","some 2023-04-15 other text"]
All three of these strings pass the following test:
@Test(arguments: dateStringHaystacks)func dateRegex(string:String)throws{
// Create a Regex using RegexBuilder for YYYY/MM/DD
letdateRegex=Regex{Repeat(OneOrMore(.digit), count:4) // Matches exactly 4 digits for the year (YYYY)
"-"Repeat(OneOrMore(.digit), count:2) // Matches exactly 2 digits for the month (MM)
"-"Repeat(OneOrMore(.digit), count:2) // Matches exactly 2 digits for the day (DD)
}letranges= string.ranges(of: dateRegex)letrange=try #require(ranges.first)letfoundString=String(string[range])
#expect(foundString =="2023-04-15")}
which yields the following results:
// input: "2023-04-15 some other text" ✅ test passes
// input: "some other text 2023-04-15" ✅ test passes
// input: "some 2023-04-15 other text" ✅ test passes
I then implemented a CustomConsumingRegexComponent wrapper around NSDataDetector:
publicstructDateDataDetector:CustomConsumingRegexComponent{publictypealiasRegexOutput=Datepublicinit(){}publicfunc consuming(
_ input:String,
startingAt index:String.Index,
in bounds:Range<String.Index>)throws->(upperBound:String.Index, output:Date)?{letdetector=tryNSDataDetector(types:NSTextCheckingResult.CheckingType.date.rawValue)letrange=NSRange(index..<bounds.upperBound, in: input)
guard let match = detector.firstMatch(in: input, options:[], range: range),let date = match.date else{returnnil}letupperBound= input.index(
input.startIndex,
offsetBy: match.range.upperBound
)return(upperBound: upperBound, output: date)}}
This custom regex component correctly identifies dates however, the ranges are wrong. The start of the range is always the beginning of the string, even if that is not where the beginning of the match was. Furthermore, because there is no upperBound in the tuple, I can see no way to implement declaring the beginning of the range when a match is found.
The following tests fail, but only fail when the date is not found at the beginning of the string. (They pass even if there is text after the match.)
// input: "2023-04-15 some other text", ✅ test passes
// input: "some other text 2023-04-15" ❌ match.output = "some other text 2023-04-15"
// input: "some 2023-04-15 other text" ❌ match.output = "some 2023-04-15"
As you can see the match behavior is different for RegexBuilder than it is for CustomConsumingRegexComponent. The RegexBuilder begins the match at the beginning of the match, but the CustomConsumingRegexComponent matches at the beginning of the string, no matter what. It's highly likely that I'm "using it wrong". However, if that is the case, I simply do not know how to tell the Regex system where the beginning of the match is. I can reliably tell it where the end of the match is (the upperBound) but there is no way to return the beginning of the match (a lowerBound).
Expected behavior
See reproduction.
Environment
swift-driver version: 1.115 Apple Swift version 6.0 (swiftlang-6.0.0.9.10 clang-1600.0.26.2)
Target: arm64-apple-macosx15.0
Additional information
I also posted about this in the Swift forums here to try to figure out if this is missing or not.
wes1 helpfully pointed to evidence that suggests that CustomConsumingRegexComponent was only meant to implement prefixMatch by design. If CustomConsumingRegexComponent is not currently up to feature parity with any other RegexComponent then this should be documented.
The text was updated successfully, but these errors were encountered:
Description
I believe I have found a missing capability in CustomConsumingRegexComponent 2 and I would like to confirm that I'm not simply using it wrong.
To implement CustomConsumingRegexComponent you must implement
consuming(_:startingAt:in:)
which returns(upperBound: String.Index, output: Self.RegexOutput)?
. Notice how the tuple only contains anupperBound
. There is nolowerBound
and there is noRange
. As far as I know there is no way for aCustomConsumingRegexComponent
to declare where the start of the match is.In the short term this missing capability should be documented.
In the long term, it would be beneficial to pitch an implementation of this to Swift Evolution.
Reproduction
Consider the following test data:
All three of these strings pass the following test:
which yields the following results:
I then implemented a
CustomConsumingRegexComponent
wrapper around NSDataDetector:This custom regex component correctly identifies dates however, the ranges are wrong. The start of the range is always the beginning of the string, even if that is not where the beginning of the match was. Furthermore, because there is no
upperBound
in the tuple, I can see no way to implement declaring the beginning of the range when a match is found.The following tests fail, but only fail when the date is not found at the beginning of the string. (They pass even if there is text after the match.)
which yields the following results:
As you can see the match behavior is different for
RegexBuilder
than it is forCustomConsumingRegexComponent
. TheRegexBuilder
begins the match at the beginning of the match, but theCustomConsumingRegexComponent
matches at the beginning of the string, no matter what. It's highly likely that I'm "using it wrong". However, if that is the case, I simply do not know how to tell the Regex system where the beginning of the match is. I can reliably tell it where the end of the match is (theupperBound
) but there is no way to return the beginning of the match (alowerBound
).Expected behavior
See reproduction.
Environment
swift-driver version: 1.115 Apple Swift version 6.0 (swiftlang-6.0.0.9.10 clang-1600.0.26.2)
Target: arm64-apple-macosx15.0
Additional information
I also posted about this in the Swift forums here to try to figure out if this is missing or not.
wes1 helpfully pointed to evidence that suggests that
CustomConsumingRegexComponent
was only meant to implementprefixMatch
by design. IfCustomConsumingRegexComponent
is not currently up to feature parity with any otherRegexComponent
then this should be documented.The text was updated successfully, but these errors were encountered: