-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does it supply line&column numbers for the parsed tokens? #492
Comments
Changes in line numbers are available to client code in the tree builder ( html5ever/markup5ever/interface/tree_builder.rs Lines 237 to 238 in 98d3c0c
|
Simiarly, the tokenizer receives a line number with each token: html5ever/html5ever/src/tokenizer/interface.rs Lines 97 to 98 in 98d3c0c
|
thank you @jdm ! :-) |
Yeah the line number on its own is kind of useless for certain applications. For my own project I'm having to resort to https://github.com/y21 just to get the exact byte positions of each DOM node. Positions for DOM nodes were also recently added to JSoup and also seems available in HTML parsers in other major languages, so I think it would make sense if we could figure out a way for html5ever to provide the same. Also there's been several issues over the years asking for similar features. One thing that I was trying to make work but couldn't quite yet is to provide a byte stream that I can read the offset from as tokens are emitted from html5ever, however since tokens are actually consumed ahead of time it doesn't quite give the right positions. This could maybe be fixed by providing something that's Peekable, but tbh. I didn't really like the direction anyways. Are there any better ideas of how this could potentially be added in such a way that it's an opt-in performance penalty? |
hey @RXminuS :-) |
It's not actively maintained and you need to do some hacky things such as replacing script/style/no script content otherwise the ranges will be off since it still matches on those tokens inside (e.g. no state switching) |
For anyone else running into this problem, in whatwg/html-build#291 I'm creating a Column numbers, of course, are not so easy. |
I searched the sources, and found
_line_numer
in a few places, but overall, I had the impression that this info is not available to client code. Am I wrong?The text was updated successfully, but these errors were encountered: