-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does default_para_level correspond to TR9#HL1? #129
Comments
Test case (extra line-breaks inserted to allow correct rendering in the browser):
Level runs:
|
Yes, it does. The naming comes from the standard naming for HL1 from other bidi implementations, including ICU4C. While typically HL1 is viewed as a higher level algorithm, in practice it really should just be an input to the main algorithm. This is a bit of a discrepancy between the spec and how real world users tend to conceptualize the algorithm. It's "default" because further embedding will change the level: it is not the level of the paragraph, it is the level the paragraph starts with. "override" carries a stronger connotation here. I would prefer to just document this better. |
Personally I'd still like to see something like this: pub fn BidiInfo::new(text: &str, base_para_level: Option<Level>, default_base_para_level: Level) -> Self Doc:
Or maybe just: pub fn BidiInfo::new(text: &str, infer: bool, base_para_level: Level) -> Self Doc:
Then |
This doesn't really match typical usage patterns of the bidi algorithm. Basically, either you have a signal from the embedding environment (e.g. your html dir attribute, or surrounding bidi text in the case of embedded inline content) as to what level to use, or you don't. I don't want to overrotate on how the spec conceptualizes these things. The spec probably should have HL1 clarified. |
In this case, the current behaviour makes sense. For CSS's I see you already have this in your code-base: /// TODO: Support auto-RTL base direction
Personally, I've decided to use the following: /// Directionality of text
///
/// Texts may be bi-directional as specified by Unicode Technical Report #9.
#[derive(Copy, Clone, Debug, Default, PartialEq, Eq, Ord, PartialOrd, Hash)]
#[cfg_attr(feature = "serde", derive(serde::Serialize, serde::Deserialize))]
#[repr(u8)]
pub enum Direction {
/// Auto-detect direction
///
/// The text direction is inferred from the first strongly-directional
/// character. In case no such character is found, the text will be
/// left-to-right.
#[default]
Auto = 2,
/// Auto-detect, default right-to-left
///
/// The text direction is inferred from the first strongly-directional
/// character. In case no such character is found, the text will be
/// right-to-left.
AutoRtl = 3,
/// The base text direction is left-to-right
///
/// If the text contains right-to-left content, this will be considered an
/// embedded right-to-left run. Non-directional leading and trailing
/// characters (e.g. a full stop) will normally not be included within this
/// right-to-left section.
///
/// This uses Unicode TR9 HL1 to set an explicit paragraph embedding level of 0.
Ltr = 0,
/// The base text direction is right-to-left
///
/// If the text contains left-to-right content, this will be considered an
/// embedded left-to-right run. Non-directional leading and trailing
/// characters (e.g. a full stop) will normally not be included within this
/// left-to-right section.
///
/// This uses Unicode TR9 HL1 to set an explicit paragraph embedding level of 1.
Rtl = 1,
} IIUC |
because I believe calling it the base direction in the API would be fine though if we're doing that I would want documentation that uses both names. It may be worth documenting how to get AutoRtl though really I think this is a thing that would benefit from a new API. Unfortunately it's not easy to change the current one without it being a breaking change. Note that the basic enum is insufficient for this crate: there are valid use cases for setting a base level that is greater than 1, typically for doing incremental updates to embedded content. |
Presumably it does, but it is not clear how (in particular, the name may be wrong).
HL1 states:
... then gives some suggestions for how this may do so, but no specific requirement. One of those suggestions is to follow P2 and P3, but default to level 1 (RTL).
What I find from testing is that providing
Some
value viadefault_para_level
toBidiInfo::new
overrides the paragraph level. Given the name, I would expect that instead it only acts as a default (i.e. when P2 does not find a strongly directional character).Suggestions:
default_para_level
tooverride_para_level
default_para_level
act only when no strongly-directional (L/AL/R) character is found. (In this case, it does not need to be optional.)pub fn new(text: &str, default_para_level: Level, override_para_level: Option<Level>) -> BidiInfo<'_>
From a usage point-of-view, I can see value in being able to specify a default line direction (i.e. suggestion 2) for e.g. an input field when it may be empty or contain only formatting characters. The current behaviour (override) achieves the same thing, but also messes up formatting when the directionality is detectable (e.g. a sentence with a full stop as in #52), for little gain.
The text was updated successfully, but these errors were encountered: