-
Notifications
You must be signed in to change notification settings - Fork 939
doc: properly handle preformatted blocks #8242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
doc: properly handle preformatted blocks #8242
Conversation
Lowdown requires a blank line before all preformatted blocks, or it doesn't recognize them. `tools/md2man.sh` contained some ad-hoc efforts at fixing up some locations where these required blank lines are absent from the output of `tools/fromschema.py`, but it missed some. Instead of playing Whack-a-Mole, use a blanket sed expression to ensure that a blank line precedes _every_ opening ```. `esc_underscores(…)` in `tools/fromschema.py` did not work correctly on strings containing an odd number of backticks, notably the ``` delimiters surrounding preformatted text blocks. Specifically, it was dropping the last backtick since none of the alternatives in the regex matched it. Add a new alternative that matches a whole preformatted block as a single unit. `output_member(…)` in `tools/fromschema.py` was passing each line of a member's description through `esc_underscores(…)` individually, but that breaks preformatted text blocks that are naturally multi-line and leads to mistakenly escaping underscores inside such blocks. Rewrite the code to make use of the `outputs(…)` utility function that joins all the provided lines together before passing the whole text through `esc_underscores(…)`. Drive-by fix a couple of flubbed preformatted blocks in schemas. Changelog-None
\
\preformatted blocks\
\\
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor fix, generally looks good!
@@ -21,7 +21,7 @@ def output_title(title, underline='-', num_leading_newlines=1, num_trailing_newl | |||
|
|||
def esc_underscores(s): | |||
"""Backslash-escape underscores outside of backtick-enclosed spans""" | |||
return ''.join(['\\_' if x == '_' else x for x in re.findall(r'[^`_\\]+|`(?:[^`\\]|\\.)*`|\\.|_', s)]) | |||
return ''.join(['\\_' if x == '_' else x for x in re.findall(r'(?ms:^[ \t]*```.*?^[ \t]*```)|[^`_\\\n]++|`(?:[^`\\]|\\.)*`|\\.|[_\n]', s)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"++" here is wrong:
re.error: multiple repeat at position 40
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Are you using a very old version of Python? The ++
(possessive one-or-more) quantifier works on Python 3.11.12, 3.12.10, and 3.13.3, but it gives the error you quoted on Python 3.10.17. Do you really need to maintain compatibility with ancient versions of Python? Possessive quantifiers avoid needlessly backtracking when we know that backtracking will not find any new matches, although, now that I am looking at this again, I don't think it's going to make any difference in this case since there are no assertions after that repeat, so no backtracking would ever be attempted even if the quantifier were non-possessive.
Lowdown requires a blank line before all preformatted blocks, or it doesn't recognize them.
tools/md2man.sh
contained some ad-hoc efforts at fixing up some locations where these required blank lines are absent from the output oftools/fromschema.py
, but it missed some. Instead of playing Whack-a-Mole, use a blanket sed expression to ensure that a blank line precedes every opening ```.esc_underscores(…)
intools/fromschema.py
did not work correctly on strings containing an odd number of backticks, notably the ``` delimiters surrounding preformatted text blocks. Specifically, it was dropping the last backtick since none of the alternatives in the regex matched it. Add a new alternative that matches a whole preformatted block as a single unit.output_member(…)
intools/fromschema.py
was passing each line of a member's description throughesc_underscores(…)
individually, but that breaks preformatted text blocks that are naturally multi-line and leads to mistakenly escaping underscores inside such blocks. Rewrite the code to make use of theoutputs(…)
utility function that joins all the provided lines together before passing the whole text throughesc_underscores(…)
.Drive-by fix a couple of flubbed preformatted blocks in schemas.
Checklist
Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked: