Skip to content

doc: properly handle preformatted blocks #8242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

whitslack
Copy link
Collaborator

Lowdown requires a blank line before all preformatted blocks, or it doesn't recognize them. tools/md2man.sh contained some ad-hoc efforts at fixing up some locations where these required blank lines are absent from the output of tools/fromschema.py, but it missed some. Instead of playing Whack-a-Mole, use a blanket sed expression to ensure that a blank line precedes every opening ```.

esc_underscores(…) in tools/fromschema.py did not work correctly on strings containing an odd number of backticks, notably the ``` delimiters surrounding preformatted text blocks. Specifically, it was dropping the last backtick since none of the alternatives in the regex matched it. Add a new alternative that matches a whole preformatted block as a single unit.

output_member(…) in tools/fromschema.py was passing each line of a member's description through esc_underscores(…) individually, but that breaks preformatted text blocks that are naturally multi-line and leads to mistakenly escaping underscores inside such blocks. Rewrite the code to make use of the outputs(…) utility function that joins all the provided lines together before passing the whole text through esc_underscores(…).

Drive-by fix a couple of flubbed preformatted blocks in schemas.

Checklist

Before submitting the PR, ensure the following tasks are completed. If an item is not applicable to your PR, please mark it as checked:

  • The changelog has been updated in the relevant commit(s) according to the guidelines.
  • Tests have been added or modified to reflect the changes. (Not applicable.)
  • Documentation has been reviewed and updated as needed. (That's what this PR does.)
  • Related issues have been listed and linked, including any that this PR closes. (I didn't find any.)

Lowdown requires a blank line before all preformatted blocks, or it doesn't
recognize them. `tools/md2man.sh` contained some ad-hoc efforts at fixing up
some locations where these required blank lines are absent from the output of
`tools/fromschema.py`, but it missed some. Instead of playing Whack-a-Mole, use
a blanket sed expression to ensure that a blank line precedes _every_ opening
```.

`esc_underscores(…)` in `tools/fromschema.py` did not work correctly on strings
containing an odd number of backticks, notably the ``` delimiters surrounding
preformatted text blocks. Specifically, it was dropping the last backtick since
none of the alternatives in the regex matched it. Add a new alternative that
matches a whole preformatted block as a single unit.

`output_member(…)` in `tools/fromschema.py` was passing each line of a member's
description through `esc_underscores(…)` individually, but that breaks
preformatted text blocks that are naturally multi-line and leads to mistakenly
escaping underscores inside such blocks. Rewrite the code to make use of the
`outputs(…)` utility function that joins all the provided lines together before
passing the whole text through `esc_underscores(…)`.

Drive-by fix a couple of flubbed preformatted blocks in schemas.

Changelog-None
@whitslack whitslack changed the title doc: properly handle \\\preformatted blocks\\\ doc: properly handle preformatted blocks Apr 19, 2025
@cdecker cdecker requested a review from ShahanaFarooqui April 21, 2025 15:26
Copy link
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor fix, generally looks good!

@@ -21,7 +21,7 @@ def output_title(title, underline='-', num_leading_newlines=1, num_trailing_newl

def esc_underscores(s):
"""Backslash-escape underscores outside of backtick-enclosed spans"""
return ''.join(['\\_' if x == '_' else x for x in re.findall(r'[^`_\\]+|`(?:[^`\\]|\\.)*`|\\.|_', s)])
return ''.join(['\\_' if x == '_' else x for x in re.findall(r'(?ms:^[ \t]*```.*?^[ \t]*```)|[^`_\\\n]++|`(?:[^`\\]|\\.)*`|\\.|[_\n]', s)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"++" here is wrong:

re.error: multiple repeat at position 40

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Are you using a very old version of Python? The ++ (possessive one-or-more) quantifier works on Python 3.11.12, 3.12.10, and 3.13.3, but it gives the error you quoted on Python 3.10.17. Do you really need to maintain compatibility with ancient versions of Python? Possessive quantifiers avoid needlessly backtracking when we know that backtracking will not find any new matches, although, now that I am looking at this again, I don't think it's going to make any difference in this case since there are no assertions after that repeat, so no backtracking would ever be attempted even if the quantifier were non-possessive.

@rustyrussell rustyrussell added this to the v25.05 milestone Apr 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants