This repository was archived by the owner on Feb 18, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 34
Address feedback from plenary #77
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,9 +31,11 @@ contributors: | |
1. Let _escaped_ be the empty String. | ||
1. Let _cpList_ be StringToCodePoints(_S_). | ||
1. For each code point _c_ in _cpList_, do | ||
1. If _escaped_ is the empty String and _c_ is matched by |DecimalDigit|, then | ||
1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text which may be used after a `\0` character escape or a |DecimalEscape| such as `\1` and still match _S_ rather than be interpreted as an extension of the preceding escape sequence. | ||
1. Set _escaped_ to the string-concatenation of _escaped_, the code unit 0x005C (REVERSE SOLIDUS), *"x3"*, and the code unit whose numeric value is the numeric value of _c_. | ||
1. If _escaped_ is the empty String, and _c_ is matched by |DecimalDigit| or |AsciiLetter|, then | ||
1. NOTE: Escaping a leading digit ensures that output corresponds with pattern text which may be used after a `\0` character escape or a |DecimalEscape| such as `\1` and still match _S_ rather than be interpreted as an extension of the preceding escape sequence. Escaping a leading ASCII letter does the same for the context after `\c`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, TIL |
||
1. Let _hex_ be Number::toString(𝔽(_c_), 16). | ||
1. Assert: The length of _hex_ is 2. | ||
1. Set _escaped_ to the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), *"x"*, and _hex_. | ||
1. Else, | ||
1. Set _escaped_ to the string-concatenation of _escaped_ and EncodeForRegExpEscape(_c_). | ||
1. Return _escaped_. | ||
|
@@ -52,13 +54,17 @@ contributors: | |
</h1> | ||
<dl class="header"> | ||
<dt>description</dt> | ||
<dd>It returns a string representing a |Pattern| for matching _c_. If _c_ is white space or an ASCII punctuator, the returned value is an escape sequence (corresponding with |HexEscapeSequence| if possible, or otherwise with |RegExpUnicodeEscapeSequence|). Otherwise, the returned value is a string representation of _c_ itself.</dd> | ||
<dd>It returns a string representing a |Pattern| for matching _c_. If _c_ is white space or an ASCII punctuator, the returned value is an escape sequence. Otherwise, the returned value is a string representation of _c_ itself.</dd> | ||
</dl> | ||
|
||
<emu-alg> | ||
1. Let _punctuators_ be the string-concatenation of *"(){}[]|,.?\*+-^$=<>/#&!%:;@~'`"*, the code unit 0x0022 (QUOTATION MARK), and the code unit 0x005C (REVERSE SOLIDUS). | ||
1. Let _toEscape_ be StringToCodePoints(_punctuators_). | ||
1. If _toEscape_ contains _c_ or _c_ is matched by |WhiteSpace|, then | ||
1. If _c_ is matched by |SyntaxCharacter| or _c_ is U+002F (SOLIDUS), then | ||
1. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and UTF16EncodeCodePoint(_c_). | ||
Comment on lines
+61
to
+62
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This strikes me as a point-in-time snapshot, and I'm not a fan because it will be weird if e.g. |
||
1. Else if _c_ is the code point listed in some cell of the “Code Point” column of <emu-xref href="#table-controlescape-code-point-values"></emu-xref>, then | ||
1. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and the string in the “ControlEscape” column of the row whose “Code Point” column contains _c_. | ||
bakkot marked this conversation as resolved.
Show resolved
Hide resolved
|
||
1. Let _otherPunctuators_ be the string-concatenation of *",-=<>#&!%:;@~'`"* and the code unit 0x0022 (QUOTATION MARK). | ||
ljharb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
1. Let _toEscape_ be StringToCodePoints(_otherPunctuators_). | ||
1. If _toEscape_ contains _c_, _c_ is matched by |WhiteSpace| or |LineTerminator|, or _c_ has the same numeric value as a leading surrogate or trailing surrogate, then | ||
1. If _c_ ≤ 0xFF, then | ||
1. Let _hex_ be Number::toString(𝔽(_c_), 16). | ||
1. Return the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), *"x"*, and StringPad(_hex_, 2, *"0"*, ~start~). | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this sentence is too long to not use a parenthetical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those commas read as ungrammatical to me. In particular the second comma cuts a clause in the middle - the pattern text "maybe used after \0 [...] and still match S". Open to other rephrasing here but I don't like this particular suggestion.