Skip to content

Commit 807963f

Browse files
leobalterMs2ger
authored andcommitted
Add tests for RegExp.escape
1 parent d09ecdb commit 807963f

21 files changed

+767
-0
lines changed

features.txt

+4
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,10 @@ Array.fromAsync
6868
# https://github.com./tc39/proposal-json-parse-with-source
6969
json-parse-with-source
7070

71+
# RegExp.escape
72+
# https://github.com./tc39/proposal-regex-escaping
73+
RegExp.escape
74+
7175
# Regular expression modifiers
7276
# https://github.com./tc39/proposal-regexp-modifiers
7377
regexp-modifiers
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
// Copyright 2024 Leo Balter. All rights reserved.
2+
// This code is governed by the BSD license found in the LICENSE file.
3+
4+
/*---
5+
esid: sec-regexp.escape
6+
description: escape called with a RegExp object from another realm
7+
features: [RegExp.escape, cross-realm]
8+
---*/
9+
10+
const str = "oi+hello";
11+
const other = $262.createRealm().global;
12+
13+
assert.sameValue(typeof other.RegExp.escape, "function", "other.RegExp.escape is a function");
14+
15+
const res = other.RegExp.escape.call(RegExp, str);
16+
17+
assert.sameValue(res, RegExp.escape(str), "cross-realm escape works correctly");
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
// Copyright (C) 2024 Leo Balter. All rights reserved.
2+
// This code is governed by the BSD license found in the LICENSE file.
3+
4+
/*---
5+
esid: sec-encodeforregexescape
6+
description: Encodes control characters with their ControlEscape sequences
7+
info: |
8+
EncodeForRegExpEscape ( c )
9+
10+
2. If c is the code point listed in some cell of the “Code Point” column of Table 64, then
11+
a. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and the string in the “ControlEscape” column of the row whose “Code Point” column contains c.
12+
13+
ControlEscape, Numeric Value, Code Point, Unicode Name, Symbol
14+
t 9 U+0009 CHARACTER TABULATION <HT>
15+
n 10 U+000A LINE FEED (LF) <LF>
16+
v 11 U+000B LINE TABULATION <VT>
17+
f 12 U+000C FORM FEED (FF) <FF>
18+
r 13 U+000D CARRIAGE RETURN (CR) <CR>
19+
features: [RegExp.escape]
20+
---*/
21+
22+
const controlCharacters = '\t\n\v\f\r';
23+
const expectedEscapedCharacters = '\\t\\n\\v\\f\\r';
24+
25+
assert.sameValue(RegExp.escape(controlCharacters), expectedEscapedCharacters, 'Control characters are correctly escaped');
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
// Copyright (C) 2024 Leo Balter. All rights reserved.
2+
// This code is governed by the BSD license found in the LICENSE file.
3+
4+
/*---
5+
esid: sec-regexp.escape
6+
description: Escaped lineterminator characters (simple assertions)
7+
info: |
8+
EncodeForRegExpEscape ( c )
9+
10+
...
11+
3. Let otherPunctuators be the string-concatenation of ",-=<>#&!%:;@~'`" and the code unit 0x0022 (QUOTATION MARK).
12+
4. Let toEscape be StringToCodePoints(otherPunctuators).
13+
5. If toEscape ..., c is matched by WhiteSpace or LineTerminator, ..., then
14+
a. If c ≤ 0xFF, then
15+
i. Let hex be Number::toString(𝔽(c), 16).
16+
ii. Return the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), "x", and StringPad(hex, 2, "0", START).
17+
b. Let escaped be the empty String.
18+
c. Let codeUnits be UTF16EncodeCodePoint(c).
19+
d. For each code unit cu of codeUnits, do
20+
i. Set escaped to the string-concatenation of escaped and UnicodeEscape(cu).
21+
e. Return escaped.
22+
6. Return UTF16EncodeCodePoint(c).
23+
24+
LineTerminator ::
25+
<LF>
26+
<CR>
27+
<LS>
28+
<PS>
29+
30+
Exceptions:
31+
32+
2. If c is the code point listed in some cell of the “Code Point” column of Table 64, then
33+
a. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and the string in the “ControlEscape” column of the row whose “Code Point” column contains c.
34+
35+
ControlEscape, Numeric Value, Code Point, Unicode Name, Symbol
36+
t 9 U+0009 CHARACTER TABULATION <HT>
37+
n 10 U+000A LINE FEED (LF) <LF>
38+
v 11 U+000B LINE TABULATION <VT>
39+
f 12 U+000C FORM FEED (FF) <FF>
40+
r 13 U+000D CARRIAGE RETURN (CR) <CR>
41+
features: [RegExp.escape]
42+
---*/
43+
44+
assert.sameValue(RegExp.escape('\u2028'), '\\u2028', 'line terminator \\u2028 is escaped correctly to \\u2028');
45+
assert.sameValue(RegExp.escape('\u2029'), '\\u2029', 'line terminator \\u2029 is escaped correctly to \\u2029');
46+
47+
assert.sameValue(RegExp.escape('\u2028\u2029'), '\\u2028\\u2029', 'line terminators are escaped correctly');
48+
assert.sameValue(RegExp.escape('\u2028a\u2029a'), '\\u2028a\\u2029a', 'mixed line terminators are escaped correctly');
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
// Copyright (C) 2024 Leo Balter. All rights reserved.
2+
// This code is governed by the BSD license found in the LICENSE file.
3+
4+
/*---
5+
esid: sec-regexp.escape
6+
description: Escaped other punctuators characters
7+
info: |
8+
EncodeForRegExpEscape ( c )
9+
10+
...
11+
3. Let otherPunctuators be the string-concatenation of ",-=<>#&!%:;@~'`" and the code unit 0x0022 (QUOTATION MARK).
12+
4. Let toEscape be StringToCodePoints(otherPunctuators).
13+
5. If toEscape contains c, (...), then
14+
a. If c ≤ 0xFF, then
15+
i. Let hex be Number::toString(𝔽(c), 16).
16+
ii. Return the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), "x", and StringPad(hex, 2, "0", START).
17+
b. Let escaped be the empty String.
18+
c. Let codeUnits be UTF16EncodeCodePoint(c).
19+
d. For each code unit cu of codeUnits, do
20+
i. Set escaped to the string-concatenation of escaped and UnicodeEscape(cu).
21+
e. Return escaped.
22+
6. Return UTF16EncodeCodePoint(c).
23+
24+
codePoints
25+
0x002c ,
26+
0x002d -
27+
0x003d =
28+
0x003c <
29+
0x003e >
30+
0x0023 #
31+
0x0026 &
32+
0x0021 !
33+
0x0025 %
34+
0x003a :
35+
0x003b ;
36+
0x0040 @
37+
0x007e ~
38+
0x0027 '
39+
0x0060 `
40+
0x0022 "
41+
features: [RegExp.escape]
42+
---*/
43+
44+
const otherPunctuators = ",-=<>#&!%:;@~'`\"";
45+
46+
// Return the string-concatenation of the code unit 0x005C (REVERSE SOLIDUS), "x", and StringPad(hex, 2, "0", START).
47+
for (const c of otherPunctuators) {
48+
const expected = `\\x${c.codePointAt(0).toString(16)}`;
49+
assert.sameValue(RegExp.escape(c), expected, `${c} is escaped correctly`);
50+
}
51+
52+
const otherPunctuatorsExpected = "\\x2c\\x2d\\x3d\\x3c\\x3e\\x23\\x26\\x21\\x25\\x3a\\x3b\\x40\\x7e\\x27\\x60\\x22";
53+
54+
assert.sameValue(
55+
RegExp.escape(otherPunctuators),
56+
otherPunctuatorsExpected,
57+
'all other punctuators are escaped correctly'
58+
);
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
// Copyright (C) 2024 Leo Balter. All rights reserved.
2+
// This code is governed by the BSD license found in the LICENSE file.
3+
4+
/*---
5+
esid: sec-regexp.escape
6+
description: Escaped U+002F (SOLIDUS) characters (mixed assertions)
7+
info: |
8+
EncodeForRegExpEscape ( c )
9+
10+
1. If c is matched by SyntaxCharacter or c is U+002F (SOLIDUS), then
11+
a. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and UTF16EncodeCodePoint(c).
12+
features: [RegExp.escape]
13+
---*/
14+
15+
assert.sameValue(RegExp.escape('.a/b'), '\\.a\\/b', 'mixed string with solidus character is escaped correctly');
16+
assert.sameValue(RegExp.escape('/./'), '\\/\\.\\/', 'solidus character is escaped correctly - regexp similar');
17+
assert.sameValue(RegExp.escape('./a\\/*b+c?d^e$f|g{2}h[i]j\\k'), '\\.\\/a\\\\\\/\\*b\\+c\\?d\\^e\\$f\\|g\\{2\\}h\\[i\\]j\\\\k', 'complex string with multiple special characters is escaped correctly');
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
// Copyright (C) 2024 Leo Balter. All rights reserved.
2+
// This code is governed by the BSD license found in the LICENSE file.
3+
4+
/*---
5+
esid: sec-regexp.escape
6+
description: Escaped U+002F (SOLIDUS) character (simple assertions)
7+
info: |
8+
EncodeForRegExpEscape ( c )
9+
10+
1. If c is matched by SyntaxCharacter or c is U+002F (SOLIDUS), then
11+
a. Return the string-concatenation of 0x005C (REVERSE SOLIDUS) and UTF16EncodeCodePoint(c).
12+
features: [RegExp.escape]
13+
---*/
14+
15+
assert.sameValue(RegExp.escape('/'), '\\/', 'solidus character is escaped correctly');
16+
assert.sameValue(RegExp.escape('//'), '\\/\\/', 'solidus character is escaped correctly - multiple occurrences 1');
17+
assert.sameValue(RegExp.escape('///'), '\\/\\/\\/', 'solidus character is escaped correctly - multiple occurrences 2');
18+
assert.sameValue(RegExp.escape('////'), '\\/\\/\\/\\/', 'solidus character is escaped correctly - multiple occurrences 3');

test/built-ins/RegExp/escape/escaped-surrogates.js

+97
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)