You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add skip list based implementation for smaller encoding
This arranges for the sparser sets (everything except lower and uppercase) to be
encoded in a significantly smaller context. However, it is also a performance
trade-off (roughly 3x slower than the bitset encoding). The 40% size reduction
is deemed to be sufficiently important to merit this performance loss,
particularly as it is unlikely that this code is hot anywhere (and if it is,
paying the memory cost for a bitset that directly represents the data seems
worthwhile).
Alphabetic : 1599 bytes (- 937 bytes)
Case_Ignorable : 949 bytes (- 822 bytes)
Cased : 359 bytes (- 429 bytes)
Cc : 9 bytes (- 15 bytes)
Grapheme_Extend: 813 bytes (- 675 bytes)
Lowercase : 863 bytes
N : 419 bytes (- 619 bytes)
Uppercase : 776 bytes
White_Space : 37 bytes (- 46 bytes)
Total table sizes: 5824 bytes (-3543 bytes)
0 commit comments