mirror of
https://github.com/Perl/perl5.git
synced 2026-01-26 08:38:23 +00:00
l1_char_class_tab.h categorizes characters in the Latin1 range into various classes, mostly into the POSIX classes like [:word:]. Each character has a bit set corresponding to every class it is a member of. These values are placed in a 256-element array and the ordinal value of a character is used as an index into it for quick determination of if a character is a member of a given class. Besides the POSIX classes, there are some classes that make it more convenient and/or faster for our code. For example, there is a class that allows us to quickly know if a given character is one that needs to be preceded by a backslash by quotemeta(). This commit adds a class for the single character underscore '_', and a macro that allows for seeing if a character is either an underscore or a member of any other class, using a single conditional. This means code that checks for if character X is either an underscore or a member of class Y can change to eliminate one conditional. Thus the reason to do this is efficiency. Currently, the only places that do this explicitly are in non-hot code. But I have wip that has hot code that could benefit from this. The only downside of doing this is that it uses up one bit of the 32 available (without shenanigans) for such classes, leaving 4 spare. But before this release, the last time any new bit had been used up was 5.32, so the rate of using these spare up is quite low. This bit could be reclaimed because the IDFIRST class in the Latin1 range is identical to ALPHA plus the underscore, so it could be rewritten as that combination and its bit freed up. However, this would require adding some macros that take two class parameters instead of one. I briefly thought about doing that now, but since we have spare bits and the rate of using them up is low, I didn't think it was worth it at this time. \w in this range is ALPHANUMERIC plus underscore. But its use is more embedded than IDFIRST is, so an attempt to reclaim its bit would require more effort.