mirror of
https://github.com/ruby/ruby.git
synced 2026-01-29 13:34:17 +00:00
Invalid escapes are handled at multiple levels. The first level is in parse.y, so skip invalid unicode escape checks for regexps in parse.y. Make rb_reg_preprocess and unescape_nonascii accept the regexp options. In unescape_nonascii, if the regexp is an extended regexp, when "#" is encountered, ignore all characters until the end of line or end of regexp. Unfortunately, in extended regexps, you can use "#" as a non-comment character inside a character class, so also parse "[" and "]" specially for extended regexps, and only skip comments if "#" is not inside a character class. Handle nested character classes as well. This issue doesn't just affect extended regexps, it also affects "(#?" comments inside all regexps. So for those comments, scan until trailing ")" and ignore content inside. I'm not sure if there are other corner cases not handled. A better fix would be to redesign the regexp parser so that it unescaped during parsing instead of before parsing, so you already know the current parsing state. Fixes [Bug #18294] Co-authored-by: Nobuyoshi Nakada <nobu@ruby-lang.org>