mirror of
https://github.com/Perl/perl5.git
synced 2026-01-26 08:38:23 +00:00
GH #16197 The main purpose of this commit is to stop premature scope unwinding within eval code in regexes. Aside from backtracking on failure, the scopes of every eval, such as (?{ local $x }), are supposed to accumulate, and are only unwound en-masse at the very end of the pattern match. However, a combination of a sub-pattern call, such as (??{...}) or (?&...), combined with a cut, (?>...), can trigger earlier savestack popping. The direct fix for this, as explained below, is to remove this single line from the EVAL_postponed_A/B: case: REGCP_UNWIND(ST.cp); /* LEAVE in disguise */ However, that line is entwined with code which attempts to preserve the final value of $^R during scope unwinding. Since that code was kind of working around the misplaced REGCP_UNWIND(), it needs ripping out and re-implementing. This has to be done at the same time, so the bulk of this commit is actually concerned with $^R, even though it isn't the subject of this ticket. So this commit doesn't change the behaviour of $^R, but just changes its implementation somewhat. The $^R issue is that every /(?{...})/ causes the equivalent of local $^R = ...; to be executed. During final exit, the savestack gets unwound, and all those local's get undone, leaving $^R with the value it had before the match started. But we promise that after the match, $^R will hold the value of the most recent (?{...}). The code which this commit rips out restored that value in one way; the new code in this commit does it a different way. Basically, almost the last thing S_regmatch() does is a LEAVE_SCOPE(orig_savestack_ix); This commit makes it so that the current value of $^R is copied just before the LEAVE_SCOPE(), and that value is copied back to $^R just after the LEAVE_SCOPE(). For efficiency, we only do the copy if we've actually set $^R. A mechanism is also needed to ensure that the temporary copy doesn't leak if we die during the savestack unwind. This is achieved by holding a pointer to the copy in the aux_eval struct, which gets processed even if we die. Now back to the main purpose of this commit, the premature stack unwind in the presence of a cut with a sub-pattern. This bug has been there since these features were added. It is instructive to look at a somewhat idealised overview of the S_regmatch() function from around 5.6.0 (with some bug fixes from later releases added). This was while the function was still recursive. It looks approximately like: pp_match(...) { I32 ix = PL_savestack_ix; ... do a match ... LEAVE_SCOPE(ix); } S_regmatch(...) { while (scan) { switch (OP(scan)) { case FOO: if (! there's a FOO) return 0; I32 ix = PL_savestack_ix; if (regmatch(...)) /* recursively match rest of pattern */ return 1; LEAVE_SCOPE(ix); return 0; case END: if (doing a (??{...}) ) { I32 ix = PL_savestack_ix; if (regmatch(...) { /* recursively match rest of pattern */ LEAVE_SCOPE(ix); return 1; } LEAVE_SCOPE(ix); return 0; } return 1; case EVAL: ... run the code, then, if its a (?{...}) ... I32 ix = PL_savestack_ix; if (regmatch(...) { /* recursively run subpattern */ LEAVE_SCOPE(ix) return 1; } LEAVE_SCOPE(ix); return 0; } } Here, the FOO: case represents all the various ops which recurse. In general, they match the next item and then recurse to match the rest of the pattern. Note that they all do a LEAVE_SCOPE() only in the *failure* branch. At the end of a successful match, there is potentially much recursion, and much stuff on the savestack. When the END op is reached, the series of 'return 1's causes all the recursion to unwind, while leaving the savestack untouched. Finally, the caller - such as pp_match() - clears the savestack. In more recent perls the recursion has been removed and the final LEAVE_SCOPE() is done within S_regmatch() itself, but the principle remains the same: no stack freeing is done *during* matching, and instead there's a single big clean up at the end. Once (??{...}) enters the picture, that changes a bit. When the END op associated with the '...' sub-pattern is reached, regmatch() is called recursively to process any pattern after the (??{..}); then on success, while working its way back through the nested regmatch() calls, both the END and the EVAL code do a LEAVE_SCOPE() in the *success* branch. This is anomalous, and those two LEAVE_SCOPE()'s are what this commit removes (although in the current non-recursive regex engine, they are shared by the same piece of code, so only one had to be removed). By removing them, this regularises the behaviour of sub-patterns. I can't think why those LEAVE_SCOPE()s were originally added, and assume it was a thinko. Normally it makes no difference whether the savestack is popped near the end, interleaved with popping all recursive regmatch calls (or equivalently on non-recursive engines popping the regmatch_state stack), or whether the savestack is popped only after all the recursion is exited from. However, it makes a difference in the presence of a cut, (?>...). Here, the final op of the sub-pattern '...' is SUCCEED, which rather than recursing to match anything following the cut block, just returns. The recursion pops back to the SUSPEND op which started the cut, which then continues with the op loop as normal. Thus when about to match the first op following a (?>...), the recursion *within* the cut has been blown away as if it never happened, but the accumulated savestack entries (e.g. from evals within the cut block) are preserved and continue to accumulate. Now, if there is a (??{...}) sub-pattern, or similarly a (?&FOO), within the cut, then at the end of the cut, the recursion is unwound, which includes the stacked EVAL and END recursions, which at this time call LEAVE_SCOPE(), which frees part of the savestack, even though the pattern match hasn't ended yet. That's the bug which this commit fixes. The tests added to pat_re_eval.t check that the scope bug has been fixed. The test added to svleak.t checks that the new $^R copying code doesn't leak.
…
…
…
…
…
…
…
…
…
…
…
Perl is Copyright (C) 1993 - 2025 by Larry Wall and others.
All rights reserved.
# ABOUT PERL
Perl is a general-purpose programming language originally developed for
text manipulation and now used for a wide range of tasks including
system administration, web development, network programming, GUI
development, and more.
The language is intended to be practical (easy to use, efficient,
complete) rather than beautiful (tiny, elegant, minimal). Its major
features are that it's easy to use, supports both procedural and
object-oriented (OO) programming, has powerful built-in support for text
processing, and has one of the world's most impressive collections of
third-party modules.
For an introduction to the language's features, see pod/perlintro.pod.
For a discussion of the important changes in this release, see
pod/perldelta.pod.
There are also many Perl books available, covering a wide variety of topics,
from various publishers. See pod/perlbook.pod for more information.
# INSTALLATION
If you're using a relatively modern operating system and want to
install this version of Perl locally, run the following commands:
./Configure -des -Dprefix=$HOME/localperl
make test
make install
This will configure and compile perl for your platform, run the regression
tests, and install perl in a subdirectory "localperl" of your home directory.
If you run into any trouble whatsoever or you need to install a customized
version of Perl, you should read the detailed instructions in the "INSTALL"
file that came with this distribution. Additionally, there are a number of
"README" files with hints and tips about building and using Perl on a wide
variety of platforms, some more common than others.
Once you have Perl installed, a wealth of documentation is available to you
through the 'perldoc' tool. To get started, run this command:
perldoc perl
# IF YOU RUN INTO TROUBLE
Perl is a large and complex system that's used for everything from
knitting to rocket science. If you run into trouble, it's quite
likely that someone else has already solved the problem you're
facing. Once you've exhausted the documentation, please report bugs to us
at the GitHub issue tracker at https://github.com/Perl/perl5/issues
While it was current when we made it available, Perl is constantly evolving
and there may be a more recent version that fixes bugs you've run into or
adds new features that you might find useful.
You can always find the latest version of perl on a CPAN (Comprehensive Perl
Archive Network) site near you at https://www.cpan.org/src/
If you want to submit a simple patch to the perl source, see the "SUPER
QUICK PATCH GUIDE" in pod/perlhack.pod.
Just a personal note: I want you to know that I create nice things like this
because it pleases the Author of my story. If this bothers you, then your
notion of Authorship needs some revision. But you can use perl anyway. :-)
The author.
# LICENSING
This program is free software; you can redistribute it and/or modify
it under the terms of either:
a. the GNU General Public License as published by the Free
Software Foundation; either version 1, or (at your option) any
later version, or
b. the "Artistic License" which comes with this Kit.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See either
the GNU General Public License or the Artistic License for more details.
You should have received a copy of the Artistic License with this
Kit, in the file named "Artistic". If not, I'll be glad to provide one.
You should also have received a copy of the GNU General Public License
along with this program in the file named "Copying". If not, see
<https://www.gnu.org/licenses/>.
For those of you that choose to use the GNU General Public License,
my interpretation of the GNU General Public License is that no Perl
script falls under the terms of the GPL unless you explicitly put
said script under the terms of the GPL yourself. Furthermore, any
object code linked with perl does not automatically fall under the
terms of the GPL, provided such object code only adds definitions
of subroutines and variables, and does not otherwise impair the
resulting interpreter from executing any standard Perl script. I
consider linking in C subroutines in this manner to be the moral
equivalent of defining subroutines in the Perl language itself. You
may sell such an object file as proprietary provided that you provide
or offer to provide the Perl source, as specified by the GNU General
Public License. (This is merely an alternate way of specifying input
to the program.) You may also sell a binary produced by the dumping of
a running Perl script that belongs to you, provided that you provide or
offer to provide the Perl source as specified by the GPL. (The
fact that a Perl interpreter and your code are in the same binary file
is, in this case, a form of mere aggregation.) This is my interpretation
of the GPL. If you still have concerns or difficulties understanding
my intent, feel free to contact me. Of course, the Artistic License
spells all this out for your protection, so you may prefer to use that.
Description
Languages
Perl
61.7%
C
31.7%
Shell
2.6%
XS
2.3%
Text
0.8%
Other
0.7%