HvNAME_HEK((HV*)CopSTASH(cx->blk_oldcop)) does some implicit assumptions, which should
better be asserted. Also record CX pp_caller reads besides PUSH and POP.
I was trying to write a JAPH, but did not get what I expected:
$ ./perl -Ilib -e '@UNIVERSAL::ISA = CORE; print "just another "->ucfirst, "perl hacker,\n"->ucfirst'
Perl hacker,
Perl hacker,
This happened because coresubs use leavesublv, to avoid copying the
return value wastefully.
But since this is exactly the same ucfirst op being called each time
(the one in &CORE::ucfirst’s op tree), and since ucfirst uses TARG, we
end up with the same scalar.
We have the same problem with lvalue subs:
$ ./perl -Ilib -e 'sub UNIVERSAL::ucfirst :lvalue { ucfirst $_[0] } print "just another "->ucfirst, "perl hacker,\n"->ucfirst'
Perl hacker,
Perl hacker,
(This is not a regression, as 5.14 gave ‘Can't modify ucfirst in
lvalue subroutine return’.)
So ‘fixing’ coresubs would not be a solution, but a workaround.
The solution therefore is for leavesublv to copy PADTMPs in
rvalue context.
Commit 80422e24c fixed this for potential lvalue list context (i.e.,
for(lvsub()) {...}), but it wasn’t sufficient.
Since we have supported for embedded nulls in strings, we shouldn’t
be using if(*label) to see whether label has a non-zero length.
It’s probably not possible to get a null into a label, but we should
still say ‘can’t find’ rather than ‘must have’ in that case.
The logic was written in such a way that goto "" just happened to slip
past all the checks and cause pp_goto to return NULL for the next op,
which means the end of the program.
goto ${\""} dies with ‘goto must have label’, so goto ""
should as well.
This also adds other tests for that error, which was apparently
untested till now.
The code that compared non UTF-8 labels neglected to check that
the label's length was equal before comparing them with a memEQ,
which lead to code that used labels with the same prefixes to fail:
./perl -Ilib -E 'CATCH: { CATCHLOOP: { last CATCH; } die }'
This changes the code in pp_regcomp to use the underlying REGEXP
instead of the reference to it, when concatenating pieces to mark a
larger regular expression. This makes /foo$qr/ work even under ‘no
overloading’. It stopped working with commit a75c6ed6b.
I completely forgot about do-file when, in commit f45b078d2, I stopped
eval from localising hints at run time. The result was that warning
hints were propagating into do-file.
The convention is that when the interpreter dies with an internal error, the
message starts "panic: ". Historically, many panic messages had been terse
fixed strings, which means that the out-of-range values that triggered the
panic are lost. Now we try to report these values, as such panics may not be
repeatable, and the original error message may be the only diagnostic we get
when we try to find the cause.
We can't report diagnostics when the panic message is generated by something
other than croak(), as we don't have *printf-style format strings. Don't
attempt to report values in panics related to *printf buffer overflows, as
attempting to format the values to strings may repeat or compound the
original error.
The current definition of SAVE_DEFSV doesn’t take reference count-
ing into account. Every instance of it in the perl core is buggy
as a result.
Most are also followed by DEFSV_set, which is likewise buggy.
This commit implements SAVE_DEFSV in terms of save_gp and
SAVEGENERICSV if PERL_CORE is defined. save_gp and SAVEGENERICSV are
what local(*_) = \$foo uses. Changing the definition for XS code is
probably too risky this close to 5.16. It should probably be changed
later, though.
DEFSV_set is now changed to do reference counting too.
return and leavesub, for speed, were not copying temp variables with a
refcount of 1, which is fine as long as the fact that it was not cop-
ied is not observable.
With magical variables, that *can* be observed, so we have to forego
the optimisation and copy the variable if it’s magical.
This obviously applies only to rvalue subs.
This commit not only mentions default (as opposed to when)
in the error message about it being outside a topicalizer, but
also normalises those error messages, making them consistent with
continue and other loop controls. It also makes the perldiag
entry for when actually match the error message.
In S_doeval, if yystatus == 0 but there have been parser errors, then
there will be an extra scope on the scope stack inside the evalcomp
scope, causing an assertion failure with LEAVE_with_name("evalcomp").
This can happen with eval(q|""!=!~//|), which is a reduced version of
an eval in SNMP::Trapinfo’s test suite.
Under non-debugging builds, everything would have worked anyway,
as the LEAVE_with_name("evalcomp") would have left the scope
inside evalcomp.
Since POPBLOCK pops away the savestack markers on the scope stack, it
is not actually necessary to do LEAVE_with_name("evalcomp") at all
when there is a syntax error.
This used to cause an assertion failure, or sometimes ‘Attempt to free
nonexistent shared string’.
All that was required to fix it was the deletion of two cpp lines.
If goto &sub triggers a destructor that undefines &sub, a
crash ensues.
This commit adds an extra check in pp_goto after the unwinding of the
previous sub’s scope.
PL_compcv used to be localised around the entire string eval process,
and hence at runtime of the evaled code would refer to the evaled code
rather than code of a surrounding compilation. This interfered with the
ability of string-evaled code in a BEGIN block to affect the surrounding
compilation, in a similar way to the localisation of $^H and %^H that
was fixed in f45b078d20.
Similar to the fix there, this change moves the localisation of PL_compcv
inside the new evalcomp scope. A couple of things were relying on
PL_compcv to find the running code when in a string-eval scope; they now
need to find it from cx->blk_eval.cv, which was already being populated.
It doesn’t any more.
Now the hints are localised in a separate inner scope surrounding the
call to yyparse. This meant moving hint-handling code from pp_require
and pp_entereval into S_doeval.
Some tests in t/comp/hints.t were testing for the buggy behaviour, so
they have been adjusted.
Basically, this fixes
sub import {
eval "strict->import"
}
which should work the same way as
sub import {
strict->import
}
but was not working because %^H and $^H were being localised to
the eval at its run time, not just its compilation. So the values
assigned to %^H and $^H at the eval’s run time would simply be lost.
As of commit e4a21daa4e9 which removed the intervening code, there
have been two adjacent if blocks in S_doeval with the same condition.
They can be merged into one, to avoid confusing people (like me) try-
ing to understand the code.
Commit f9bddea7d2 divorced it from the code it was describing (dele-
tion of the FILEGV on eval exit). That code was subsequently repeated
in various places by commit 78da7625. The comment is now above the
first instance.
Perl_lex_start copies the string passed to it unconditionally.
Sometimes pp_entereval makes a copy before passing the string
to lex_start. So in those cases we can pass a flag to avoid a
redundant copy.
This is scary:
#use sort 'stable';
require re; re->import('/x');
eval '
print "a b" =~ /a b/ ? "ok\n" : "nokay\n";
use re "/m";
print "a b" =~ /a b/ ? "ok\n" : "nokay\n";
';
It prints:
ok
nokay
The re->import statement is supposed to apply to the caller that
is currently being compiled, but it makes ‘use re "/m"’ enable
/x as well.
Uncomment the ‘use sort’ line, and you get:
ok
ok
which is even scarier.
eval"" is supposed to compile its argument with the hints under which
the eval itself was compiled.
Whenever %^H is modified, a flag (HINT_LOCALIZE_HH; LHH hereinafter)
is set in $^H.
When eval is called, it checks the LHH flag in the hints from the time
it was compiled, to determine whether to reset %^H. If LHH is set,
it creates a new %^H based on the hints under which it was compiled.
Otherwise, it just leaves %^H alone.
The problem is that %^H and LHH may be set some time later
(re->import), so when the eval runs there is junk in %^H that
does not apply to the contents of the eval.
There are two layers at which the hints hash is stored. There is the
Perl-level hash, %^H, and then there is a faster cop-hints-hash struc-
ture underneath. It’s the latter that is actually used during compi-
lation. %^H is just a Perl front-end to it.
When eval does not reset %^H and %^H has junk in it, the two get
out of sync, because eval always sets the cop-hints-hash correctly.
Hence the first print in the first example above compiles without
‘use re "/x"’. The ‘use re’ statement after it modifies the %^H-with-
junk-in-it, which then gets synchronised with the cop-hints-hash,
turning on /x for the next print statement.
Adding ‘use sort’ to the top of the program makes the problem go
away, because, since sort.pm uses %^H, LHH is set when eval() itself
is compiled.
This commit fixes this by having pp_entereval check not only the LHH
flag from the hints under which it was compiled, but also the hints of
the currently compiling code ($^H / PL_hints).
This function evaluates its argument as a byte string, regardless of
the internal encoding. It croaks if the string contains characters
outside the byte range. Hence evalbytes(" use utf8; '\xc4\x80' ")
will return "\x{100}", even if the original string had the UTF8 flag
on, and evalbytes(" '\xc4\x80' ") will return "\xc4\x80".
This has the side effect of fixing the deparsing of CORE::break under
‘use feature’ when there is an override.
In 5.8.x, this code:
use overload '""'=>sub { warn "stringify"; --$| ? "gonzo" : chr 256 };
my $obj = bless\do{my $x};
warn "$obj";
print "match\n" if chr(256) =~ $obj;
prints
stringify at - line 1.
gonzo at - line 3.
stringify at - line 1.
match
which is to be expected.
In 5.10+, the stringification happens one extra time, causing a failed match:
stringify at - line 1.
gonzo at - line 3.
stringify at - line 1.
stringify at - line 1.
This logic in pp_regcomp is faulty:
if (DO_UTF8(tmpstr)) {
assert (SvUTF8(tmpstr));
} else if (SvUTF8(tmpstr)) {
... copy under ‘use bytes’...
}
else if (SvAMAGIC(tmpstr)) {
/* make a copy to avoid extra stringifies */
tmpstr = newSVpvn_flags(t, len, SVs_TEMP | SvUTF8(tmpstr));
}
The SvAMAGIC check never happens when the UTF8 flag is on.
This stops PL_curstash from pointing to a freed-and-reused scalar in
cases like ‘package Foo; BEGIN {*Foo:: = *Bar::}’.
In such cases, another BEGIN block, or any subroutine definition,
would cause a crash. Now it just happily proceeds. newATTRSUB and
newXS have been modified not to call mro_method_changed_in in such
cases, as it doesn’t make sense.
Commit c774086b8 made this:
$ ./miniperl -lwe '()=my $undef1..my $undef2'
Use of uninitialized value in range (or flop) at -e line 1.
Use of uninitialized value in range (or flop) at -e line 1.
become this:
$ ./miniperl -lwe '()=my $undef1..my $undef2'
Use of uninitialized value $undef2 in range (or flop) at -e line 1.
Use of uninitialized value in range (or flop) at -e line 1.
which was slightly better. This commit finishes the job:
$ ./miniperl -lwe '()=my $undef1..my $undef2'
Use of uninitialized value $undef1 in range (or flop) at -e line 1.
Use of uninitialized value $undef2 in range (or flop) at -e line 1.
In addition to using _nomg calls in pp_flop, I had to modify
looks_like_number, which was clearly buggy: it was ignoring get-magic
completely, *except* in the case of SvPOKp. But checking SvPOKp
before calling magic does not make sense, as it may change during the
magic call.
This copy, which occurs with "a".."z" in list context, has been there
since alphabetic ranges were added in commit b1248f16c (perl 3.0 patch
#17 patch #16, continued).
As a side effect, this:
$ ./miniperl -lwe '()=my $undef1..my $undef2'
Use of uninitialized value in range (or flop) at -e line 1.
Use of uninitialized value in range (or flop) at -e line 1.
becomes this:
$ ./miniperl -lwe '()=my $undef1..my $undef2'
Use of uninitialized value $undef2 in range (or flop) at -e line 1.
Use of uninitialized value in range (or flop) at -e line 1.
which is slightly better. :-)
SVs_PADSTALE is only meaningful with SVs_PADMY, while
SVs_PADTMP is only meaningful with !SVs_PADMY,
so let them share the same flag bit.
Note that this doesn't yet free a bit in SvFLAGS, as the two
bits are also used for SVpad_STATE, SVpad_TYPED.
(This is is follow-on to 62bb6514085e5eddc42b4fdaf3713ccdb7f1da85.)
% perl -e 'print "package F;\n # \xF1\n;1;"' > x.pl
% perl '-Mopen=:encoding(utf8)' -e 'require "x.pl"'
utf8 "\xF1" does not map to Unicode at x.pl line 1.
Bit of a surprising discovery; Turns out that passing a single ":" for
the layers skips the fetch from the context layers:
perl -wE 'use open qw( :encoding(UTF-8) ); open my $fh, "<:", "etc"; say PerlIO::get_layers($fh);'
That will only get the relevant default layers, while removing the
colons makes it work as usual -- So we can abuse this (mis)feature to
fix the bug.
When smartmatch is about to start, to avoid calling get-magic (e.g.,
FETCH methods) more than once, it copies any argument that has
get-magic.
Tainting uses get-magic to taint the expression. Calling mg_get(sv)
on a tainted scalar causes PL_tainted to be set, causing any scalars
modified by sv_setsv_flags to be tainted. That means that tainting
magic gets copied from one scalar to another.
So when smartmatch tries to copy the variable to avoid repeated calls
to magic, it still copies taint magic to the new variable.
For $scalar ~~ @array (or ~~ [...]), S_do_smartmatch calls itself
recursively for each element of @array, with $scalar (on the suppos-
edly non-magical copy of $scalar) on the left and the element on
the right.
In that recursive call, it again does the get-magic check and copies
the argument. Since the copied of a tainted variable on the LHS is
magical, it gets copied again. Since the first copy is a mortal
(marked TEMP) with a refcount of one, the second copy steal its
string buffer.
The outer call to S_do_smartmatch then proceeds with the second ele-
ment of @array, without realising that its copy of $scalar has lost
its string buffer and is now undefined.
So these produce incorrect results under -T (where $^X is ‘perl’):
$^X =~ ["whatever", undef] # matches
$^X =~ ["whatever", "perl"] # fails
This problem did not start occurring until this commit:
commit 8985fe98dcc5c0af2fadeac15dfbc13f553ee7fc
Author: David Mitchell <davem@iabyn.com>
Date: Thu Dec 30 10:32:44 2010 +0000
Better handling of magic methods freeing the SV
mg_get used to increase the refcount unconditionally, pushing it on to
the mortals stack. So the magical copy would have had a refcount of
2, preventing its string buffer from being stolen. Now it has a ref-
erence count of 1.
This commit solves it by adding a new parameter to S_do_smartmatch
telling it that the variable has already been copied and does not even
need to be checked. The $scalar~~@array case sets that parameter for
the recursive calls. That avoids the whole string-stealing problem
*and* avoids extra unnecessary SVs.
Commit 309aab3a made goto &foo make the lexical hints of the caller of
the sub containing the goto visible when foo is called. CORE subs
need this to function properly when ‘goneto’. But in that commit I
put the PL_curcop assignment before the recursion check, causing the
warning settings of the caller to be used, instead of those at the
goto. This commit moves the PL_curcop further down in pp_goto.