Besides using the just-introduced faster path for SV copying, this
allows the check for SV_GMAGIC to be pushed into the called function
without having to worry about SV leaks.
Two additional micro-optimizations are also in this commit:
* A pointer to xav_fill is cached for use in the loop. This can
be used directly to update AvFILLp(av), rather than having to
get there from av's SV* each time.
* The value of the loop iterator, i, is directly written into
xav_fill, rather than getting the value in that slot,
incrementing it (to get the same value as i), and writing it back.
Generally shifting is done by adjusting the start position of the
array, so that AvALLOC() and AvARRAY() no longer align. Previously if
the array was AvREAL(), perl NULL-ed out the old pointer at the same
time, i.e. doing the equivalent of AvARRAY(av)[-1] = NULL.
This commit changes it so that on PERL_RC_STACK builds, perl instead
keeps the old pointer there on AvREAL() arrays too, even if it now
points to a freed SV; and instead makes av_clear() and av_unshift()
responsible for NULL-ing out the slot only when actually reclaiming any
unused slots between AvALLOC() and AvARRAY().
The reason for this is because of the mechanism whereby @DB::args is
sometimes populated by caller(), as used by Carp.pm to display function
arguments in stack traces etc. For the common idiom of
sub f {
my $self = shift;
....;
}
then previously, since @_ wasn't AvREAL(), the old pointer in @_ to
argument 0 was kept around and thus the hidden pointer value was still
available for caller() to add to @DB::args.
But since the next commit will change @_ to be AvREAL(), this will no
longer work. Thus to keep the ability to debug stack traces etc (via
the awful @DB::args hack), we make shift keep the pointer even for the
AvREAL() case.
This function abstracts away removing any AvARRAY(av) - AvALLOC(av)
offset from an array (previously added by av_shift()).
Also use Zero() rather than a loop to NULL out any elements in the
offset area.
Since 399fef93c9,
trailing elements in an array that has been unshifted and resized
might not be properly Zero() initialized. This is because of faulty
arithmetic when calculating `to_null`, the number of elements to
initialize, when the array was only partially shifted.
This commit corrects the arithmetic, adds comments arount the
calculation of `to_null`, and adds a test based upon the case
provided in GH #21235.
The test added segfaults more reliably for me - almost every time -
than the originally supplied case. However, since it relies upon
uninitialized memory, it's probably still not deterministic and
somewhat dependent upon the choice of memory allocator.
Closes#21235
The api for av_store() says it is the callers responsibility to call
SvREFCNT_inc() on the stored item after the store is successful.
However inside of av_store() we store the item in the C level array before we
trigger magic. To a certain extent this is required because mg_set(av) needs
to be able to see the newly stored item.
But if the mg_set() or other magic associated with the av_store() operation
fails, we would end up with a double free situation, as we will long jump up
the stack above and out of the call to av_store(), freeing the mortal as we go
(via Perl_croak()), but leaving the reference to the now freed pointer in the
array. When the next SV is allocated the reference will be reused, and then we
are in a double free scenario.
I see comments in pp_aassign talking about defusing the temps stack for the
parameters it is passing in, and things like this, which at first looked
related. But that commentary doesn't seem that relevant to me, as this bug
could happen any time a scalar owned by one data structure was copied into an
array with set magic which could die. Eg, I can easily imagine XS code that
expected code like this (assume it handles magic properly) to work:
SV **svp = av_fetch(av1,0,1);
if (av_store(av2,0,*svp))
SvREFCNT_inc(*svp);
but if av2 has set magic and it dies the end result will be that both av1 and
av2 contain a visible reference to *svp, but its refcount will be 1. So I think
this is a bug regardless of what pp_aassign does.
This fixes https://github.com/Perl/perl5/issues/20675
In GH 20435 many typos in our C code were corrected. However, this pull
request was not applied to blead and developed merge conflicts. I
extracted diffs for the individual modified files and applied them with
'git apply', excepting four files where patch conflicts were reported.
Those files were:
handy.h
locale.c
regcomp.c
toke.c
We can handle these in a subsequent commit. Also, had to run these two
programs to keep 'make test_porting' happy:
$ ./perl -Ilib regen/uconfig_h.pl
$ ./perl -Ilib regen/regcomp.pl regnodes.h
In a nutshell, for a long time the minimum PV length (hardcoded
in Perl_sv_grow) has been 10 bytes and the minimum AV array size
(hardcoded in av_extend_guts) has been 4 elements.
These numbers have been used elsewhere for consistency (e.g.
Perl_sv_grow_fresh) in the past couple of development cycles.
Having a standard definition, rather than hardcoding in multiple
places, is more maintainable. This commit therefore introduces
into perl.h:
PERL_ARRAY_NEW_MIN_KEY
PERL_STRLEN_NEW_MIN
(Note: Subsequent commit(s) will actually change the values.)
There was less benefit in doing this prior to newSV_type becoming an inline
function. Now that it is, inlining av_new_alloc can help compilers eliminate
redundant setting of AvMAX/AvFILLp and unnecessary branches of any following
inlined calls to av_store_simple.
The consolidation of newAV with newAV_alloc_xz? included nonsensical
juxtapositions. This cleans that up, adds a bit of markup, moves the
function implementing the newAV_alloc ones to internal, as there are no
CPAN uses, and we prefer the macro interface.
There's no efficient way to create a mortal SV of any type other than
SVt_NULL (via sv_newmortal). The options are either to do:
* SV* sv = sv_newmortal; sv_upgrade(sv, SVt_SOMETYPE);
but sv_upgrade is needlessly inefficient on new SVs.
* SV* sv = sv_2mortal(newSV_type(SVt_SOMETYPE)
but this will perform runtime checks to see if (sv) and if (SvIMMORTAL(sv),
and for a new SV we know that those answers will always be yes and no.
This commit adds a new inline function which is basically a mortalizing
wrapper around the now-inlined newSV_type.
When a function outside of sv.c creates a SV via newSV(0):
* There is a call to Perl_newSV
* A SV head is uprooted and its flags set
* A runtime check is made to effectively see if 0 > 0
* The new SV* is returned
Replacing newSV(0) with newSV_type(SVt_NULL) should be more efficient,
because (assuming there are SV heads to uproot), the only step is:
* A SV head is uprooted and its flags set
This just detabifies to get rid of the mixed tab/space indentation.
Applying consistent indentation and dealing with other tabs are another issue.
Done with `expand -i`.
* vutil.* left alone, it's part of version.
* Left regen managed files alone for now.
Many of the files in perl are for one thing only, and hence their
embedded documentation will be for that one thing. By creating a hash
here of them, those files don't have to worry about what section that
documentation goes under, and so it can be completely changed without
affecting them.
av_make sets `AvFILLp(av)= -1;`, but will already have been set by `newAV()`,
via `newSV_type`'s call to `sv_upgrade`.
This PR also includes the minor cosmetic change of swapping in `newAV()`
in place of its expansion, seemingly the only occurrence of that in core.
I created this in 87306e0674dfe3af29804b4641347cd5ac9b0521, thinking it
was needed to preserve backward compatibility if someone were using this
instead of the macro. But it turned out that there never was such a
function, it was inlined, and the name was S_av_top_index, so there is
no reason to create a new function that no one has ever been able to
call. So just remove it, and let all accesses go through the macro
This returns the number of elements in an array in a clearly named
function.
av_top_index(), av_tindex() are clearly named, but are less than ideal,
and came about because no one back then thought of this one, until now
Paul Evans did.
In [rt.cpan.org #39196] issue #17496 there is a report
that Tie::File produced spurious blank lines in the file
after
@tied= sort @tied;
it turns out that this is because Tie::File treats
EXTEND similarly to STORESIZE (which is arguably not
entirely correct, but also not that weird) coupled
with an off by one error in the calls to av_extend()
in pp_sort.
This patch fixes the fencepost error, adds some comments
to av_extend() to make it clear what it is doing, and
adds a test that EXTEND is called by this code with
correct argument.
This reverts commit bc62bf8519f9005df2fb29dbd3d330202b258b6b.
As Dave Mitchell said in <Perl/perl5/issues/17265/570253376@github.com>
"The docs for av_store() say that the caller is responsible for first
incrementing the ref count of the new SV before passing it to
av_store(). In the normal case of the two elements being different,
av_store() will decrement the old SV before storing the new one, and
everything is good. When the two elements are the same SV, av_store()
*still* needs to decrement the ref count, to undo the increment the
caller just did."
This should resolve GH #17265
Don't decrement the reference count of the element about to be stored
into.
Likely, this is an error in the caller, but doing this action blindly is
like shooting yourself in the foot. The branch prediction also added
ensures this shouldn't slow things down.
See http://nntp.perl.org/group/perl.perl5.porters/254974
MSVC due to a bug doesn't merge identicals between .o'es or discard these
vars and their contents.
MEM_WRAP_CHECK_2 has never been used outside of core according to cpan grep
MEM_WRAP_CHECK_2 was removed on the "have PERL_MALLOC_WRAP" branch in
commit fabdb6c0879 "pre-likely cleanup" without explination, probably bc
it was unused. But MEM_WRAP_CHECK_2 was still left on the "no
PERL_MALLOC_WRAP" branch, so remove it from the "no" side for tidyness
since it was a mistake to leave it there if it was removed from the "yes"
side of the #ifdef.
Add MEM_WRAP_CHECK_s API, letter "s" means argument is string or static.
This lets us get rid of the "%s" argument passed to Perl_croak_nocontext at
a couple call sites since we fully control the next and only argument and
its guaranteed to be a string literal. This allows merging of 2
"Out of memory during array extend" c strings by linker now.
Also change the 2 op.h messages into macros which become string literals
at their call sites instead of "read char * from a global char **" which
was going on before.
VC 2003 32b perl527.dll section size before
.text name
DE503 virtual size
.rdata name
4B621 virtual size
after
.text name
DE503 virtual size
.rdata name
4B5D1 virtual size
To avoid having to create deferred elements every time a sparse array
is pushed on to the stack, store a magic scalar in the array itself,
which av_exists and refto recognise as not existing.
This means there is only a one-time cost for putting such arrays on
the stack.
It also means that deferred elements that live long enough don’t
start pointing to the wrong array entry if the array gets shifted (or
unshifted/spliced) in the mean time. Instead, the scalar is already
in the array, so it cannot lose its place. This fix only applies
when the array as a whole is pushed on to the stack, but it could be
extended in future commits to apply to other places where we currently
use deferred elements.
The iterator of an AV is an IV value attached to the AV via magic.
It may be stored in the space used by mg_len, or it may be stored in
separately allocated space referenced by mg_ptr. The former is more
efficient, so should be preferred. The original code for AV iterators
would use mg_len if IV was (the same size as) I32, because mg_len was of
type I32. Since then mg_len has been increased to type SSize_t, but the
conditional about how AV iterators are stored wasn't updated to match.
As a result, on the now very common 64-bit builds we were missing out on
the opportunity to use the more efficient form of storage. This commit
updates the condition about how AV iterators are stored, to take account
of the new type.
In principle AV iterators ought to be of type SSize_t, and thus *always*
storable as mg_len. But Perl_av_iter_p() is in the public API with
its IV* return type, so there is a compatibility issue to address in
changing that.
av.c: In function ‘Perl_av_undef’:
av.c:577:35: warning: ‘orig_ix’ may be used uninitialized in this function [-Wmaybe-uninitialized]
PL_tmps_stack[orig_ix] = &PL_sv_undef;
The warning is bogus, as we only use the orig_ix if real is true,
and if real is true we will have set orig_ix. But it doesnt cost
much to initialize it always and shut up the compiler.
C++11 requires space between the end of a string literal and a macro, so
that a feature can unambiguously be added to the language. Starting in
g++ 6.2, the compiler emits a warning when there isn't a space
(presumably so that future versions can support C++11). Unfortunately
there are many such instances in the perl core. This commit fixes
those, including those in ext/, but individual commits will be used for
the other modules, those in dist/ and cpan/.
This commit also inserts space at the end of a macro before a string
literal, even though that is not deprecated, and removes useless ""
literals following a macro (instead of inserting a blank). The result
is easier to read, making the macro stand out, and be clearer as to the
intention.
Code and modules included with the Perl core need to be compilable using
C++. This is so that perl can be embedded in C++ programs. (Actually,
only the hdr files need to be so compilable, but it would be hard to
test that just the hdrs are compilable.) So we need to accommodate
changes to the C++ language.
av_clear(), av_undef(), hv_clear(), hv_undef() and av_make()
all have similar guards along the lines of:
ENTER;
SAVEFREESV(SvREFCNT_inc_simple_NN(av));
... do stuff ...;
LEAVE;
to stop the AV or HV leaking or being prematurely freed while processing
its elements (e.g. FETCH() or DESTROY() might do something to it).
Introducing an extra scope and calling leave_scope() is expensive.
Instead, use a trick I introduced in my recent pp_assign() recoding:
add the AV/HV to the temps stack, then at the end of the function,
just PL_tmpx_ix-- if nothing else has been pushed on the tmps stack in the
meantime, or replace the tmps stack slot with &PL_sv_undef otherwise
(which doesn't care how many times its ref count gets decremented).
This is efficient, and doesn't artificially extend the life of the SV
like sv_2mortal() would.
This commit makes this code around 5% faster:
my @a;
for my $i (1..3_000_000) {
@a = (1,2,3);
@a = ();
}
and this code around 3% faster:
my %h;
for my $i (1..3_000_000) {
%h = qw(a 1 b 2);
%h = ();
}