mirror/perl - perl - Maple Linux Source

mirror of https://github.com/Perl/perl5.git synced 2026-01-26 08:38:23 +00:00

Author	SHA1	Message	Date
Richard Leach	5eb6a00307	Add a test for GH #16943 assertion failure The asserting fuzzed test case was: eval q!s,,$0[sub{m[]]],;s,,$0[sub{m[]]],}}! The assertion triggered was: pad.c:614: Perl_pad_add_anon: Assertion `!CvWEAKOUTSIDE((const CV *)sv)' failed. This behaviour was long standing, present in v5.8.8 if not earlier, then was addressed by: ``` commit eb54d46 Author: Yves Orton <demerphq@gmail.com> Date: Fri Aug 26 18:26:14 2022 +0200 Stop parsing on first syntax error. We try to keep parsing after many types of errors, up to a (current) maximum of 10 errors. Continuing after a semantic error (like undeclared variables) can be helpful, for instance showing a set of common errors, but continuing after a syntax error isn't helpful most of the time as the internal state of the parser can get confused and is not reliably restored in between attempts. This can produce sometimes completely bizarre errors which just obscure the true error, and has resulted in security tickets being filed in the past. This patch makes the parser stop after the first syntax error, while preserving the current behavior for other errors. An error is considered a syntax error if the error message from our internals is the literal text "syntax error". This may not be a complete list of true syntax errors, we can iterate on that in the future. This fixes the segfaults reported in Issue #17397, and #16944 and likely fixes other "segfault due to compiler continuation after syntax error" bugs that we have on record, which has been a recurring issue over the years. ```	2026-01-24 23:01:10 +00:00
Richard Leach	f747bb18d7	t/op/cond.t - add a specific test for GH#18576 GH #18576 was concerned with the value returned from `if/elsif` statements that both have a false conditional, such as: my $y=do { if (0) { 5 } elsif(0) { 6 } }; where `$y` should contain an IV with value 0, the value of the last expression to be evaluated, but it did not. This problem was fixed as a side-effect of following commit: 4176abf7a8e425113debe55679c99b59bb9d299a Author: David Mitchell <davem@iabyn.com> Date: Wed Sep 18 12:28:18 2019 +0100 set VOID on OP_ENTER The OP_ENTER planted at the start of a program (and possibly elsewhere) gets left as UNKNOWN context rather than VOID context, due to op_scope() not honouring the current context. Fixing this makes things infinitesimally faster. This commit adds the `if/else` example mentioned above as a specific test for GH #18576, to add assurance that a future regression would result in a test failure.	2026-01-24 15:07:41 +00:00
Richard Leach	03e3af819d	Add test for GH#16938 assertion failure The asserting fuzzed case was: eval"${sub{sub{//]]]"}} The assertion triggered was: perl: op.c:7346: Perl_newSVOP: Assertion `sv' failed. The bug appeared following: ``` commit: 9ffcdca1f504cb09088413c074b35af4b7f247e3 Author: Father Chrysostomos <sprout@cpan.org> Date: Mon Nov 12 23:04:16 2012 -0800 Don’t leak subs containing syntax errors I fixed this for BEGIN blocks earlier, but missed the fact that all subs are affected. When called without an o argument (from newANONATTRSUB), newATTRSUB is expected to return a CV with an unowned reference count of which the caller will take ownership. We cannot have newATTRSUB returning a freed CV, so we have it return null instead. But that means ck_anoncode and pm_runtime have to account for that. ``` The bug disappeared following: ``` commit eb54d46f7264ff7af62c409d8a6ab984a5a34f57 Author: Yves Orton <demerphq@gmail.com> Date: Fri Aug 26 18:26:14 2022 +0200 Stop parsing on first syntax error. We try to keep parsing after many types of errors, up to a (current) maximum of 10 errors. Continuing after a semantic error (like undeclared variables) can be helpful, for instance showing a set of common errors, but continuing after a syntax error isn't helpful most of the time as the internal state of the parser can get confused and is not reliably restored in between attempts. This can produce sometimes completely bizarre errors which just obscure the true error, and has resulted in security tickets being filed in the past. This patch makes the parser stop after the first syntax error, while preserving the current behavior for other errors. An error is considered a syntax error if the error message from our internals is the literal text "syntax error". This may not be a complete list of true syntax errors, we can iterate on that in the future. This fixes the segfaults reported in Issue #17397, and #16944 and likely fixes other "segfault due to compiler continuation after syntax error" bugs that we have on record, which has been a recurring issue over the years. ```	2026-01-23 23:31:16 +00:00
Karl Williamson	274208291b	embed.fnc: Change EPTR assert for sv_pos_u2b_foo to gt These internal functions can handle empty strings, but it aren't called with those so far, and it is better practice to not call them with an empty string, so guard against it now.	2026-01-23 12:09:29 -07:00
Karl Williamson	ac3b9fce9b	perlapi: Add small detail to hv_iterval This now parallels the entry for hv_iterkey	2026-01-23 12:07:44 -07:00
Scott Baker	3e7d794e79	Use `seed` instead of `u` for readability in Perl_seed()	2026-01-23 11:03:16 -07:00
Scott Baker	8d1c2aec21	Use `getentropy()` for seeding PRNG in Perl_seed() On libc (*nix) systems we call `getentropy()` to get the seed needed to start the PRNG. If that call fails, we fall back to reading the filesystem via `/dev/urandom`. If that fails we fall back to hashing some state variables instead. This should be faster, less risky, and generally better than trying to read from `/dev/urandom` Foo	2026-01-23 11:03:16 -07:00
Karl Williamson	d4247bb256	sbox32_hash.h: Add #undef's These case statements need not be visible outside this header. Putting these here avoids cluttering up embed.h, where the same #undef lines would otherwise be generated	2026-01-22 09:55:58 -07:00
Karl Williamson	682cd222bd	embed.pl: Remove no longer used sub	2026-01-22 09:55:58 -07:00
Karl Williamson	4b67bbf7f6	embed.pl: Also consider #undef's This code looks to see what conditions must apply before a #define happens. This commit extends that to also look for #undef commands. The end result is that for symbols that are visible to XS code, but aren't supposed to be, embed.h contains an #undef so it isn't visible. But if it already has been #undef'ed, there is no need to do this. But a symbol can be defined and undefined many times, and the conditions for doing an #undef may be different than what the symbol was #defined under. The consequences of not realizing that a symbol gets undefined are simply that we generate an unnecessary #undef. The consequences of failing to generate one when the symbol is defined is that it is visibile when not intended to be so. So, there are various restrictions to try to make sure that we don't err in the latter direction.	2026-01-22 09:55:58 -07:00
Karl Williamson	46bf0cf5e8	embed.pl: Consider symbols visible only to extensions Two commits ago, the code was extended to compare the C preprocessor visibility of a symbol with what the desired visibility of a symbol is. It assumed that everything not constrained to core was visible everywhere. This commit extends that to look for being visible only to extensions. As a result, the large number of symbols added to the override list in that commit are now removed.	2026-01-22 09:55:58 -07:00
Karl Williamson	5ed1b2fcbe	embed.pl: Add cpp constraints for .h files Many of the header files in our source have guards that keep them from being recursively called, with a convention as to how their name is derived from the file name. This commit changes to now consider these when computing what a cpp conditional evaluates to. It follows the convention, except in those few places where it is violated, and sets up the infrastructure so that this mechanism could be applied for other cases. Since this commit was originally written, all but one header file has been changed to follow the convention, so after rebasing, only one line is now being added.	2026-01-22 09:55:58 -07:00
Karl Williamson	6731bcd053	embed.pl: Compare cpp visibility with desired This commit creates a function that calculates what C preprocessor constraints there are on the visibility of a #define'd symbol. It then compares that with what the desired visibility is, based as prior commits have determined, and reconciles any discrepancies. It warns if the symbol is supposed to be visible, but cpp makes it not so. It adds it to the list of symbols to undefine if it is visible, but is not supposed to be so. In order to make this commit somewhat smaller with respect to code changes, it assumes anything that is visible to extensions is visible everywhere. This entailed adding a large number of symbols to the list of symbols to not #undef, in order to not change embed.h. The commit after the next one will fix this, and those symbols will be removed from the list in that commit.	2026-01-22 09:55:58 -07:00
Karl Williamson	c0955b6422	regen/HeaderParser: _reduce_conds: Return more than a bool This changes this function to stringify the result into a preprocessor conditional expression, instead of just a bool 0 or 1. This gives the caller more information. This doesn't change the outcome of callers who are expecting a boolean, as any string now returned evaluates to true.	2026-01-22 09:55:58 -07:00
Karl Williamson	d3cd2a8764	embed.pl: Assume Perl reserved symbols are visible Unless there is an indication otherwise.	2026-01-22 09:55:58 -07:00
Karl Williamson	38df3f3835	embed.pl: Formalize reserved Perl symbols This creates a regular expression pattern of names that we feel free to expose to XS code's namespace. Hence they are names reserved for our use, and should any conflicts arise, the module needs to change, not us. Naturally, the pattern is pretty restrictive. It is: Any symbol beginning with "PL_" Any symbol containing /perl/i, with both sides delimitted Any symbol containing "PERL" Any other spelling that we expose could be considered to pollute the XS code space. We feel free to do that all the time. Any new function's short name will do that. And we generally feel free to create macros with arbitrary names which could conflict with an existing XS name. Some important potential conflicts are: New keywords: We create an exposed KEY_foo macro. Some existing modules use some of these. My grep of CPAN shows maybe a dozen of these get used; mostly KEY_END. config.h is full of symbols like HAS_foo, I_bar, and others that are all exposed. I don't imagine we can claim to reserve any symbol beginning with either of those. Informally, myself and others have used a trailing underscore to indicate a private symbol. There are a few distributions that use some of these anyway. And there has been pushback when new short symbols that use this convention have been added. I would like to get a formal rule about use of this convention. There are 200+ of these currently. We could reserve any names with trailing underscores, or if that is too much, any ending in, say, 'pl_' or 'PL_'. We have 3000+ undocumented macro names that don't end in underscores and which are currently visible to XS code. This number includes the KEY_foo ones, but not the ones in config.h. To deal with namespace pollution, we have had the -DNO_SHORT_NAMES Configure option for use just with embedded perls. This hasn't worked at least since we added inline functions, and it always applied to only functions. I have a WIP to get this to work again, and to extend it to work with documented macros. It just occurred to me how to make this be customizable, so that downstream someone could add a list of symbols that should only exist as 'Perl_foo', and then recompile	2026-01-22 09:55:58 -07:00
Karl Williamson	2166122145	embed.pl: Add some comments	2026-01-22 09:55:58 -07:00
Karl Williamson	8178bdb9ed	embed.pl: Use 'next' to remove an else Then we outdent the contents of that else, and reflow, which makes this long-ish section of code a bit shorter.	2026-01-22 09:55:58 -07:00
Karl Williamson	e4cec3d716	embed.pl: Swap order of conditionals It is easier to understand when the nearly trivial case is gotten out of the way first.	2026-01-22 09:55:58 -07:00
Karl Williamson	46c37430a0	embed.pl: Add constraints I did an inspection of the source, and found these few symbols that will not be defined for any XS code.	2026-01-22 09:55:58 -07:00
Karl Williamson	1fd7f0165b	embed.pl: Handle case of multiple flags for an element Some symbols are #defined in multiple places in the input; based typically on different preprocessor conditionals. We want to use the definition which has the widest visibility. This change showed that USE_STDIO had wrongly been undefined for the past few commits in blead. It is no longer actually ever defined by perl.	2026-01-22 09:55:58 -07:00
Karl Williamson	b0655cabd0	embed.pl: Add sub-hash This is in preparation for having a different sub-hash at the same level.	2026-01-22 09:55:58 -07:00
Karl Williamson	6355afcfa2	embed.pl: Extract common code into a function These are just a few lines now, but future commits will make this function bigger	2026-01-22 09:55:58 -07:00
Karl Williamson	ac8278ef3f	embed.pl: Pass file name to subroutine This will be used in future commits for warnings and errors	2026-01-22 09:55:58 -07:00
Karl Williamson	404a19b06b	embed.pl: Remove some commented out code I no longer think this might ever be useful	2026-01-22 09:55:58 -07:00
Karl Williamson	9ef0d33ac6	embed.pl: Use a separate loop for a hash This is in preparation for it to do things differently than the other hashes in the loop it previously was in.	2026-01-22 09:55:58 -07:00
Karl Williamson	ada54ac219	embed.pl: Move declaration This is an internal value. The declarations at the top of the program are for data that someone might want to change.	2026-01-22 09:55:58 -07:00
Karl Williamson	13b09f53da	embed.pl: Save hash elem into $var to make more readable	2026-01-22 09:55:58 -07:00
Karl Williamson	f3e59cad18	embed.pl: Rename variable The new name reflects the source of the data being examined.	2026-01-22 09:55:58 -07:00
Karl Williamson	c33265be37	embed.pl: Move pattern definition to only use The pattern doesn't get recompiled each time through the loop; it's easier to understand if the definition and use are near each other	2026-01-22 09:55:58 -07:00
Karl Williamson	943b938012	embed.pl: Improve detection of system symbols The heuristic previously used had many false positives, so it thought symbols were for the system that really weren't. This tightens it up, and to avoid breaking any existing code that might be relying on those miscategorized symbols, adds them to the list of unresolved visibility ones, so that they remain visibile.	2026-01-22 09:55:58 -07:00
Karl Williamson	ada434ed89	embed.pl: Split an 'if' into two And reorder them. This paves the way for the next commit which will change them into having different actions	2026-01-22 09:55:58 -07:00
Karl Williamson	2784d222e2	embed.pl: Recategorize libc symbols These few symbols had been marked as unresolved as to their visibility. But in fact they are symbols in libc that do need to always be visible, and there is already a hash for this type. Move them to the proper place. The net effect is no external changes.	2026-01-22 09:55:58 -07:00
Karl Williamson	6399e4abe5	embed.pl: Add comments	2026-01-22 09:55:58 -07:00
Karl Williamson	37f79c58e9	embed.pl: Move more code to earlier in the file Future commits will need the results of this code earlier than it is now calculated.	2026-01-22 09:55:58 -07:00
Karl Williamson	f210eba1a5	embed.pl: Rename variable The new name adds the detail as to what it constrains	2026-01-22 09:55:58 -07:00
Karl Williamson	aa3eb59ab9	embed.pl: Move code to earlier in file This now creates lists much earlier so that future commits that will need them earlier in the process can do so.	2026-01-22 09:55:58 -07:00
Karl Williamson	b00a943889	embed.pl: Add some comments	2026-01-22 09:55:58 -07:00
Karl Williamson	086ad9905f	embed.pl: Move comments A block of comments got pushed down in the file by previous commits to where it no longer made sense. Move it back to the top.	2026-01-22 09:55:58 -07:00
Karl Williamson	eb5764fa6c	embed.pl: Remove config.h from file skip list This file doesn't get parsed anyway because it isn't in the MANIFEST, nor would it work out to parse it in spite of that, if only because it isn't under source control, and the outputs of this are.	2026-01-22 09:55:58 -07:00
Karl Williamson	ff31f47a54	embed.fnc: Add string assertions for S_intuit-more This makes sure the terminating character is a NUL. This internal function isn't documented as having that requirement, but that's always the case in our test suite. And functions it calls assume there is at least one character in the input, so the assertion shouldn't be EPTRge, and the test suite fails if it is EPTRgt.	2026-01-22 07:24:00 -07:00
Karl Williamson	31d13f8ab7	embed.fnc: Add string arg assertions for S_packlist This function takes a string argument with beginning and ending positions. It appears to me that those positions are overwritten without being examined, but the function does get called with an apparently empty string, but it actually contains a NUL.	2026-01-22 07:24:00 -07:00
Karl Williamson	c6c2ced1e9	embed.fnc: Add string arg assertions for S_incline This function is expecting a NUL-terminated C string	2026-01-22 07:24:00 -07:00
Karl Williamson	6a39094910	embed.fnc: Add string args assertions for S_wildcard The end pointer for this function should always point to the terminating NUL character of the string.	2026-01-22 07:24:00 -07:00
Karl Williamson	ea15ca3786	embed.fnc: Add EPTRtermNUL Some functions take arguments that point to the terminating NUL character of a string. This commit adds a way to declare in embed.fnc that a given argument is of that kind.	2026-01-22 07:24:00 -07:00
Karl Williamson	e3f7e781f3	embed.pl: Save some hash references in variables This shortens later references	2026-01-22 07:24:00 -07:00
Karl Williamson	4e751d51ff	toke.c: Add comment to S_parse-ident	2026-01-22 07:24:00 -07:00
Karl Williamson	6773f0fb0a	embed.fnc: Add comment	2026-01-22 07:24:00 -07:00
Karl Williamson	bbc7f9cd1e	Standardize .h recursive #include guard names Some header files in the Perl core have guards to keep a recursive #include from compiling them again. This is standard practice in C coding, and I was surprised at how many headers don't have it. These seem to rely on only being included from perl.h, which does have its own guard. Most of the guards use a common naming convention. If the file is named foo.h, the guard is named 'PERL_FOO_H'. Often, though, a trailing underscore is added, 'PERL_FOO_H_', making the convention slightly fuzzy. The 'PERL_' is not added if the file 'foo' already includes 'perl' in its name, Those rules are enough to describe all the guards in the core, except for the outliers in this commit, plus perl.h. There are occasions in various Perl scripts that examine our source that we want to create a pattern that matches the header guard name for a particular file. In spite of the slight fuzziness, that's easy using the above rules, except for the ones in this commit, and perl.h. It would be better for that code to not have to worry about the outliers, and since these are arbitrary names, we can just change them to follow the rules that already apply to most. This commit changes the names of the outliers so that the only file the rules don't apply to is perl.h. Its guard is named H_PERL. That spelling is used in Encode, so it's not so easy to change it seamlessly. I'm willing to live with it continuing to be an outlier for the code I write.	2026-01-22 07:21:21 -07:00
Karl Williamson	ed5117fa8b	grok_infnan: Handle empty input This public function dereferences its pointer parameter before checking its validity.	2026-01-22 06:59:32 -07:00

1 2 3 4 5 ...

83777 Commits