mirror/perl - perl - Maple Linux Source

mirror of https://github.com/Perl/perl5.git synced 2026-01-26 08:38:23 +00:00

Author	SHA1	Message	Date
Karl Williamson	24c7fb4c21	Convert Perl utf16 to utf8 functions to macros These functions are hereby removed in favor of calling the plain macros that already exist	2025-12-27 21:24:47 -07:00
Karl Williamson	7e1ae0c850	Remove SBOX case statements from external visibility I'm pretty sure there is no use case for these, and very unlikely to have any actual uses.	2025-12-10 08:50:19 -07:00
Karl Williamson	ebbe6ac0f7	Remove a few more macros from being visible to XS code These are a few macros dealing with inversion lists that were never intended to be visible to general XS code, and they actually can't be in use in cpan because the mechanisms to create inversion lists are private to perl.	2025-12-10 08:50:19 -07:00
Karl Williamson	92dcf59a90	Gain control of macro namespace visibility This commit adds the capability to undefine macros that are visible to XS code but shouldn't be. This can be used to stop macro namespace pollution by perl. It works by changing embed.h to have two modes, controlled by a #ifdef that is set by perl.h. perl.h now #includes embed.h twice. The first time works as it always has. The second sets the #ifdef, and causes embed.h to #undef the macros that shouldn't be visible. This call is just before perl.h returns to its includer, so that these macros have come and gone before the file that #included perl.h is affected by them. It comes after the inline headers get included, so they have access to all the symbols that are defined. The list of macros is determined by the visibility given by the apidoc lines documenting them, plus several exception lists that allow a symbol to be visible even though it is not documented as such. In this commit, the main exception list contains everything that is currently visible outside the Perl core, so this should not break any code. But it means that the visibility control is established for future changes to our code base. New macros will not be visible except when documented as needing to be such. We can no longer inadvertently add new names to pollute the user's. I expect that over time, the exception list will become smaller, as we go through it and remove the items that really shouldn't be visible. We can then see via smoking if someone is actually using them, and either decide that these should be visible, or work with the module author for another way to accomplish their needs. (I would hope this would lead to proper documentation of the ones that need to be visible.) There are currently four lists of symbols. One list is for symbols that are used by libc functions, and that Perl may redefine (usually so that code doesn't have to know if it is running on a platform that is lacking the given feature.) The algorithm added here catches most of these and keeps them visible, but there are a few items that currently must be manually listed. A second list is of symbols that the re extension to Perl requires, but no one else needs to. This list is currently empty, as everything initially is in the main exception list. A third list is for items that other Perl extensions require, but no one else needs to. This list is currently empty, as everything initially is in the main exception list. The final list is for items that currently are visible to the whole world. It contains thousands of items. This list should be examined for: 1) Names that shouldn't be so visible; and 2) Names that need to remain visible but should be changed so they are less likely to clash with anything the user might come up with. I have wanted this ability to happen for a long time; and now things have come together to enable it. This allows us to have a clear-cut boundary with CPAN. It means you can add macros that have internal-only use without having to worry about making them likely not to clash with user names. It shows precisely in one place what our names are that are visible to CPAN.	2025-12-10 08:50:19 -07:00
Karl Williamson	838b774823	Move hv_stores() declaration from embed.fnc to hv.h This is required for the next few commits that start automatically creating long Perl_name functions for the elements in embed.fnc that are macros and don't already have them in the source. Only macros can take a parameter that has to be a literal string, so don't fit with the next few commits. This is the only case in embed.fnc like that, so I'm deferring dealing with it for now.	2025-12-10 08:50:19 -07:00
Karl Williamson	32aaa22eec	embed.fnc: Drop Perl_ on do_aexec my_stat my_lstat These macros are not for external use, so don't need a Perl_ prefix	2025-12-10 08:50:19 -07:00
Karl Williamson	4092daf53e	Remove some special EBCDIC code The 'variant_byte_number' function was written to find the byte number in a word of the first byte whose meaning varies depending on if the string it is part of is encoded in UTF-8 or not. On ASCII machines, that is simply when the upper bit is set. On EBCDIC machines, there is no similar pattern, so this function hasn't been compiled on those. A long time ago, I realized that this function could also handle binary data by coercing that binary data into having the form of having that bit set or not depending on the pattern being looked for, and then calling that function. But I actually hadn't realized until now that it was binary data not tied to a character set that was being worked on. This commit rectifies that. A new alias is added for that function that emphasizes that it works on binary data, the function is now compiled for EBCDIC, and the EBCDIC-only code that avoided using it is now removed.	2025-11-01 21:02:37 -06:00
Paul "LeoNerd" Evans	f1a8d7d883	Implement named parameters in signatures (PPC0024) This adds a major new ability to subroutine signatures, allowing callers to pass parameters by name/value pairs rather than by position. sub f ($x, $y, :$alpha, :$beta = undef) { ... } f( 123, 456, alpha => 789 ); Originally specified in https://github.com/Perl/PPCs/blob/main/ppcs/ppc0024-signature-named-parameters.md This feature is currently considered experimental.	2025-10-31 11:31:29 +00:00
Branislav Zahradník	147d5f1b9e	[parser] new_block_statement - deduplicate "a block is a loop that happens once"	2025-10-22 17:23:56 +01:00
Branislav Zahradník	e6a443b294	[parser] package - deduplicate coupled call sequence Function combines call of original `package` and `package_version` when new namespace statement is detected. Instead of required three statements usage now consists of single function call.	2025-10-22 17:23:56 +01:00
Karl Williamson	935cdb76e8	embed.fnc: mv definition of more_sv This was in a #ifdef of being in sv.c, which it is, but since it is public, it needs to be moved out of this. This removes the need for a copy of its prototype to be in sv_inline.h	2025-10-21 18:58:48 -06:00
Karl Williamson	2e142e0d27	regen/embed.pl: Avoid use of hard-coded list The list consists of exactly the functions that have the O flag set in embed.fnc. No need to keep this data twice. The entries are trivially generatable from existing entries as we go along And those generated entries have the added advantage of not using the short name, so potentially less name space pollution	2025-10-21 18:58:48 -06:00
Karl Williamson	bd4c0d1fc2	Add S_parse_ident_no_copy() This new function is for callers that are merely checking if the string being parsed is a legal identifier or not, and arent interested in the normalized version of the identifier that parse_indent() generates. This new function allows callers to not have to think about this buffer; it just wraps plain parse_ident() using a throw-away buffer to hold the returned normalized text. This avoids introducing a bunch of conditionals inside parse_ident.	2025-10-17 12:26:04 -06:00
Karl Williamson	735e7cc211	toke.c: Change parse_ident to take any string Prior to this commit, the string passed to this function had to be pointing to somewhere in PL_bufptr. But this is only because it assumed that the initial position is less than PL_bufend. By passing the upper bound in, that assumption is automatically removed.	2025-10-17 12:26:00 -06:00
Karl Williamson	e4be402477	toke.c: Use flags parameter to S_parse_ident This makes it clearer at each call point what is happening, and prepares for future commits where more flags will be passed to this function.	2025-10-17 12:26:00 -06:00
Karl Williamson	bfbd5f7e35	toke.c: Use flags parameter for S_force_word This makes it clear at each call point what is happening, instead of having to jump to the S_force_word definition to know what 'false, true' vs 'true, false' actually means. And this prepares for future commits.	2025-10-17 12:25:59 -06:00
Karl Williamson	3450d19250	intuit_more: 'use strict' allows much better handling Most code these days runs under 'use strict'. That allows us to resolve ambiguity without resorting to heuristics in far more cases than before. This commit adds a parameter to intuit_more() that gives the context it is being called from. And when that call is to resolve what $foo[...] is supposed to mean, we can look up foo to see if it is an array or a scalar. If the former, the "..." must be a subscript; if a scalar, it must be a charclass. Only if there is both a $foo and an @foo is there ambiguity. If so, we drop down to using the heuristics	2025-10-17 12:09:03 -06:00
Karl Williamson	aa93969e9c	toke.c: Create function to see if an identifier is known This checks first if there is a lexical variable in scope with the given name, and if not, if there is a global	2025-10-17 12:09:03 -06:00
Karl Williamson	9fc9ec2818	Change invlist function names to be legal This continues the process started in #23592 to change names with leading underscores to be legal C. See that p.r. or 4bb3572f7a1c1f3944b7f58b22b6e7a9ef5faba6 for extensive discussion. This commit simply moves the leading underscore to be trailing	2025-10-12 16:56:21 -06:00
Karl Williamson	59bca40fd0	S_scan_ident: Convert parameter to bool All calls to it set it to TRUE or FALSE	2025-10-07 11:48:47 -06:00
Paul "LeoNerd" Evans	215e36f380	Add `cop_*_warning()` API This adds three new API functions: a pair to modify a COP by enabling or disabling a single warning bit within it, and a query function to ask if a given warning is already enabled. This API is provided for CPAN modules to use to modify the set of warnings present in a COP during compile-time. Currently modules need to use the `new_warnings_bitfield()` function, which was recently hidden by 09a0707. That change broke the `Syntax::Keyword::Try` module, as reported in https://github.com/Perl/perl5/issues/23609.	2025-09-23 13:43:47 +01:00
Karl Williamson	c14d142701	Make die() always expand to Perl_die_nocontext() See 03f24b8a082948e5b437394fa33d0af08d7b80b6 for the motivation. This commit changes plain die() to not use a thread context parameter. It and die_nocontext() now behave identically.	2025-09-21 06:55:45 -06:00
Karl Williamson	2cb0034ef5	Unroll valid_utf8_to_uv loop This gives a bit of performance boost in this function that can be called during pattern matching. Here are some cachegrind comparisons with blead: Key: Ir Instruction read Dr Data read Dw Data write COND conditional branches IND indirect branches The numbers represent relative counts per loop iteration, compared to blead at 100.0%. Higher is better: for example, using half as many instructions gives 200%, while using twice as many gives 50%. GCC CLANG valid_utf8_to_uv(0x007f), length is 1 blead hacked blead hacked ------ ----------- ------ ------ Ir 100.00 100.69 Ir 100.00 99.11 Dr 100.00 101.47 Dr 100.00 99.74 Dw 100.00 100.00 Dw 100.00 99.57 COND 100.00 101.20 COND 100.00 100.00 IND 100.00 100.00 IND 100.00 94.12 valid_utf8_to_uv(0x07ff), length is 2 blead hacked blead hacked ------ ----------- ------ ------ Ir 100.00 100.68 Ir 100.00 99.04 Dr 100.00 101.47 Dr 100.00 99.74 Dw 100.00 100.00 Dw 100.00 99.57 COND 100.00 102.40 COND 100.00 101.23 IND 100.00 100.00 IND 100.00 94.12 valid_utf8_to_uv(0xfffd), length is 3 blead hacked blead hacked ------ ----------- ------ ------ Ir 100.00 100.83 Ir 100.00 99.04 Dr 100.00 101.47 Dr 100.00 99.75 Dw 100.00 100.00 Dw 100.00 99.57 COND 100.00 102.99 COND 100.00 101.84 IND 100.00 100.00 IND 100.00 94.12 valid_utf8_to_uv(0xffffd), length is 4 blead hacked blead hacked ------ ----------- ------ ------ Ir 100.00 100.91 Ir 100.00 99.13 Dr 100.00 101.46 Dr 100.00 99.75 Dw 100.00 100.00 Dw 100.00 99.57 COND 100.00 103.59 COND 100.00 102.45 IND 100.00 100.00 IND 100.00 94.12 valid_utf8_to_uv(0x3ffffff), length is 5 blead hacked blead hacked ------ ----------- ------ ------ Ir 100.00 101.28 Ir 100.00 99.29 Dr 100.00 101.46 Dr 100.00 99.75 Dw 100.00 100.00 Dw 100.00 99.57 COND 100.00 104.19 COND 100.00 103.07 IND 100.00 100.00 IND 100.00 94.12 valid_utf8_to_uv(0x7fffffff), length is 6 blead hacked blead hacked ------ ----------- ------ ------ Ir 100.00 89.83 Ir 100.00 88.83 Dr 100.00 95.22 Dr 100.00 92.94 Dw 100.00 92.44 Dw 100.00 91.63 COND 100.00 86.21 COND 100.00 87.11 IND 100.00 100.00 IND 100.00 88.89 Clang gives slightly worse results than gcc. But there is an improvement in both cases for conditionals for two-byte and longer characters.. This shows that the performance is significantly worse for code points that take 6 bytes (or more, which I didn't include) to represent. These are all well outside the Unicode range; hence are very rarely encountered. Performance is improved a bit for the typical cases. The algorithm used could handle 6 and 7 byte characters, but that increases memory usage, and can lead to the compiler choosing to not inline this function. In blead, experiments with clang gave these results Max bytes inlined Instances in the code where not inlined 3 14 4 19 5 19 6 19 7 57 We really need to accomodate any Unicode code point, which is 4 bytes (5 on EBCDIC). But the others we don't care about. Even though 6 bytes doesn't show as being worse than 4, I chose to not include it, because we don't care about performance for these rare non-Unicode code points, and it just might cause non-inlining for different compilers or clang versions.	2025-09-20 10:21:33 -06:00
Karl Williamson	03f24b8a08	Make croak() always expand to Perl_croak_nocontext() Perl almost always opts for saving time over saving space. Hence, we have croak() that saves time at the expense of space, but needs thread context available; and croak_no_context() that doesn't need that, but takes extra time But, when we are about to die, time isn't that important. Even if we are doing eval after eval in a tight loop, the potential time savings of passing the thread context to Perl_croak is insignificant compared to the tear-down that follows. My claim then is that croak() never needed a thread context parameter to save a bit of time just before death. It is an optimization that isn't worth it. And having it do so required the invention of croak_nocontext(), and the extra cognitive load associated with two methods for the same task. This commit changes plain croak() to not use a thread context parameter. It and croak_nocontext() now behave identically. That means that going forward, people will likely choose croak() which requires less typing and occupies fewer columns on the screen, and they won't have to remember which form to use when.	2025-09-12 14:47:53 -06:00
Karl Williamson	8444d54d4b	Move prototype definition of SvPV_helper to embed.fnc It's usually a bad idea to try to work around a limitation in common code by copy-pasting and then modifiying to taste. Fixes/improvements to the common code rarely get propagated to the outlier. I wrote code in 1ef9039bccb that did just this for the prototype definition of SvPV_helper, because the place where it really belongs, embed.fnc, couldn't (and still doesn't) handle function pointers as arguments (patches welcome). I should have at least added a comment to the common code noting the existence of this outlier. It turns out that that limitation can be worked around by declaring a typedef of the pointer, and then using that in embed.fnc. That's what this commit does. This commit removes the final instance of duplicating the work of embed.fnc in the core, except for some in the regex system whose comments say the reason is to avoid making a typedef public. I haven't investigated these further.	2025-09-01 10:50:08 -06:00
Karl Williamson	d8012228a9	Convert _is_utf8_FOO to legal name	2025-09-01 08:12:24 -06:00
Karl Williamson	8de60a95d1	Convert _is_uni_FOO to legal name	2025-09-01 08:12:23 -06:00
Karl Williamson	8b91a7e5f4	Convert _is_utf8_perl_idcont to legal name	2025-09-01 08:12:23 -06:00
Karl Williamson	ffc38ee761	Convert _is_uni_perl_idcont to legal name	2025-09-01 08:12:22 -06:00
Karl Williamson	9f11f6a038	Convert _is_utf8_perl_idstart to legal name	2025-09-01 08:12:21 -06:00
Karl Williamson	eb3ee9300b	Convert _is_uni_perl_idstart to legal name	2025-09-01 08:12:21 -06:00
Karl Williamson	81e1cbe370	Convert _to_utf8_case to legal name	2025-09-01 08:12:20 -06:00
Karl Williamson	8efe6a1425	Convert _to_utf8_upper_flags to legal name	2025-09-01 08:12:20 -06:00
Karl Williamson	a2f5678d13	Convert _to_utf8_title_flags to legal name	2025-09-01 08:12:19 -06:00
Karl Williamson	f79fa08ae1	Convert _to_upper_title_latin1 to legal name	2025-09-01 08:12:18 -06:00
Karl Williamson	f5f6a1be9e	Convert _to_utf8_lower_flags to legal name	2025-09-01 08:12:18 -06:00
Karl Williamson	309431c01c	onvert _to_utf8_fold_flags to legal name	2025-09-01 08:12:17 -06:00
Karl Williamson	d6909d9413	Convert _tofold_latin1 to legal name	2025-09-01 08:12:17 -06:00
Karl Williamson	5bda2037de	Convert _inverse_folds to legal name	2025-09-01 08:12:15 -06:00
Karl Williamson	8bdb0ad55c	Convert _to_uni_fold_flags to legal name	2025-09-01 08:12:15 -06:00
Karl Williamson	8decb8ab1a	Convert _byte_dump_string() to legal name	2025-09-01 08:12:10 -06:00
Karl Williamson	6a9f2d68fa	Expose some short form macros unconditionally Until C99 we couldn't use the type of macro we have that hides the need for thread context to call a function that needed both a thread context parameter and a format with varying numbers of parameters. Therefore you had to call the function directly with aTHX_. For some such functions, there were parallel functions created that omitted the thread context parameter (re-deriving it themselves). And there were compatibility macros created that called these. So, for example warn() would call Perl_warn_nocontext(). That changed in C99, and the calls in core to such functions were changed to use the macro that now expanded to Perl_warn(). Not all functions with this problem had '_nocontext()' versions. It turns out that the way the macros were #defined in embed.h, a definition existed for core, and non-threaded builds, but not threaded ones. This meant that, likely unknown to you, if you wrote an XS module, and used one of those macros, such as ck_warner(), it would compile and run on a non-threaded system, but would not compile on a threaded build. Commits 13e5ba49b2cfe0add44db552ecbebb2f785aecbc and d933027ef0a56c99aee8cc3c88ff4f9981ac9fc2 did not affect the '_nocontext()' versions. This commit exposes their macros to the public. There is no need to worry about breaking existing code, as these macros existed only on non-threaded builds, and they still work there. They now work on threaded builds as well, as long as you have an aTHX variable available. This is no different than any newly created macro for which we are also requiring aTHX availability.	2025-08-27 07:30:04 -06:00
Richard Leach	79b32d926e	sv.c: Add Perl_newSVsv_flags_NN and static helpers Perl_newSVsv_flags_NN creates a fresh SV that contains the values of its source SV argument. It's like calling `new_SV(dsv)` followed by `sv_setsv_flags(dsv, ssv, flags`, but is optimized for a brand new destination SV and the most common code paths. The intended initial users for this new function were: * Perl_sv_mortalcopy_flags (still in sv.c) * Perl_newSVsv_flags (now a simple function in sv_inline.h) Perl_newSVsv_flags_NN prioritises the following hot cases: * SVt_IV containing an IV * SVt_IV containing an RV * SVt_NV containing an NV * SVt_PV containing a PV It will then check for: * SVt_NULL * SVt_IV containing a UV * SVt_LAST The helper function S_newSVsv_flags_NN_PVxx is called for everything else. It will use Perl_sv_setsv_flags as a fallback for rare or tricky cases. S_newSVsv_flags_NN_POK is a dedicated helper for string swipe/COW/copy logic and is called from both Perl_newSVsv_flags_NN and S_newSVsv_flags_NN_PVxx. With these changes compared with the previous commit: * `perl -e 'for (1..100_000_0) { my $x = { (1) x 1000 }; }'` runs about 20% faster * `perl -e 'for (1..100_000_0) { my $x = { ("Perl") x 250 }' runs about 40% faster * `perl -e 'for (1..100_000_0) { my $x = { a => 1, b => 2, c => 3, d => 4, e => 5 }; }'` is a touch faster, but within the margin for error * `perl -e 'for (1..100_000_0) { my $x = { a => "Perl", b => "Perl", c => "Perl", d => "Perl", e => "Perl" } ; }'` runs about 17% faster	2025-08-23 17:44:29 +01:00
Karl Williamson	e9d09605b8	Add detail to -Dy debugging Commit 6ceb4087860c6ef8e86e0c252feb738d635e9e3f added a way to cleanly output UTF-8 tr/// values. This commit uses that to improve the debug output of compiling and running tr///. For a simple tr of of transliterating Greek capital letters to lowercase, the output of 'perl -Dy' has these added lines: > op.c: 6553: Compiling tr/t/r/; /c=0; /d=0; /s=0 > t is '\x{391}-\x{3a9}' > r is '\x{3b1}-\x{3c9}' Before the aforementioned commit the minus sign indicating a range would not have rendered properly; so things like that were omitted from the debug output. The output also now includes special mention of the special casing where the input is complemented, and/or some characters not being translated or get deleted.	2025-08-23 07:54:00 -06:00
Karl Williamson	ef2c06ab92	Create embed.fnc entry for pv_display_flags_ This creates an ARGS_ASSERT for this function. Previously, the code was using the one for plain pv_display(), which is kind of ugly. Now there is a macro for each function	2025-08-23 07:54:00 -06:00
Karl Williamson	8543a7ac33	Add valid_utf8_to_uv() This is identical to valid_utf8_to_uvchr(). They are both internal functions designed for when you are certain that the utf8 string to be translated is well formed; generally you created it yourself earlier. The only reason for this new synonym is to lessen the cognitive load on programmers who should be using the "_uv" suffix functions, and not the "_uvchr" suffix ones for these sorts of tasks. By having this synonym, one doesn't have to learn that there are two.	2025-08-21 13:52:26 -06:00
Karl Williamson	738383d65e	Revert wrongly named "Hide function prototyes from ... " This reverts commit ba4fa056e4e86ad40aee006b0ddd37951f723787 due to a completely wrong commit title and message. The next commit will reapply it with the correct information.	2025-08-21 13:52:26 -06:00
Karl Williamson	ba4fa056e4	Hide function prototyes from unauthorized callers 0351a629e71de127cbfd1b142e9eaa6069deabf5 extended hiding private functions from callers into the gcc world. Some functions are allowed only in extensions; so can not be marked as hidden; this commit discourages their use however, by hiding their prototypes to all but the core and extensions. It turns out that four functions were being used in modules we ship with that were marked as extensions-only; so they had to be made globally accessible.	2025-08-21 13:31:23 -06:00
Paul "LeoNerd" Evans	4b060dfa97	Add a subsignature_append_fence_op() A "fence op" is a miscellaneous op fragment that performs some work for side-effects during processing of a subroutine signature. In terms of timing, it will run at some time after any previously-defined arguments have been assigned from argument values passed in by the caller, but before any defaulting expressions for parameters that come after it are run. We specifically make no guarantees about whether parameters defined after this op have had their values assigned, nor whether defaulting expressions of earlier parameters have already been invoked. This is intentional because upcoming changes will change the order of these. The intention here is that method subroutines will use a fence op for the `OP_METHSTART` behaviour, ensuring that subsequent defaulting expressions can see the values of field bindings established by processing the `$self` parameter.	2025-08-14 17:06:03 +01:00
Karl Williamson	211c07b6fd	Convert Perl_uvoffuni_to_utf8_flags to a macro The function is hereby removed in favor of calling the plain uvoffuni_to_utf8_flags macro that already exists	2025-08-07 08:05:34 -06:00

1 2 3 4 5 ...

2253 Commits