Add extensive guidance to perlclib

This consolidates much of the pod about interfacing with the standard C
library into this pod, while adding extensive documentation.
This commit is contained in:
Karl Williamson 2024-05-22 13:03:33 -06:00 committed by Graham Knop
parent c729e47cb9
commit 670935c948
11 changed files with 1098 additions and 359 deletions

View File

@ -5738,7 +5738,7 @@ pod/perlcall.pod Perl calling conventions from C
pod/perlcheat.pod Perl cheat sheet
pod/perlclass.pod Perl class syntax
pod/perlclassguts.pod Internals of class syntax
pod/perlclib.pod Internal replacements for standard C library functions
pod/perlclib.pod Interacting with standard C library functions
pod/perlcommunity.pod Perl community information
pod/perldata.pod Perl data structures
pod/perldbmfilter.pod Perl DBM filters

View File

@ -14,7 +14,13 @@ or statically linked into perl. The XS interface description is
written in the XS language and is the core component of the Perl
extension interface.
Before writing XS, read the L</CAVEATS> section below.
This documents the XS language, but it's important to first note that XS
code has full access to system calls including C library functions. It
thus has the capability of interfering with things that the Perl core or
other modules have set up, such as signal handlers or file handles. It
could mess with the memory, or any number of harmful things. Don't.
Further detail is in L<perlclib>, which you should read before actually
writing any production XS.
An B<XSUB> forms the basic unit of the XS interface. After compilation
by the B<xsubpp> compiler, each XSUB amounts to a C function definition
@ -2110,30 +2116,6 @@ Note that these macros will only work together within the I<same> source
file; that is, a dMY_CTX in one source file will access a different structure
than a dMY_CTX in another source file.
=head2 Thread-aware system interfaces
Starting from Perl 5.8, in C/C++ level Perl knows how to wrap
system/library interfaces that have thread-aware versions
(e.g. getpwent_r()) into frontend macros (e.g. getpwent()) that
correctly handle the multithreaded interaction with the Perl
interpreter. This will happen transparently, the only thing
you need to do is to instantiate a Perl interpreter.
This wrapping happens always when compiling Perl core source
(PERL_CORE is defined) or the Perl core extensions (PERL_EXT is
defined). When compiling XS code outside of the Perl core, the wrapping
does not take place before Perl 5.28. Starting in that release you can
#define PERL_REENTRANT
in your code to enable the wrapping. It is advisable to do so if you
are using such functions, as intermixing the C<_r>-forms (as Perl compiled
for multithreaded operation will do) and the C<_r>-less forms is neither
well-defined (inconsistent results, data corruption, or even crashes
become more likely), nor is it very portable. Unfortunately, not all
systems have all the C<_r> forms, but using this C<#define> gives you
whatever protection that Perl is aware is available on each system.
=head1 EXAMPLES
File C<RPC.xs>: Interface to some ONC+ RPC bind library functions.
@ -2208,10 +2190,11 @@ In Makefile.PL add -ltirpc and -I/usr/include/tirpc.
=head1 CAVEATS
XS code has full access to system calls including C library functions.
It thus has the capability of interfering with things that the Perl core
or other modules have set up, such as signal handlers or file handles.
It could mess with the memory, or any number of harmful things. Don't.
=head2 Use of standard C library functions
See L<perlclib>.
=head2 Event loops and control flow
Some modules have an event loop, waiting for user-input. It is highly
unlikely that two such modules would work adequately together in a
@ -2223,189 +2206,6 @@ help-mate, to accomplish things that perl doesn't do, or doesn't do fast
enough, but always subservient to perl. The closer XS code adheres to
this model, the less likely conflicts will occur.
One area where there has been conflict is in regards to C locales. (See
L<perllocale>.) perl, with one exception and unless told otherwise,
sets up the underlying locale the program is running in to the locale
passed
into it from the environment. This is an important difference from a
generic C language program, where the underlying locale is the "C"
locale unless the program changes it. As of v5.20, this underlying
locale is completely hidden from pure Perl code outside the lexical
scope of C<S<use locale>> except for a couple of function calls in the
POSIX module which of necessity use it. But the underlying locale, with
that
one exception is exposed to XS code, affecting all C library routines
whose behavior is locale-dependent. Your XS code better not assume that
the underlying locale is "C". The exception is the
L<C<LC_NUMERIC>|perllocale/Category LC_NUMERIC: Numeric Formatting>
locale category, and the reason it is an exception is that experience
has shown that it can be problematic for XS code, whereas we have not
had reports of problems with the
L<other locale categories|perllocale/WHAT IS A LOCALE>. And the reason
for this one category being problematic is that the character used as a
decimal point can vary. Many European languages use a comma, whereas
English, and hence Perl are expecting a dot (U+002E: FULL STOP). Many
modules can handle only the radix character being a dot, and so perl
attempts to make it so. Up through Perl v5.20, the attempt was merely
to set C<LC_NUMERIC> upon startup to the C<"C"> locale. Any
L<setlocale()|perllocale/The setlocale function> otherwise would change
it; this caused some failures. Therefore, starting in v5.22, perl tries
to keep C<LC_NUMERIC> always set to C<"C"> for XS code.
To summarize, here's what to expect and how to handle locales in XS code:
=over
=item Non-locale-aware XS code
Keep in mind that even if you think your code is not locale-aware, it
may call a library function that is. Hopefully the man page for such
a function will indicate that dependency, but the documentation is
imperfect.
The current locale is exposed to XS code except possibly C<LC_NUMERIC>
(explained in the next paragraph).
There have not been reports of problems with the other categories.
Perl initializes things on start-up so that the current locale is the
one which is indicated by the user's environment in effect at that time.
See L<perllocale/ENVIRONMENT>.
However, up through v5.20, Perl initialized things on start-up so that
C<LC_NUMERIC> was set to the "C" locale. But if any code anywhere
changed it, it would stay changed. This means that your module can't
count on C<LC_NUMERIC> being something in particular, and you can't
expect floating point numbers (including version strings) to have dots
in them. If you don't allow for a non-dot, your code could break if
anyone anywhere changed the locale. For this reason, v5.22 changed
the behavior so that Perl tries to keep C<LC_NUMERIC> in the "C" locale
except around the operations internally where it should be something
else. Misbehaving XS code will always be able to change the locale
anyway, but the most common instance of this is checked for and
handled.
=item Locale-aware XS code
If the locale from the user's environment is desired, there should be no
need for XS code to set the locale except for C<LC_NUMERIC>, as perl has
already set the others up. XS code should avoid changing the locale, as
it can adversely affect other, unrelated, code and may not be
thread-safe. To minimize problems, the macros
L<perlapi/STORE_LC_NUMERIC_SET_TO_NEEDED>,
L<perlapi/STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>, and
L<perlapi/RESTORE_LC_NUMERIC> should be used to affect any needed
change.
But, starting with Perl v5.28, locales are thread-safe on platforms that
support this functionality. Windows has this starting with Visual
Studio 2005. Many other modern platforms support the thread-safe POSIX
2008 functions. The C C<#define> C<USE_THREAD_SAFE_LOCALE> will be
defined iff this build is using these. From Perl-space, the read-only
variable C<${SAFE_LOCALES}> is 1 if either the build is not threaded, or
if C<USE_THREAD_SAFE_LOCALE> is defined; otherwise it is 0.
The way this works under-the-hood is that every thread has a choice of
using a locale specific to it (this is the Windows and POSIX 2008
functionality), or the global locale that is accessible to all threads
(this is the functionality that has always been there). The
implementations for Windows and POSIX are completely different. On
Windows, the runtime can be set up so that the standard
L<C<setlocale(3)>> function either only knows about the global locale or
the locale for this thread. On POSIX, C<setlocale> always deals with
the global locale, and other functions have been created to handle
per-thread locales. Perl makes this transparent to perl-space code. It
continues to use C<POSIX::setlocale()>, and the interpreter translates
that into the per-thread functions.
All other locale-sensitive functions automatically use the per-thread
locale, if that is turned on, and failing that, the global locale. Thus
calls to C<setlocale> are ineffective on POSIX systems for the current
thread if that thread is using a per-thread locale. If perl is compiled
for single-thread operation, it does not use the per-thread functions,
so C<setlocale> does work as expected.
If you have loaded the L<C<POSIX>> module you can use the methods given
in L<perlcall> to call L<C<POSIX::setlocale>|POSIX/setlocale> to safely
change or query the locale (on systems where it is safe to do so), or
you can use the new 5.28 function L<perlapi/Perl_setlocale> instead,
which is a drop-in replacement for the system L<C<setlocale(3)>>, and
handles single-threaded and multi-threaded applications transparently.
There are some locale-related library calls that still aren't
thread-safe because they return data in a buffer global to all threads.
In the past, these didn't matter as locales weren't thread-safe at all.
But now you have to be aware of them in case your module is called in a
multi-threaded application. The known ones are
asctime()
ctime()
gcvt() [POSIX.1-2001 only (function removed in POSIX.1-2008)]
getdate()
wcrtomb() if its final argument is NULL
wcsrtombs() if its final argument is NULL
wcstombs()
wctomb()
Some of these shouldn't really be called in a Perl application, and for
others there are thread-safe versions of these already implemented:
asctime_r()
ctime_r()
Perl_langinfo()
The C<_r> forms are automatically used, starting in Perl 5.28, if you
compile your code, with
#define PERL_REENTRANT
See also L<perlapi/Perl_langinfo>.
You can use the methods given in L<perlcall>, to get the best available
locale-safe versions of these
POSIX::localeconv()
POSIX::wcstombs()
POSIX::wctomb()
And note, that some items returned by C<Localeconv> are available
through L<perlapi/Perl_langinfo>.
The others shouldn't be used in a threaded application.
Some modules may call a non-perl library that is locale-aware. This is
fine as long as it doesn't try to query or change the locale using the
system C<setlocale>. But if these do call the system C<setlocale>,
those calls may be ineffective. Instead,
L<C<Perl_setlocale>|perlapi/Perl_setlocale> works in all circumstances.
Plain setlocale is ineffective on multi-threaded POSIX 2008 systems. It
operates only on the global locale, whereas each thread has its own
locale, paying no attention to the global one. Since converting
these non-Perl libraries to C<Perl_setlocale> is out of the question,
there is a new function in v5.28
L<C<switch_to_global_locale>|perlapi/switch_to_global_locale> that will
switch the thread it is called from so that any system C<setlocale>
calls will have their desired effect. The function
L<C<sync_locale>|perlapi/sync_locale> must be called before returning to
perl.
This thread can change the locale all it wants and it won't affect any
other thread, except any that also have been switched to the global
locale. This means that a multi-threaded application can have a single
thread using an alien library without a problem; but no more than a
single thread can be so-occupied. Bad results likely will happen.
In perls without multi-thread locale support, some alien libraries,
such as C<Gtk> change locales. This can cause problems for the Perl
core and other modules. For these, before control is returned to
perl, starting in v5.20.1, calling the function
L<sync_locale()|perlapi/sync_locale> from XS should be sufficient to
avoid most of these problems. Prior to this, you need a pure Perl
statement that does this:
POSIX::setlocale(LC_ALL, POSIX::setlocale(LC_ALL));
or use the methods given in L<perlcall>.
=back
=head1 XS VERSION
This document covers features supported by C<ExtUtils::ParseXS>

View File

@ -6,7 +6,7 @@ perlxstut - Tutorial for writing XSUBs
This tutorial will educate the reader on the steps involved in creating
a Perl extension. The reader is assumed to have access to L<perlguts>,
L<perlapi> and L<perlxs>.
L<perlclib>, L<perlapi>, and L<perlxs>.
This tutorial starts with very simple examples and becomes more complex,
with each new example adding new features. Certain concepts may not be
@ -1403,7 +1403,8 @@ Some systems may have installed Perl version 5 as "perl5".
=head1 See also
For more information, consult L<perlguts>, L<perlapi>, L<perlxs>, L<perlmod>,
For more information, consult L<perlguts>, L<perlapi>, L<perlclib>,
L<perlxs>, L<perlmod>,
L<perlapio>, and L<perlpod>
=head1 Author

View File

@ -155,7 +155,7 @@ aux h2ph h2xs perlbug pl2pm pod2html pod2man splain xsubpp
perlxstut Perl XS tutorial
perlxs Perl XS application programming interface
perlxstypemap Perl XS C/Perl type conversion tools
perlclib Internal replacements for standard C library functions
perlclib Interacting with standard C library functions
perlguts Perl internal functions for those doing extensions
perlcall Perl calling conventions from C
perlmroapi Perl method resolution plugin interface

File diff suppressed because it is too large Load Diff

View File

@ -736,6 +736,18 @@ C<rpp_replace_at_norc_NN(sp, sv)>
=back
=head3 L<perlclib>
=over 4
=item *
Extensive guidance has been added for interfacing with the standard C
library, including many more functions to avoid, and how to cope with
locales and threads.
=back
=head3 L<perlhacktips>
=over 4

View File

@ -12,7 +12,7 @@ Do you want to:
=item B<Use C from Perl?>
Read L<perlxstut>, L<perlxs>, L<h2xs>, L<perlguts>, and L<perlapi>.
Read L<perlxstut>, L<perlxs>, L<perlclib>, L<h2xs>, L<perlguts>, and L<perlapi>.
=item B<Use a Unix program from Perl?>
@ -938,7 +938,7 @@ C<-Dusemultiplicity> option otherwise some interpreter variables may
not be initialized correctly between consecutive runs and your
application may crash.
See also L<perlxs/Thread-aware system interfaces>.
See also L<perlclib/Dealing with embedded perls and threads>.
Using C<-Dusethreads -Duseithreads> rather than C<-Dusemultiplicity>
is more appropriate if you intend to run multiple interpreters
@ -1091,7 +1091,7 @@ B<ExtUtils::Embed> can also automate writing the I<xs_init> glue code.
% cc -c interp.c `perl -MExtUtils::Embed -e ccopts`
% cc -o interp perlxsi.o interp.o `perl -MExtUtils::Embed -e ldopts`
Consult L<perlxs>, L<perlguts>, and L<perlapi> for more details.
Consult L<perlxs>, L<perlclib>, L<perlguts>, and L<perlapi> for more details.
=head2 Using embedded Perl with POSIX locales

View File

@ -2909,6 +2909,8 @@ to be a no-op.
=head2 How do I use all this in extensions?
See also L<perlclib/Dealing with embedded perls and threads>.
When Perl is built with MULTIPLICITY, extensions that call
any functions in the Perl API will need to pass the initial context
argument somehow. The kicker is that you will need to write it in

View File

@ -251,7 +251,8 @@ This applies as well to L<I18N::Langinfo>.
XS modules for all categories but C<LC_NUMERIC> get the underlying
locale, and hence any C library functions they call will use that
underlying locale. For more discussion, see L<perlxs/CAVEATS>.
underlying locale. For more discussion, see
L<perlclib/Dealing with locales>.
=back
@ -577,7 +578,7 @@ automatically use their thread's locale.
This should be completely transparent to any applications written
entirely in Perl (minus a few rarely encountered caveats given in the
L</Multi-threaded> section). Information for XS module writers is given
in L<perlxs/Locale-aware XS code>.
in L<perlclib/Dealing with locales>.
=head2 Finding locales
@ -1747,7 +1748,7 @@ You should not change the locale after startup on a platform where
C<${^SAFE_LOCALES}> is 0. It will always be 1 on an unthreaded
platform.
XS writers should refer to L<perlxs/Thread-aware system interfaces>.
XS writers should refer to L<perlclib/Dealing with embedded perls and threads>.
=head2 Broken systems

View File

@ -1046,27 +1046,10 @@ threads. See L<threads/THREAD SIGNALLING> for more details.)
=head1 Thread-Safety of System Libraries
Whether various library calls are thread-safe is outside the control
of Perl. Calls often suffering from not being thread-safe include:
C<localtime()>, C<gmtime()>, functions fetching user, group and
network information (such as C<getgrent()>, C<gethostent()>,
C<getnetent()> and so on), C<readdir()>, C<rand()>, and C<srand()>. In
general, calls that depend on some global external state.
If the system Perl is compiled in has thread-safe variants of such
calls, they will be used. Beyond that, Perl is at the mercy of
the thread-safety or -unsafety of the calls. Please consult your
C library call documentation.
On some platforms the thread-safe library interfaces may fail if the
result buffer is too small (for example the user group databases may
be rather large, and the reentrant interfaces may have to carry around
a full snapshot of those databases). Perl will start with a small
buffer, but keep retrying and growing the result buffer
until the result fits. If this limitless growing sounds bad for
security or memory consumption reasons you can recompile Perl with
C<PERL_REENTRANT_MAXSIZE> defined to the maximum number of bytes you will
allow.
Whether C library calls are thread-safe is outside the control of Perl.
Undefined behavior will happen if unsafe ones are used during
multi-thread operation. See
L<perlclib/Dealing with embedded perls and threads>.
=head1 Conclusion

View File

@ -288,6 +288,7 @@ printf(3)
provide
ptar(1)
ptargrep(1)
pthreads(7)
pwd_mkdb(8)
querylocale(3)
RDF::Trine
@ -343,6 +344,7 @@ String::Base
String::Scanf
String::Util
strstr(3)
strtok(3)
strtol(3)
Switch
tar(1)
@ -410,7 +412,7 @@ ext/pod-html/corpus/perlvar-copy.pod Verbatim line length including indents exce
ext/vms-filespec/lib/vms/filespec.pm Verbatim line length including indents exceeds 78 by 1
install ? Should you be using F<...> or maybe L<...> instead of 1
install Verbatim line length including indents exceeds 78 by 2
pod/perl.pod Verbatim line length including indents exceeds 78 by 6
pod/perl.pod Verbatim line length including indents exceeds 78 by 5
pod/perlandroid.pod Verbatim line length including indents exceeds 78 by 3
pod/perldebguts.pod Verbatim line length including indents exceeds 78 by -1
pod/perldebtut.pod Verbatim line length including indents exceeds 78 by 2