From 76ddbb913e94cfc06fc93b5c9f48bf8a3a75f5f6 Mon Sep 17 00:00:00 2001 From: James Youngman Date: Mon, 3 Jun 2024 12:33:49 +0100 Subject: [PATCH] doc: State that find -print0 and xargs -0 are in POSIX from Issue 8. The forthcoming Issue 8 of the POSIX standard includes find -print0 and xargs -0. * doc/find.texi: find -print0 is no longer GNU-specific. Similarly for xargs -0. * xargs/xargs.1: Likewise. * NEWS: mention these changes. --- NEWS | 4 ++++ doc/find.texi | 31 +++++++++++++++++++------------ xargs/xargs.1 | 8 +++++++- 3 files changed, 30 insertions(+), 13 deletions(-) diff --git a/NEWS b/NEWS index e2a8e067..ce1946d0 100644 --- a/NEWS +++ b/NEWS @@ -2,6 +2,10 @@ GNU findutils NEWS - User visible changes. -*- outline -*- (allout) * Noteworthy changes in release ?.? (????-??-??) [?] +** Documentation Changes + + The forthcoming Issue 8 of the POSIX standard will standardise "find + -print0" and "xargs -0". Our documentation now points this out. * Noteworthy changes in release 4.10.0 (2024-06-01) [stable] diff --git a/doc/find.texi b/doc/find.texi index b3eff2d0..d1ad8d68 100644 --- a/doc/find.texi +++ b/doc/find.texi @@ -2453,7 +2453,7 @@ should consider the following line to be part of this one. Instead of blank-delimited names, it is safer to use @samp{find -print0} or @samp{find -fprint0} and process the output by giving the -@samp{-0} or @samp{--null} option to GNU @code{xargs}, GNU @code{tar}, +@samp{-0} or @samp{--null} option to @code{xargs}, GNU @code{tar}, GNU @code{cpio}, or @code{perl}. The @code{locate} command also has a @samp{-0} or @samp{--null} option which does the same thing. @@ -2566,6 +2566,9 @@ can process file names generated this way by giving the @samp{-0} or @samp{--null} option to GNU @code{xargs}, GNU @code{tar}, GNU @code{cpio}, or @code{perl}. +Both @code{find . -print0} and @code{xargs -0} will be +POSIX-conforming, starting from the currently-expected Issue 8. + @deffn Action -print0 True; print the entire file name on the standard output, followed by a null character. @@ -3908,8 +3911,8 @@ when commands are run. Otherwise, stdin is redirected from @itemx -0 Input file names are terminated by a null character instead of by whitespace, and any quotes and backslash characters are not considered -special (every character is taken literally). Disables the end of -file string, which is treated like any other argument. +special (every character is taken literally). Disables the end of file +string, which is treated like any other argument. @item --delimiter @var{delim} @itemx -d @var{delim} @@ -4719,13 +4722,16 @@ this command: find /var/tmp/stuff -mtime +90 -print0 | xargs -0 /bin/rm @end smallexample -The result is an efficient way of proceeding that -correctly handles all the possible characters that could appear in the -list of files to delete. This is good news. However, there is, as -I'm sure you're expecting, also more bad news. The problem is that -this is not a portable construct; although other versions of Unix -(notably BSD-derived ones) support @samp{-print0}, it's not -universal. So, is there a more universal mechanism? +The result is an efficient way of proceeding that correctly handles +all the possible characters that could appear in the list of files to +delete. This is good news. However, there is, as I'm sure you're +expecting, also more bad news. The problem is that this is not a +portable construct. Support for @samp{-print0} is not universal. + +Although some other versions of Unix (notably BSD-derived ones) +support @samp{-print0}, this is only required in POSIX from Issue 8 +(which as of 2024-06-03 has not yet been published). So, is there a +more universal mechanism? @subsection Going back to @code{-exec} @@ -5600,9 +5606,10 @@ The only ways to avoid this problem are either to avoid all use of available) @samp{find -execdir}, or to use the @samp{-0} option, which ensures that @code{xargs} considers file names to be separated by ASCII NUL characters rather than whitespace. However, useful as this -option is, the POSIX standard does not make it mandatory. +option is, the POSIX standard did not make it mandatory prior to Issue +8. -POSIX also specifies that @code{xargs} interprets quoting and trailing +POSIX also specifies that @code{xargs} without @code{-0} interprets quoting and trailing whitespace specially in filenames, too. This means that using @code{find ... -print | xargs ...} can cause the commands run by @code{xargs} to receive a list of file names which is not the same as diff --git a/xargs/xargs.1 b/xargs/xargs.1 index e6cae3d7..4e506391 100644 --- a/xargs/xargs.1 +++ b/xargs/xargs.1 @@ -73,7 +73,7 @@ whitespace, and the quotes and backslash are not special (every character is taken literally). Disables the end-of-file string, which is treated like any other argument. Useful when input items might contain white space, quote marks, or backslashes. -The GNU find \-print0 option produces input suitable for this mode. +The GNU find (and from Issue 8, POSIX) \-print0 option produces input suitable for this mode. .TP .BI "\-a " file ", \-\-arg\-file=" file @@ -448,6 +448,12 @@ Exit codes greater than 128 are used by the shell to indicate that a program died due to a fatal signal. . .SH "STANDARDS CONFORMANCE" +The long-standing +.B \-0 +option of +.B xargs +will be included in Issue 8 of the POSIX standard. + As of GNU xargs version 4.2.9, the default behaviour of .B xargs is not to have a logical end-of-file marker.