This is so we can improve sdiff signal handling.
* bootstrap.conf: Add stdcountof-h.
* src/Makefile.am (diff_SOURCES): Add syncsig.c.
(noinst_HEADERS): Add syncsig.h.
* src/diff.c, src/util.c: Include syncsig.h.
* src/util.c: Move signal-related stuff from here ...
* src/syncsig.c: ... to here.
(syncsig_install, syncsig_cleanup):
Rename from install_signal_handlers, cleanup_signal_handlers.
All uses changed. Handle some more signals.
Add an option to not handle stop-related signals.
(syncsig_poll, syncsig_deliver):
New functions, which are like the old process_signals
but in two pieces not one. All uses changed.
(syncsig_install): New args FUN and ARG.
Return int on failure. All callers changed.
* src/syncsig.h: New file.
Recentish changes to Gnulib have pulled in more dependencies
on multithreading, locking, and whatnot. Revamp to remove
these unwanted dependencies.
* bootstrap.conf: Also avoid hard-locale, localcharset,
localename-unsafe, localename-unsafe-limited.
Stop avoiding localename.
(avoided_gnulib_tests): New var. Avoid these tests too.
(gnulib-modules): Remove hard-locale, nstrftime.
Add nstrftime-limited.
* configure.ac (gl_cv_func_mbrtowc_C_locale_sans_EILSEQ)
(gl_cv_func_mbrtoc32_C_locale_sans_EILSEQ):
New vars, so that we do not worry about multibyte C locales.
(gl_THREADLIB_DEFAULT_NO): New macro.
Not sure how much it helps, but it can’t hurt.
(SUPPORT_NON_GREG_CALENDARS_IN_STRFTIME): New macro.
* src/cmp.c: Do not include hard-locale.h.
(hard_locale_LC_MESSAGES): Assume that LC_MESSAGES is hard
if and only if "(C)" gets translated. This drags in fewer
dependencies than calling hard_locale.
* src/diff.c: Include strftime.h instead of hard-locale.h.
(hard_locale_LC_TIME): New function, that uses nstrftime
to infer whether the time locale is hard.
(main): Use it instead of hard_locale.
maint: default Gnulib to no multithreading
* configure.ac: Define gl_THREADLIB_DEFAULT_NO
so that Gnulib defaults to no multithreading.
* bootstrap.conf: Some gnulib modules are now deprecated, in
favor of new names with a "-h" suffix (and stdbool->bool).
Induce this change with the following:
re='inttypes|signal|stdckdint|stdint|sys_types|sys_wait|unistd'
perl -pi -e 's{^('"$re"')$}{$1-h};s{^stdbool$}{bool}' bootstrap.conf
Then, sort the module names.
stdbit.h is standardized in C23, so use that instead of
the GNU-specific count-leading-zeros module.
* bootstrap.conf (gnulib_modules): Remove count-leading-zeros.
Add stdbit.
* src/system.h: Include stdbit.h instead of count-leading-zeros.h.
(floor_log2): Implement via stdc_bit_width instead of via
count_leading_zeros_ll.
* bootstrap.conf (gnulib_modules): Add c32isprint.
* src/util.c: Include mcel.h.
(output_1_line): Return immediately on output error.
Scan multi-byte characters and count their widths.
(analyze_hunk): Ignore multi-byte white space too.
Go back to a single mcel module, instead of trying to break it up
into ucore and mcel pieces, as breaking it up hurt performance.
Use gnulib-tool’s --local-dir to create diffutils-specific modules
for mcel; the idea is that this will eventually migrate into Gnulib.
* bootstrap.conf (avoided_gnulib_modules): Add mbuiterf.
(gnulib_modules): Add mbscasecmp, mcel-prefer.
(gnulib_tool_option_extras): Add --local-dir=gl to pick up new files.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_doubled_word):
Do not exclude now-removed files lib/ucore.c, lib/ucore.h.
* lib/Makefile.am: Adjust to use of modules.
(noinst_HEADERS): Remove mcel.h, ucore.h.
(libdiffutils_a_SOURCES): Remove mcel.c, mcel-casecmp.c, ucore.c
* lib/mcel-casecmp.c, lib/ucore.c, lib/ucore.h: Remove.
* lib/mcel.h: Switch to LGPLv2.1+. Do not include ucore.h.
All uses of ucore_t changed back to using char32_t.
Do what ucore.h used to do: include verify.h, limits.h, stddef.h,
uchar.h; require config.h, define _GL_LIKELY, _GL_UNLIKELY.
(MCEl_CHAR_MAX, MCEL_ERR_MIN, MCEL_ERR_MAX): New constants.
(mcel_t): Switch from single ucore_t c to a char32_t ch and
unsigned char err. This has significantly better performance on
Fedora 38 x86-64. All uses changed. Check that unsigned char
promotes to int.
(mcel_ch, mcel_err, mcel_cmp, mcel_tocmp): New functions.
(MCEL_ERR_SHIFT): Rename from MCEL_ENCODING_ERROR_SHIFT.
All uses changed.
(mcel_isbasic): Add a _GL_LIKELY to help compilers. All uses changed.
(mcel_scan, mcel_scant): Simplify by using mcel_ch, mcel_err.
(mcel_casecmp): Remove decl. Callers changed to use mbscasecmp.
* gl/lib/mcel.c, gl/lib/mcel.h: Rename from lib/mcel.c, lib/mcel.h.
* gl/lib/mbscasecmp.c: New file.
* gl/modules/mcel, gl/modules/mcel-prefer, gl/modules/mcel-tests:
* gl/tests/test-mcel.c:
New files.
* src/io.c: Revert use of ucore API. Use plain c32isspace etc.
instead of ucore_is. Use .err instead of ucore_iserr.
(same_ch_err): Bring back, and use it instead of ucore_cmp.
* src/side.c (print_half_line): Use .err instead of ucore_iserr.
* bootstrap.conf (gnulib_modules): Add c-file-type
and remove file-type.
* po/POTFILES.in: Add lib/c-file-type.c, remove lib/file-type.c
* src/diff.c (O_PATH_DEFINED): New constant.
(detype_from_mode): Remove; no longer used.
(dir_p): Go back to the old way of using S_ISDIR.
(compare_prepped_files): Use filetype and stat macros, not detype.
Pass symlink fd and "" to careadlinkat if available, as that
avoids a race. Test for dir vs file earlier, so that a missing
file is treated consistently with dir/file vs file.
(compare_files): New arg DETYPE replacing the old DETYPE0 and DETYPE1.
All uses changed. st_size for nonexistent files is 0, not -1.
Set up .filetype, not .detype, as .filetype is finer-grained.
Open symlinks with O_PATH on GNU/Linux, since we can then
use readlinkat on the resulting file descriptor and this
avoids a race.
* src/diff.h (struct file_data): Remove detype member.
Add filetype member; it’s finer-grained. All uses changed.
* tests/no-dereference: Add test that the previous commit failed.
* bootstrap.conf (gnulib_modules): Add sys_types,
for MAJOR_IN_MKDEV and MAJOR_IN_SYSMACROS.
* src/diff.c (major, minor): New macros or functions.
Include <sys/mkdev.h> or <sys/sysmacros.h> for them.
(compare_files): Output major and minor device numbers
for special files that differ.
* bootstrap.conf (gnulib_modules): Add quote.
* src/diff.c: Include quote.h.
(compare_files): Print contents of symlinks that differ,
and quote their names and contents.
* src/system.h (symlink_size_ok): Remove.
(stat_size): Don’t worry about symlink sizes.
* tests/no-dereference: Adjust tests to match new behavior.
This closes some more races, by using openat+fstat instead
of fstatat+openat which can get confused by some other process
renaming files in the meantime. Not all races are closed of course.
* bootstrap.conf (gnulib_modules): Add d-type.
* src/diff.c (errno_encode, errno_decode): Remove, as file
descriptors are no longer portmanteau variables. All uses removed.
(detype_from_mode): New function.
(dir_p): Use detype, not stat.st_mode.
(compare_files): New args DETYPE0 and DETYPE1. All uses changed.
Update detype and err as new info arrives.
Adjust to desc's new use (no longer encodes errno).
Do not ignore lseek failures on regular files.
Prefer openat+fstat to fstatat+openat when detype shows that it's safe,
and avoid both fstat and fstatat if detype suffices.
Use ‘error’ with errno value rather than setting error
and then calling perror_with_name. Coalesce two of these
error diagnostics into one by moving an error check before
the diagnostic is output. Coalesce two calls to diff_dirs.
Print file type based on detype if available,
in case neither fstat nor fstatat was called.
* src/diff.h (enum detype): New type.
(struct file_data): New slots err, detype.
(NONEXISTENT, UNOPENED): Renumber so that -1 stands for open failed.
* src/dir.c (HAVE_STRUCT_DIRENT_D_TYPE): Default to false.
(dir_read): Return to caller the d_type, if available.
(diff_dirs): Pass detype to compare_files.
* bootstrap.conf (gnulib_modules): Add popen, pclose, readdir,
readlinkat, sigaction.
* configure.ac: Don't enable _FORTIFY_SOURCE on mingw.
* src/util.c (process_signals): If SIGTSTP is not defined,
stop_signal_count is zero, therefore disable the stop signal processing.
(sig): If SIGHUP is not defined, don't list it. If SIGPIPE is not
defined, don't list it.
Also, remove dependence on xreadlink.
* bootstrap.conf (gnulib_modules): Add careadlinkat.
Remove xreadlink (which depends on careadlinkat).
* src/diff.c: Include careadlinkat.h, not xreadlink.h.
(compare_files): Don’t bother to read links if their lengths differ.
Use careadlinkat instead of xreadlink so that normally malloc need
not be called.
* bootstrap.conf (gnulib_modules): Remove inttostr.
* src/cmp.c: Do not include inttostr.h.
(cmp): Use C99-style PRIdMAX rather than Gnulib inttostr,
as PRIdMAX is simpler and (thanks to Gnulib) is portable.
* bootstrap.conf (gnulib_modules): Remove gettime; add timespec_get.
* src/context.c (print_context_label): Get current time lazily.
Use C11 timespec_get rather than older Gnulib gettime function.
* src/diff.c: Do not include timespec.h.
(set_mtime_to_now): Remove. All uses removed.
* src/system.h: Include stat-time.h, timespec.h.
* bootstrap.conf (gnulib_modules): Add timespec, for timespec_cmp.
(same_file_attributes): Check birthtime and ns components too.
Check attributes earlier if they are more likely to differ.
* bootstrap.conf (gnulib_modules): Add builtin-expect.
* lib/mbcel-strcasecmp.c: New file.
* lib/Makefile.am (libdiffutils_a_SOURCES): Add it.
* lib/mbcel.h (MBCEL_LEN_MAX, MBCEL_ENCODING_ERROR_SHIFT)
(MBCEL_UCHAR_FITS, MBCEL_UCHAR_EASILY_FITS): New constants.
(_GL_LIKELY): New macro.
(mbcel_scan): Use it. Simplify NetBSD code.
(mbcel_scant, mbcel_scanz, mbcel_cmp, mbcel_casecmp): New functions.
* src/dir.c (strcasecoll): Move defn here from system.h,
since only dir.c needs it. Use mbcel_strcasecmp instead
of strcasecmp.
This should improve performance when doing recursive comparisons.
Currently there is no attempt to avoid file descriptor exhaustion,
just as previously there is no attempt to avoid file names
that provoke ENAMETOOLONG. Because of this change, ‘diff - A/B’
now works correctly when standard input is a directory.
* .gitignore: Add lib/dirent.h.
* bootstrap.conf (gnulib_modules): Add fdopendir.
* src/diff.c (main): Initialize noparent’s desc to AT_FDCWD.
(compare_files): Use fstatat with parent directory’s file
descriptor and relative name, instead of lstat or stat.
Likewise for openat and open.
* src/diff.h (struct file_data): New member ‘dirstream’.
(struct comparison): The ‘parent’ member is now &noparent (instead
of null) if there is no parent. All uses changed.
(curr): New toplevel variable, replacing ‘files’. All uses changed.
* src/dir.c: Include dirname.h, for last_component.
(dir_read): New arg PARENTDIRFD. Arg DIR is no longer
pointer-to-const since DIR->desc and DIR->dirstream are now
updated. Use PARENTDIRFD to open the directory via
opendat+fdopendir instead of via opendir. Update new dirstream
component instead of closing the directory, since it’s now the
caller’s responsibility to close the directory because callers now
want the file descriptor. All callers changed.
(diff_dirs): First arg CMP is no longer pointer-to-const since
CMP->file is updated by dir_read. All callers changed.
(find_dir_file_pathname): First arg is now struct file_data *,
not merely a file name. All callers changed.
* tests/stdin: Test new behavior when stdin is a directory.
* bootstrap.conf (gnulib_modules): Add c32isspace, c32tolower.
* lib/Makefile.am (noinst_HEADERS): Add mbcel.h.
(libdiffutils_a_SOURCES): Add mbcel.c
* lib/mbcel.c, lib/mbcel.h: New files.
* src/io.c: Include mbcel.h, uchar.h.
(hash): 2nd arg is now hash_value, not merely unsigned char,
since the caller might pass a char32_t now.
(find_and_hash_each_line): Support multi-byte input.
* src/util.c: Include mbcel.h, uchar.h.
(lines_differ): New args S1LEN, S2LEN, needed for mbcel_scan.
Caller changed. Support multi-byte input.
* tests/ignore-case: New file.
* tests/Makefile.am (TESTS): Add it.
* tests/ignore-tab-expansion: Add UTF-8 test.
* tests/init.cfg (require_utf8_locale_): New function.
* tests/side-by-side: Use it. Add a column-counting test.
c_isdigit is a function supplied by Gnulib, which should
be a bit better than our own macro.
* bootstrap.conf (gnulib_modules): Add c-ctype.
* src/system.h (ISDIGIT): Remove. All calls replaced by c_isdigit.
Include <c-ctype.h>, for c_isdigit.
Prefer C11-style char32_t to wchar_t, as char32_t works better on
platforms where wchar_t is only 16 bits.
* .gitignore: Add lib/uchar.h.
* bootstrap.conf (gnulib_modules): Add c32width, mbrtoc32.
Remove mbrtowc. Sort.
* src/side.c: Include uchar.h instead of wchar.h.
(print_half_line): Use c32width and mbrtowc instead of
wcwidth and mbrtowc.
* bootstrap.conf (buildreq): Require Automake 1.14 instead of 1.12.2,
since AM_PROG_CC_C_O is obsolete as of 1.14.
* configure.ac: Don’t use obsolescent AM_PROG_CC_C_O.
* bootstrap.conf (gnulib_modules): Add count-leading-zeros.
* src/analyze.c (discard_confusing_lines, diff_2_files):
* src/io.c (read_files):
Prefer floor_log2 to doing it by hand.
* src/cmp.c, src/diff.c, src/diff3.c, src/sdiff.c:
Define SYSTEM_INLINE, for system.h.
* src/system.h: Include count-leading-zeros.h.
(SYSTEM_INLINE): New macro.
(LIN_MAX): Verify that it does not exceed IDX_MAX, so that
floor_log2 is safe to use for lin values too.
(floor_log2): New inline function.
* bootstrap.conf (gnulib_modules): Add ialloc, to document
the now-direct dependency.
* src/diff.c (add_regexp):
* src/diff3.c (read_diff):
* src/dir.c (dir_read):
* src/io.c (slurp, find_and_hash_each_line, find_identical_ends):
* src/sdiff.c (diffarg):
Prefer xpalloc to doing it by hand.
* src/io.c: Include ialloc.h, for irealloc.
(equivs_alloc): Now idx_t, not lin, for xpalloc.
(sip): Don’t bother subtracting 2 * sizeof (word) from the
buffer_lcm upper bound, as later code works anyway now.
(slurp): Simplify buffer allocation so that xpalloc can be used.
Use irealloc for speculative reallocation, since the code could
work anyway if the irealloc fails. Use current->eof to check
for EOF, rather than the less-intuitive buffer size checks.
Prefer idx_t to size_t in lib/cmpbuf.c and related buffer-size code.
Because POSIX says blksize_t can be wider than idx_t,
check for overflow when copying the former to the latter.
* bootstrap.conf (gnulib_modules): Add idx.
* lib/cmpbuf.c (block_read, buffer_lcm):
Prefer idx_t to size_t. All uses changed.
* lib/cmpbuf.c (block_read): Return ptrdiff_t instead of size_t.
All uses changed.
(buffer_lcm): Help the compiler by checking for negative args,
even though they are not allowed.
* lib/cmpbuf.h: Include idx.h and stddef.h, for idx_t and ptrdiff_t,
so that this include file is self-contained.
* src/analyze.c (diff_2_files):
* src/cmp.c (main):
* src/diff.c, src/io.c: Do not include stdckdint.h here,
since system.h now does that.
* src/diff3.c (read_diff):
* src/io.c (sip):
Protect against negative STAT_BLOCKSIZE, or STAT_BLOCKSIZE
outside idx_t range.
* src/system.h: Include stdckdint.h, idx.h.
* bootstrap.conf (gnulib_modules): Add stdckdint.
* lib/cmpbuf.c: Use ckd_mul rather than INT_MULTIPLY_WRAPV.
Include stdckdint.h, not "intprops.h".
* src/diff.c: Similar, but for both ckd_add and ckd_mul.
* src/io.c: Likewise for ckd_add.
Following Pádraig Brady's example from coreutils, ...
* bootstrap.conf: Add an explicit requirement on m4.
Add an explicit requirement on texi2pdf which is often packaged
separately to makeinfo and induces a failure far down the
distribution phase if not present.
Replace the rsync dependency with wget,
which gnulib changed to in 2018.
Also, add an xz requirement and a version for autopoint.
* bootstrap.conf (gnulib_modules): Add extern-inline.
* src/diff.h: Use _GL_INLINE_HEADER_BEGIN and _GL_INLINE_HEADER_END.
(DIFF_INLINE): New macro.
(robust_output_style): Now an inline function, not a macro
ROBUST_OUTPUT_STYLE. All uses changed.
AC_SYS_LARGEFILE meaning has changed, need AC_SYS_YEAR2038 as well
* NEWS: mention this
* tests: add test
* bootstrap.conf: add year2038
Copyright-paperwork-exempt: yes