But reported by Nick Smallbone, with one-line fix by
Collin Funk <https://bugs.gnu.org/76613>.
* src/io.c (find_and_hash_each_line): Fix size computation.
Gleb Fotengauer-Malinovskiy reported <https://bugs/gnu/org/66095>
that the recent change to quoting style broke GNU patch.
* src/util.c: Include quotearg.h.
(current_name): New static var, replacing the the old
current_name0 and current_name1. All uses changed.
(begin_output): Go back to quoting file names for C,
not for the shell, when they contain troublesome characters.
This is not a simple revert, as the revised code handles
multi-byte characters even in non-UTF-8 locales.
* tests/filename-quoting: Revert previous change to this file.
* bootstrap.conf (gnulib_modules): Add c32isprint.
* src/util.c: Include mcel.h.
(output_1_line): Return immediately on output error.
Scan multi-byte characters and count their widths.
(analyze_hunk): Ignore multi-byte white space too.
* bootstrap.conf (gnulib_modules): Add sys_types,
for MAJOR_IN_MKDEV and MAJOR_IN_SYSMACROS.
* src/diff.c (major, minor): New macros or functions.
Include <sys/mkdev.h> or <sys/sysmacros.h> for them.
(compare_files): Output major and minor device numbers
for special files that differ.
* bootstrap.conf (gnulib_modules): Add quote.
* src/diff.c: Include quote.h.
(compare_files): Print contents of symlinks that differ,
and quote their names and contents.
* src/system.h (symlink_size_ok): Remove.
(stat_size): Don’t worry about symlink sizes.
* tests/no-dereference: Adjust tests to match new behavior.
This closes some more races, by using openat+fstat instead
of fstatat+openat which can get confused by some other process
renaming files in the meantime. Not all races are closed of course.
* bootstrap.conf (gnulib_modules): Add d-type.
* src/diff.c (errno_encode, errno_decode): Remove, as file
descriptors are no longer portmanteau variables. All uses removed.
(detype_from_mode): New function.
(dir_p): Use detype, not stat.st_mode.
(compare_files): New args DETYPE0 and DETYPE1. All uses changed.
Update detype and err as new info arrives.
Adjust to desc's new use (no longer encodes errno).
Do not ignore lseek failures on regular files.
Prefer openat+fstat to fstatat+openat when detype shows that it's safe,
and avoid both fstat and fstatat if detype suffices.
Use ‘error’ with errno value rather than setting error
and then calling perror_with_name. Coalesce two of these
error diagnostics into one by moving an error check before
the diagnostic is output. Coalesce two calls to diff_dirs.
Print file type based on detype if available,
in case neither fstat nor fstatat was called.
* src/diff.h (enum detype): New type.
(struct file_data): New slots err, detype.
(NONEXISTENT, UNOPENED): Renumber so that -1 stands for open failed.
* src/dir.c (HAVE_STRUCT_DIRENT_D_TYPE): Default to false.
(dir_read): Return to caller the d_type, if available.
(diff_dirs): Pass detype to compare_files.
* bootstrap.conf (gnulib_modules): Add builtin-expect.
* lib/mbcel-strcasecmp.c: New file.
* lib/Makefile.am (libdiffutils_a_SOURCES): Add it.
* lib/mbcel.h (MBCEL_LEN_MAX, MBCEL_ENCODING_ERROR_SHIFT)
(MBCEL_UCHAR_FITS, MBCEL_UCHAR_EASILY_FITS): New constants.
(_GL_LIKELY): New macro.
(mbcel_scan): Use it. Simplify NetBSD code.
(mbcel_scant, mbcel_scanz, mbcel_cmp, mbcel_casecmp): New functions.
* src/dir.c (strcasecoll): Move defn here from system.h,
since only dir.c needs it. Use mbcel_strcasecmp instead
of strcasecmp.
This should improve performance when doing recursive comparisons.
Currently there is no attempt to avoid file descriptor exhaustion,
just as previously there is no attempt to avoid file names
that provoke ENAMETOOLONG. Because of this change, ‘diff - A/B’
now works correctly when standard input is a directory.
* .gitignore: Add lib/dirent.h.
* bootstrap.conf (gnulib_modules): Add fdopendir.
* src/diff.c (main): Initialize noparent’s desc to AT_FDCWD.
(compare_files): Use fstatat with parent directory’s file
descriptor and relative name, instead of lstat or stat.
Likewise for openat and open.
* src/diff.h (struct file_data): New member ‘dirstream’.
(struct comparison): The ‘parent’ member is now &noparent (instead
of null) if there is no parent. All uses changed.
(curr): New toplevel variable, replacing ‘files’. All uses changed.
* src/dir.c: Include dirname.h, for last_component.
(dir_read): New arg PARENTDIRFD. Arg DIR is no longer
pointer-to-const since DIR->desc and DIR->dirstream are now
updated. Use PARENTDIRFD to open the directory via
opendat+fdopendir instead of via opendir. Update new dirstream
component instead of closing the directory, since it’s now the
caller’s responsibility to close the directory because callers now
want the file descriptor. All callers changed.
(diff_dirs): First arg CMP is no longer pointer-to-const since
CMP->file is updated by dir_read. All callers changed.
(find_dir_file_pathname): First arg is now struct file_data *,
not merely a file name. All callers changed.
* tests/stdin: Test new behavior when stdin is a directory.
* bootstrap.conf (gnulib_modules): Add c32isspace, c32tolower.
* lib/Makefile.am (noinst_HEADERS): Add mbcel.h.
(libdiffutils_a_SOURCES): Add mbcel.c
* lib/mbcel.c, lib/mbcel.h: New files.
* src/io.c: Include mbcel.h, uchar.h.
(hash): 2nd arg is now hash_value, not merely unsigned char,
since the caller might pass a char32_t now.
(find_and_hash_each_line): Support multi-byte input.
* src/util.c: Include mbcel.h, uchar.h.
(lines_differ): New args S1LEN, S2LEN, needed for mbcel_scan.
Caller changed. Support multi-byte input.
* tests/ignore-case: New file.
* tests/Makefile.am (TESTS): Add it.
* tests/ignore-tab-expansion: Add UTF-8 test.
* tests/init.cfg (require_utf8_locale_): New function.
* tests/side-by-side: Use it. Add a column-counting test.
* src/io.c (find_and_hash_each_line):
* src/util.c (lines_differ):
Treat '\0', '\a', '\b', '\f', '\r', '\v' consistently with how
side.c treats them when expanding them, e.g., backspacing from
column 1 is a no-op when counting tab columns.
* tests/ignore-tab-expansion: New test.
* tests/Makefile.am (TESTS): Add it.
* src/cmp.c (bytes): Use INTMAX_MAX, not -1, for “infinity”.
This simplifies the code and is not a value that can be exhausted
these days.
(main, cmp): Treat very large -n values as “infinity”.
* tests/cmp: Test this.
* NEWS: Document this.
* src/cmp.c (specify_ignore_initial):
If the value exceeds TYPE_MAXIMUM (off_t), set the correspnding
ignore_initial value to -1 instead of reporting an error.
(main, cmp, file_position): All uses changed. Hence a huge value
will always do the right thing with regular files, which cannot
contain more than TYPE_MAXIMUM (off_t) bytes. There still may be
EOVERFLOW failures reported for non-regular files, though, as
these can be larger.
* tests/cmp: Test cmp -i N when N cannot fit into 64 bits.
* NEWS: Mention this.
* src/cmp.c (main, cmp): Do not trust st_size == 0, as it may
be a /proc file.
* tests/brief-vs-stat-zero-kernel-lies: Also test cmp -s.
AC_SYS_LARGEFILE meaning has changed, need AC_SYS_YEAR2038 as well
* NEWS: mention this
* tests: add test
* bootstrap.conf: add year2038
Copyright-paperwork-exempt: yes
Problem reported by Robert Webb (bug#61193).
* NEWS: Mention this.
* src/diff.c (main): Omit stray ‘sizeof’.
* tests/ifdef: New test.
* tests/Makefile.am (TESTS): Add it.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
Run "make update-copyright" and then...
* gnulib: Update to latest with copyright year adjusted.
* tests/init.sh: Sync with gnulib to pick up copyright year.
* bootstrap: Likewise.
Problem reported by alec (Bug#35256).
* NEWS: Mention the fix.
* bootstrap.conf (gnulib_modules): Use strtoimax and xstrtoimax,
not strtoumax and strtoumax.
* src/cmp.c (bytes): Now signed, with -1 representing no limit.
All uses changed.
* src/cmp.c (specify_ignore_initial, main):
* src/diff.c (main):
* src/ifdef.c (format_group):
* src/sdiff.c (interact):
Use strtoimax, not strtoumax.
Using this file,
cat > leading-blank.exempt <<\EOF
(\.gitmodules|help2man|pre-commit)$
(?:^|\/)ChangeLog[^/]*$
(?:^|\/)(?:GNU)?[Mm]akefile[^/]*$
\.(?:am|mk)$
EOF
run the following command to convert all non-conforming leading white
space to be all spaces:
git ls-files \
| pcregrep -vf leading-blank.exempt \
| xargs pcregrep -l '^ *\t' \
| xargs perl -MText::Tabs -ni -le \
'$m=/^( *\t[ \t]*)(.*)/; print $m ? expand($1) . $2 : $_'
Since that changed old NEWS, I also ran "make update-NEWS-hash"
to update the old_NEWS_hash value in cfg.mk.
* NEWS: Mention this.
* src/cmp.c, src/diff3.c, src/sdiff.c: Include stdopen.h.
(main): Call stdopen early.
* src/cmp.c (main): Simplify now that we need not worry about
stdin being closed.
* src/diff.c (main): Translate stdopen diagnostic.
* NEWS: Mention this.
* bootstrap.conf (gnulib_modules): Add stdopen.
* doc/diffutils.texi (Comparing Directories):
Do not document behavior if stdin is closed.
* src/diff.c: Include stdopen.h.
(main): Call stdopen early.
(compare_files) [__hpux]: Remove recently-introduced
special case for HP-UX exec with stdin closed.
* tests/new-file: Remove tests of the removed feature.
GNU less can display ANSI-colored text with the -R flag, but this
support has some limitations. One of them is that if an escape
sequence starts on one line and ends on a different line, only the
first line will be colored in less.
As a result, when diff creates colored output with multi-line deletes
or adds, less will only color the first line.
This change resets ANSI color to the default at the end of
each line and restarts it at the beginning of the next. It patches
normal and context mode. Side-by-side already worked in my testing.
* src/context.c (print_context_label, pr_context_hunk): As above.
(pr_unidiff_hunk, print_context_header): Likewise.
* src/normal.c (print_normal_hunk): Likewise.
* tests/colors: Adjust existing tests to accommodate this.
* NEWS (Improvements): Mention it.
Proposed in http://bugs.gnu.org/31105
Problem reported by Hongxu Chen (Bug#31935).
* src/io.c (prepare_text): Strip trailing CR before
doing the rest of the analysis.
* NEWS: Mention the fix.
Co-authored-by: Jim Meyering <jim@meyering.net>