diff: fix bug with Asian file names

Problem reported by Errembault Philippe in:
http://lists.gnu.org/archive/html/bug-diffutils/2013-03/msg00012.html
* NEWS: Document this.
* src/dir.c (compare_names): Fall back on file_name_cmp if
compare_collated returns 0, unless ignoring file name case.
(diff_dirs): Don't bother with the O(N**2) stuff unless ignoring
file name case.
* tests/Makefile.am (TESTS): Add strcoll-0-names.
* tests/strcoll-0-names: New file.
This commit is contained in:
Paul Eggert 2013-04-03 08:20:31 -07:00
parent 885dfcec00
commit 4825b8d70c
4 changed files with 39 additions and 2 deletions

7
NEWS
View File

@ -2,6 +2,13 @@ GNU diffutils NEWS -*- outline -*-
* Noteworthy changes in release ?.? (????-??-??) [?]
** Bug fixes
Unless the --ignore-file-name-case option is used, diff now
considers file names to be equal only if they are byte-for-byte
equivalent. This fixes a bug where diff in an English locale might
consider two Asian file names to be the same merely because they
contain no English characters.
* Noteworthy changes in release 3.3 (2013-03-24) [stable]

View File

@ -166,7 +166,11 @@ static int
compare_names (char const *name1, char const *name2)
{
if (locale_specific_sorting)
return compare_collated (name1, name2);
{
int diff = compare_collated (name1, name2);
if (diff || ignore_file_name_case)
return diff;
}
return file_name_cmp (name1, name2);
}
@ -271,7 +275,7 @@ diff_dirs (struct comparison const *cmp,
O(N**2), where N is the number of names in a directory
that compare_names says are all equal, but in practice N
is so small it's not worth tuning. */
if (nameorder == 0)
if (nameorder == 0 && ignore_file_name_case)
{
int raw_order = file_name_cmp (*names[0], *names[1]);
if (raw_order != 0)

View File

@ -12,6 +12,7 @@ TESTS = \
no-dereference \
no-newline-at-eof \
stdin \
strcoll-0-names \
filename-quoting
EXTRA_DIST = \

25
tests/strcoll-0-names Executable file
View File

@ -0,0 +1,25 @@
#!/bin/sh
# Check that diff responds well with two different file names
# that compare equal with strcoll. See:
# http://lists.gnu.org/archive/html/bug-diffutils/2013-03/msg00012.html
. "${srcdir=.}/init.sh"; path_prepend_ ../src
# These two names compare equal in the en_US.UTF-8 locale
# in current (2013) versions of glibc.
# On systems where the names do not compare equal,
# this diff test should still do the right thing.
LC_ALL=en_US.UTF-8
export LC_ALL
name1='エンドカード1'
name2='ブックレット1'
mkdir d1 d2 || fail=1
echo x >d1/"$name1" || fail=1
echo x >d2/"$name2" || fail=1
# This should report a difference, but on the affected systems
# diffutils 3.3 does not.
diff d1 d2 && fail=1
Exit $fail