updatedb: Remove support for the old pre-4.0 database format.

* locate/testsuite/Makefile.am (EXTRA_DIST_EXP): Remove
locate.gnu/old_prefix.exp and locate.gnu/oldformat.exp.
(EXTRA_DIST_XO): Remove locate.gnu/old_prefix.xo and
locate.gnu/oldformat.xo.
* doc/find.texi (Database Formats): Remove the warning about old
versions of locate failing to read the LOCATE02 database format.
Mention that the slocate database format is also supported.
(Old Database Format): Point out that updatedb will no longer
produce the old format.
(Invoking updatedb): Remove mention of the --old-format option.
Remove mention of --dbformat=old.
(Long File Name Bugs with Old-Format Databases): Remove this
section.
* locate/updatedb.sh: remove support for --dbformat=old and
--old-format.
(checkbinary): Don't look for the bigram and code binaries.
* locate/updatedb.1: Explain that support for the old database
format has been removed from updatedb and will shortly be removed
from locate also.  Remove the documentation for the removed
option --old-format and mention of --dbformat-old.
* locate/code.c: remove since this program was only used to
generate old-format databases.
* locate/bigram.c: remove since this program was only used to
generate old-format databases.
* po/POTFILES.in: Remove bigram.c and code.c.
* locate/word_io.c (putword): Remove this function, since it was
only needed for making old-format databases.
* find/find.1 (NON-BUGS): Don't mention bigram.c and code.c in the
example.
* locate/locatedb.h: Remove declaration of putword, which has been
deleted.
* locate/Makefile.am (libexec_PROGRAMS): Remove bigram and code
(since they were only used to generate old-format databases).
(updatedb): Don't substitute @bigram@ and @code@.
(code_SOURCES): Delete.
* locate/testsuite/locate.gnu/old_prefix.exp: delete test case for
the old database format.
* locate/testsuite/locate.gnu/old_prefix.xo: Likewise.
* locate/testsuite/locate.gnu/oldformat.exp: Likewise.
* locate/testsuite/locate.gnu/oldformat.xo: Likewise.
* TODO: manpages for bigram and code are no longer needed.
* NEWS: Mention these changes.
This commit is contained in:
James Youngman 2016-01-09 22:24:59 +00:00
parent 69e308b286
commit 89ec0211ce
17 changed files with 69 additions and 736 deletions

6
NEWS
View File

@ -4,6 +4,12 @@ GNU findutils NEWS - User visible changes. -*- outline -*- (allout)
** Changes to locate / updatedb
Support for generating old-format databases (with updatedb
--old-format or updatedb --dbformat=old) has been removed. The old
database format was deprecated in 2007 (and updatedb has warned about
this since that time). The locate program will will read old-format
databases, though this support also will be removed.
The updatedb script now operates in the C locale only. This means
that character encoding issues are now not likely to cause sort to
fail. It also honours the TMPDIR environment variable if that was

2
TODO
View File

@ -2,7 +2,7 @@
* Internationalization
** updatedb.sh should be internationalized
* man pages for frcode, bigram, and code
* man page for frcode
Perhaps a better description in texi pages as well.
* Add option for find to sort output in lexical order for use for updatedb

View File

@ -2912,18 +2912,17 @@ directory trees when the databases were last updated. The file name
database format changed starting with GNU @code{locate} version 4.0 to
allow machines with different byte orderings to share the databases.
GNU @code{locate} can read both the old and new database formats.
However, old versions of @code{locate} (on other Unix systems, or GNU
@code{locate} before version 4.0) produce incorrect results if run
against a database in something other than the old format.
Support for the old database format will eventually be discontinued,
first in @code{updatedb} and later in @code{locate}.
GNU @code{locate} can read both the old pre-findutils-4.0 database
format and the @samp{LOCATE02} database format. Support for the old
database format will shortly be removed from @code{locate}. It has
already been removed from @code{updatedb}.
If you run @samp{locate --statistics}, the resulting summary indicates
the type of each @code{locate} database. You select which database
format @code{updatedb} will use with the @samp{--dbformat} option.
The @samp{slocate} database format is very similar to @samp{LOCATE02}
and is also supported (in both @code{updatedb} and @code{locate}).
@menu
* LOCATE02 Database Format::
@ -3024,21 +3023,20 @@ interpreted as for the GNU LOCATE02 format.
@subsection Old Database Format
The old database format is used by Unix @code{locate} and @code{find}
programs and earlier releases of the GNU ones. @code{updatedb}
produces this format if given the @samp{--old-format} option.
programs and pre-4.0 releases of GNU findutils. @code{locate}
understands this format, though @code{updatedb} will no longer produce
it.
@code{updatedb} runs programs called @code{bigram} and @code{code} to
produce old-format databases. The old format differs from the new one
in the following ways. Instead of each entry starting with an
offset-differential count byte and ending with a null, byte values
from 0 through 28 indicate offset-differential counts from -14 through
14. The byte value indicating that a long offset-differential count
follows is 0x1e (30), not 0x80. The long counts are stored in host
byte order, which is not necessarily network byte order, and host
integer word size, which is usually 4 bytes. They also represent a
count 14 less than their value. The database lines have no
termination byte; the start of the next line is indicated by its first
byte having a value <= 30.
The old format differs from @samp{LOCATE02} in the following ways.
Instead of each entry starting with an offset-differential count byte
and ending with a null, byte values from 0 through 28 indicate
offset-differential counts from -14 through 14. The byte value
indicating that a long offset-differential count follows is 0x1e (30),
not 0x80. The long counts are stored in host byte order, which is not
necessarily network byte order, and host integer word size, which is
usually 4 bytes. They also represent a count 14 less than their
value. The database lines have no termination byte; the start of the
next line is indicated by its first byte having a value <= 30.
In addition, instead of starting with a dummy entry, the old database
format starts with a 256 byte table containing the 128 most common
@ -3049,17 +3047,13 @@ offset-differential count coding makes these databases 20-25% smaller
than the new format, but makes them not 8-bit clean. Any byte in a
file name that is in the ranges used for the special codes is replaced
in the database by a question mark, which not coincidentally is the
shell wildcard to match a single character.
shell wildcard to match a single character. The old format therefore
cannot faithfully store entries with non-ASCII characters.
The old format therefore cannot faithfully store entries with
non-ASCII characters. It therefore should not be used in
internationalised environments. That is, most installations should
not use it.
Because the long counts are stored by the @code{code} program as
Because the long counts are stored as
native-order machine words, the database format is not easily used in
environments which differ in terms of byte order. If locate databases
are to be shared between machines, the LOCATE02 database format should
are to be shared between machines, the @samp{LOCATE02} database format should
be used. This has other benefits as discussed above. However, the
length of the filename currently being processed can normally be used
to place reasonable limits on the long counts and so this information
@ -3098,16 +3092,6 @@ the newline character, meaning that parts of file names containing
newlines will be incorrectly sorted. This can result in both
incorrect matches and incorrect failures to match.
On the other hand, if you are using the old database format, file
names with embedded newlines are not correctly handled. There is no
technical limitation which enforces this, it's just that the
@code{bigram} program has not been updated to support lists of file
names separated by nulls.
So, if you are using the new database format (this is the default) and
your system uses GNU @code{sort}, newlines will be correctly handled
at all times. Otherwise, newlines may not be correctly handled.
@node File Permissions
@chapter File Permissions
@ -3631,24 +3615,12 @@ The user to search network directories as, using @code{su}. Default
@code{user} is @code{daemon}. You can also use the environment variable
@code{NETUSER} to set this user.
@item --old-format
Generate a @code{locate} database in the old format, for compatibility
with versions of @code{locate} other than GNU @code{locate}. Using
this option means that @code{locate} will not be able to properly
handle non-ASCII characters in file names (that is, file names
containing characters which have the eighth bit set, such as many of
the characters from the ISO-8859-1 character set). @xref{Database
Formats}, for a detailed description of the supported database
formats.
@item --dbformat=@var{FORMAT}
Generate the locate database in format @code{FORMAT}. Supported
database formats include @code{LOCATE02} (which is the default),
@code{old} and @code{slocate}. The @code{old} format exists for
compatibility with implementations of @code{locate} on other Unix
systems. The @code{slocate} format exists for compatibility with
@code{slocate}. @xref{Database Formats}, for a detailed description
of each format.
database formats include @code{LOCATE02} (which is the default) and
@code{slocate}. The @code{slocate} format exists for compatibility
with @code{slocate}. @xref{Database Formats}, for a detailed
description of each format.
@item --help
Print a summary of the command line usage and exit.
@ -5377,47 +5349,6 @@ resolved by using @code{locate}'s @samp{-0} option, this still leaves
the race condition problems associated with @samp{find @dots{} -print0}.
There is no way to avoid these problems in the case of @code{locate}.
@subsection Long File Name Bugs with Old-Format Databases
Old versions of @code{locate} have a bug in the way that old-format
databases are read. This bug affects the following versions of
@code{locate}:
@enumerate
@item All releases prior to 4.2.31
@item All 4.3.x releases prior to 4.3.7
@end enumerate
The affected versions of @code{locate} read file names into a
fixed-length 1026 byte buffer, allocated on the heap. This buffer is
not extended if file names are too long to fit into the buffer. No
range checking on the length of the filename is performed. This could
in theory lead to a privilege escalation attack. Findutils versions
4.3.0 to 4.3.6 are also affected.
On systems using the old database format and affected versions of
@code{locate}, carefully-chosen long file names could in theory allow
malicious users to run code of their choice as any user invoking
locate.
If remote users can choose the names of files stored on your system,
and these files are indexed by @code{updatedb}, this may be a remote
security vulnerability. Findutils version 4.2.31 and findutils
version 4.3.7 include fixes for this problem. The @code{updatedb},
@code{bigram} and @code{code} programs do no appear to be affected.
If you are also using GNU coreutils, you can use the following command
to determine the length of the longest file name on a given system:
@example
find / -print0 | tr -c '\0' 'x' | tr '\0' '\n' | wc -L
@end example
Although this problem is significant, the old database format is not
the default, and use of the old database format is not common. Most
installations and most users will not be affected by this problem.
@node Security Summary
@section Summary

View File

@ -2212,7 +2212,7 @@ resulting in
actually receiving a command line like this:
.nf
.
.B find . \-name bigram.c code.c frcode.c locate.c \-print
.B find . \-name frcode.c locate.c word_io.c \-print
.
.fi
That command is of course not going to work. Instead of doing things

View File

@ -4,12 +4,9 @@ AM_CFLAGS = $(WARN_CFLAGS)
LOCATE_DB = $(localstatedir)/locatedb
localedir = $(datadir)/locale
AM_INSTALLCHECK_STD_OPTIONS_EXEMPT = \
frcode$(EXEEXT) \
code$(EXEEXT) \
bigram$(EXEEXT)
AM_INSTALLCHECK_STD_OPTIONS_EXEMPT = frcode$(EXEEXT)
bin_PROGRAMS = locate
libexec_PROGRAMS = frcode code bigram
libexec_PROGRAMS = frcode
bin_SCRIPTS = updatedb
man_MANS = locate.1 updatedb.1 locatedb.5
BUILT_SOURCES = dblocation.texi
@ -18,7 +15,6 @@ CLEANFILES = updatedb
DISTCLEANFILES = dblocation.texi
locate_SOURCES = locate.c word_io.c
code_SOURCES = code.c word_io.c
nodist_locate_TEXINFOS = dblocation.texi
AM_CPPFLAGS = -I$(top_srcdir)/lib -I../gl/lib -I$(top_srcdir)/gl/lib -DLOCATE_DB=\"$(LOCATE_DB)\" -DLOCALEDIR=\"$(localedir)\"
@ -34,8 +30,6 @@ updatedb: updatedb.sh Makefile
rm -f $@
find=`echo find|sed '$(transform)'`; \
frcode=`echo frcode|sed '$(transform)'`; \
bigram=`echo bigram|sed '$(transform)'`; \
code=`echo code|sed '$(transform)'`; \
sed \
-e "s,@""bindir""@,$(bindir)," \
-e "s,@""libexecdir""@,$(libexecdir)," \
@ -44,8 +38,6 @@ updatedb: updatedb.sh Makefile
-e "s,@""PACKAGE_NAME""@,$(PACKAGE_NAME)," \
-e "s,@""find""@,$${find}," \
-e "s,@""frcode""@,$${frcode}," \
-e "s,@""bigram""@,$${bigram}," \
-e "s,@""code""@,$${code}," \
-e "s,@""SORT""@,$(SORT)," \
-e "s,@""SORT_SUPPORTS_Z""@,$(SORT_SUPPORTS_Z)," \
$(srcdir)/updatedb.sh > $@

View File

@ -1,140 +0,0 @@
/* bigram -- list bigrams for locate
Copyright (C) 1994, 2007, 2009-2011, 2016 Free Software Foundation,
Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/* Usage: bigram < text > bigrams
Use `code' to encode a file using this output.
Read a file from stdin and write out the bigrams (pairs of
adjacent characters), one bigram per line, to stdout. To reduce
needless duplication in the output, it starts finding the
bigrams on each input line at the character where that line
first differs from the previous line (i.e., in the ASCII
remainder). Therefore, the input should be sorted in order to
get the least redundant output.
Written by James A. Woods <jwoods@adobe.com>.
Modified by David MacKenzie <djm@gnu.ai.mit.edu>. */
/* config.h must always be included first. */
#include <config.h>
/* system headers. */
#include <errno.h>
#include <stdio.h>
#include <locale.h>
#include <string.h>
#include <stdlib.h>
#include <sys/types.h>
/* gnulib headers. */
#include "closeout.h"
#include "gettext.h"
#include "progname.h"
#include "xalloc.h"
#include "error.h"
/* find headers would go here but we don't need any. */
/* We use gettext because for example xmalloc may issue an error message. */
#if ENABLE_NLS
# include <libintl.h>
# define _(Text) gettext (Text)
#else
# define _(Text) Text
#define textdomain(Domain)
#define bindtextdomain(Package, Directory)
#endif
/* Return the length of the longest common prefix of strings S1 and S2. */
static int
prefix_length (char *s1, char *s2)
{
register char *start;
for (start = s1; *s1 == *s2 && *s1 != '\0'; s1++, s2++)
;
return s1 - start;
}
int
main (int argc, char **argv)
{
char *path; /* The current input entry. */
char *oldpath; /* The previous input entry. */
size_t pathsize, oldpathsize; /* Amounts allocated for them. */
int line_len; /* Length of input line. */
if (argv[0])
set_program_name (argv[0]);
else
set_program_name ("bigram");
#ifdef HAVE_SETLOCALE
setlocale (LC_ALL, "");
#endif
bindtextdomain (PACKAGE, LOCALEDIR);
textdomain (PACKAGE);
(void) argc;
if (atexit (close_stdout))
{
error (EXIT_FAILURE, errno, _("The atexit library function failed"));
}
pathsize = oldpathsize = 1026; /* Increased as necessary by getline. */
path = xmalloc (pathsize);
oldpath = xmalloc (oldpathsize);
/* Set to empty string, to force the first prefix count to 0. */
oldpath[0] = '\0';
while ((line_len = getline (&path, &pathsize, stdin)) > 0)
{
register int count; /* The prefix length. */
register int j; /* Index into input line. */
path[line_len - 1] = '\0'; /* Remove the newline. */
/* Output bigrams in the remainder only. */
count = prefix_length (oldpath, path);
for (j = count; path[j] != '\0' && path[j + 1] != '\0'; j += 2)
{
putchar (path[j]);
putchar (path[j + 1]);
putchar ('\n');
}
{
/* Swap path and oldpath and their sizes. */
char *tmppath = oldpath;
size_t tmppathsize = oldpathsize;
oldpath = path;
oldpathsize = pathsize;
path = tmppath;
pathsize = tmppathsize;
}
}
free (path);
free (oldpath);
return 0;
}

View File

@ -1,285 +0,0 @@
/* code -- bigram- and front-encode filenames for locate
Copyright (C) 1994, 2005, 2007-2008, 2010-2011, 2016 Free Software
Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
*/
/* Compress a sorted list.
Works with `find' to encode a filename database to save space
and search time.
Usage:
bigram < file_list > bigrams
process-bigrams > most_common_bigrams
code most_common_bigrams < file_list > squeezed_list
Uses `front compression' (see ";login:", March 1983, p. 8).
The output begins with the 128 most common bigrams.
After that, the output format is, for each line,
an offset (from the previous line) differential count byte
followed by a (partially bigram-encoded) ASCII remainder.
The output lines have no terminating byte; the start of the next line
is indicated by its first byte having a value <= 30.
The encoding of the output bytes is:
0-28 likeliest differential counts + offset (14) to make nonnegative
30 escape code for out-of-range count to follow in next halfword
128-255 bigram codes (the 128 most common, as determined by `updatedb')
32-127 single character (printable) ASCII remainder
Written by James A. Woods <jwoods@adobe.com>.
Modified by David MacKenzie <djm@gnu.org>. */
/* config.h should always be included first. */
#include <config.h>
/* system headers. */
#include <errno.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/types.h>
/* gnulib headers. */
#include "closeout.h"
#include "error.h"
#include "gettext.h"
#include "progname.h"
#include "xalloc.h"
/* find headers. */
#include "findutils-version.h"
#include "locatedb.h"
#if ENABLE_NLS
# include <libintl.h>
# define _(Text) gettext (Text)
#else
# define _(Text) Text
#define textdomain(Domain)
#define bindtextdomain(Package, Directory)
#endif
#ifndef ATTRIBUTE_NORETURN
# define ATTRIBUTE_NORETURN __attribute__ ((__noreturn__))
#endif
/* The 128 most common bigrams in the file list, padded with NULs
if there are fewer. */
static char bigrams[257] = {0};
/* Return the offset of PATTERN in STRING, or -1 if not found. */
static int
strindex (char *string, char *pattern)
{
register char *s;
for (s = string; *s != '\0'; s++)
/* Fast first char check. */
if (*s == *pattern)
{
register char *p2 = pattern + 1, *s2 = s + 1;
while (*p2 != '\0' && *p2 == *s2)
p2++, s2++;
if (*p2 == '\0')
return s2 - strlen (pattern) - string;
}
return -1;
}
/* Return the length of the longest common prefix of strings S1 and S2. */
static int
prefix_length (char *s1, char *s2)
{
register char *start;
for (start = s1; *s1 == *s2 && *s1 != '\0'; s1++, s2++)
;
return s1 - start;
}
extern char *version_string;
static void
usage (FILE *stream)
{
fprintf (stream, _("\
Usage: %s [--version | --help]\n\
or %s most_common_bigrams < file-list > locate-database\n"),
program_name, program_name);
fputs (_("\nReport bugs to <bug-findutils@gnu.org>.\n"), stream);
}
static void inerr (const char *filename) ATTRIBUTE_NORETURN;
static void outerr (void) ATTRIBUTE_NORETURN;
static void
inerr (const char *filename)
{
error (EXIT_FAILURE, errno, "%s", filename);
/*NOTREACHED*/
abort ();
}
static void
outerr (void)
{
error (EXIT_FAILURE, errno, _("write error"));
/*NOTREACHED*/
abort ();
}
int
main (int argc, char **argv)
{
char *path; /* The current input entry. */
char *oldpath; /* The previous input entry. */
size_t pathsize, oldpathsize; /* Amounts allocated for them. */
int count, oldcount, diffcount; /* Their prefix lengths & the difference. */
char bigram[3]; /* Bigram to search for in table. */
int code; /* Index of `bigram' in bigrams table. */
FILE *fp; /* Most common bigrams file. */
int line_len; /* Length of input line. */
set_program_name (argv[0]);
if (atexit (close_stdout))
{
error (EXIT_FAILURE, errno, _("The atexit library function failed"));
}
bigram[2] = '\0';
if (argc != 2)
{
usage (stderr);
return 2;
}
if (0 == strcmp (argv[1], "--help"))
{
usage (stdout);
return 0;
}
else if (0 == strcmp (argv[1], "--version"))
{
display_findutils_version ("code");
return 0;
}
fp = fopen (argv[1], "r");
if (fp == NULL)
{
fprintf (stderr, "%s: ", argv[0]);
perror (argv[1]);
return 1;
}
pathsize = oldpathsize = 1026; /* Increased as necessary by getline. */
path = xmalloc (pathsize);
oldpath = xmalloc (oldpathsize);
/* Set to empty string, to force the first prefix count to 0. */
oldpath[0] = '\0';
oldcount = 0;
/* Copy the list of most common bigrams to the output,
padding with NULs if there are <128 of them. */
if (NULL == fgets (bigrams, 257, fp))
inerr (argv[1]);
if (256 != fwrite (bigrams, 1, 256, stdout))
outerr ();
if (EOF == fclose (fp))
inerr (argv[1]);
while ((line_len = getline (&path, &pathsize, stdin)) > 0)
{
char *pp;
path[line_len - 1] = '\0'; /* Remove newline. */
/* Squelch unprintable chars in path so as not to botch decoding. */
for (pp = path; *pp != '\0'; pp++)
{
if (!(*pp >= 040 && *pp < 0177))
*pp = '?';
}
count = prefix_length (oldpath, path);
diffcount = count - oldcount;
oldcount = count;
/* If the difference is small, it fits in one byte;
otherwise, two bytes plus a marker noting that fact. */
if (diffcount < -LOCATEDB_OLD_OFFSET || diffcount > LOCATEDB_OLD_OFFSET)
{
if (EOF ==- putc (LOCATEDB_OLD_ESCAPE, stdout))
outerr ();
if (!putword (stdout,
diffcount+LOCATEDB_OLD_OFFSET,
GetwordEndianStateNative))
outerr ();
}
else
{
if (EOF == putc (diffcount + LOCATEDB_OLD_OFFSET, stdout))
outerr ();
}
/* Look for bigrams in the remainder of the path. */
for (pp = path + count; *pp != '\0'; pp += 2)
{
if (pp[1] == '\0')
{
/* No bigram is possible; only one char is left. */
putchar (*pp);
break;
}
bigram[0] = *pp;
bigram[1] = pp[1];
/* Linear search for specific bigram in string table. */
code = strindex (bigrams, bigram);
if (code % 2 == 0)
putchar ((code / 2) | 0200); /* It's a common bigram. */
else
fputs (bigram, stdout); /* Write the text as printable ASCII. */
}
{
/* Swap path and oldpath and their sizes. */
char *tmppath = oldpath;
size_t tmppathsize = oldpathsize;
oldpath = path;
oldpathsize = pathsize;
path = tmppath;
pathsize = tmppathsize;
}
}
free (path);
free (oldpath);
return 0;
}

View File

@ -63,10 +63,6 @@ int getword (FILE *fp, const char *filename,
size_t maxvalue,
GetwordEndianState *endian_state_flag);
bool putword (FILE *fp, int word,
GetwordEndianState endian_state_flag);
#define SLOCATE_DB_MAGIC_LEN 2
#endif /* !INC_LOCATEDB_H */

View File

@ -41,8 +41,6 @@ locate.gnu/slocate.exp \
locate.gnu/notexists1.exp \
locate.gnu/notexists2.exp \
locate.gnu/notexists3.exp \
locate.gnu/old_prefix.exp \
locate.gnu/oldformat.exp \
locate.gnu/space1st.exp \
locate.gnu/sv-bug-14535.exp \
locate.gnu/exceedshort.exp
@ -63,9 +61,7 @@ locate.gnu/exists3.xo \
locate.gnu/slocate.xo \
locate.gnu/notexists1.xo \
locate.gnu/notexists2.xo \
locate.gnu/notexists3.xo \
locate.gnu/old_prefix.xo \
locate.gnu/oldformat.xo
locate.gnu/notexists3.xo
EXTRA_DIST = $(EXTRA_DIST_EXP) $(EXTRA_DIST_XO) $(EXTRA_DIST_XI)

View File

@ -1,13 +0,0 @@
set tmp "tmp"
exec rm -rf $tmp
exec mkdir $tmp
exec mkdir $tmp/subdir
exec touch $tmp/subdir/________________________________________________________________________________fred1
exec touch $tmp/subdir/________________________________________________________________________________fred2
exec touch $tmp/subdir/________________________________________________________________________________fred3
exec touch $tmp/subdir/________________________________________________________________________________fred4
locate_start p "--changecwd=. --output=$tmp/locatedb --old-format --localpaths=tmp/subdir 2>/dev/null" "--database=$tmp/locatedb tmp" {}
exec rm -rf $tmp

View File

@ -1,5 +0,0 @@
tmp/subdir
tmp/subdir/________________________________________________________________________________fred1
tmp/subdir/________________________________________________________________________________fred2
tmp/subdir/________________________________________________________________________________fred3
tmp/subdir/________________________________________________________________________________fred4

View File

@ -1,12 +0,0 @@
# A basic test for the old database format. We need this test because (among
# other reasons) the updatedb script only uses our mktemp replacement when
# it needs to run bigram/code.
set tmp "tmp"
exec rm -rf $tmp
exec mkdir $tmp
exec mkdir $tmp/subdir
exec touch $tmp/subdir/fred
# Redirect stderr to /dev/null to throw away the warning message about using
# the old format, because otherwise the presence of the error message would
# cause locate_start to signal a test case failure.
locate_start p "--changecwd=. --output=$tmp/locatedb --old-format --localpaths=tmp/subdir/ 2>/dev/null" "--database=$tmp/locatedb -e fred" {}

View File

@ -1 +0,0 @@
tmp/subdir/fred

View File

@ -26,19 +26,13 @@ Users can select which databases \fBlocate\fP searches using an
environment variable or command line option; see \fBlocate\fP(1).
Databases cannot be concatenated together.
.P
The file name database format changed starting with GNU
.B find
and
The @samp{LOCATGE02} database format was introduced in GNU findutils
version 4.0 in order to allow machines with different byte orderings
to share the databases. GNU
.B locate
version 4.0 to allow machines with different byte orderings to share
the databases. The new GNU
.B locate
can read both the old and new database formats.
However, old versions of
.B locate
and
.B find
produce incorrect results if given a new-format database.
can read both the old and @samp{LOCATE02} database formats, though
support for the old pre-4.0 database format will be removed shortly.
.SH OPTIONS
.TP
.B \-\-findoptions='\fI\-option1 \-option2...\fP'
@ -88,16 +82,8 @@ The user to search network directories as, using \fBsu\fP(1).
Default is \fBdaemon\fP.
You can also use the environment variable \fBNETUSER\fP to set this user.
.TP
.B \-\-old\-format
Create the database in the old format. This is a synonym for
.BR \-\-dbformat=old .
.TP
.B \-\-dbformat=F
Create the database in format F. The default format is called LOCATE02.
F can be
.B old
to select the old database format (this is the same as specifying
.BR \-\-old\-format ).
Alternatively the
.B slocate
format is also supported. When the

View File

@ -50,11 +50,11 @@ Usage: $0 [--findoptions='-option1 -option2...']
[--localpaths='dir1 dir2...'] [--netpaths='dir1 dir2...']
[--prunepaths='dir1 dir2...'] [--prunefs='fs1 fs2...']
[--output=dbfile] [--netuser=user] [--localuser=user]
[--old-format] [--dbformat] [--version] [--help]
[--dbformat] [--version] [--help]
Report bugs to <bug-findutils@gnu.org>."
changeto=/
old=no
for arg
do
# If we are unable to fork, the back-tick operator will
@ -72,7 +72,6 @@ do
--output) LOCATE_DB="$val" ;;
--netuser) NETUSER="$val" ;;
--localuser) LOCALUSER="$val" ;;
--old-format) old=yes ;;
--changecwd) changeto="$val" ;;
--dbformat) dbformat="$val" ;;
--version) fail=0; echo "$version" || fail=1; exit $fail ;;
@ -83,51 +82,32 @@ $usage" >&2
esac
done
case "${dbformat:+yes}_${old}" in
yes_yes)
echo "The --dbformat and --old-format cannot both be specified." >&2
exit 1
;;
*)
;;
frcode_options=""
case "$dbformat" in
"")
# Default, use LOCATE02
;;
LOCATE02)
;;
slocate)
frcode_options="$frcode_options -S 1"
;;
*)
# The "old" database format is no longer supported.
echo "Unsupported locate database format ${dbformat}: Supported formats are:" >&2
echo "LOCATE02, slocate" >&2
exit 1
esac
if test "$old" = yes || test "$dbformat" = "old" ; then
echo "Warning: future versions of findutils will shortly discontinue support for the old locate database format." >&2
old=yes
if @SORT_SUPPORTS_Z@
then
sort="@SORT@ -z"
print_option="-print0"
frcode_options="$frcode_options -0"
else
sort="@SORT@"
print_option="-print"
frcode_options=""
else
frcode_options=""
case "$dbformat" in
"")
# Default, use LOCATE02
;;
LOCATE02)
;;
slocate)
frcode_options="$frcode_options -S 1"
;;
*)
echo "Unsupported locate database format ${dbformat}: Supported formats are:" >&2
echo "LOCATE02, slocate, old" >&2
exit 1
esac
if @SORT_SUPPORTS_Z@
then
sort="@SORT@ -z"
print_option="-print0"
frcode_options="$frcode_options -0"
else
sort="@SORT@"
print_option="-print"
fi
fi
getuid() {
@ -230,8 +210,6 @@ fi
# The names of the utilities to run to build the database.
: ${find:=${BINDIR}/@find@}
: ${frcode:=${LIBEXECDIR}/@frcode@}
: ${bigram:=${LIBEXECDIR}/@bigram@}
: ${code:=${LIBEXECDIR}/@code@}
make_tempdir () {
# This implementation is adapted from the GNU Autoconf manual.
@ -263,7 +241,7 @@ checkbinary () {
fi
}
for binary in $find $frcode $bigram $code
for binary in $find $frcode
do
checkbinary $binary
done
@ -303,8 +281,6 @@ fi
rm -f $LOCATE_DB.n
trap 'rm -f $LOCATE_DB.n; exit' HUP TERM
if test $old = no; then
# LOCATE02 or slocate format
if {
cd "$changeto"
if test -n "$SEARCHPATHS"; then
@ -356,73 +332,4 @@ else
rm -f $LOCATE_DB.n
fi
else # old
if temp_directory="`make_tempdir`"; then
bigrams="${temp_directory}"/bigrams
filelist="${temp_directory}"/filelist
else
echo "failed to create temporary directory" >&2
exit 1
fi
rm -f $LOCATE_DB.n
trap 'rm -f $LOCATE_DB.n; rm -rf "${temp_directory}"; exit' HUP TERM
# Alphabetize subdirectories before file entries using tr. James Woods says:
# "to get everything in monotonic collating sequence, to avoid some
# breakage i'll have to think about."
{
cd "$changeto"
if test -n "$SEARCHPATHS"; then
if [ "$LOCALUSER" != "" ]; then
# : A5
su $LOCALUSER `select_shell $LOCALUSER` -c \
"$find $SEARCHPATHS $FINDOPTIONS \
\( $prunefs_exp \
-type d -regex '$PRUNEREGEX' \) -prune -o $print_option" || exit $?
else
# : A6
$find $SEARCHPATHS $FINDOPTIONS \
\( $prunefs_exp \
-type d -regex "$PRUNEREGEX" \) -prune -o $print_option || exit $?
fi
fi
if test -n "$NETPATHS"; then
myuid=`getuid`
if [ "$myuid" = 0 ]; then
# : A7
su $NETUSER `select_shell $NETUSER` -c \
"$find $NETPATHS $FINDOPTIONS \\( -type d -regex '$PRUNEREGEX' -prune \\) -o $print_option" ||
exit $?
else
# : A8
$find $NETPATHS $FINDOPTIONS \( -type d -regex "$PRUNEREGEX" -prune \) -o $print_option ||
exit $?
fi
fi
} | tr / '\001' | $sort | tr '\001' / > "$filelist"
# Compute the (at most 128) most common bigrams in the file list.
$bigram $bigram_opts < $filelist | sort | uniq -c | sort -nr |
awk '{ if (NR <= 128) print $2 }' | tr -d '\012' > "$bigrams"
# Code the file list.
$code "$bigrams" < "$filelist" > $LOCATE_DB.n
rm -rf "${temp_directory}"
# To reduce the chances of breaking locate while this script is running,
# put the results in a temp file, then rename it atomically.
if test -s $LOCATE_DB.n; then
chmod 644 ${LOCATE_DB}.n
mv ${LOCATE_DB}.n $LOCATE_DB
else
echo "updatedb: new database would be empty" >&2
rm -f $LOCATE_DB.n
fi
fi
exit 0

View File

@ -140,26 +140,3 @@ getword (FILE *fp,
return decode_value (data, maxvalue, endian_state_flag, filename);
}
}
bool
putword (FILE *fp, int word,
GetwordEndianState endian_state_flag)
{
size_t items_written;
/* You must decide before calling this function which
* endianness you want to use.
*/
assert (endian_state_flag != GetwordEndianStateInitial);
if (GetwordEndianStateSwab == endian_state_flag)
{
word = bswap_32(word);
}
items_written = fwrite (&word, sizeof (word), 1, fp);
if (1 == items_written)
return true;
else
return false;
}

View File

@ -22,8 +22,6 @@ lib/findutils-version.c
lib/listfile.c
lib/regextype.c
lib/safe-atoi.c
locate/bigram.c
locate/code.c
locate/frcode.c
locate/locate.c
locate/word_io.c