find: by default, disable the cost-based optimiser.

The cost-based optimiser re-orders predicates based on their expected
cost.  This re-ordering (as currently implemented) results in
user-visible changes to the order of operations.  An optimiser should
not do that.  For example, "-empty -readable" and "-readable -empty"
don't actually have the same effect since "-empty" fails on an
unreadable directory.  This fixes savannah bug #58427 (unless the user
specifies -O2).

* find/util.c(set_option_defaults): set default optimisation level to
1 instead of 2.
* find/tree.c(build_expression_tree): call do_arm_swaps (i.e. apply
cost-based optimisations) only at optimisation level 2 and above.
* find/find.1(-O): explain this change.
* doc/find.texi(Optimisation Options): explain this change.
* NEWS: mention this change.
This commit is contained in:
James Youngman 2024-05-27 19:27:02 +01:00
parent b12fb8c216
commit 81c1ec3dca
5 changed files with 37 additions and 8 deletions

5
NEWS
View File

@ -3,6 +3,11 @@ GNU findutils NEWS - User visible changes. -*- outline -*- (allout)
* Noteworthy changes in release ?.? (????-??-??) [?]
** Bug Fixes
Find now defaults to optimisation level 1 rather than 2 and the
cost-based optimiser will only run at level 2 and above. This
should prevent changes of operation order which result in
user-visible differences in behaviour.
If the -P option to xargs is not used, xargs will not change the way
in which the SIGUSR1 and SIGUSR2 signals are handled. This means that
they will cause the program to terminate if the signals were not

View File

@ -3526,12 +3526,13 @@ Use of an unrecognised formatting directive with @samp{-fprintf}
The @samp{-O@var{level}} option sets @code{find}'s optimisation level
to @var{level}. The default optimisation level is 1.
At certain optimisation levels, @code{find} reorders tests to speed up
execution while preserving the overall effect; that is, predicates
with side effects are not reordered relative to each other. The
optimisations performed at each optimisation level are as follows.
At certain optimisation levels (but not by default), @code{find}
reorders tests to speed up execution while preserving the overall
effect; that is, predicates with side effects are not reordered
relative to each other. The optimisations performed at each
optimisation level are as follows.
@table @samp
@table @asis
@item 0
Currently equivalent to optimisation level 1.
@ -3553,7 +3554,6 @@ type @samp{FOO} which is not known (that is, present in
@file{/etc/mtab}) at the time @code{find} starts, that predicate is
equivalent to @samp{-false}.
@item 3
At this optimisation level, the full cost-based query optimiser is
enabled. The order of tests is modified so that cheap (i.e., fast)
@ -3565,6 +3565,14 @@ earlier, and for @samp{-a}, predicates which are likely to fail are
evaluated earlier.
@end table
The re-ordering of operations performed by the cost-based optimiser
can result in user-visible behaviour change. For example, the
@samp{-readable} and @samp{-empty} predicates are sensitive to
re-ordering. If they are run in the order @samp{-empty -readable}, an
error message will be issued for unreadable directories. If they are
run in the order @samp{-readable -empty}, no error message will be
issued. This is the reason why such operation re-ordering is not
performed at the default optimisation level.
@node Debug Options
@subsection Debug Options

View File

@ -325,6 +325,19 @@ level 1) will not be changed in the 4.3.x release series. The
findutils test suite runs all the tests on
.B find
at each optimisation level and ensures that the result is the same.
The re-ordering of operations performed by the cost-based optimiser
can result in user-visible behaviour change. For example, the
.B \-readable
and
.B \-empty
predicates are sensitive to re-ordering. If they are run in the order
.BR "\-empty \-readable" ,
an error message will be issued for unreadable directories. If they
are run in the order
.B \-readable \-empty
no error message will be issued. This is the reason why such operation
re-ordering is not performed at the default optimisation level.
.
.SH EXPRESSION
The part of the command line after the list of starting points is the

View File

@ -1430,7 +1430,10 @@ build_expression_tree (int argc, char *argv[], int end_of_leading_options)
/* Check that the tree is in normalised order (opt_expr does this) */
check_normalization (eval_tree, true);
do_arm_swaps (eval_tree);
if (options.optimisation_level > 1)
{
do_arm_swaps (eval_tree);
}
/* Check that the tree is still in normalised order */
check_normalization (eval_tree, true);

View File

@ -1035,7 +1035,7 @@ set_option_defaults (struct options *p)
p->output_block_size = 1024;
p->debug_options = 0uL;
p->optimisation_level = 2;
p->optimisation_level = 1;
if (getenv ("FIND_BLOCK_SIZE"))
{