From 81c1ec3dca0cc5439575e27b4a85e740bba70630 Mon Sep 17 00:00:00 2001 From: James Youngman Date: Mon, 27 May 2024 19:27:02 +0100 Subject: [PATCH] find: by default, disable the cost-based optimiser. The cost-based optimiser re-orders predicates based on their expected cost. This re-ordering (as currently implemented) results in user-visible changes to the order of operations. An optimiser should not do that. For example, "-empty -readable" and "-readable -empty" don't actually have the same effect since "-empty" fails on an unreadable directory. This fixes savannah bug #58427 (unless the user specifies -O2). * find/util.c(set_option_defaults): set default optimisation level to 1 instead of 2. * find/tree.c(build_expression_tree): call do_arm_swaps (i.e. apply cost-based optimisations) only at optimisation level 2 and above. * find/find.1(-O): explain this change. * doc/find.texi(Optimisation Options): explain this change. * NEWS: mention this change. --- NEWS | 5 +++++ doc/find.texi | 20 ++++++++++++++------ find/find.1 | 13 +++++++++++++ find/tree.c | 5 ++++- find/util.c | 2 +- 5 files changed, 37 insertions(+), 8 deletions(-) diff --git a/NEWS b/NEWS index 1ef56c5a..2594c916 100644 --- a/NEWS +++ b/NEWS @@ -3,6 +3,11 @@ GNU findutils NEWS - User visible changes. -*- outline -*- (allout) * Noteworthy changes in release ?.? (????-??-??) [?] ** Bug Fixes + Find now defaults to optimisation level 1 rather than 2 and the + cost-based optimiser will only run at level 2 and above. This + should prevent changes of operation order which result in + user-visible differences in behaviour. + If the -P option to xargs is not used, xargs will not change the way in which the SIGUSR1 and SIGUSR2 signals are handled. This means that they will cause the program to terminate if the signals were not diff --git a/doc/find.texi b/doc/find.texi index b3896958..82295180 100644 --- a/doc/find.texi +++ b/doc/find.texi @@ -3526,12 +3526,13 @@ Use of an unrecognised formatting directive with @samp{-fprintf} The @samp{-O@var{level}} option sets @code{find}'s optimisation level to @var{level}. The default optimisation level is 1. -At certain optimisation levels, @code{find} reorders tests to speed up -execution while preserving the overall effect; that is, predicates -with side effects are not reordered relative to each other. The -optimisations performed at each optimisation level are as follows. +At certain optimisation levels (but not by default), @code{find} +reorders tests to speed up execution while preserving the overall +effect; that is, predicates with side effects are not reordered +relative to each other. The optimisations performed at each +optimisation level are as follows. -@table @samp +@table @asis @item 0 Currently equivalent to optimisation level 1. @@ -3553,7 +3554,6 @@ type @samp{FOO} which is not known (that is, present in @file{/etc/mtab}) at the time @code{find} starts, that predicate is equivalent to @samp{-false}. - @item 3 At this optimisation level, the full cost-based query optimiser is enabled. The order of tests is modified so that cheap (i.e., fast) @@ -3565,6 +3565,14 @@ earlier, and for @samp{-a}, predicates which are likely to fail are evaluated earlier. @end table +The re-ordering of operations performed by the cost-based optimiser +can result in user-visible behaviour change. For example, the +@samp{-readable} and @samp{-empty} predicates are sensitive to +re-ordering. If they are run in the order @samp{-empty -readable}, an +error message will be issued for unreadable directories. If they are +run in the order @samp{-readable -empty}, no error message will be +issued. This is the reason why such operation re-ordering is not +performed at the default optimisation level. @node Debug Options @subsection Debug Options diff --git a/find/find.1 b/find/find.1 index 3b15909d..e488bbe0 100644 --- a/find/find.1 +++ b/find/find.1 @@ -325,6 +325,19 @@ level 1) will not be changed in the 4.3.x release series. The findutils test suite runs all the tests on .B find at each optimisation level and ensures that the result is the same. + +The re-ordering of operations performed by the cost-based optimiser +can result in user-visible behaviour change. For example, the +.B \-readable +and +.B \-empty +predicates are sensitive to re-ordering. If they are run in the order +.BR "\-empty \-readable" , +an error message will be issued for unreadable directories. If they +are run in the order +.B \-readable \-empty +no error message will be issued. This is the reason why such operation +re-ordering is not performed at the default optimisation level. . .SH EXPRESSION The part of the command line after the list of starting points is the diff --git a/find/tree.c b/find/tree.c index ba85107b..7f105bd9 100644 --- a/find/tree.c +++ b/find/tree.c @@ -1430,7 +1430,10 @@ build_expression_tree (int argc, char *argv[], int end_of_leading_options) /* Check that the tree is in normalised order (opt_expr does this) */ check_normalization (eval_tree, true); - do_arm_swaps (eval_tree); + if (options.optimisation_level > 1) + { + do_arm_swaps (eval_tree); + } /* Check that the tree is still in normalised order */ check_normalization (eval_tree, true); diff --git a/find/util.c b/find/util.c index 3445db14..88dafd7b 100644 --- a/find/util.c +++ b/find/util.c @@ -1035,7 +1035,7 @@ set_option_defaults (struct options *p) p->output_block_size = 1024; p->debug_options = 0uL; - p->optimisation_level = 2; + p->optimisation_level = 1; if (getenv ("FIND_BLOCK_SIZE")) {