also update bootstrap from gnulib

maint: update copyright dates
build: update gnulib to latest
2026-01-28 02:14:44 +00:00 · 2026-01-02 16:50:40 -08:00 · 2026-01-02 16:42:12 -08:00 · 2026-01-02 16:42:12 -08:00 · 2025-11-12 14:11:09 -08:00 · 2025-11-12 14:11:09 -08:00
154 changed files with 5766 additions and 4206 deletions
--- a/.gitignore
+++ b/.gitignore
@ -52,6 +52,7 @@
 /tests/cspatfile
 /tests/ere.script
 /tests/get-mb-cur-max
+/tests/init.sh
 /tests/khadafy.out
 /tests/patfile
 /tests/spencer1.script
--- a/.gitmodules
+++ b/.gitmodules
@ -1,3 +1,3 @@
 [submodule "gnulib"]
        path = gnulib
-        url = git://git.sv.gnu.org/gnulib.git
+        url = https://git.savannah.gnu.org/git/gnulib
--- a/.prev-version
+++ b/.prev-version
@ -1 +1 @@
-3.2
+3.12
--- a/41
+++ b/41
@ -1,4 +1,4 @@
-  Copyright (C) 1992, 1997-2002, 2004-2018 Free Software Foundation, Inc.
+  Copyright (C) 1992, 1997-2002, 2004-2026 Free Software Foundation, Inc.

  Copying and distribution of this file, with or without modification,
  are permitted in any medium without royalty provided the copyright
@ -6,16 +6,20 @@

 Mike Haertel wrote the main program and the dfa and kwset matchers.

+Isamu Hasegawa wrote the POSIX regular expression matcher, which is
+part of the GNU C Library and is distributed as part of GNU grep for
+use on non-GNU systems.  Ulrich Drepper, Paul Eggert, Paolo Bonzini,
+Stanislav Brabec, Assaf Gordon, Jakub Jelinek, Jim Meyering, Arnold
+Robbins, Andreas Schwab and Florian Weimer also contributed to this
+matcher.
+
 Arthur David Olson contributed the heuristics for finding fixed substrings
 at the end of dfa.c.

-Richard Stallman and Karl Berry wrote the regex backtracking matcher.
-
 Henry Spencer wrote the original test suite from which grep's was derived.
-
 Scott Anderson invented the Khadafy test.

-David MacKenzie wrote the automatic configuration software use to
+David MacKenzie wrote the automatic configuration software used to
 produce the configure script.

 Authors of the replacements for standard library routines are identified
@ -26,23 +30,26 @@ non-matching text before calling the regexp matcher was originally due
 to James Woods.  He also contributed some code to early versions of
 GNU grep.

-Mike Haertel would like to thank Andrew Hume for many fascinating discussions
-of string searching issues over the years.  Hume & Sunday's excellent
-paper on fast string searching (AT&T Bell Laboratories CSTR #156)
-describes some of the history of the subject, as well as providing
-exhaustive performance analysis of various implementation alternatives.
+Mike Haertel would like to thank Andrew Hume for many fascinating
+discussions of string searching issues over the years.  Hume and
+Sunday's excellent paper on fast string searching describes some of
+the history of the subject, as well as providing exhaustive
+performance analysis of various implementation alternatives.
 The inner loop of GNU grep is similar to Hume & Sunday's recommended
-"Tuned Boyer Moore" inner loop.
-
-More work was done on regex.[ch] by Ulrich Drepper and Arnold
-Robbins. Regex is now part of GNU C library, see this package
-for complete details and credits.
+"Tuned Boyer Moore" inner loop (see the Hume & Sunday citation in
+the grep manual's "Performance" chapter).

 Arnold Robbins contributed to improve dfa.[ch]. In fact
 it came straight from gawk-3.0.3 with small editing and fixes.

-Many folks contributed.  See THANKS; if I omitted someone please
-send me email.
+Norihiro Tanaka contributed many performance improvements and other
+fixes, particularly to multi-byte matchers.
+
+Paul Eggert contributed support for recursive grep, as well as several
+performance improvements such as searching file holes efficiently.
+
+Many other folks contributed.  See THANKS; if someone is omitted
+please file a bug report.

 Alain Magloire maintained GNU grep until version 2.5e.

--- a/38
+++ b/38
@ -1407,7 +1407,7 @@
        is put in different compiled structure patterns[]. The patterns
        are given to dfacomp() and kwsmusts() as is.
        (Ecompile): Likewised.
-        (Fcompile): Reverse to the old behaviour of compiling the enire
+        (Fcompile): Reverse to the old behaviour of compiling the entire
        patterns in one shot.
        (EGexecute): If falling to GNU regex for the matching, loop in the
        array of compile patterns[] to find a match.
@ -1457,7 +1457,7 @@
        (xrealloc): Removed using lib/xmalloc.c.
        (xmalloc): Removed using lib/xmalloc.c
        (main): Register with atexit() to check for error on stdout.
-        * configure.in: Check for atexit(), call jm_MALLOC, jm_RELLOC and
+        * configure.in: Check for atexit(), call jm_MALLOC, jm_REALLOC and
        jm_PREREQ_ERROR.
        * tests/bre.awk: Removed the hack to drain the buffer since we
        always fclose(stdout) atexit.
@ -1541,7 +1541,7 @@
        * src/exclude.h: New file.
        * src/grep.c (main): Took the GNU tar code to handle
        the option --include, --exclude, --exclude-from.
-        Files are check for a match, with exlude_filename ().
+        Files are check for a match, with exclude_filename ().
        New option --exclude-from.
        * src/savedir.c: Call exclude_filename() to check for
        file pattern exclusion or inclusion.
@ -1592,7 +1592,7 @@

        * m4/dosfile.m4 (AC_DOSFILE): Move AC_DEFINEs out of AC_CACHE_CHECK.

-2001-02-17  Alain Malgoire
+2001-02-17  Alain Magloire

        * doc/grep.texi: Document the new options and the new behaviour
        back-references are local.  Use excerpt from Karl Berry regex
@ -1699,8 +1699,8 @@
        (color): Rename color variable to color_option.
        Removed 'always|never|auto' arguments, not necessary for grep.
        (exclude_pattern): new variable, holder for the file pattern.
-        (include_pattern): new variable, hoder for the file pattern.
-        * src/savedir.c: Signature change, take two new argmuments.
+        (include_pattern): new variable, holder for the file pattern.
+        * src/savedir.c: Signature change, take two new arguments.
        * doc/grep.texi: Document, new options.
        * doc/grep.man: Document, new options.

@ -1712,7 +1712,7 @@

 2001-02-09  Alain Magloire

-        Patch from Ulrich Drepper to provide hilighting.
+        Patch from Ulrich Drepper to provide highlighting.

        * src/grep.c: New option --color.
        (color): New static var.
@ -1722,7 +1722,7 @@
        to find the offset of the matching string.
        * src/savedir.c: Take advantage of _DIRENT_HAVE_TYPE if supported.
        * src/search.c (EGexecute, Fexecute, Pexecute): Take a new argument
-        when doing exact match for the color hiligting.
+        when doing exact match for the color highlighting.

 2000-09-01  Brian Youmans

@ -1792,7 +1792,7 @@

 2000-06-02  Paul Eggert

-        Problen noted by Gerald Stoller <gerald_stoller@hotmail.com>
+        Problem noted by Gerald Stoller <gerald_stoller@hotmail.com>

        * src/grep.c (main): POSIX says that -q overrides -l, which
        in turn overrides the other output options.  Fix grep to
@ -2208,7 +2208,7 @@
        on pre-OpenVMS 7.x systems; general overhaul.
        * src/getpagesize.h: Reinstate support for different pagesizes on
        VAX and Alpha. Work around problem with DEC C compiler.
-        * src/vms_fab.c: Cast to some assigments; fixed typo argcp vs. argp.
+        * src/vms_fab.c: Cast to some assignments; fixed typo argcp vs. argp.
        * src/vms_fab.h: Added new include files to avoid warnings about
        undefined function prototypes.
        Those patches were provided by Martin P.J. Zinser (zinser@decus.de).
@ -2670,7 +2670,7 @@

 1999-03-16 Volker Borchert

-        * configure.in: Use case case ... esac for  checking Visual C++.
+        * configure.in: Use case ... esac for  checking Visual C++.
        When ${CC} contains options it was not recognize.

 1999-03-07 Paul Eggert
@ -2764,7 +2764,7 @@

 1999-02-10 Alain Magloire

-        * bootstrap/{Makefile{try,am},REAMDE} : skeleton
+        * bootstrap/{Makefile{try,am},README} : skeleton
        provided for system lacking the tools to autoconfigure.

        * src/{e,f,}grepmat.c: added guard [HAVE_CONFIG_H]
@ -2858,7 +2858,7 @@
        * doc/Makefile.am djgpp/Makefile.am m4/Makefile.am vms/Makefile.am:
        New files.

-        * m4/progtest.m4: proctect '[]' from m4.
+        * m4/progtest.m4: protect '[]' from m4.
        Noted by Eli Z.

        * PATCHES-AC: New file, add the patch for autoconf in the dist.
@ -3333,7 +3333,7 @@
        Suggested by Harald Hanche-Olsen.

        * src/grep.c (main): '-f /dev/null' now specifies no patterns
-        and therfore matches nothing.
+        and therefore matches nothing.
        Reported by Jorge Stolfi.
        Patched by Paul Eggert.

@ -3368,7 +3368,7 @@
        * src/grep.c: reverse back to greping directories,
        One could skip the error message by defining
        SKIP_DIR_ERROR. There is no clear way of doing
-        things, I hope to setle this on the next majore release
+        things, I hope to settle this on the next major release
        Thanks Paul Eggert, Eli Zaretskii and gnits for the
        exchange.

@ -3427,7 +3427,7 @@
        (setmatcher) [HAVE_SETRLIMIT]: Set re_max_failures so that the
        matcher won't ever overflow the stack.
        (main) [__MSDOS__, _WIN32]: Handle backslashes and drive letters
-        in argv[0], remove the .exe suffix, and downcase the prgram name.
+        in argv[0], remove the .exe suffix, and downcase the program name.
        [O_BINARY]: Pass additional DOS-specific options to getopt_long
        and handle them.  Call stat before attempting to open the file, in
        case it is a directory (DOS will fail the open call for
@ -3497,7 +3497,7 @@
        regex package. Change the way the tests were done to be more
        conformant to automake.

-        * configure.in: added --disable-regex for folks with their own fuctions.
+        * configure.in: added --disable-regex for folks with their own functions.

        * grep-20d : available for testing

@ -3551,7 +3551,7 @@

        * check.sh, scriptgen.awk: fix grep paths.

-        * change the directory strucure: grep is now in src to comply with
+        * change the directory structure: grep is now in src to comply with
        gettext.m4.

        * grep.c version.c [VERSION]: got rid of version.c,
@ -3648,6 +3648,6 @@

        * Version 2.0 released.

-Copyright (C) 1998-2018 Free Software Foundation, Inc.
+Copyright (C) 1998-2026 Free Software Foundation, Inc.
 Copying and distribution of this file, with or without modification,
  are permitted provided the copyright notice and this notice are preserved.
--- a/8
+++ b/8
@ -12,7 +12,7 @@ Use the latest upstream sources
 Base any changes you make on the latest upstream sources.
 You can get a copy of the latest with this command:

-    git clone git://git.sv.gnu.org/grep
+    git clone https://git.savannah.gnu.org/git/grep

 That downloads the entire repository, including revision control history.
 Once downloaded, you can get incremental updates by running one of
@ -83,7 +83,7 @@ Make your changes on a private "topic" branch
 =============================================
 So you checked out grep like this:

-  git clone git://git.sv.gnu.org/grep
+  git clone https://git.savannah.gnu.org/git/grep

 Now, cd into the grep/ directory and run:

@ -468,7 +468,7 @@ you'd use doc/Copyright/request-assign.future:
    https://www.gnu.org/software/gnulib/Copyright/request-assign.future

 You may make assignments for up to four projects at a time.
-[
+
 In case you're wondering why we bother with all of this, read this:

    https://www.gnu.org/licenses/why-assign.html
@ -597,7 +597,7 @@ Then just open the index.html file (in the generated lcov-html directory)
 in your favorite web browser.

 ========================================================================
-Copyright (C) 2009-2018 Free Software Foundation, Inc.
+Copyright (C) 2009-2026 Free Software Foundation, Inc.

 Permission is granted to copy, distribute and/or modify this document
 under the terms of the GNU Free Documentation License, Version 1.3 or
--- a/Makefile.am
+++ b/Makefile.am
@ -1,6 +1,6 @@
 # Process this file with automake to create Makefile.in
 #
-# Copyright 1997-1998, 2005-2018 Free Software Foundation, Inc.
+# Copyright 1997-1998, 2005-2026 Free Software Foundation, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -66,13 +66,10 @@ gen-ChangeLog:
 # current locale considers to be equal.
 ASSORT = LC_ALL=C sort

-# Extract all lines up to the first one starting with "##".
-prologue = perl -ne '/^\#\#/ and exit; print' $(srcdir)/THANKS.in
-
 THANKS: THANKS.in Makefile.am .mailmap thanks-gen
 	$(AM_V_GEN)rm -f $@-t $@;					\
 	{								\
-	  $(prologue); echo;						\
+	  perl -ne '/^\#\#/ and exit; print' $(srcdir)/THANKS.in; echo;	\
 	  { perl -ne '/^$$/.../^$$/ and !/^$$/ and s/  +/\0/ and print'	\
 	      $(srcdir)/THANKS.in;					\
 	    git log --pretty=format:'%aN%x00%aE'			\
--- a/283
+++ b/283
@ -1,5 +1,274 @@
 GNU grep NEWS                                    -*- outline -*-

+* Noteworthy changes in release ?.? (????-??-??) [?]
+
+
+* Noteworthy changes in release 3.12 (2025-04-10) [stable]
+
+** Bug fixes
+
+  Searching a directory with at least 100,000 entries no longer fails
+  with "Operation not supported" and exit status 2. Now, this prints 1
+  and no diagnostic, as expected:
+    $ mkdir t && cd t && seq 100000|xargs touch && grep -r x .; echo $?
+    1
+  [bug introduced in grep 3.11]
+
+  -mN where 1 < N no longer mistakenly lseeks to end of input merely
+  because standard output is /dev/null.
+
+** Changes in behavior
+
+  The --unix-byte-offsets (-u) option is gone. In grep-3.7 (2021-08-14)
+  it became a warning-only no-op. Before then, it was a Windows-only no-op.
+
+  On Windows platforms and on AIX in 32-bit mode, grep in some cases
+  now supports Unicode characters outside the Basic Multilingual Plane.
+
+
+* Noteworthy changes in release 3.11 (2023-05-13) [stable]
+
+** Bug fixes
+
+  With -P, patterns like [\d] now work again.  Fixing this has caused
+  grep to revert to the behavior of grep 3.8, in that patterns like \w
+  and \b go back to using ASCII rather than Unicode interpretations.
+  However, future versions of GNU grep and/or PCRE2 are likely to fix
+  this and change the behavior of \w and \b back to Unicode again,
+  without breaking [\d] as 3.10 did.
+  [bug introduced in grep 3.10]
+
+  grep no longer fails on files dated after the year 2038,
+  when running on 32-bit x86 and ARM hosts using glibc 2.34+.
+  [bug introduced in grep 3.9]
+
+  grep -P no longer fails to match patterns using negated classes
+  like \D or \W when linked with PCRE2 10.34 or newer.
+  [bug introduced in grep 3.8]
+
+
+** Changes in behavior
+
+  grep --version now prints a line describing the version of PCRE2 it uses.
+  For example, it prints this when built with the very latest from git:
+    grep -P uses PCRE2 10.43-DEV 2023-04-14
+  or this with what's currently available in Fedora 37:
+    grep -P uses PCRE2 10.40 2022-04-14
+
+  previous versions of grep wouldn't respect the user provided settings for
+  PCRE_CFLAGS and PCRE_LIBS when building if a libpcre2-8 pkg-config module
+  was found.
+
+
+* Noteworthy changes in release 3.10 (2023-03-22) [stable]
+
+** Bug fixes
+
+  With -P, \d now matches only ASCII digits, regardless of PCRE
+  options/modes. The changes in grep-3.9 to make \b and \w work
+  properly had the undesirable side effect of making \d also match
+  e.g., the Arabic digits: ٠١٢٣٤٥٦٧٨٩.  With grep-3.9, -P '\d+'
+  would match that ten-digit (20-byte) string. Now, to match such
+  a digit, you would use \p{Nd}. Similarly, \D is now mapped to [^0-9].
+  [bug introduced in grep 3.9]
+
+
+* Noteworthy changes in release 3.9 (2023-03-05) [stable]
+
+** Bug fixes
+
+  With -P, some non-ASCII UTF8 characters were not recognized as
+  word-constituent due to our omission of the PCRE2_UCP flag. E.g.,
+  given f(){ echo Perú|LC_ALL=en_US.UTF-8 grep -Po "$1"; } and
+  this command, echo $(f 'r\w'):$(f '.\b'), before it would print ":r".
+  After the fix, it prints the correct results: "rú:ú".
+
+  When given multiple patterns the last of which has a back-reference,
+  grep no longer sometimes mistakenly matches lines in some cases.
+  [Bug#36148#13 introduced in grep 3.4]
+
+
+* Noteworthy changes in release 3.8 (2022-09-02) [stable]
+
+** Changes in behavior
+
+  The -P option is now based on PCRE2 instead of the older PCRE,
+  thanks to code contributed by Carlo Arenas.
+
+  The egrep and fgrep commands, which have been deprecated since
+  release 2.5.3 (2007), now warn that they are obsolescent and should
+  be replaced by grep -E and grep -F.
+
+  The confusing GREP_COLOR environment variable is now obsolescent.
+  Instead of GREP_COLOR='xxx', use GREP_COLORS='mt=xxx'.  grep now
+  warns if GREP_COLOR is used and is not overridden by GREP_COLORS.
+  Also, grep now treats GREP_COLOR like GREP_COLORS by silently
+  ignoring it if it attempts to inject ANSI terminal escapes.
+
+  Regular expressions with stray backslashes now cause warnings, as
+  their unspecified behavior can lead to unexpected results.
+  For example, '\a' and 'a' are not always equivalent
+  <https://bugs.gnu.org/39678>.  Similarly, regular expressions or
+  subexpressions that start with a repetition operator now also cause
+  warnings due to their unspecified behavior; for example, *a(+b|{1}c)
+  now has three reasons to warn.  The warnings are intended as a
+  transition aid; they are likely to be errors in future releases.
+
+  Regular expressions like [:space:] are now errors even if
+  POSIXLY_CORRECT is set, since POSIX now allows the GNU behavior.
+
+** Bug fixes
+
+  In locales using UTF-8 encoding, the regular expression '.' no
+  longer sometimes fails to match Unicode characters U+D400 through
+  U+D7FF (some Hangul Syllables, and Hangul Jamo Extended-B) and
+  Unicode characters U+108000 through U+10FFFF (half of Supplemental
+  Private Use Area plane B).
+  [bug introduced in grep 3.4]
+
+  The -s option no longer suppresses "binary file matches" messages.
+  [Bug#51860 introduced in grep 3.5]
+
+** Documentation improvements
+
+  The manual now covers unspecified behavior in patterns like \x, (+),
+  and range expressions outside the POSIX locale.
+
+
+* Noteworthy changes in release 3.7 (2021-08-14) [stable]
+
+** Changes in behavior
+
+  Use of the --unix-byte-offsets (-u) option now evokes a warning.
+  Since 3.1, this Windows-only option has had no effect.
+
+** Bug fixes
+
+  Preprocessing N patterns would take at least O(N^2) time when too many
+  patterns hashed to too few buckets. This now takes seconds, not days:
+  : | grep -Ff <(seq 6400000 | tr 0-9 A-J)
+  [Bug#44754 introduced in grep 3.5]
+
+
+* Noteworthy changes in release 3.6 (2020-11-08) [stable]
+
+** Changes in behavior
+
+  The GREP_OPTIONS environment variable no longer affects grep's behavior.
+  The variable was declared obsolescent in grep 2.21 (2014), and since
+  then any use had caused grep to issue a diagnostic.
+
+** Bug fixes
+
+  grep's DFA matcher performed an invalid regex transformation
+  that would convert an ERE like a+a+a+ to a+a+, which would make
+  grep a+a+a+ mistakenly match "aa".
+  [Bug#44351 introduced in grep 3.2]
+
+  grep -P now reports the troublesome input filename upon PCRE execution
+  failure.  Before, searching many files for something rare might fail with
+  just "exceeded PCRE's backtracking limit".  Now, it also reports which file
+  triggered the failure.
+
+
+* Noteworthy changes in release 3.5 (2020-09-27) [stable]
+
+** Changes in behavior
+
+  The message that a binary file matches is now sent to standard error
+  and the message has been reworded from "Binary file FOO matches" to
+  "grep: FOO: binary file matches", to avoid confusion with ordinary
+  output or when file names contain spaces and the like, and to be
+  more consistent with other diagnostics.  For example, commands
+  like 'grep PATTERN FILE | wc' no longer add 1 to the count of
+  matching text lines due to the presence of the message.  Like other
+  stderr messages, the message is now omitted if the --no-messages
+  (-s) option is given.
+
+  Two other stderr messages now use the typical form too.  They are
+  now "grep: FOO: warning: recursive directory loop" and "grep: FOO:
+  input file is also the output".
+
+  The --files-without-match (-L) option has reverted to its behavior
+  in grep 3.1 and earlier.  That is, grep -L again succeeds when a
+  line is selected, not when a file is listed.  The behavior in grep
+  3.2 through 3.4 was causing compatibility problems.
+
+** Bug fixes
+
+  grep -I no longer issues a spurious "Binary file FOO matches" line.
+  [Bug#33552 introduced in grep 2.23]
+
+  In UTF-8 locales, grep -w no longer ignores a multibyte word
+  constituent just before what would otherwise be a word match.
+  [Bug#43225 introduced in grep 2.28]
+
+  grep -i no longer mishandles ASCII characters that match multibyte
+  characters.  For example, 'LC_ALL=tr_TR.utf8 grep -i i' no longer
+  dumps core merely because 'i' matches 'İ' (U+0130 LATIN CAPITAL
+  LETTER I WITH DOT ABOVE) in Turkish when ignoring case.
+  [Bug#43577 introduced partly in grep 2.28 and partly in grep 3.4]
+
+  A performance regression with -E and many patterns has been mostly fixed.
+  "Mostly" as there is a performance tradeoff between Bug#22357 and Bug#40634.
+  [Bug#40634 introduced in grep 2.28]
+
+  A performance regression with many duplicate patterns has been fixed.
+  [Bug#43040 introduced in grep 3.4]
+
+  An N^2 RSS performance regression with many patterns has been fixed
+  in common cases (no backref, and no use of -o or --color).
+  With only 80,000 lines of /usr/share/dict/linux.words, the following
+  would use 100GB of RSS and take 3 minutes. With the fix, it used less
+  than 400MB and took less than one second:
+    head -80000 /usr/share/dict/linux.words > w; grep -vf w w
+  [Bug#43527 introduced in grep 3.4]
+
+** Build-related
+
+  "make dist" builds .tar.gz files again, as they are still used in
+  some barebones builds.
+
+
+* Noteworthy changes in release 3.4 (2020-01-02) [stable]
+
+** New features
+
+  The new --no-ignore-case option causes grep to observe case
+  distinctions, overriding any previous -i (--ignore-case) option.
+
+** Bug fixes
+
+  '.' no longer matches some invalid byte sequences in UTF-8 locales.
+  [bug introduced in grep 2.7]
+
+  grep -Fw can no longer false match in non-UTF-8 multibyte locales
+  For example, this command would erroneously print its input line:
+    echo ab | LC_CTYPE=ja_JP.eucjp grep -Fw b
+  [Bug#38223 introduced in grep 2.28]
+
+  The exit status of 'grep -L' is no longer incorrect when standard
+  output is /dev/null.
+  [Bug#37716 introduced in grep 3.2]
+
+  A performance bug has been fixed when grep is given many patterns,
+  each with no back-reference.
+  [Bug#33249 introduced in grep 2.5]
+
+  A performance bug has been fixed for patterns like '01.2' that
+  cause grep to reorder tokens internally.
+  [Bug#34951 introduced in grep 3.2]
+
+** Build-related
+
+  The build procedure no longer relies on any already-built src/grep
+  that might be absent or broken.  Instead, it uses the system 'grep'
+  to bootstrap, and uses src/grep only to test the build.  On Solaris
+  /usr/bin/grep is broken, but you can install GNU or XPG4 'grep' from
+  the standard Solaris distribution before building GNU Grep yourself.
+  [bug introduced in grep 2.8]
+
+
 * Noteworthy changes in release 3.3 (2018-12-20) [stable]

 ** Bug fixes
@ -8,9 +277,9 @@ GNU grep NEWS                                    -*- outline -*-
  the following would print nothing (it should print the input line):
    echo 123-x|LC_ALL=C grep '.\bx'
  Using a multibyte locale, using certain regexp constructs (some ranges,
-  backreferences), or forcing use of the PCRE matcher via --perl-regexp (-P)
+  back-references), or forcing use of the PCRE matcher via --perl-regexp (-P)
  would avoid the bug.
-  [bug introduced in grep 2.3]
+  [bug introduced in grep 3.2]


 * Noteworthy changes in release 3.2 (2018-12-20) [stable]
@ -201,7 +470,7 @@ GNU grep NEWS                                    -*- outline -*-

  grep -z would match strings it should not.  To trigger the bug, you'd
  have to use a regular expression including an anchor (^ or $) and a
-  feature like a range or a backreference, causing grep to forego its DFA
+  feature like a range or a back-reference, causing grep to forego its DFA
  matcher and resort to using re_search.  With a multibyte locale, that
  matcher could mistakenly match a string containing a newline.
  For example, this command:
@ -434,7 +703,7 @@ GNU grep NEWS                                    -*- outline -*-
  Previously it was unreliable, and sometimes crashed or looped.
  [bug introduced in grep-2.16]

-  grep -P now works with -w and -x and backreferences. Before,
+  grep -P now works with -w and -x and back-references. Before,
  echo aa|grep -Pw '(.)\1' would fail to match, yet
  echo aa|grep -Pw '(.)\2' would match.

@ -770,7 +1039,7 @@ GNU grep NEWS                                    -*- outline -*-
  X{0,0} is implemented correctly.  It used to be a synonym of X{0,1}.
  [bug present since "the beginning"]

-  In multibyte locales, regular expressions including backreferences
+  In multibyte locales, regular expressions including back-references
  no longer exhibit quadratic complexity (i.e., they are orders
  of magnitude faster). [bug present since multi-byte character set
  support was introduced in 2.5.2]
@ -928,7 +1197,7 @@ Version 2.5
  - The new option --line-buffered fflush on everyline.  There is a noticeable
    slow down when forcing line buffering.

-  - Back references  are now local to the regex.
+  - Back-references are now local to the regex.
    grep -e '\(a\)\1' -e '\(b\)\1'
    The last backref \1 in the second expression refer to \(b\)

@ -1142,7 +1411,7 @@ necessary to track the evolution of the regex package, and since
 I was changing it anyway I decided to do a general cleanup.

 ========================================================================
-Copyright (C) 1992, 1997-2002, 2004-2018 Free Software Foundation, Inc.
+Copyright (C) 1992, 1997-2002, 2004-2026 Free Software Foundation, Inc.

  Copying and distribution of this file, with or without modification,
  are permitted in any medium without royalty provided the copyright
--- a/16
+++ b/16
@ -1,4 +1,4 @@
-  Copyright (C) 1992, 1997-2002, 2004-2018 Free Software Foundation, Inc.
+  Copyright (C) 1992, 1997-2002, 2004-2026 Free Software Foundation, Inc.

  Copying and distribution of this file, with or without modification,
  are permitted in any medium without royalty provided the copyright
@ -12,13 +12,13 @@ GNU grep is provided "as is" with no warranty.  The exact terms
 under which you may use and (re)distribute this program are detailed
 in the GNU General Public License, in the file COPYING.

-GNU grep is based on a fast lazy-state deterministic matcher (about
-twice as fast as stock Unix egrep) hybridized with a Boyer-Moore-Gosper
-search for a fixed string that eliminates impossible text from being
-considered by the full regexp matcher without necessarily having to
-look at every character.  The result is typically many times faster
-than Unix grep or egrep.  (Regular expressions containing backreferencing
-will run more slowly, however.)
+GNU grep is based on a fast lazy-state deterministic matcher
+hybridized with Boyer-Moore and Aho-Corasick searches for fixed
+strings that eliminate impossible text from being considered by the
+full regexp matcher without necessarily having to look at every
+character.  The result is typically many times faster than traditional
+implementations.  (Regular expressions containing back-references will
+run more slowly, however.)

 See the files AUTHORS and THANKS for a list of authors and other contributors.

--- a/2
+++ b/2
@ -1,4 +1,4 @@
-  Copyright (C) 1992, 1997-2002, 2004-2018 Free Software Foundation, Inc.
+  Copyright (C) 1992, 1997-2002, 2004-2026 Free Software Foundation, Inc.

  Copying and distribution of this file, with or without modification,
  are permitted in any medium without royalty provided the copyright
--- a/90
+++ b/90
@ -1,35 +1,47 @@
-*- outline -*-
+Building from a Git repository				-*- outline -*-

 These notes intend to help people working on the checked-out sources.
 These requirements do not apply when building from a distribution tarball.
+If this package has a file HACKING, please also read that file for
+more detailed contribution guidelines.

 * Requirements

-We've opted to keep only the highest-level sources in the GIT repository.
-This eases our maintenance burden, (fewer merges etc.), but imposes more
+We've opted to keep only the highest-level sources in the Git repository.
+This eases our maintenance burden (fewer merges etc.), but imposes more
 requirements on anyone wishing to build from the just-checked-out sources.
-Note the requirements to build the released archive are much less and
-are just the requirements of the standard ./configure && make procedure.
+(The requirements to build from a release are much less and are just
+the requirements of the standard './configure && make' procedure.)
 Specific development tools and versions will be checked for and listed by
 the bootstrap script.  See README-prereq for specific notes on obtaining
 these prerequisite tools.

 Valgrind <http://valgrind.org/> is also highly recommended, if
-Valgrind supports your architecture. See also README-valgrind.
+Valgrind supports your architecture.  See also README-valgrind
+(if present).

 While building from a just-cloned source tree may require installing a
-few prerequisites, later, a plain 'git pull && make' should be sufficient.
+few prerequisites, later, a plain 'git pull && make' typically suffices.

-* First GIT checkout
+* First Git checkout

 You can get a copy of the source repository like this:

-        $ git clone git://git.sv.gnu.org/grep
-        $ cd grep
+        $ git clone https://git.savannah.gnu.org/git/<packagename>
+        $ cd <packagename>

-As an optional step, if you already have a copy of the gnulib git
-repository on your hard drive, then you can use it as a reference to
-reduce download time and disk space requirements:
+where '<packagename>' stands for 'coreutils' or whatever other package
+you are building.
+
+To use the most-recent Gnulib (as opposed to the Gnulib version that
+the package last synchronized to), do this next:
+
+        $ git submodule foreach git pull origin master
+        $ git commit -m 'build: update gnulib submodule to latest' gnulib
+
+As an optional step, if you already have a copy of the Gnulib Git
+repository, then you can use it as a reference to reduce download
+time and file system space requirements:

        $ export GNULIB_SRCDIR=/path/to/gnulib

@ -38,20 +50,14 @@ which are extracted from other source packages:

        $ ./bootstrap

-To use the most-recent gnulib (as opposed to the gnulib version that
-the package last synchronized to), do this next:
-
-        $ git submodule foreach git pull origin master
-        $ git commit -m 'build: update gnulib submodule to latest' gnulib
-
 And there you are!  Just

-        $ ./configure --quiet #[--enable-gcc-warnings] [*]
+        $ ./configure --quiet #[--disable-gcc-warnings] [*]
        $ make
        $ make check

 At this point, there should be no difference between your local copy,
-and the GIT master copy:
+and the Git master copy:

        $ git diff

@ -59,15 +65,43 @@ should output no difference.

 Enjoy!

-[*] The --enable-gcc-warnings option is useful only with glibc
-and with a very recent version of gcc.  You'll probably also have
-to use recent system headers.  If you configure with this option,
-and spot a problem, please be sure to send the report to the bug
-reporting address of this package, and not to that of gnulib, even
-if the problem seems to originate in a gnulib-provided file.
+[*] By default GCC warnings are enabled when building from Git.
+If you get warnings with recent GCC and Glibc with default
+configure-time options, please report the warnings to the bug
+reporting address of this package instead of to bug-gnulib,
+even if the problem seems to originate in a Gnulib-provided file.
+If you get warnings with other configurations, you can run
+'./configure --disable-gcc-warnings' or 'make WERROR_CFLAGS='
+to build quietly or verbosely, respectively.
 -----

-Copyright (C) 2002-2018 Free Software Foundation, Inc.
+* Submitting patches
+
+If you develop a fix or a new feature, please send it to the
+appropriate bug-reporting address as reported by the --help option of
+each program.  One way to do this is to use vc-dwim
+<https://www.gnu.org/software/vc-dwim/>), as follows.
+
+  Run the command "vc-dwim --initialize" from the top-level directory
+  of this package's git-cloned hierarchy.
+
+  Edit the (empty) ChangeLog file that this command creates, creating a
+  properly-formatted entry according to the GNU coding standards
+  <https://www.gnu.org/prep/standards/html_node/Change-Logs.html>.
+
+  Make your changes.
+
+  Run the command "vc-dwim" and make sure its output (the diff of all
+  your changes) looks good.
+
+  Run "vc-dwim --commit".
+
+  Run the command "git format-patch --stdout -1", and email its output
+  in, using the output's subject line.
+
+-----
+
+Copyright (C) 2002-2026 Free Software Foundation, Inc.

 This program is free software: you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
--- a/65
+++ b/65
@ -1,62 +1,41 @@
 This gives some notes on obtaining the tools required for development.
-I.E. the tools checked for by the bootstrap script and include:
+These tools can be used by the 'bootstrap' and 'configure' scripts,
+as well as by 'make'.  They include:

 - Autoconf   <https://www.gnu.org/software/autoconf/>
 - Automake   <https://www.gnu.org/software/automake/>
- Bison      <https://www.gnu.org/software/bison/>
 - Gettext    <https://www.gnu.org/software/gettext/>
 - Git        <https://git-scm.com/>
 - Gperf      <https://www.gnu.org/software/gperf/>
 - Gzip       <https://www.gnu.org/software/gzip/>
+- Help2man   <https://www.gnu.org/software/help2man/>
+- M4         <https://www.gnu.org/software/m4/>
+- Make       <https://www.gnu.org/software/make/>
 - Perl       <https://www.cpan.org/>
 - Pkg-config <https://www.freedesktop.org/wiki/Software/pkg-config/>
- Rsync      <https://rsync.samba.org/>
 - Tar        <https://www.gnu.org/software/tar/>
 - Texinfo    <https://www.gnu.org/software/texinfo/>
+- Wget       <https://www.gnu.org/software/wget/>
+- XZ Utils   <https://tukaani.org/xz/>

-Note please try to install/build official packages for your system.
-If these programs are not available use the following instructions
-to build them and install the results into a directory that you will
-then use when building this package.
+It is generally better to use official packages for your system.
+If a package is not officially available you can build it from source
+and install it into a directory that you can then use to build this
+package.  If some packages are available but are too old, install the
+too-old versions first as they may be needed to build newer versions.

-Even if the official version of a package for your system is too old,
-please install it, as it may be required to build the newer versions.
-The examples below install into $HOME/grep/deps/, so if you are
-going to follow these instructions, first ensure that your $PATH is
-set correctly by running this command:
+Here is an example of how to build a program from source.  This
+example is for Autoconf; a similar approach should work for the other
+developer prerequisites.  This example assumes Autoconf 2.71; it
+should be OK to use a later version of Autoconf, if available.

-  prefix=$HOME/grep/deps
+  prefix=$HOME/prefix   # (or wherever else you choose)
  export PATH=$prefix/bin:$PATH
-
-* autoconf *
-
-  # Note Autoconf 2.62 or newer is needed to build automake-1.11.1
-  git clone --depth=1 git://git.sv.gnu.org/autoconf.git
-  git checkout v2.62
-  autoreconf -vi
+  wget https://ftp.gnu.org/pub/gnu/autoconf/autoconf-2.71.tar.gz
+  gzip -d <autoconf-2.71.tar.gz | tar xf -
+  cd autoconf-2.71
  ./configure --prefix=$prefix
  make install

-* automake *
-
-  # Note help2man is required to build automake fully
-  git clone git://git.sv.gnu.org/automake.git
-  cd automake
-  git checkout v1.11.1
-  ./bootstrap
-  ./configure --prefix=$prefix
-  make install
-
-This package uses XZ utils (successor to LZMA) to create
-a compressed distribution tarball.  Using this feature of Automake
-requires version 1.10a or newer, as well as the xz program itself.
-
-* xz *
-
-  git clone git://ctrl.tukaani.org/xz.git
-  cd xz
-  ./autogen.sh
-  ./configure --prefix=$prefix
-  make install
-
-Now you can build this package as described in README-hacking.
+Once the prerequisites are installed, you can build this package as
+described in README-hacking.
--- a/THANKS.in
+++ b/THANKS.in
@ -13,6 +13,7 @@ end of e.g., grep --help).
 Akim Demaille                       akim@epita.fr
 Andreas Schwab                      schwab@suse.de
 Andreas Ley                         andy@rz.uni-karlsruhe.de
+Anton Samokat                       samokat700@gmail.com
 Bastiaan "Darquan" Stougie          darquan@zonnet.nl
 Ben Elliston                        bje@cygnus.com
 Bernd Strieder                      strieder@student.uni-kl.de
@ -28,6 +29,7 @@ David J MacKenzie                   djm@catapult.va.pubnix.com
 David O'Brien                       obrien@freebsd.org
 'Drake' Daham Wang                  drakewang@gmail.com
 Egmont Koblinger                    egmont@gmail.com
+Emanuele Torre                      torreemanuele6@gmail.com
 Fernando Basso                      fernandobasso.br@gmail.com
 Florian La Roche                    laroche@redhat.com
 François Pinard                     pinard@iro.umontreal.ca
@ -35,6 +37,7 @@ Gerald Stoller                      gerald_stoller@hotmail.com
 Grant McDorman                      grant@isgtec.com
 Greg Boyd                           gboyd.ccsf@gmail.com
 Greg Louis                          glouis@dynamicro.on.ca
+Gro-Tsen                            https://twitter.com/gro_tsen
 Guglielmo 'bond' Bondioni           g.bondioni@libero.it
 H. Merijn Brand                     h.m.brand@hccnet.nl
 Harald Hanche-Olsen                 hanche@math.ntnu.no
@ -50,9 +53,11 @@ Joel N. Weber II                    devnull@gnu.org
 John Hughes                         john@nitelite.calvacom.fr
 Jorge Stolfi                        stolfi@dcc.unicamp.br
 Karl Heuer                          kwzh@gnu.org
+Karl Pettersson                     karl.pettersson@klpn.se
 Kaveh R. Ghazi                      ghazi@caip.rutgers.edu
 Kazuro Furukawa                     furukawa@apricot.kek.jp
 Keith Bostic                        bostic@bsdi.com
+Koen Claessen                       koen@chalmers.se
 Krishna Sethuraman                  krishna@sgihub.corp.sgi.com
 Kurt D Schwehr                      kdschweh@insci14.ucsd.edu
 Ludovic Courtès                     ludo@gnu.org
@ -77,6 +82,7 @@ Rainer Orth                         ro@cebitec.uni-bielefeld.de
 Roland Roberts                      rroberts@muller.com
 Ruslan Ermilov                      ru@freebsd.org
 Santiago Vila                       sanvila@unex.es
+Sebastian Carlos                    sebaaa1754@gmail.com
 Shannon Hill                        hill@synnet.com
 Sotiris Vassilopoulos               Sotiris.Vassilopoulos@betatech.gr
 Standish Parsley                    adsspamtrap01@yahoo.com
--- a/6
+++ b/6
@ -1,6 +1,6 @@
 Things to do for GNU grep

-  Copyright (C) 1992, 1997-2002, 2004-2018 Free Software Foundation, Inc.
+  Copyright (C) 1992, 1997-2002, 2004-2026 Free Software Foundation, Inc.

  Copying and distribution of this file, with or without modification,
  are permitted in any medium without royalty provided the copyright
@ -31,13 +31,13 @@ GNU grep originally did 32-bit arithmetic.  Although it has moved to
 64-bit on 64-bit platforms by using types like ptrdiff_t and size_t,
 this conversion has not been entirely systematic and should be checked.

-Lazy dynamic linking of libpcre.  See Debian’s 03-397262-dlopen-pcre.patch.
+Lazy dynamic linking of the PCRE library.

 Check FreeBSD’s integration of zgrep (-Z) and bzgrep (-J) in one
 binary.  Is there a possibility of doing even better by automatically
 checking the magic of binary files ourselves (0x1F 0x8B for gzip, 0x1F
 0x9D for compress, and 0x42 0x5A 0x68 for bzip2)?  Once what to do with
-libpcre is decided, do the same for libz and libbz2.
+the PCRE library is decided, do the same for libz and libbz2.


 ===================
--- a/1730
+++ b/1730
--- a/bootstrap.conf
+++ b/bootstrap.conf
@ -1,6 +1,6 @@
 # Bootstrap configuration.

-# Copyright (C) 2006-2018 Free Software Foundation, Inc.
+# Copyright (C) 2006-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -17,25 +17,31 @@

 avoided_gnulib_modules='
  --avoid=lock-tests
+  --avoid=mbuiter
+  --avoid=mbuiterf
+  --avoid=mbrlen-tests
  --avoid=mbrtowc-tests
  --avoid=update-copyright-tests
 '

 # gnulib modules used by this package.
 gnulib_modules='
-alloca
 announce-gen
 argmatch
+assert-h
 c-ctype
 c-stack
+c-strcasecmp
+c32isalnum
+c32rtomb
 closeout
 configmake
 dfa
+dirname-lgpl
 do-release-commit-and-tag
 error
 exclude
 fcntl-h
-fdl
 fnmatch
 fstatat
 fts
@ -47,58 +53,59 @@ git-version-gen
 gitlog-to-changelog
 gnu-web-doc-update
 gnupload
+hash
+idx
 ignore-value
 intprops
-inttypes
+inttypes-h
 isatty
 isblank
-iswctype
+kwset
 largefile
-locale
+locale-h
 lseek
 maintainer-makefile
 malloc-gnu
 manywarnings
 mbrlen
-mbrtowc
+mbrtoc32-regular
+mbszero
+mcel-prefer
 memchr
 memchr2
 mempcpy
 minmax
+nullptr
 obstack
 openat-safer
 perl
-propername
-quote
+rawmemchr
 readme-release
-realloc-gnu
+realloc-posix
 regex
 safe-read
 same-inode
 ssize_t
-stddef
-stdlib
+stdckdint-h
+stddef-h
+stdlib-h
 stpcpy
 strerror
-string
+string-h
 strstr
-strtoull
-strtoumax
-sys_stat
-unistd
+sys_stat-h
+unistd-h
 unlocked-io
 update-copyright
 useless-if-before-free
 verify
 version-etc-fsf
-wchar
-wcrtomb
-wctob
-wctype-h
+wchar-single
 windows-stat-inodes
 xalloc
 xbinary-io
 xstrtoimax
+year2038
 '
 gnulib_name=libgreputils

@ -126,13 +133,16 @@ gnulib_tool_option_extras="--tests-base=gnulib-tests --with-tests --symlink\
 buildreq="\
 autoconf   2.62
 automake   1.11.1
-autopoint  -
+autopoint  0.19.2
 gettext    -
 git        1.4.4
 gzip       -
+m4         -
 makeinfo   -
-rsync      -
 tar        -
+texi2pdf   6.1
+wget       -
+xz         -
 "

 bootstrap_post_import_hook ()
@ -140,22 +150,27 @@ bootstrap_post_import_hook ()
  # Automake requires that ChangeLog exist.
  touch ChangeLog || return 1

+  # Copy tests/init.sh from Gnulib.
+  $gnulib_tool --copy-file tests/init.sh
+
  # Copy pkg-config's pkg.m4 so that our downstream users don't need to.
  local ac_dir=`aclocal --print-ac-dir`
  test -s "$ac_dir/dirlist" && ac_dir=$ac_dir:`tr '\n' : < "$ac_dir/dirlist"`
  oIFS=$IFS
  IFS=:
+  local found=false
  for dir in \
    $ACLOCAL_PATH $ac_dir /usr/share/aclocal ''
  do
    IFS=$oIFS
    if test -n "$dir" && test -r "$dir/pkg.m4"; then
      cp "$dir/pkg.m4" m4/pkg.m4
-      return
+      found=:
+      break
    fi
  done
  IFS=$oIFS
-  die 'Cannot find pkg.m4; perhaps you need to install pkg-config'
+  $found || die 'Cannot find pkg.m4; perhaps you need to install pkg-config'
 }

 bootstrap_epilogue()
--- a/cfg.mk
+++ b/cfg.mk
@ -1,5 +1,5 @@
 # Customize maint.mk                           -*- makefile -*-
-# Copyright (C) 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2009-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -30,7 +30,9 @@ url_dir_list = https://ftp.gnu.org/gnu/$(PACKAGE)

 # Tests not to run as part of "make distcheck".
 local-checks-to-skip =			\
-  sc_texinfo_acronym
+  sc_indent				\
+  sc_texinfo_acronym			\
+  sc_unportable_grep_q

 # Tools used to bootstrap this package, used for "announcement".
 bootstrap-tools = autoconf,automake,gnulib
@ -40,7 +42,14 @@ announcement_Cc_ = $(translation_project_), $(PACKAGE)-devel@gnu.org

 # The tight_scope test gets confused about inline functions.
 # like 'to_uchar'.
-_gl_TS_unmarked_extern_functions = main usage mb_clen to_uchar dfaerror dfawarn
+_gl_TS_unmarked_extern_functions = \
+  main usage mb_clen to_uchar dfaerror dfawarn imbrlen
+
+# Write base64-encoded (not hex) checksums into the announcement.
+announce_gen_args = --cksum-checksums
+
+# Add an exemption for sc_makefile_at_at_check.
+_makefile_at_at_check_exceptions = ' && !/MAKEINFO/'

 # Now that we have better tests, make this the default.
 export VERBOSE = yes
@ -65,7 +74,13 @@ export VERBOSE = yes
 # 1127556 9e
 export XZ_OPT = -6e

-old_NEWS_hash = 7623f45d6e457629257ff9a9f8237673
+old_NEWS_hash = 3713245f672c3a9d1b455d6cc410c9ec
+
+# We prefer to spell it back-reference, as POSIX does.
+sc_prohibit_backref:
+	@prohibit=back''reference					\
+	halt='spell it "back-reference"'				\
+	  $(_sc_search_regexp)

 # Many m4 macros names once began with 'jm_'.
 # Make sure that none are inadvertently reintroduced.
@ -89,6 +104,7 @@ LINE_LEN_MAX = 80
 FILTER_LONG_LINES =							\
  /^[^:]*\.diff:[^:]*:@@ / d;						\
  \|^[^:]*TODO:| d;							\
+  \|^[^:]*doc/fdl.texi:| d;						\
  \|^[^:]*man/help2man:| d;						\
  \|^[^:]*tests/misc/sha[0-9]*sum.*\.pl[-:]| d;				\
  \|^[^:]*tests/pr/|{ \|^[^:]*tests/pr/pr-tests:| !d; };
@ -159,3 +175,18 @@ exclude_file_name_regexp--sc_prohibit_tab_based_indentation = \
 exclude_file_name_regexp--sc_prohibit_doubled_word = ^tests/count-newline$$

 exclude_file_name_regexp--sc_long_lines = ^tests/.*$$
+
+# If a test uses timeout, it must also use require_timeout_.
+# Grandfather-exempt the fedora test, since it ensures timeout works
+# as expected before using it.
+sc_timeout_prereq:
+	@$(VC_LIST_EXCEPT)						\
+	  | grep '^tests/'						\
+	  | grep -v '^tests/fedora$$'					\
+	  | xargs grep -lw timeout					\
+	  | xargs grep -FLw require_timeout_				\
+	  | $(GREP) .							\
+	  && { echo '$(ME): timeout without use of require_timeout_'	\
+	    1>&2; exit 1; } || :
+
+codespell_ignore_words_list = clen,allo,Nd,abd,alph,debbugs,wee,UE,ois,creche
--- a/configure.ac
+++ b/configure.ac
@ -1,7 +1,7 @@
 dnl
 dnl autoconf input file for GNU grep
 dnl
-dnl Copyright (C) 1997-2006, 2009-2018 Free Software Foundation, Inc.
+dnl Copyright (C) 1997-2006, 2009-2026 Free Software Foundation, Inc.
 dnl
 dnl This file is part of GNU grep.
 dnl
@ -22,54 +22,23 @@ AC_INIT([GNU grep],
        m4_esyscmd([build-aux/git-version-gen .tarball-version]),
        [bug-grep@gnu.org])

-# Set the GREP and EGREP variables to a dummy replacement for the 'grep'
-# command, so that AC_PROG_GREP and AC_PROG_EGREP don't fail when no good
-# 'grep' program is found. This makes it possible to build GNU grep on a
-# Solaris machine that has only /usr/bin/grep and no /usr/xpg4/bin/grep.
-# This function supports only restricted arguments:
-#   - No file names as arguments, process only standard input.
-#   - Only literal strings without backslashes, no regular expressions.
-#   - The only options are -e and -E (and -Ee).
-# This function also does not support long lines beyond what the shell
-# supports), and backslash-processes the input.
-fn_grep () {
-  test "$1" = -E && shift
-  case $@%:@:$1 in
-    0:*) AC_MSG_ERROR([fn_grep: expected pattern]) ;;
-    1:-*) AC_MSG_ERROR([fn_grep: invalid command line]) ;;
-    1:*) pattern=$1 ;;
-    2:--|2:-e|2:-Ee) pattern=$2 ;;
-    *) AC_MSG_ERROR([fn_grep: invalid command line]) ;;
-  esac
+if test -n "$GREP" || test -n "$EGREP"; then
+  AC_MSG_ERROR(
+    [no working 'grep' found
+  A working 'grep' command is needed to build GNU Grep.
+  This 'grep' should support -e and long lines.
+  On Solaris 10, install the package SUNWggrp or SUNWxcu4.
+  On Solaris 11, install the package text/gnu-grep or system/xopen/xcu4.])
+fi

-  case $pattern in
-    [*['].^$\*[']*]) dnl The outer brackets are for M4.
-      AC_MSG_ERROR([fn_grep: regular expressions not supported])  ;;
-  esac
-
-  rc=1
-  while read line; do
-    case $line in
-      *$pattern*)
-        rc=0
-        AS_ECHO([$line]) ;;
-    esac
-  done
-  return $rc
-}
-
-test -n "$GREP" || GREP=fn_grep
-test -n "$EGREP" || EGREP=fn_grep
-ac_cv_path_EGREP=$EGREP
-
-AC_CONFIG_AUX_DIR(build-aux)
-AC_CONFIG_SRCDIR(src/grep.c)
+AC_CONFIG_AUX_DIR([build-aux])
+AC_CONFIG_SRCDIR([src/grep.c])
 AC_DEFINE([GREP], 1, [We are building grep])
-AC_PREREQ([2.63])
+AC_PREREQ([2.64])
 AC_CONFIG_MACRO_DIRS([m4])

 dnl Automake stuff.
-AM_INIT_AUTOMAKE([1.11 no-dist-gzip dist-xz color-tests parallel-tests
+AM_INIT_AUTOMAKE([1.11 dist-xz color-tests parallel-tests
                  subdir-objects])
 AM_SILENT_RULES([yes]) # make --enable-silent-rules the default.

@ -82,74 +51,94 @@ AC_PROG_INSTALL
 AC_PROG_CC
 gl_EARLY
 AC_PROG_RANLIB
-PKG_PROG_PKG_CONFIG([0.9.0])
+PKG_PROG_PKG_CONFIG([0.9.0], [PKG_CONFIG=false])

 # grep never invokes mbrtowc or mbrlen on empty input,
 # so don't worry about this common bug,
 # as working around it would merely slow grep down.
 gl_cv_func_mbrtowc_empty_input='assume yes'
+gl_cv_func_mbrlen_empty_input='assume yes'

 dnl Checks for typedefs, structures, and compiler characteristics.
-AC_TYPE_SIZE_T
-AC_C_CONST
 gl_INIT

+# Ensure VLAs are not used.
+# Note -Wvla is implicitly added by gl_MANYWARN_ALL_GCC
+AC_DEFINE([GNULIB_NO_VLA], [1], [Define to 1 to disable use of VLAs])
+
 # The test suite needs to know if we have a working perl.
-# FIXME: this is suboptimal.  Ideally, we would be able to call gl_PERL
-# with an ACTION-IF-NOT-FOUND argument ...
-cu_have_perl=yes
-case $PERL in *"/missing "*) cu_have_perl=no;; esac
-AM_CONDITIONAL([HAVE_PERL], [test $cu_have_perl = yes])
+AM_CONDITIONAL([HAVE_PERL], [test "$gl_cv_prog_perl" != no])
+
+# gl_GCC_VERSION_IFELSE([major], [minor], [run-if-found], [run-if-not-found])
+# ------------------------------------------------
+# If $CPP is gcc-MAJOR.MINOR or newer, then run RUN-IF-FOUND.
+# Otherwise, run RUN-IF-NOT-FOUND.
+AC_DEFUN([gl_GCC_VERSION_IFELSE],
+  [AC_PREPROC_IFELSE(
+    [AC_LANG_PROGRAM(
+      [[
+#if ($1) < __GNUC__ || (($1) == __GNUC__ && ($2) <= __GNUC_MINOR__)
+/* ok */
+#else
+# error "your version of gcc is older than $1.$2"
+#endif
+      ]]),
+    ], [$3], [$4])
+  ]
+)

 AC_ARG_ENABLE([gcc-warnings],
-  [AS_HELP_STRING([--enable-gcc-warnings],
-                  [turn on lots of GCC warnings (for developers)])],
+  [AS_HELP_STRING([--enable-gcc-warnings@<:@=TYPE@:>@],
+    [control generation of GCC warnings.  The TYPE 'no' disables
+     warnings (default for non-developer builds); 'yes' generates
+     cheap warnings if available (default for developer builds);
+     'expensive' in addition generates expensive-to-compute warnings
+     if available.])],
  [case $enableval in
-     yes|no) ;;
+     no|yes|expensive) ;;
     *)      AC_MSG_ERROR([bad value $enableval for gcc-warnings option]) ;;
   esac
   gl_gcc_warnings=$enableval],
-  [gl_gcc_warnings=no
-   if test "$GCC" = yes && test -d "$srcdir"/.git; then
-     AC_COMPILE_IFELSE(
-       [AC_LANG_PROGRAM([[
-          #if ! (6 < __GNUC__ + (2 <= __GNUC_MINOR__))
-            #error "--enable-gcc-warnings defaults to 'no' on older GCC"
-          #endif
-          ]])],
-       [gl_gcc_warnings=yes])
-   fi]
+  [
+   # GCC provides fine-grained control over diagnostics which
+   # is used in gnulib for example to suppress warnings from
+   # certain sections of code.  So if this is available and
+   # we're running from a git repo, then auto enable the warnings.
+   gl_gcc_warnings=no
+   gl_GCC_VERSION_IFELSE([4], [6],
+                         [test -d "$srcdir"/.git \
+                          && ! test -f "$srcdir"/.tarball-version \
+                          && gl_gcc_warnings=yes])]
 )

-if test "$gl_gcc_warnings" = yes; then
+if test $gl_gcc_warnings != no; then
  gl_WARN_ADD([-Werror], [WERROR_CFLAGS])
  AC_SUBST([WERROR_CFLAGS])

-  nw=
+  ew=
+  AS_IF([test $gl_gcc_warnings != expensive],
+    [# -fanalyzer and related options slow GCC considerably.
+     ew="$ew -fanalyzer -Wno-analyzer-double-free -Wno-analyzer-malloc-leak"
+     ew="$ew -Wno-analyzer-null-dereference -Wno-analyzer-use-after-free"])
+
+  nw=$ew
  # This, $nw, is the list of warnings we disable.
-  nw="$nw -Wdeclaration-after-statement" # too useful to forbid
-  nw="$nw -Waggregate-return"       # anachronistic
-  nw="$nw -Wlong-long"              # C90 is anachronistic (lib/gethrxtime.h)
-  nw="$nw -Wc++-compat"             # We don't care about C++ compilers
-  nw="$nw -Wundef"                  # Warns on '#if GNULIB_FOO' etc in gnulib
+  nw="$nw -Wvla"                    # suppress a warning in regexec.h
+  nw="$nw -Winline"                 # suppress warnings from streq.h's streq5
  nw="$nw -Wsystem-headers"         # Don't let system headers trigger warnings
-  nw="$nw -Wpadded"                 # Our structs are not padded
-  nw="$nw -Wvla"                    # warnings in gettext.h
  nw="$nw -Wstack-protector"        # generates false alarms for useful code
-  nw="$nw -Wswitch-default"         # Too many warnings for now
-  nw="$nw -Wunsafe-loop-optimizations" # OK to suppress unsafe optimizations
-  nw="$nw -Winline"                 # streq.h's streq4, streq6 and strcaseeq6
-  nw="$nw -Wstrict-overflow"        # regexec.c

  gl_MANYWARN_ALL_GCC([ws])
  gl_MANYWARN_COMPLEMENT([ws], [$ws], [$nw])
  for w in $ws; do
    gl_WARN_ADD([$w])
  done
+  gl_WARN_ADD([-Wtrailing-whitespace]) # This project's coding style
  gl_WARN_ADD([-Wno-missing-field-initializers]) # We need this one
  gl_WARN_ADD([-Wno-sign-compare])     # Too many warnings for now
  gl_WARN_ADD([-Wno-unused-parameter]) # Too many warnings for now
  gl_WARN_ADD([-Wno-cast-function-type]) # sig-handler.h's sa_handler_t cast
+  gl_WARN_ADD([-Wno-deprecated-declarations]) # clang complains about sprintf

  # In spite of excluding -Wlogical-op above, it is enabled, as of
  # gcc 4.5.0 20090517, and it provokes warnings in cat.c, dd.c, truncate.c
@ -175,6 +164,32 @@ if test "$gl_gcc_warnings" = yes; then
  gl_WARN_ADD([-Wno-format-nonliteral])
  gl_MANYWARN_COMPLEMENT([GNULIB_WARN_CFLAGS], [$WARN_CFLAGS], [$nw])
  AC_SUBST([GNULIB_WARN_CFLAGS])
+
+  # For gnulib-tests, the set is slightly smaller still.
+  # It's not worth being this picky about test programs.
+  nw=
+  nw="$nw -Wformat-truncation=2"    # False alarm in strerror_r.c
+  nw="$nw -Wmissing-declarations"
+  nw="$nw -Wmissing-prototypes"
+  nw="$nw -Wmissing-variable-declarations"
+  nw="$nw -Wnull-dereference"
+  nw="$nw -Wold-style-definition"
+  nw="$nw -Wstrict-prototypes"
+  nw="$nw -Wsuggest-attribute=cold"
+  nw="$nw -Wsuggest-attribute=const"
+  nw="$nw -Wsuggest-attribute=format"
+  nw="$nw -Wsuggest-attribute=pure"
+
+  # Disable to avoid warnings in e.g., test-intprops.c and test-limits-h.c
+  # due to overlong expansions like this:
+  # test-intprops.c:147:5: error: string literal of length 9531 exceeds \
+  # maximum length 4095 that ISO C99 compilers are required to support
+  nw="$nw -Woverlength-strings"
+
+  gl_MANYWARN_COMPLEMENT([GNULIB_TEST_WARN_CFLAGS],
+                         [$GNULIB_WARN_CFLAGS], [$nw])
+  gl_WARN_ADD([-Wno-return-type], [GNULIB_TEST_WARN_CFLAGS])
+  AC_SUBST([GNULIB_TEST_WARN_CFLAGS])
 fi

 # By default, argmatch should fail calling usage (EXIT_FAILURE).
@ -183,14 +198,7 @@ AC_DEFINE([ARGMATCH_DIE], [usage (EXIT_FAILURE)],
 AC_DEFINE([ARGMATCH_DIE_DECL], [void usage (int _e)],
          [Define to the declaration of the xargmatch failure function.])

-dnl Checks for header files.
-AC_HEADER_STDC
-AC_HEADER_DIRENT
-
-dnl Checks for functions.
-AC_FUNC_CLOSEDIR_VOID
-
-AC_CHECK_FUNCS_ONCE(isascii setlocale)
+AC_CHECK_FUNCS_ONCE([setlocale])

 dnl I18N feature
 AM_GNU_GETTEXT_VERSION([0.18.2])
@ -203,9 +211,12 @@ dnl then the installer should configure --with-included-regex.
 AM_CONDITIONAL([USE_INCLUDED_REGEX], [test "$ac_use_included_regex" = yes])
 if test "$ac_use_included_regex" = no; then
  AC_MSG_WARN([Included lib/regex.c not used])
+else
+  AC_DEFINE([USE_INCLUDED_REGEX], 1, [building with included regex code])
 fi

 gl_FUNC_PCRE
+AM_CONDITIONAL([USE_PCRE], [test $use_pcre = yes])

 case $host_os in
  mingw*) suffix=w32 ;;
@ -223,6 +234,4 @@ AC_CONFIG_FILES([
  doc/Makefile
  gnulib-tests/Makefile
 ])
-GREP="$ac_abs_top_builddir/src/grep"
-EGREP="$ac_abs_top_builddir/src/grep -E"
 AC_OUTPUT
--- a/doc/.gitignore
+++ b/doc/.gitignore
@ -1,6 +1,3 @@
-/egrep.1
-/fdl.texi
-/fgrep.1
 /gendocs_template
 /gendocs_template_min
 /grep.info*
--- a/doc/Makefile.am
+++ b/doc/Makefile.am
@ -1,7 +1,7 @@
 # Process this file with automake to create Makefile.in
 # Makefile.am for grep/doc.
 #
-# Copyright 2008-2018 Free Software Foundation, Inc.
+# Copyright 2008-2026 Free Software Foundation, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -16,23 +16,20 @@
 # You should have received a copy of the GNU General Public License
 # along with this program.  If not, see <https://www.gnu.org/licenses/>.

+# The customization variable CHECK_NORMAL_MENU_STRUCTURE is necessary with
+# makeinfo versions ≥ 6.8.
+MAKEINFO = @MAKEINFO@ -c CHECK_NORMAL_MENU_STRUCTURE=1
+
 info_TEXINFOS = grep.texi
 grep_TEXINFOS = fdl.texi

-man_MANS = grep.1 fgrep.1 egrep.1
+man_MANS = grep.1

 EXTRA_DIST = grep.in.1
-CLEANFILES = grep.1 egrep.1 fgrep.1
+CLEANFILES = grep.1

 grep.1: grep.in.1
 	$(AM_V_GEN)rm -f $@-t $@
 	$(AM_V_at)sed 's/@''VERSION@/$(VERSION)/' $(srcdir)/grep.in.1 > $@-t
 	$(AM_V_at)chmod a=r $@-t
 	$(AM_V_at)mv -f $@-t $@
-
-egrep.1 fgrep.1: Makefile.am
-	$(AM_V_GEN)rm -f $@-t $@
-	$(AM_V_at)inst=`echo grep | sed '$(transform)'`.1 \
-	  && echo ".so man1/$$inst" > $@-t
-	$(AM_V_at)chmod a=r $@-t
-	$(AM_V_at)mv -f $@-t $@
--- a/doc/fdl.texi
+++ b/doc/fdl.texi
@ -0,0 +1,506 @@
+@c The GNU Free Documentation License.
+@center Version 1.3, 3 November 2008
+
+@c This file is intended to be included within another document,
+@c hence no sectioning command or @node.
+
+@display
+Copyright @copyright{} 2000--2002, 2007--2008, 2023--2026 Free Software
+Foundation, Inc.
+@uref{https://fsf.org/}
+
+Everyone is permitted to copy and distribute verbatim copies
+of this license document, but changing it is not allowed.
+@end display
+
+@enumerate 0
+@item
+PREAMBLE
+
+The purpose of this License is to make a manual, textbook, or other
+functional and useful document @dfn{free} in the sense of freedom: to
+assure everyone the effective freedom to copy and redistribute it,
+with or without modifying it, either commercially or noncommercially.
+Secondarily, this License preserves for the author and publisher a way
+to get credit for their work, while not being considered responsible
+for modifications made by others.
+
+This License is a kind of ``copyleft'', which means that derivative
+works of the document must themselves be free in the same sense.  It
+complements the GNU General Public License, which is a copyleft
+license designed for free software.
+
+We have designed this License in order to use it for manuals for free
+software, because free software needs free documentation: a free
+program should come with manuals providing the same freedoms that the
+software does.  But this License is not limited to software manuals;
+it can be used for any textual work, regardless of subject matter or
+whether it is published as a printed book.  We recommend this License
+principally for works whose purpose is instruction or reference.
+
+@item
+APPLICABILITY AND DEFINITIONS
+
+This License applies to any manual or other work, in any medium, that
+contains a notice placed by the copyright holder saying it can be
+distributed under the terms of this License.  Such a notice grants a
+world-wide, royalty-free license, unlimited in duration, to use that
+work under the conditions stated herein.  The ``Document'', below,
+refers to any such manual or work.  Any member of the public is a
+licensee, and is addressed as ``you''.  You accept the license if you
+copy, modify or distribute the work in a way requiring permission
+under copyright law.
+
+A ``Modified Version'' of the Document means any work containing the
+Document or a portion of it, either copied verbatim, or with
+modifications and/or translated into another language.
+
+A ``Secondary Section'' is a named appendix or a front-matter section
+of the Document that deals exclusively with the relationship of the
+publishers or authors of the Document to the Document's overall
+subject (or to related matters) and contains nothing that could fall
+directly within that overall subject.  (Thus, if the Document is in
+part a textbook of mathematics, a Secondary Section may not explain
+any mathematics.)  The relationship could be a matter of historical
+connection with the subject or with related matters, or of legal,
+commercial, philosophical, ethical or political position regarding
+them.
+
+The ``Invariant Sections'' are certain Secondary Sections whose titles
+are designated, as being those of Invariant Sections, in the notice
+that says that the Document is released under this License.  If a
+section does not fit the above definition of Secondary then it is not
+allowed to be designated as Invariant.  The Document may contain zero
+Invariant Sections.  If the Document does not identify any Invariant
+Sections then there are none.
+
+The ``Cover Texts'' are certain short passages of text that are listed,
+as Front-Cover Texts or Back-Cover Texts, in the notice that says that
+the Document is released under this License.  A Front-Cover Text may
+be at most 5 words, and a Back-Cover Text may be at most 25 words.
+
+A ``Transparent'' copy of the Document means a machine-readable copy,
+represented in a format whose specification is available to the
+general public, that is suitable for revising the document
+straightforwardly with generic text editors or (for images composed of
+pixels) generic paint programs or (for drawings) some widely available
+drawing editor, and that is suitable for input to text formatters or
+for automatic translation to a variety of formats suitable for input
+to text formatters.  A copy made in an otherwise Transparent file
+format whose markup, or absence of markup, has been arranged to thwart
+or discourage subsequent modification by readers is not Transparent.
+An image format is not Transparent if used for any substantial amount
+of text.  A copy that is not ``Transparent'' is called ``Opaque''.
+
+Examples of suitable formats for Transparent copies include plain
+ASCII without markup, Texinfo input format, La@TeX{} input
+format, SGML or XML using a publicly available
+DTD, and standard-conforming simple HTML,
+PostScript or PDF designed for human modification.  Examples
+of transparent image formats include PNG, XCF and
+JPG@.  Opaque formats include proprietary formats that can be
+read and edited only by proprietary word processors, SGML or
+XML for which the DTD and/or processing tools are
+not generally available, and the machine-generated HTML,
+PostScript or PDF produced by some word processors for
+output purposes only.
+
+The ``Title Page'' means, for a printed book, the title page itself,
+plus such following pages as are needed to hold, legibly, the material
+this License requires to appear in the title page.  For works in
+formats which do not have any title page as such, ``Title Page'' means
+the text near the most prominent appearance of the work's title,
+preceding the beginning of the body of the text.
+
+The ``publisher'' means any person or entity that distributes copies
+of the Document to the public.
+
+A section ``Entitled XYZ'' means a named subunit of the Document whose
+title either is precisely XYZ or contains XYZ in parentheses following
+text that translates XYZ in another language.  (Here XYZ stands for a
+specific section name mentioned below, such as ``Acknowledgements'',
+``Dedications'', ``Endorsements'', or ``History''.)  To ``Preserve the Title''
+of such a section when you modify the Document means that it remains a
+section ``Entitled XYZ'' according to this definition.
+
+The Document may include Warranty Disclaimers next to the notice which
+states that this License applies to the Document.  These Warranty
+Disclaimers are considered to be included by reference in this
+License, but only as regards disclaiming warranties: any other
+implication that these Warranty Disclaimers may have is void and has
+no effect on the meaning of this License.
+
+@item
+VERBATIM COPYING
+
+You may copy and distribute the Document in any medium, either
+commercially or noncommercially, provided that this License, the
+copyright notices, and the license notice saying this License applies
+to the Document are reproduced in all copies, and that you add no other
+conditions whatsoever to those of this License.  You may not use
+technical measures to obstruct or control the reading or further
+copying of the copies you make or distribute.  However, you may accept
+compensation in exchange for copies.  If you distribute a large enough
+number of copies you must also follow the conditions in section 3.
+
+You may also lend copies, under the same conditions stated above, and
+you may publicly display copies.
+
+@item
+COPYING IN QUANTITY
+
+If you publish printed copies (or copies in media that commonly have
+printed covers) of the Document, numbering more than 100, and the
+Document's license notice requires Cover Texts, you must enclose the
+copies in covers that carry, clearly and legibly, all these Cover
+Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
+the back cover.  Both covers must also clearly and legibly identify
+you as the publisher of these copies.  The front cover must present
+the full title with all words of the title equally prominent and
+visible.  You may add other material on the covers in addition.
+Copying with changes limited to the covers, as long as they preserve
+the title of the Document and satisfy these conditions, can be treated
+as verbatim copying in other respects.
+
+If the required texts for either cover are too voluminous to fit
+legibly, you should put the first ones listed (as many as fit
+reasonably) on the actual cover, and continue the rest onto adjacent
+pages.
+
+If you publish or distribute Opaque copies of the Document numbering
+more than 100, you must either include a machine-readable Transparent
+copy along with each Opaque copy, or state in or with each Opaque copy
+a computer-network location from which the general network-using
+public has access to download using public-standard network protocols
+a complete Transparent copy of the Document, free of added material.
+If you use the latter option, you must take reasonably prudent steps,
+when you begin distribution of Opaque copies in quantity, to ensure
+that this Transparent copy will remain thus accessible at the stated
+location until at least one year after the last time you distribute an
+Opaque copy (directly or through your agents or retailers) of that
+edition to the public.
+
+It is requested, but not required, that you contact the authors of the
+Document well before redistributing any large number of copies, to give
+them a chance to provide you with an updated version of the Document.
+
+@item
+MODIFICATIONS
+
+You may copy and distribute a Modified Version of the Document under
+the conditions of sections 2 and 3 above, provided that you release
+the Modified Version under precisely this License, with the Modified
+Version filling the role of the Document, thus licensing distribution
+and modification of the Modified Version to whoever possesses a copy
+of it.  In addition, you must do these things in the Modified Version:
+
+@enumerate A
+@item
+Use in the Title Page (and on the covers, if any) a title distinct
+from that of the Document, and from those of previous versions
+(which should, if there were any, be listed in the History section
+of the Document).  You may use the same title as a previous version
+if the original publisher of that version gives permission.
+
+@item
+List on the Title Page, as authors, one or more persons or entities
+responsible for authorship of the modifications in the Modified
+Version, together with at least five of the principal authors of the
+Document (all of its principal authors, if it has fewer than five),
+unless they release you from this requirement.
+
+@item
+State on the Title page the name of the publisher of the
+Modified Version, as the publisher.
+
+@item
+Preserve all the copyright notices of the Document.
+
+@item
+Add an appropriate copyright notice for your modifications
+adjacent to the other copyright notices.
+
+@item
+Include, immediately after the copyright notices, a license notice
+giving the public permission to use the Modified Version under the
+terms of this License, in the form shown in the Addendum below.
+
+@item
+Preserve in that license notice the full lists of Invariant Sections
+and required Cover Texts given in the Document's license notice.
+
+@item
+Include an unaltered copy of this License.
+
+@item
+Preserve the section Entitled ``History'', Preserve its Title, and add
+to it an item stating at least the title, year, new authors, and
+publisher of the Modified Version as given on the Title Page.  If
+there is no section Entitled ``History'' in the Document, create one
+stating the title, year, authors, and publisher of the Document as
+given on its Title Page, then add an item describing the Modified
+Version as stated in the previous sentence.
+
+@item
+Preserve the network location, if any, given in the Document for
+public access to a Transparent copy of the Document, and likewise
+the network locations given in the Document for previous versions
+it was based on.  These may be placed in the ``History'' section.
+You may omit a network location for a work that was published at
+least four years before the Document itself, or if the original
+publisher of the version it refers to gives permission.
+
+@item
+For any section Entitled ``Acknowledgements'' or ``Dedications'', Preserve
+the Title of the section, and preserve in the section all the
+substance and tone of each of the contributor acknowledgements and/or
+dedications given therein.
+
+@item
+Preserve all the Invariant Sections of the Document,
+unaltered in their text and in their titles.  Section numbers
+or the equivalent are not considered part of the section titles.
+
+@item
+Delete any section Entitled ``Endorsements''.  Such a section
+may not be included in the Modified Version.
+
+@item
+Do not retitle any existing section to be Entitled ``Endorsements'' or
+to conflict in title with any Invariant Section.
+
+@item
+Preserve any Warranty Disclaimers.
+@end enumerate
+
+If the Modified Version includes new front-matter sections or
+appendices that qualify as Secondary Sections and contain no material
+copied from the Document, you may at your option designate some or all
+of these sections as invariant.  To do this, add their titles to the
+list of Invariant Sections in the Modified Version's license notice.
+These titles must be distinct from any other section titles.
+
+You may add a section Entitled ``Endorsements'', provided it contains
+nothing but endorsements of your Modified Version by various
+parties---for example, statements of peer review or that the text has
+been approved by an organization as the authoritative definition of a
+standard.
+
+You may add a passage of up to five words as a Front-Cover Text, and a
+passage of up to 25 words as a Back-Cover Text, to the end of the list
+of Cover Texts in the Modified Version.  Only one passage of
+Front-Cover Text and one of Back-Cover Text may be added by (or
+through arrangements made by) any one entity.  If the Document already
+includes a cover text for the same cover, previously added by you or
+by arrangement made by the same entity you are acting on behalf of,
+you may not add another; but you may replace the old one, on explicit
+permission from the previous publisher that added the old one.
+
+The author(s) and publisher(s) of the Document do not by this License
+give permission to use their names for publicity for or to assert or
+imply endorsement of any Modified Version.
+
+@item
+COMBINING DOCUMENTS
+
+You may combine the Document with other documents released under this
+License, under the terms defined in section 4 above for modified
+versions, provided that you include in the combination all of the
+Invariant Sections of all of the original documents, unmodified, and
+list them all as Invariant Sections of your combined work in its
+license notice, and that you preserve all their Warranty Disclaimers.
+
+The combined work need only contain one copy of this License, and
+multiple identical Invariant Sections may be replaced with a single
+copy.  If there are multiple Invariant Sections with the same name but
+different contents, make the title of each such section unique by
+adding at the end of it, in parentheses, the name of the original
+author or publisher of that section if known, or else a unique number.
+Make the same adjustment to the section titles in the list of
+Invariant Sections in the license notice of the combined work.
+
+In the combination, you must combine any sections Entitled ``History''
+in the various original documents, forming one section Entitled
+``History''; likewise combine any sections Entitled ``Acknowledgements'',
+and any sections Entitled ``Dedications''.  You must delete all
+sections Entitled ``Endorsements.''
+
+@item
+COLLECTIONS OF DOCUMENTS
+
+You may make a collection consisting of the Document and other documents
+released under this License, and replace the individual copies of this
+License in the various documents with a single copy that is included in
+the collection, provided that you follow the rules of this License for
+verbatim copying of each of the documents in all other respects.
+
+You may extract a single document from such a collection, and distribute
+it individually under this License, provided you insert a copy of this
+License into the extracted document, and follow this License in all
+other respects regarding verbatim copying of that document.
+
+@item
+AGGREGATION WITH INDEPENDENT WORKS
+
+A compilation of the Document or its derivatives with other separate
+and independent documents or works, in or on a volume of a storage or
+distribution medium, is called an ``aggregate'' if the copyright
+resulting from the compilation is not used to limit the legal rights
+of the compilation's users beyond what the individual works permit.
+When the Document is included in an aggregate, this License does not
+apply to the other works in the aggregate which are not themselves
+derivative works of the Document.
+
+If the Cover Text requirement of section 3 is applicable to these
+copies of the Document, then if the Document is less than one half of
+the entire aggregate, the Document's Cover Texts may be placed on
+covers that bracket the Document within the aggregate, or the
+electronic equivalent of covers if the Document is in electronic form.
+Otherwise they must appear on printed covers that bracket the whole
+aggregate.
+
+@item
+TRANSLATION
+
+Translation is considered a kind of modification, so you may
+distribute translations of the Document under the terms of section 4.
+Replacing Invariant Sections with translations requires special
+permission from their copyright holders, but you may include
+translations of some or all Invariant Sections in addition to the
+original versions of these Invariant Sections.  You may include a
+translation of this License, and all the license notices in the
+Document, and any Warranty Disclaimers, provided that you also include
+the original English version of this License and the original versions
+of those notices and disclaimers.  In case of a disagreement between
+the translation and the original version of this License or a notice
+or disclaimer, the original version will prevail.
+
+If a section in the Document is Entitled ``Acknowledgements'',
+``Dedications'', or ``History'', the requirement (section 4) to Preserve
+its Title (section 1) will typically require changing the actual
+title.
+
+@item
+TERMINATION
+
+You may not copy, modify, sublicense, or distribute the Document
+except as expressly provided under this License.  Any attempt
+otherwise to copy, modify, sublicense, or distribute it is void, and
+will automatically terminate your rights under this License.
+
+However, if you cease all violation of this License, then your license
+from a particular copyright holder is reinstated (a) provisionally,
+unless and until the copyright holder explicitly and finally
+terminates your license, and (b) permanently, if the copyright holder
+fails to notify you of the violation by some reasonable means prior to
+60 days after the cessation.
+
+Moreover, your license from a particular copyright holder is
+reinstated permanently if the copyright holder notifies you of the
+violation by some reasonable means, this is the first time you have
+received notice of violation of this License (for any work) from that
+copyright holder, and you cure the violation prior to 30 days after
+your receipt of the notice.
+
+Termination of your rights under this section does not terminate the
+licenses of parties who have received copies or rights from you under
+this License.  If your rights have been terminated and not permanently
+reinstated, receipt of a copy of some or all of the same material does
+not give you any rights to use it.
+
+@item
+FUTURE REVISIONS OF THIS LICENSE
+
+The Free Software Foundation may publish new, revised versions
+of the GNU Free Documentation License from time to time.  Such new
+versions will be similar in spirit to the present version, but may
+differ in detail to address new problems or concerns.  See
+@uref{https://www.gnu.org/licenses/}.
+
+Each version of the License is given a distinguishing version number.
+If the Document specifies that a particular numbered version of this
+License ``or any later version'' applies to it, you have the option of
+following the terms and conditions either of that specified version or
+of any later version that has been published (not as a draft) by the
+Free Software Foundation.  If the Document does not specify a version
+number of this License, you may choose any version ever published (not
+as a draft) by the Free Software Foundation.  If the Document
+specifies that a proxy can decide which future versions of this
+License can be used, that proxy's public statement of acceptance of a
+version permanently authorizes you to choose that version for the
+Document.
+
+@item
+RELICENSING
+
+``Massive Multiauthor Collaboration Site'' (or ``MMC Site'') means any
+World Wide Web server that publishes copyrightable works and also
+provides prominent facilities for anybody to edit those works.  A
+public wiki that anybody can edit is an example of such a server.  A
+``Massive Multiauthor Collaboration'' (or ``MMC'') contained in the
+site means any set of copyrightable works thus published on the MMC
+site.
+
+``CC-BY-SA'' means the Creative Commons Attribution-Share Alike 3.0
+license published by Creative Commons Corporation, a not-for-profit
+corporation with a principal place of business in San Francisco,
+California, as well as future copyleft versions of that license
+published by that same organization.
+
+``Incorporate'' means to publish or republish a Document, in whole or
+in part, as part of another Document.
+
+An MMC is ``eligible for relicensing'' if it is licensed under this
+License, and if all works that were first published under this License
+somewhere other than this MMC, and subsequently incorporated in whole
+or in part into the MMC, (1) had no cover texts or invariant sections,
+and (2) were thus incorporated prior to November 1, 2008.
+
+The operator of an MMC Site may republish an MMC contained in the site
+under CC-BY-SA on the same site at any time before August 1, 2009,
+provided the MMC is eligible for relicensing.
+
+@end enumerate
+
+@page
+@heading ADDENDUM: How to use this License for your documents
+
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and
+license notices just after the title page:
+
+@smallexample
+@group
+  Copyright (C)  @var{year}  @var{your name}.
+  Permission is granted to copy, distribute and/or modify this document
+  under the terms of the GNU Free Documentation License, Version 1.3
+  or any later version published by the Free Software Foundation;
+  with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+  Texts.  A copy of the license is included in the section entitled ``GNU
+  Free Documentation License''.
+@end group
+@end smallexample
+
+If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
+replace the ``with@dots{}Texts.''@: line with this:
+
+@smallexample
+@group
+    with the Invariant Sections being @var{list their titles}, with
+    the Front-Cover Texts being @var{list}, and with the Back-Cover Texts
+    being @var{list}.
+@end group
+@end smallexample
+
+If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of
+free software license, such as the GNU General Public License,
+to permit their use in free software.
+
+@c Local Variables:
+@c ispell-local-pdict: "ispell-dict"
+@c End:
--- a/doc/grep.in.1
+++ b/doc/grep.in.1
@ -2,7 +2,7 @@
 .de dT
 .ds Dt \\$2
 ..
-.dT Time-stamp: "2018-05-11"
+.dT Time-stamp: "2025-03-21"
 .\" Update the above date whenever a change to either this file or
 .\" grep.c's 'usage' function results in a nontrivial change to the man page.
 .\" In Emacs, you can update the date by running 'M-x time-stamp'
@ -11,8 +11,10 @@
 .
 .TH GREP 1 \*(Dt "GNU grep @VERSION@" "User Commands"
 .
-.if !\w|\*(lq| \{\
-.\" groff an-old.tmac does not seem to be in use, so define lq and rq.
+.ie \n(.g .ds ' \(aq
+.el .ds ' '
+.if !\w@\*(lq@ \{\
+.\" The implementation lacks \*(lq and presumably \*(rq.
 .	ie \n(.g \{\
 .		ds lq \(lq\"
 .		ds rq \(rq\"
@ -23,152 +25,156 @@
 .	\}
 .\}
 .
-.if !\w|\*(la| \{\
+.as mC
+.if !\w@\*(mC@ \{\
 .\" groff an-ext.tmac does not seem to be in use, so define the parts of
-.\" it that are used below.  For a copy of groff an-ext.tmac, please see:
-.\" https://git.savannah.gnu.org/cgit/groff.git/plain/tmac/an-ext.tmac
-.\" --- Start of lines taken from groff an-ext.tmac
+.\" it that are used below, taken from groff 1.23.0.  For a copy, please see:
+.\" https://git.savannah.gnu.org/cgit/groff.git/plain/tmac/an-ext.tmac?id=1.23.0
+.nr mG \n(.g-1
+.\" --- Start of lines taken from groff an-ext.tmac,
+.\" except with "nr mH 14" replaced by "nr mH 0"
+.\" and with mS, SY, YS definitions omitted.
 .
-.\" Check whether we are using grohtml.
-.nr mH 0
-.if \n(.g \
-.  if '\*(.T'html' \
-.    nr mH 1
+.\" Define this to your implementation's constant-width typeface.
+.ds mC CW
+.if n .ds mC R
+.
+.\" Save the automatic hyphenation mode.
+.\"
+.\" In AT&T troff, there was no register exposing the hyphenation mode,
+.\" and no way to save and restore it.  Set `mH` to a reasonable value
+.\" for your implementation and preference.
+.de mY
+.  ie !\\n(.g \
+.    nr mH 0
+.  el \
+.    do nr mH \\n[.hy] \" groff extension register
+..
+.
+.nr mE 0 \" in an example (EX/EE)?
+.
+.\" Prepare link text for mail/web hyperlinks.  `MT` and `UR` call this.
+.de mV
+.  ds m1 \\$1\"
+..
 .
 .
-.\" Map mono-width fonts to standard fonts for groff's TTY device.
-.if n \{\
-.  do ftr CR R
-.  do ftr CI I
-.  do ftr CB B
-.\}
+.\" Emit hyperlink.  The optional argument supplies trailing punctuation
+.\" after link text.  `ME` and `UE` call this.
+.de mQ
+.  mY
+.  nh
+<\\*(m1>\\$1
+.  hy \\n(mH
+..
 .
-.\" groff has glyph entities for angle brackets.
-.ie \n(.g \{\
-.  ds la \(la\"
-.  ds ra \(ra\"
-.\}
-.el \{\
-.  ds la <\"
-.  ds ra >\"
-.  \" groff's man macros control hyphenation with this register.
-.  nr HY 1
-.\}
 .
 .\" Start URL.
+.if \n(.g-\n(mG \{\
 .de UR
-.  ds m1 \\$1\"
-.  nh
-.  if \\n(mH \{\
-.    \" Start diversion in a new environment.
-.    do ev URL-div
-.    do di URL-div
-.  \}
+.  mV \\$1
 ..
+.\}
 .
 .
 .\" End URL.
+.if \n(.g-\n(mG \{\
 .de UE
-.  ie \\n(mH \{\
-.    br
-.    di
-.    ev
-.
-.    \" Has there been one or more input lines for the link text?
-.    ie \\n(dn \{\
-.      do HTML-NS "<a href=""\\*(m1"">"
-.      \" Yes, strip off final newline of diversion and emit it.
-.      do chop URL-div
-.      do URL-div
-\c
-.      do HTML-NS </a>
-.    \}
-.    el \
-.      do HTML-NS "<a href=""\\*(m1"">\\*(m1</a>"
-\&\\$*\"
-.  \}
-.  el \
-\\*(la\\*(m1\\*(ra\\$*\"
-.
-.  hy \\n(HY
+.  mQ \\$1
 ..
+.\}
 .
 .
 .\" Start email address.
+.if \n(.g-\n(mG \{\
 .de MT
-.  ds m1 \\$1\"
-.  nh
-.  if \\n(mH \{\
-.    \" Start diversion in a new environment.
-.    do ev URL-div
-.    do di URL-div
-.  \}
+.  mV \\$1
 ..
+.\}
 .
 .
 .\" End email address.
+.if \n(.g-\n(mG \{\
 .de ME
-.  ie \\n(mH \{\
-.    br
-.    di
-.    ev
-.
-.    \" Has there been one or more input lines for the link text?
-.    ie \\n(dn \{\
-.      do HTML-NS "<a href=""mailto:\\*(m1"">"
-.      \" Yes, strip off final newline of diversion and emit it.
-.      do chop URL-div
-.      do URL-div
-\c
-.      do HTML-NS </a>
-.    \}
-.    el \
-.      do HTML-NS "<a href=""mailto:\\*(m1"">\\*(m1</a>"
-\&\\$*\"
-.  \}
-.  el \
-\\*(la\\*(m1\\*(ra\\$*\"
-.
-.  hy \\n(HY
+.  mQ \\$1
 ..
+.\}
+.
+.
+.\" Start example.
+.if \n(.g-\n(mG \{\
+.de EX
+.  br
+.  if !\\n(mE \{\
+.    nr mF \\n(.f
+.    nr mP \\n(PD
+.    nr PD 1v
+.    nf
+.    ft \\*(mC
+.    nr mE 1
+.  \}
+..
+.\}
+.
+.
+.\" End example.
+.if \n(.g-\n(mG \{\
+.de EE
+.  br
+.  if \\n(mE \{\
+.    ft \\n(mF
+.    nr PD \\n(mP
+.    fi
+.    nr mE 0
+.  \}
+..
+.\}
 .\" --- End of lines taken from groff an-ext.tmac
 .\}
 .
 .hy 0
 .
 .SH NAME
-grep, egrep, fgrep \- print lines that match patterns
+grep \- print lines that match patterns
 .
 .SH SYNOPSIS
 .B grep
-.RI [ OPTION .\|.\|.]\&
+.RI [ OPTION ].\|.\|.\&
 .I PATTERNS
-.RI [ FILE .\|.\|.]
+.RI [ FILE ].\|.\|.
 .br
 .B grep
-.RI [ OPTION .\|.\|.]\&
+.RI [ OPTION ].\|.\|.\&
 .B \-e
 .I PATTERNS
 \&.\|.\|.\&
-.RI [ FILE .\|.\|.]
+.RI [ FILE ].\|.\|.
 .br
 .B grep
-.RI [ OPTION .\|.\|.]\&
+.RI [ OPTION ].\|.\|.\&
 .B \-f
 .I PATTERN_FILE
 \&.\|.\|.\&
-.RI [ FILE .\|.\|.]
+.RI [ FILE ].\|.\|.
 .
 .SH DESCRIPTION
 .B grep
-searches for
-.I PATTERNS
-in each
+searches for patterns in each
 .IR FILE .
+In the synopsis's first form, which is used if no
+.B \-e
+or
+.B \-f
+options are present, the first operand
 .I PATTERNS
-is one or patterns separated by newline characters, and
+is one or more patterns separated by newline characters, and
 .B grep
 prints each line that matches a pattern.
+Typically
+.I PATTERNS
+should be quoted when
+.B grep
+is used in a shell command.
 .PP
 A
 .I FILE
@ -179,17 +185,6 @@ If no
 .I FILE
 is given, recursive searches examine the working directory,
 and nonrecursive searches read standard input.
-.PP
-In addition, the variant programs
-.B egrep
-and
-.B fgrep
-are the same as
-.B "grep\ \-E"
-and
-.BR "grep\ \-F" ,
-respectively.
-These variants are deprecated, but are provided for backward compatibility.
 .
 .SH OPTIONS
 .SS "Generic Program Information"
@ -201,7 +196,7 @@ Output a usage message and exit.
 Output the version number of
 .B grep
 and exit.
-.SS "Matcher Selection"
+.SS "Pattern Syntax"
 .TP
 .BR \-E ", " \-\^\-extended\-regexp
 Interpret
@ -220,7 +215,9 @@ as basic regular expressions (BREs, see below).
 This is the default.
 .TP
 .BR \-P ", " \-\^\-perl\-regexp
-Interpret PATTERNS as Perl-compatible regular expressions (PCREs).
+Interpret
+.I PATTERNS
+as Perl-compatible regular expressions (PCREs).
 This option is experimental when combined with the
 .B \-z
 .RB ( \-\^\-null\-data )
@ -248,11 +245,24 @@ If this option is used multiple times or is combined with the
 .RB ( \-\^\-regexp )
 option, search for all patterns given.
 The empty file contains zero patterns, and therefore matches nothing.
+If
+.I FILE
+is
+.B \-
+, read patterns from standard input.
 .TP
 .BR \-i ", " \-\^\-ignore\-case
-Ignore case distinctions, so that characters that differ only in case
+Ignore case distinctions in patterns and input data,
+so that characters that differ only in case
 match each other.
 .TP
+.B \-\^\-no\-ignore\-case
+Do not ignore case distinctions in patterns and input data.
+This is the default.
+This option is useful for passing to shell scripts that already use
+.BR \-i ,
+to cancel its effects because the two options override each other.
+.TP
 .BR \-v ", " \-\^\-invert\-match
 Invert the sense of matching, to select non-matching lines.
 .TP
@ -275,10 +285,6 @@ pattern and then surrounding it with
 .B ^
 and
 .BR $ .
-.TP
-.B \-y
-Obsolete synonym for
-.BR \-i .
 .SS "General Output Control"
 .TP
 .BR \-c ", " \-\^\-count
@ -286,7 +292,7 @@ Suppress normal output; instead print a count of
 matching lines for each input file.
 With the
 .BR \-v ", " \-\^\-invert\-match
-option (see below), count non-matching lines.
+option (see above), count non-matching lines.
 .TP
 .BR \-\^\-color [ =\fIWHEN\fP "], " \-\^\-colour [ =\fIWHEN\fP ]
 Surround the matched (non-empty) strings, matching lines, context lines,
@ -295,9 +301,6 @@ groups of context lines) with escape sequences to display them in color
 on the terminal.
 The colors are defined by the environment variable
 .BR GREP_COLORS .
-The deprecated environment variable
-.B GREP_COLOR
-is still supported, but its setting does not have priority.
 .I WHEN
 is
 .BR never ", " always ", or " auto .
@ -306,18 +309,27 @@ is
 Suppress normal output; instead print the name
 of each input file from which no output would
 normally have been printed.
-The scanning will stop on the first match.
 .TP
 .BR \-l ", " \-\^\-files\-with\-matches
 Suppress normal output; instead print
 the name of each input file from which output
 would normally have been printed.
-The scanning will stop on the first match.
+Scanning each input file stops upon first match.
 .TP
 .BI \-m " NUM" "\fR,\fP \-\^\-max\-count=" NUM
 Stop reading a file after
 .I NUM
 matching lines.
+If
+.I NUM
+is zero,
+.B grep
+stops right away without reading input.
+A
+.I NUM
+of \-1 is treated as infinity and
+.B grep
+does not stop; this is the default.
 If the input is standard input from a regular file,
 and
 .I NUM
@ -380,6 +392,7 @@ print the offset of the matching part itself.
 .BR \-H ", " \-\^\-with\-filename
 Print the file name for each match.
 This is the default when there is more than one file to search.
+This is a GNU extension.
 .TP
 .BR \-h ", " \-\^\-no\-filename
 Suppress the prefixing of file names on output.
@ -389,10 +402,10 @@ This is the default when there is only one file
 .BI \-\^\-label= LABEL
 Display input actually coming from standard input as input coming from file
 .IR LABEL .
-This is especially useful when implementing tools like
-.BR zgrep ,
+This can be useful for commands that transform a file's contents
+before searching,
 e.g.,
-.BR "gzip \-cd foo.gz | grep \-\^\-label=foo \-H something" .
+.BR "gzip \-cd foo.gz | grep \-\^\-label=foo \-H \*'some pattern\*'" .
 See also the
 .B \-H
 option.
@ -413,20 +426,6 @@ from a single file will all start at the same column,
 this also causes the line number and byte offset (if present)
 to be printed in a minimum size field width.
 .TP
-.BR \-u ", " \-\^\-unix\-byte\-offsets
-Report Unix-style byte offsets.
-This switch causes
-.B grep
-to report byte offsets as if the file were a Unix-style text file,
-i.e., with CR characters stripped off.
-This will produce results identical to running
-.B grep
-on a Unix machine.
-This option has no effect unless
-.B \-b
-option is also used;
-it has no effect on platforms other than MS-DOS and MS-Windows.
-.TP
 .BR \-Z ", " \-\^\-null
 Output a zero byte (the ASCII
 .B NUL
@ -484,6 +483,26 @@ With the
 or
 .B \-\^\-only\-matching
 option, this has no effect and a warning is given.
+.TP
+.BI \-\^\-group\-separator= SEP
+When
+.BR \-A ,
+.BR \-B ,
+or
+.B \-C
+are in use, print
+.I SEP
+instead of
+.B \-\^\-
+between groups of lines.
+.TP
+.B \-\^\-no\-group\-separator
+When
+.BR \-A ,
+.BR \-B ,
+or
+.B \-C
+are in use, do not print a separator between groups of lines.
 .SS "File and Directory Selection"
 .TP
 .BR \-a ", " \-\^\-text
@ -505,11 +524,14 @@ By default,
 .I TYPE
 is
 .BR binary ,
-and when
+and
 .B grep
-discovers that a file is binary it suppresses any further output, and
-instead outputs either a one-line message saying that a binary file
-matches, or no message if there is no match.
+suppresses output after null input binary data is discovered,
+and suppresses output lines that contain improperly encoded data.
+When some output is suppressed,
+.B grep
+follows any output
+with a message to standard error saying that a binary file matches.
 .IP
 If
 .I TYPE
@ -517,7 +539,7 @@ is
 .BR without\-match ,
 when
 .B grep
-discovers that a file is binary it assumes that the rest of the file
+discovers null input binary data it assumes that the rest of the file
 does not match; this is equivalent to the
 .B \-I
 option.
@ -574,7 +596,7 @@ On the other hand, when reading files whose text encodings are
 unknown, it can be helpful to use
 .B \-a
 or to set
-.B LC_ALL='C'
+.B LC_ALL=\*'C\*'
 in the environment, in order to find more matches even if the matches
 are unsafe for direct display.
 .TP
@ -621,14 +643,13 @@ option.
 Skip any command-line file with a name suffix that matches the pattern
 .IR GLOB ,
 using wildcard matching; a name suffix is either the whole
-name, or any suffix starting after a
-.B /
-and before a
-.RB non- / .
+name, or a trailing part that starts with a non-slash character
+immediately after a slash
+.RB ( / )
+in the name.
 When searching recursively, skip any subfile whose base name matches
 .IR GLOB ;
-the base name is the part after the last
-.BR / .
+the base name is the part after the last slash.
 A pattern can use
 .BR * ,
 .BR ? ,
@ -654,7 +675,7 @@ whose base name matches
 Ignore any redundant trailing slashes in
 .IR GLOB .
 .TP
-.BR \-I
+.B \-I
 Process a binary file as if it did not contain matching data; this is
 equivalent to the
 .B \-\^\-binary\-files=without\-match
@ -665,11 +686,24 @@ Search only files whose base name matches
 .I GLOB
 (using wildcard matching as described under
 .BR \-\^\-exclude ).
+If contradictory
+.B \-\^\-include
+and
+.B \-\^\-exclude
+options are given, the last matching one wins.
+If no
+.B \-\^\-include
+or
+.B \-\^\-exclude
+options match, a file is included unless the first such option is
+.BR \-\^\-include .
 .TP
 .BR \-r ", " \-\^\-recursive
 Read all files under each directory, recursively,
 following symbolic links only if they are on the command line.
-Note that if no file operand is given, grep searches the working directory.
+Note that if no file operand is given,
+.B grep
+searches the working directory.
 This is equivalent to the
 .B "\-d recurse"
 option.
@ -680,19 +714,19 @@ Follow all symbolic links, unlike
 .BR \-r .
 .SS "Other Options"
 .TP
-.BR \-\^\-line\-buffered
+.B \-\^\-line\-buffered
 Use line buffering on output.
 This can cause a performance penalty.
 .TP
 .BR \-U ", " \-\^\-binary
 Treat the file(s) as binary.
 By default, under MS-DOS and MS-Windows,
-.BR grep
+.B grep
 guesses whether a file is text or binary as described for the
 .B \-\^\-binary\-files
 option.
 If
-.BR grep
+.B grep
 decides the file is a text file, it strips the CR characters from the
 original file contents (to make regular expressions with
 .B ^
@ -716,7 +750,7 @@ Like the
 or
 .B \-\^\-null
 option, this option can be used with commands like
-.B sort -z
+.B "sort \-z"
 to process arbitrary file names.
 .
 .SH "REGULAR EXPRESSIONS"
@ -728,15 +762,19 @@ expressions, by using various operators to combine smaller expressions.
 understands three different versions of regular expression syntax:
 \*(lqbasic\*(rq (BRE), \*(lqextended\*(rq (ERE) and \*(lqperl\*(rq (PCRE).
 In GNU
-.B grep
-there is no difference in available functionality between basic and
-extended syntaxes.
-In other implementations, basic regular expressions are less powerful.
+.BR grep ,
+basic and extended regular expressions are merely different notations
+for the same pattern-matching functionality.
+In other implementations, basic regular expressions are ordinarily
+less powerful than extended, though occasionally it is the other way around.
 The following description applies to extended regular expressions;
 differences for basic regular expressions are summarized afterwards.
-Perl-compatible regular expressions give additional functionality, and are
-documented in pcresyntax(3) and pcrepattern(3), but work only if
-PCRE is available in the system.
+Perl-compatible regular expressions have different functionality, and are
+documented in
+.BR pcre2syntax (3)
+and
+.BR pcre2pattern (3),
+but work only if PCRE support is enabled.
 .PP
 The fundamental building blocks are the regular expressions
 that match a single character.
@ -771,19 +809,21 @@ matches any single digit.
 Within a bracket expression, a
 .I "range expression"
 consists of two characters separated by a hyphen.
-It matches any single character that sorts between the two characters,
-inclusive, using the locale's collating sequence and character set.
-For example, in the default C locale,
+In the default C locale, it matches any single character that appears
+between the two characters in ASCII order, inclusive.
+For example,
 .B [a\-d]
 is equivalent to
 .BR [abcd] .
-Many locales sort characters in dictionary order, and in these locales
+In other locales the behavior is unspecified:
 .B [a\-d]
-is typically not equivalent to
-.BR [abcd] ;
-it might be equivalent to
-.BR [aBbCcDd] ,
-for example.
+might be equivalent to
+.B [abcd]
+or
+.B [aBbCcDd]
+or some other bracket expression,
+or it might fail to match any character, or the set of
+characters that it matches might be erratic, or it might be invalid.
 To obtain the traditional interpretation of bracket expressions,
 you can use the C locale by setting the
 .B LC_ALL
@ -795,6 +835,7 @@ bracket expressions, as follows.
 Their names are self explanatory, and they are
 .BR [:alnum:] ,
 .BR [:alpha:] ,
+.BR [:blank:] ,
 .BR [:cntrl:] ,
 .BR [:digit:] ,
 .BR [:graph:] ,
@ -905,7 +946,7 @@ Repetition takes precedence over concatenation, which in turn
 takes precedence over alternation.
 A whole expression may be enclosed in parentheses
 to override these precedence rules and form a subexpression.
-.SS "Back References and Subexpressions"
+.SS "Back-references and Subexpressions"
 The back-reference
 .BI \e n\c
 \&, where
@ -922,7 +963,7 @@ In basic regular expressions the meta-characters
 .BR | ,
 .BR ( ,
 and
-.BR )
+.B )
 lose their special meaning; instead use the backslashed
 versions
 .BR \e? ,
@ -933,7 +974,18 @@ versions
 and
 .BR \e) .
 .
-.SH "ENVIRONMENT VARIABLES"
+.SH "EXIT STATUS"
+Normally the exit status is 0 if a line is selected, 1 if no lines
+were selected, and 2 if an error occurred.  However, if the
+.B \-q
+or
+.B \-\^\-quiet
+or
+.B \-\^\-silent
+is used and a line is selected, the exit status is 0 even if an error
+occurred.
+.
+.SH ENVIRONMENT
 The behavior of
 .B grep
 is affected by the following environment variables.
@ -963,45 +1015,10 @@ The shell command
 .B "locale \-a"
 lists locales that are currently available.
 .TP
-.B GREP_OPTIONS
-This variable specifies default options
-to be placed in front of any explicit options.
-As this causes problems when writing portable scripts,
-this feature will be removed in a future release of
-.BR grep ,
-and
-.B grep
-warns if it is used.
-Please use an alias or script instead.
-.TP
-.B GREP_COLOR
-This variable specifies the color used to highlight matched (non-empty) text.
-It is deprecated in favor of
-.BR GREP_COLORS ,
-but still supported.
-The
-.BR mt ,
-.BR ms ,
-and
-.B mc
-capabilities of
 .B GREP_COLORS
-have priority over it.
-It can only specify the color used to highlight
-the matching non-empty text in any matching line
-(a selected line when the
-.B \-v
-command-line option is omitted,
-or a context line when
-.B \-v
-is specified).
-The default is
-.BR 01;31 ,
-which means a bold red foreground text on the terminal's default background.
-.TP
-.B GREP_COLORS
-Specifies the colors and other attributes
-used to highlight various parts of the output.
+Controls how the
+.B \-\^\-color
+option highlights output.
 Its value is a colon-separated list of capabilities
 that defaults to
 .B ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36
@ -1235,45 +1252,13 @@ front of the operand list and are treated as options.
 Also, POSIX requires that unrecognized options be diagnosed as
 \*(lqillegal\*(rq, but since they are not really against the law the default
 is to diagnose them as \*(lqinvalid\*(rq.
-.B POSIXLY_CORRECT
-also disables \fB_\fP\fIN\fP\fB_GNU_nonoption_argv_flags_\fP,
-described below.
-.TP
-\fB_\fP\fIN\fP\fB_GNU_nonoption_argv_flags_\fP
-(Here
-.I N
-is
-.BR grep 's
-numeric process ID.)  If the
-.IR i th
-character of this environment variable's value is
-.BR 1 ,
-do not consider the
-.IR i th
-operand of
-.B grep
-to be an option, even if it appears to be one.
-A shell can put this variable in the environment for each command it runs,
-specifying which operands are the results of file name wildcard
-expansion and therefore should not be treated as options.
-This behavior is available only with the GNU C library, and only
-when
-.B POSIXLY_CORRECT
-is not set.
 .
-.SH "EXIT STATUS"
-Normally the exit status is 0 if a line is selected, 1 if no lines
-were selected, and 2 if an error occurred.  However, if the
-.B \-q
-or
-.B \-\^\-quiet
-or
-.B \-\^\-silent
-is used and a line is selected, the exit status is 0 even if an error
-occurred.
+.SH NOTES
+This man page is maintained only fitfully;
+the full documentation is often more up-to-date.
 .
 .SH COPYRIGHT
-Copyright 1998\(en2000, 2002, 2005\(en2018 Free Software Foundation, Inc.
+Copyright 1998\(en2000, 2002, 2005\(en2026 Free Software Foundation, Inc.
 .PP
 This is free software;
 see the source for copying conditions.
@ -1309,16 +1294,48 @@ to run out of memory.
 .PP
 Back-references are very slow, and may require exponential time.
 .
+.SH EXAMPLE
+The following example outputs the location and contents of any line
+containing \*(lqf\*(rq and ending in \*(lq.c\*(rq,
+within all files in the current directory whose names
+contain \*(lqg\*(rq and end in \*(lq.h\*(rq.
+The
+.B \-n
+option outputs line numbers, the
+.B \-\^\-
+argument treats expansions of \*(lq*g*.h\*(rq starting with \*(lq\-\*(rq
+as file names not options,
+and the empty file /dev/null causes file names to be output
+even if only one file name happens to be of the form \*(lq*g*.h\*(rq.
+.PP
+.in +2n
+.EX
+$ \fBgrep\fP \-n \-\^\- \*'f.*\e.c$\*' *g*.h /dev/null
+argmatch.h:1:/* definitions and prototypes for argmatch.c
+.EE
+.in
+.PP
+The only line that matches is line 1 of argmatch.h.
+Note that the regular expression syntax used in the pattern differs
+from the globbing syntax that the shell uses to match file names.
+.
 .SH "SEE ALSO"
 .SS "Regular Manual Pages"
-awk(1), cmp(1), diff(1), find(1), gzip(1),
-perl(1), sed(1), sort(1), xargs(1), zgrep(1),
-read(2),
-pcre(3), pcresyntax(3), pcrepattern(3),
-terminfo(5),
-glob(7), regex(7).
-.SS "POSIX Programmer's Manual Page"
-grep(1p).
+.BR awk (1),
+.BR cmp (1),
+.BR diff (1),
+.BR find (1),
+.BR perl (1),
+.BR sed (1),
+.BR sort (1),
+.BR xargs (1),
+.BR read (2),
+.BR pcre2 (3),
+.BR pcre2syntax (3),
+.BR pcre2pattern (3),
+.BR terminfo (5),
+.BR glob (7),
+.BR regex (7)
 .SS "Full Documentation"
 A
 .UR https://www.gnu.org/software/grep/manual/
@ -1335,9 +1352,6 @@ programs are properly installed at your site, the command
 .PP
 should give you access to the complete manual.
 .
-.SH NOTES
-This man page is maintained only fitfully;
-the full documentation is often more up-to-date.
 .\" Work around problems with some troff -man implementations.
 .br
 .
--- a/doc/grep.texi
+++ b/doc/grep.texi
--- a/2
+++ b/2
@ -1 +1 @@
-Subproject commit 5d6a3cdd5c312e77a6d0f0848e3cb79a52e08658
+Subproject commit 4f6ac2c3c689cd7312b5f9da97791b14bbc2ee53
--- a/gnulib-tests/Makefile.am
+++ b/gnulib-tests/Makefile.am
@ -1 +1,3 @@
+AM_CFLAGS = $(GNULIB_TEST_WARN_CFLAGS) $(WERROR_CFLAGS)
+
 include gnulib.mk
--- a/lib/Makefile.am
+++ b/lib/Makefile.am
@ -1,4 +1,4 @@
-# Copyright 1997-1998, 2005-2018 Free Software Foundation, Inc.
+# Copyright 1997-1998, 2005-2026 Free Software Foundation, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/lib/colorize-posix.c
+++ b/lib/colorize-posix.c
@ -1,5 +1,5 @@
 /* Output colorization.
-   Copyright 2011-2018 Free Software Foundation, Inc.
+   Copyright 2011-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,9 +12,7 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 /* Without this pragma, gcc 4.7.0 20120102 suggests that the
   init_colorize function might be candidate for attribute 'const'  */
--- a/lib/colorize-w32.c
+++ b/lib/colorize-w32.c
@ -1,5 +1,5 @@
 /* Output colorization on MS-Windows.
-   Copyright 2011-2018 Free Software Foundation, Inc.
+   Copyright 2011-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,9 +12,7 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 /* Written by Eli Zaretskii.  */

@ -96,7 +94,7 @@ w32_sgr2attr (const char *sgr_seq)
    {
      if (*p == ';' || *p == '\0')
        {
-          code = strtol (s, NULL, 10);
+          code = strtol (s, nullptr, 10);
          s = p + (*p != '\0');

          switch (code)
--- a/lib/colorize.h
+++ b/lib/colorize.h
@ -1,6 +1,6 @@
 /* Output colorization.

-   Copyright 2011-2018 Free Software Foundation, Inc.
+   Copyright 2011-2026 Free Software Foundation, Inc.
   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
   the Free Software Foundation; either version 3, or (at your option)
@ -12,9 +12,7 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 extern int should_colorize (void);
 extern void init_colorize (void);
--- a/m4/pcre.m4
+++ b/m4/pcre.m4
@ -1,6 +1,6 @@
-# pcre.m4 - check for libpcre support
+# pcre.m4 - check for PCRE library support

-# Copyright (C) 2010-2018 Free Software Foundation, Inc.
+# Copyright (C) 2010-2026 Free Software Foundation, Inc.
 # This file is free software; the Free Software Foundation
 # gives unlimited permission to copy and/or distribute it,
 # with or without modifications, as long as this notice is preserved.
@ -8,8 +8,8 @@
 AC_DEFUN([gl_FUNC_PCRE],
 [
  AC_ARG_ENABLE([perl-regexp],
-    AC_HELP_STRING([--disable-perl-regexp],
-                   [disable perl-regexp (pcre) support]),
+    AS_HELP_STRING([--disable-perl-regexp],
+                   [disable perl-regexp (PCRE) support]),
    [case $enableval in
       yes|no) test_pcre=$enableval;;
       *) AC_MSG_ERROR([invalid value $enableval for --disable-perl-regexp]);;
@ -21,36 +21,54 @@ AC_DEFUN([gl_FUNC_PCRE],
  use_pcre=no

  if test $test_pcre != no; then
-    PKG_CHECK_MODULES([PCRE], [libpcre], [], [: ${PCRE_LIBS=-lpcre}])

-    AC_CACHE_CHECK([for pcre_compile], [pcre_cv_have_pcre_compile],
+    AS_CASE([${PCRE_CFLAGS+set}@${PCRE_LIBS+set}@$PKG_CONFIG],
+      [@@false], [],
+      [@@*], [PKG_CHECK_MODULES([PCRE], [libpcre2-8], [], [:])])
+
+    AC_CACHE_CHECK([for pcre2_compile], [pcre_cv_have_pcre2_compile],
      [pcre_saved_CFLAGS=$CFLAGS
       pcre_saved_LIBS=$LIBS
-       CFLAGS="$CFLAGS $PCRE_CFLAGS"
-       LIBS="$PCRE_LIBS $LIBS"
-       AC_LINK_IFELSE(
-         [AC_LANG_PROGRAM([[#include <pcre.h>
-                          ]],
-            [[pcre *p = pcre_compile (0, 0, 0, 0, 0);
-              return !p;]])],
-         [pcre_cv_have_pcre_compile=yes],
-         [pcre_cv_have_pcre_compile=no])
+       pcre_cv_have_pcre2_compile=no
+
+       while
+         CFLAGS="$pcre_saved_CFLAGS $PCRE_CFLAGS"
+         LIBS="$pcre_saved_LIBS $PCRE_LIBS"
+         AC_LINK_IFELSE(
+           [AC_LANG_PROGRAM([[#define PCRE2_CODE_UNIT_WIDTH 8
+                              #include <pcre2.h>
+                            ]],
+              [[pcre2_code *p = pcre2_compile (0, 0, 0, 0, 0, 0);
+                return !p;]])],
+           [pcre_cv_have_pcre2_compile=yes])
+         test $pcre_cv_have_pcre2_compile = no
+       do
+         AS_CASE([$PCRE_CFLAGS@$PCRE_LIBS],
+           [@-lpcre2-8],
+             [# Even the fallback setting fails; give up.
+              PCRE_LIBS=
+              break])
+         # Fallback setting.
+         PCRE_CFLAGS=
+         PCRE_LIBS=-lpcre2-8
+       done
+
       CFLAGS=$pcre_saved_CFLAGS
       LIBS=$pcre_saved_LIBS])

-    if test "$pcre_cv_have_pcre_compile" = yes; then
+    if test "$pcre_cv_have_pcre2_compile" = yes; then
      use_pcre=yes
    elif test $test_pcre = maybe; then
-      AC_MSG_WARN([AC_PACKAGE_NAME will be built without pcre support.])
+      AC_MSG_WARN([AC_PACKAGE_NAME will be built without PCRE support.])
    else
-      AC_MSG_ERROR([pcre support not available])
+      AC_MSG_ERROR([PCRE support not available])
    fi
  fi

  if test $use_pcre = yes; then
    AC_DEFINE([HAVE_LIBPCRE], [1],
      [Define to 1 if you have the Perl Compatible Regular Expressions
-       library (-lpcre).])
+       library.])
  else
    PCRE_CFLAGS=
    PCRE_LIBS=
--- a/po/POTFILES.in
+++ b/po/POTFILES.in
@ -1,6 +1,6 @@
 # List of files which containing translatable strings.
 #
-# Copyright 1997-1998, 2005-2018 Free Software Foundation, Inc.
+# Copyright 1997-1998, 2005-2026 Free Software Foundation, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -16,6 +16,7 @@
 # along with this program.  If not, see <https://www.gnu.org/licenses/>.

 lib/argmatch.c
+lib/argmatch.h
 lib/c-stack.c
 lib/closeout.c
 lib/dfa.c
@ -28,6 +29,6 @@ lib/quotearg.c
 lib/regcomp.c
 lib/version-etc.c
 lib/xalloc-die.c
-lib/xstrtol-error.c
+src/dfasearch.c
 src/grep.c
 src/pcresearch.c
--- a/src/Makefile.am
+++ b/src/Makefile.am
@ -1,5 +1,5 @@
 ## Process this file with automake to create Makefile.in
-# Copyright 1997-1998, 2005-2018 Free Software Foundation, Inc.
+# Copyright 1997-1998, 2005-2026 Free Software Foundation, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -28,11 +28,12 @@ grep_SOURCES =					\
  die.h						\
  grep.c					\
  kwsearch.c					\
-  kwset.c					\
-  pcresearch.c					\
  searchutils.c
+if USE_PCRE
+grep_SOURCES += pcresearch.c
+endif

-noinst_HEADERS = grep.h kwset.h search.h system.h
+noinst_HEADERS = grep.h search.h system.h

 # Sometimes, the expansion of $(LIBINTL) includes -lc which may
 # include modules defining variables like 'optind', so libgreputils.a
@ -40,7 +41,9 @@ noinst_HEADERS = grep.h kwset.h search.h system.h
 # But libgreputils.a must also follow $(LIBINTL), since libintl uses
 # replacement functions defined in libgreputils.a.
 LDADD = \
-  ../lib/libgreputils.a $(LIBINTL) ../lib/libgreputils.a $(LIBICONV) \
+  ../lib/libgreputils.a $(LIBINTL) ../lib/libgreputils.a \
+  $(HARD_LOCALE_LIB) $(LIBC32CONV) \
+  $(LIBSIGSEGV) $(LIBUNISTRING) $(MBRTOWC_LIB) $(SETLOCALE_NULL_LIB) \
  $(LIBTHREAD)

 grep_LDADD = $(LDADD) $(PCRE_LIBS) $(LIBCSTACK)
@ -52,11 +55,11 @@ EXTRA_DIST = egrep.sh
 egrep fgrep: egrep.sh Makefile
 	$(AM_V_GEN)grep=`echo grep | sed -e '$(transform)'` &&		\
 	case $@ in egrep) option=-E;; fgrep) option=-F;; esac &&	\
-	shell_does_substrings='set x/y && d=$${1%/*} && test "$$d" = x' && \
+	shell_does_substrings='set x/y && d=$${1##*/} && test "$$d" = y' && \
 	if $(SHELL) -c "$$shell_does_substrings" 2>/dev/null; then	\
 	  edit_substring='s,X,X,';					\
 	else								\
-	  edit_substring='s,\$${0%/\*},`expr "X$$0" : '\''X\\(.*\\)/'\''`,g'; \
+	  edit_substring='s,\$${0##\*/},`expr "X$$0" : '\''X\\(.*\\)/'\''`,g'; \
 	fi &&								\
 	sed -e 's|[@]SHELL@|$(SHELL)|g'					\
 	    -e "$$edit_substring"					\
--- a/src/dfasearch.c
+++ b/src/dfasearch.c
@ -1,5 +1,5 @@
 /* dfasearch.c - searching subroutines using dfa and regex for grep.
-   Copyright 1992, 1998, 2000, 2007, 2009-2018 Free Software Foundation, Inc.
+   Copyright 1992, 1998, 2000, 2007, 2009-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,21 +12,19 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 /* Written August 1992 by Mike Haertel. */

 #include <config.h>
 #include "intprops.h"
-#include "search.h"
+#include <search.h>
 #include "die.h"
 #include <error.h>

 struct dfa_comp
 {
-  /* KWset compiled pattern.  For Ecompile and Gcompile, we compile
+  /* KWset compiled pattern.  For GEAcompile, we compile
     a list of strings, at least one of which is known to occur in
     any string matching the regexp. */
  kwset_t kwset;
@ -35,14 +33,14 @@ struct dfa_comp
  struct dfa *dfa;

  /* Regex compiled regexps. */
-  struct re_pattern_buffer* patterns;
-  size_t pcount;
+  struct re_pattern_buffer *patterns;
+  idx_t pcount;
  struct re_registers regs;

  /* Number of compiled fixed strings known to exactly match the regexp.
     If kwsexec returns < kwset_exact_matches, then we don't need to
     call the regexp matcher at all. */
-  ptrdiff_t kwset_exact_matches;
+  idx_t kwset_exact_matches;

  bool begline;
 };
@ -53,14 +51,10 @@ dfaerror (char const *mesg)
  die (EXIT_TROUBLE, 0, "%s", mesg);
 }

-/* For now, the sole dfawarn-eliciting condition (use of a regexp
-   like '[:lower:]') is unequivocally an error, so treat it as such,
-   when possible.  */
 void
 dfawarn (char const *mesg)
 {
-  if (!getenv ("POSIXLY_CORRECT"))
-    dfaerror (mesg);
+  error (0, 0, _("warning: %s"), mesg);
 }

 /* If the DFA turns out to have some set of fixed strings one of
@ -80,9 +74,9 @@ kwsmusts (struct dfa_comp *dc)
         The kwset matcher will return the index of the matching
         string that it chooses. */
      ++dc->kwset_exact_matches;
-      ptrdiff_t old_len = strlen (dm->must);
-      ptrdiff_t new_len = old_len + dm->begline + dm->endline;
-      char *must = xmalloc (new_len);
+      idx_t old_len = strlen (dm->must);
+      idx_t new_len = old_len + dm->begline + dm->endline;
+      char *must = ximalloc (new_len);
      char *mp = must;
      *mp = eolbyte;
      mp += dm->begline;
@ -103,8 +97,103 @@ kwsmusts (struct dfa_comp *dc)
  dfamustfree (dm);
 }

+/* Return true if KEYS, of length LEN, might contain a back-reference.
+   Return false if KEYS cannot contain a back-reference.
+   BS_SAFE is true of encodings where a backslash cannot appear as the
+   last byte of a multibyte character.  */
+static bool _GL_ATTRIBUTE_PURE
+possible_backrefs_in_pattern (char const *keys, idx_t len, bool bs_safe)
+{
+  /* Normally a backslash, but in an unsafe encoding this is a non-char
+     value so that the comparison below always fails, because if there
+     are two adjacent '\' bytes, the first might be the last byte of a
+     multibyte character.  */
+  int second_backslash = bs_safe ? '\\' : CHAR_MAX + 1;
+
+  /* This code can return true even if KEYS lacks a back-reference, for
+     patterns like [\2], or for encodings where '\' appears as the last
+     byte of a multibyte character.  However, false alarms should be
+     rare and do not affect correctness.  */
+
+  /* Do not look for a backslash in the pattern's last byte, since it
+     can't be part of a back-reference and this streamlines the code.  */
+  len--;
+
+  if (0 <= len)
+    {
+      char const *lim = keys + len;
+      for (char const *p = keys; (p = memchr (p, '\\', lim - p)); p++)
+        {
+          if ('1' <= p[1] && p[1] <= '9')
+            return true;
+          if (p[1] == second_backslash)
+            {
+              p++;
+              if (p == lim)
+                break;
+            }
+        }
+    }
+  return false;
+}
+
+static bool
+regex_compile (struct dfa_comp *dc, char const *p, idx_t len,
+               idx_t pcount, idx_t lineno, reg_syntax_t syntax_bits,
+               bool syntax_only)
+{
+  struct re_pattern_buffer pat;
+  pat.buffer = nullptr;
+  pat.allocated = 0;
+
+  /* Do not use a fastmap with -i, to work around glibc Bug#20381.  */
+  static_assert (UCHAR_MAX < IDX_MAX);
+  idx_t uchar_max = UCHAR_MAX;
+  pat.fastmap = syntax_only | match_icase ? nullptr : ximalloc (uchar_max + 1);
+
+  pat.translate = nullptr;
+
+  if (syntax_only)
+    re_set_syntax (syntax_bits | RE_NO_SUB);
+  else
+    re_set_syntax (syntax_bits);
+
+  char const *err = re_compile_pattern (p, len, &pat);
+  if (!err)
+    {
+      if (syntax_only)
+        regfree (&pat);
+      else
+        dc->patterns[pcount] = pat;
+
+      return true;
+    }
+
+  free (pat.fastmap);
+
+  /* Emit a filename:lineno: prefix for patterns taken from files.  */
+  idx_t pat_lineno;
+  char const *pat_filename
+    = lineno < 0 ? "" : pattern_file_name (lineno, &pat_lineno);
+
+  if (*pat_filename == '\0')
+    error (0, 0, "%s", err);
+  else
+    {
+      ptrdiff_t n = pat_lineno;
+      error (0, 0, "%s:%td: %s", pat_filename, n, err);
+    }
+
+  return false;
+}
+
+/* Compile PATTERN, containing SIZE bytes that are followed by '\n'.
+   SYNTAX_BITS specifies whether PATTERN uses style -G, -E, or -A.
+   Return a description of the compiled pattern.  */
+
 void *
-GEAcompile (char *pattern, size_t size, reg_syntax_t syntax_bits)
+GEAcompile (char *pattern, idx_t size, reg_syntax_t syntax_bits,
+            bool exact)
 {
  char *motif;
  struct dfa_comp *dc = xcalloc (1, sizeof (*dc));
@ -113,9 +202,12 @@ GEAcompile (char *pattern, size_t size, reg_syntax_t syntax_bits)

  if (match_icase)
    syntax_bits |= RE_ICASE;
-  re_set_syntax (syntax_bits);
-  int dfaopts = eolbyte ? 0 : DFA_EOL_NUL;
+  int dfaopts = (DFA_CONFUSING_BRACKETS_ERROR | DFA_STRAY_BACKSLASH_WARN
+                 | DFA_PLUS_WARN
+                 | (syntax_bits & RE_CONTEXT_INDEP_OPS ? DFA_STAR_WARN : 0)
+                 | (eolbyte ? 0 : DFA_EOL_NUL));
  dfasyntax (dc->dfa, &localeinfo, syntax_bits, dfaopts);
+  bool bs_safe = !localeinfo.multibyte | localeinfo.using_utf8;

  /* For GNU regex, pass the patterns separately to detect errors like
     "[\nallo\n]\n", where the patterns are "[", "allo" and "]", and
@ -124,53 +216,82 @@ GEAcompile (char *pattern, size_t size, reg_syntax_t syntax_bits)
  char const *p = pattern;
  char const *patlim = pattern + size;
  bool compilation_failed = false;
-  size_t palloc = 0;
+
+  dc->patterns = xmalloc (sizeof *dc->patterns);
+  dc->patterns++;
+  dc->pcount = 0;
+  idx_t palloc = 1;
+
+  char const *prev = pattern;
+
+  /* Buffer containing back-reference-free patterns.  */
+  char *buf = nullptr;
+  idx_t buflen = 0;
+  idx_t bufalloc = 0;
+
+  idx_t lineno = 0;

  do
    {
-      size_t len;
-      char const *sep = memchr (p, '\n', patlim - p);
-      if (sep)
+      char const *sep = rawmemchr (p, '\n');
+      idx_t len = sep - p;
+
+      bool backref = possible_backrefs_in_pattern (p, len, bs_safe);
+
+      if (backref && prev < p)
        {
-          len = sep - p;
-          sep++;
+          idx_t prevlen = p - prev;
+          ptrdiff_t bufshortage = buflen - bufalloc + prevlen;
+          if (0 < bufshortage)
+            buf = xpalloc (buf, &bufalloc, bufshortage, -1, 1);
+          memcpy (buf + buflen, prev, prevlen);
+          buflen += prevlen;
        }
-      else
-        len = patlim - p;

-      if (palloc <= dc->pcount)
-        dc->patterns = x2nrealloc (dc->patterns, &palloc, sizeof *dc->patterns);
-      struct re_pattern_buffer *pat = &dc->patterns[dc->pcount];
-      pat->buffer = NULL;
-      pat->allocated = 0;
-
-      /* Do not use a fastmap with -i, to work around glibc Bug#20381.  */
-      pat->fastmap = match_icase ? NULL : xmalloc (UCHAR_MAX + 1);
-
-      pat->translate = NULL;
-
-      char const *err = re_compile_pattern (p, len, pat);
-      if (err)
+      /* Ensure room for at least two more patterns.  The extra one is
+         for the regex_compile that may be executed after this loop
+         exits, and its (unused) slot is patterns[-1] until then.  */
+      ptrdiff_t shortage = dc->pcount - palloc + 2;
+      if (0 < shortage)
        {
-          /* With patterns specified only on the command line, emit the bare
-             diagnostic.  Otherwise, include a filename:lineno: prefix.  */
-          size_t lineno;
-          char const *pat_filename = pattern_file_name (dc->pcount + 1,
-                                                        &lineno);
-          if (*pat_filename == '\0')
-            error (0, 0, "%s", err);
-          else
-            error (0, 0, "%s:%zu: %s", pat_filename, lineno, err);
-          compilation_failed = true;
+          dc->patterns = xpalloc (dc->patterns - 1, &palloc, shortage, -1,
+                                  sizeof *dc->patterns);
+          dc->patterns++;
+        }
+
+      if (!regex_compile (dc, p, len, dc->pcount, lineno, syntax_bits,
+                          !backref))
+        compilation_failed = true;
+
+      p = sep + 1;
+      lineno++;
+
+      if (backref)
+        {
+          dc->pcount++;
+          prev = p;
        }
-      dc->pcount++;
-      p = sep;
    }
-  while (p);
+  while (p <= patlim);

  if (compilation_failed)
    exit (EXIT_TROUBLE);

+  if (patlim < prev)
+    buflen--;
+  else if (pattern < prev)
+    {
+      idx_t prevlen = patlim - prev;
+      buf = xirealloc (buf, buflen + prevlen);
+      memcpy (buf + buflen, prev, prevlen);
+      buflen += prevlen;
+    }
+  else
+    {
+      buf = pattern;
+      buflen = size;
+    }
+
  /* In the match_words and match_lines cases, we use a different pattern
     for the DFA matcher that will quickly throw out cases that won't work.
     Then if DFA succeeds we do some hairy stuff using the regex matcher
@ -186,11 +307,12 @@ GEAcompile (char *pattern, size_t size, reg_syntax_t syntax_bits)
      static char const word_beg_bk[] = "\\(^\\|[^[:alnum:]_]\\)\\(";
      static char const word_end_bk[] = "\\)\\([^[:alnum:]_]\\|$\\)";
      int bk = !(syntax_bits & RE_NO_BK_PARENS);
-      char *n = xmalloc (sizeof word_beg_bk - 1 + size + sizeof word_end_bk);
+      idx_t bracket_bytes = sizeof word_beg_bk - 1 + sizeof word_end_bk;
+      char *n = ximalloc (size + bracket_bytes);

      strcpy (n, match_lines ? (bk ? line_beg_bk : line_beg_no_bk)
                             : (bk ? word_beg_bk : word_beg_no_bk));
-      size_t total = strlen (n);
+      idx_t total = strlen (n);
      memcpy (n + total, pattern, size);
      total += size;
      strcpy (n + total, match_lines ? (bk ? line_end_bk : line_end_no_bk)
@ -200,26 +322,42 @@ GEAcompile (char *pattern, size_t size, reg_syntax_t syntax_bits)
      size = total;
    }
  else
-    motif = NULL;
+    motif = nullptr;

-  dfacomp (pattern, size, dc->dfa, 1);
+  dfaparse (pattern, size, dc->dfa);
  kwsmusts (dc);
+  dfacomp (nullptr, 0, dc->dfa, 1);
+
+  if (buf)
+    {
+      if (exact || !dfasupported (dc->dfa))
+        {
+          dc->patterns--;
+          dc->pcount++;
+
+          if (!regex_compile (dc, buf, buflen, 0, -1, syntax_bits, false))
+            abort ();
+        }
+
+      if (buf != pattern)
+        free (buf);
+    }

  free (motif);

  return dc;
 }

-size_t
-EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
+ptrdiff_t
+EGexecute (void *vdc, char const *buf, idx_t size, idx_t *match_size,
           char const *start_ptr)
 {
  char const *buflim, *beg, *end, *ptr, *match, *best_match, *mb_start;
  char eol = eolbyte;
  regoff_t start;
-  size_t len, best_len;
+  idx_t len, best_len;
  struct kwsmatch kwsm;
-  size_t i;
+  idx_t i;
  struct dfa_comp *dc = vdc;
  struct dfa *superset = dfasuperset (dc->dfa);
  bool dfafast = dfaisfast (dc->dfa);
@ -234,7 +372,7 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
      if (!start_ptr)
        {
          char const *next_beg, *dfa_beg = beg;
-          size_t count = 0;
+          idx_t count = 0;
          bool exact_kwset_match = false;
          bool backref = false;

@ -248,7 +386,7 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
                                          buflim - beg + dc->begline,
                                          &kwsm, true);
              if (offset < 0)
-                goto failure;
+                return offset;
              match = beg + offset;
              prev_beg = beg;

@ -264,14 +402,19 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
                 greater of the latter two values; this temporarily prefers
                 the DFA to KWset.  */
              exact_kwset_match = kwsm.index < dc->kwset_exact_matches;
-              end = ((exact_kwset_match || !dfafast
-                      || MAX (16, match - beg) < (match - prev_beg) >> 2)
-                     ? match
-                     : MAX (16, match - beg) < (buflim - prev_beg) >> 2
-                     ? prev_beg + 4 * MAX (16, match - beg)
-                     : buflim);
-              end = memchr (end, eol, buflim - end);
-              end = end ? end + 1 : buflim;
+              if (exact_kwset_match || !dfafast
+                  || MAX (16, match - beg) < (match - prev_beg) >> 2)
+                {
+                  end = rawmemchr (match, eol);
+                  end++;
+                }
+              else if (MAX (16, match - beg) < (buflim - prev_beg) >> 2)
+                {
+                  end = rawmemchr (prev_beg + 4 * MAX (16, match - beg), eol);
+                  end++;
+                }
+              else
+                end = buflim;

              if (exact_kwset_match)
                {
@ -279,7 +422,7 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
                    goto success;
                  if (mb_start < beg)
                    mb_start = beg;
-                  if (mb_goback (&mb_start, match, buflim) == 0)
+                  if (mb_goback (&mb_start, nullptr, match, buflim) == 0)
                    goto success;
                  /* The matched line starts in the middle of a multibyte
                     character.  Perform the DFA search starting from the
@ -295,8 +438,8 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
                 potential matches; this is more likely to be fast
                 than falling back to KWset would be.  */
              next_beg = dfaexec (superset, dfa_beg, (char *) end, 0,
-                                  &count, NULL);
-              if (next_beg == NULL || next_beg == end)
+                                  &count, nullptr);
+              if (!next_beg || next_beg == end)
                continue;

              /* Narrow down to the line we've found.  */
@ -306,8 +449,8 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
                  beg++;
                  dfa_beg = beg;
                }
-              end = memchr (next_beg, eol, buflim - next_beg);
-              end = end ? end + 1 : buflim;
+              end = rawmemchr (next_beg, eol);
+              end++;

              count = 0;
            }
@ -318,7 +461,7 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,

          /* If there's no match, or if we've matched the sentinel,
             we're done.  */
-          if (next_beg == NULL || next_beg == end)
+          if (!next_beg || next_beg == end)
            continue;

          /* Narrow down to the line we've found.  */
@ -327,10 +470,10 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
              beg = memrchr (buf, eol, next_beg - buf);
              beg++;
            }
-          end = memchr (next_beg, eol, buflim - next_beg);
-          end = end ? end + 1 : buflim;
+          end = rawmemchr (next_beg, eol);
+          end++;

-          /* Successful, no backreferences encountered! */
+          /* Successful, no back-references encountered! */
          if (!backref)
            goto success;
          ptr = beg;
@ -446,13 +589,11 @@ EGexecute (void *vdc, char const *buf, size_t size, size_t *match_size,
          }
    } /* for (beg = end ..) */

- failure:
  return -1;

 success:
  len = end - beg;
 success_in_len:;
-  size_t off = beg - buf;
  *match_size = len;
-  return off;
+  return beg - buf;
 }
--- a/src/die.h
+++ b/src/die.h
@ -1,5 +1,5 @@
 /* Report an error and exit.
-   Copyright 2016-2018 Free Software Foundation, Inc.
+   Copyright 2016-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,15 +12,12 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 #ifndef DIE_H
 #define DIE_H

 #include <error.h>
-#include <stdbool.h>
 #include <verify.h>

 /* Like 'error (STATUS, ...)', except STATUS must be a nonzero constant.
--- a/src/egrep.sh
+++ b/src/egrep.sh
@ -1,2 +1,4 @@
 #!@SHELL@
+cmd=${0##*/}
+echo "$cmd: warning: $cmd is obsolescent; using @grep@ @option@" >&2
 exec @grep@ @option@ "$@"
--- a/src/grep.c
+++ b/src/grep.c
--- a/src/grep.h
+++ b/src/grep.h
@ -1,5 +1,5 @@
 /* grep.h - interface to grep driver for searching subroutines.
-   Copyright (C) 1992, 1998, 2001, 2007, 2009-2018 Free Software Foundation,
+   Copyright (C) 1992, 1998, 2001, 2007, 2009-2026 Free Software Foundation,
   Inc.

   This program is free software; you can redistribute it and/or modify
@ -13,14 +13,12 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 #ifndef GREP_GREP_H
 #define GREP_GREP_H 1

-#include <stdbool.h>
+#include <idx.h>

 /* The following flags are exported from grep for the matchers
   to look at. */
@ -29,6 +27,6 @@ extern bool match_words;	/* -w */
 extern bool match_lines;	/* -x */
 extern char eolbyte;		/* -z */

-extern char const *pattern_file_name (size_t, size_t *);
+extern char const *pattern_file_name (idx_t, idx_t *);

 #endif
--- a/src/kwsearch.c
+++ b/src/kwsearch.c
@ -1,5 +1,5 @@
 /* kwsearch.c - searching subroutines using kwset for grep.
-   Copyright 1992, 1998, 2000, 2007, 2009-2018 Free Software Foundation, Inc.
+   Copyright 1992, 1998, 2000, 2007, 2009-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,14 +12,12 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 /* Written August 1992 by Mike Haertel. */

 #include <config.h>
-#include "search.h"
+#include <search.h>

 /* A compiled -F pattern list.  */

@ -32,58 +30,46 @@ struct kwsearch
     'kwswords (kwset)' when some extra one-character words have been
     appended, one for each troublesome character that will require a
     DFA search.  */
-  ptrdiff_t words;
+  idx_t words;

  /* The user's pattern and its size in bytes.  */
  char *pattern;
-  size_t size;
+  idx_t size;

  /* The user's pattern compiled as a regular expression,
     or null if it has not been compiled.  */
  void *re;
 };

-/* Compile the -F style PATTERN, containing SIZE bytes.  Return a
-   description of the compiled pattern.  */
+/* Compile the -F style PATTERN, containing SIZE bytes that are
+   followed by '\n'.  Return a description of the compiled pattern.  */

 void *
-Fcompile (char *pattern, size_t size, reg_syntax_t ignored)
+Fcompile (char *pattern, idx_t size, reg_syntax_t ignored, bool exact)
 {
  kwset_t kwset;
-  ptrdiff_t total = size;
-  char *buf = NULL;
-  size_t bufalloc = 0;
+  char *buf = nullptr;
+  idx_t bufalloc = 0;

  kwset = kwsinit (true);

  char const *p = pattern;
  do
    {
-      ptrdiff_t len;
-      char const *sep = memchr (p, '\n', total);
-      if (sep)
-        {
-          len = sep - p;
-          sep++;
-          total -= (len + 1);
-        }
-      else
-        {
-          len = total;
-          total = 0;
-        }
+      char const *sep = rawmemchr (p, '\n');
+      idx_t len = sep - p;

      if (match_lines)
        {
-          if (eolbyte == '\n' && pattern < p && sep)
+          if (eolbyte == '\n' && pattern < p)
            p--;
          else
            {
              if (bufalloc < len + 2)
                {
                  free (buf);
-                  bufalloc = len + 2;
-                  buf = x2realloc (NULL, &bufalloc);
+                  bufalloc = len;
+                  buf = xpalloc (nullptr, &bufalloc, 2, -1, 1);
                  buf[0] = eolbyte;
                }
              memcpy (buf + 1, p, len);
@ -94,45 +80,13 @@ Fcompile (char *pattern, size_t size, reg_syntax_t ignored)
        }
      kwsincr (kwset, p, len);

-      p = sep;
+      p = sep + 1;
    }
-  while (p);
+  while (p <= pattern + size);

  free (buf);
-  ptrdiff_t words = kwswords (kwset);
-
-  if (match_icase)
-    {
-      /* For each pattern character C that has a case folded
-         counterpart F that is multibyte and so cannot easily be
-         implemented via translating a single byte, append a pattern
-         containing just F.  That way, if the data contains F, the
-         matcher can fall back on DFA.  For example, if C is 'i' and
-         the locale is en_US.utf8, append a pattern containing just
-         the character U+0131 (LATIN SMALL LETTER DOTLESS I), so that
-         Fexecute will use a DFA if the data contain U+0131.  */
-      mbstate_t mbs = { 0 };
-      char checked[NCHAR] = {0,};
-      for (p = pattern; p < pattern + size; p++)
-        {
-          unsigned char c = *p;
-          if (checked[c])
-            continue;
-          checked[c] = true;
-
-          wint_t wc = localeinfo.sbctowc[c];
-          wchar_t folded[CASE_FOLDED_BUFSIZE];
-
-          for (int i = case_folded_counterparts (wc, folded); 0 <= --i; )
-            {
-              char s[MB_LEN_MAX];
-              int nbytes = wcrtomb (s, folded[i], &mbs);
-              if (1 < nbytes)
-                kwsincr (kwset, s, nbytes);
-            }
-        }
-    }

+  idx_t words = kwswords (kwset);
  kwsprep (kwset);

  struct kwsearch *kwsearch = xmalloc (sizeof *kwsearch);
@ -140,61 +94,39 @@ Fcompile (char *pattern, size_t size, reg_syntax_t ignored)
  kwsearch->words = words;
  kwsearch->pattern = pattern;
  kwsearch->size = size;
-  kwsearch->re = NULL;
+  kwsearch->re = nullptr;
  return kwsearch;
 }

 /* Use the compiled pattern VCP to search the buffer BUF of size SIZE.
   If found, return the offset of the first match and store its
-   size into *MATCH_SIZE.  If not found, return SIZE_MAX.
+   size into *MATCH_SIZE.  If not found, return -1.
   If START_PTR is nonnull, start searching there.  */
-size_t
-Fexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
+ptrdiff_t
+Fexecute (void *vcp, char const *buf, idx_t size, idx_t *match_size,
          char const *start_ptr)
 {
  char const *beg, *end, *mb_start;
-  ptrdiff_t len;
+  idx_t len;
  char eol = eolbyte;
-  struct kwsmatch kwsmatch;
-  size_t ret_val;
-  bool mb_check;
-  bool longest;
  struct kwsearch *kwsearch = vcp;
  kwset_t kwset = kwsearch->kwset;
-
-  if (match_lines)
-    mb_check = longest = false;
-  else
-    {
-      mb_check = localeinfo.multibyte & !localeinfo.using_utf8;
-      longest = mb_check | !!start_ptr | match_words;
-    }
+  bool mb_check = localeinfo.multibyte & !localeinfo.using_utf8 & !match_lines;
+  bool longest = (mb_check | !!start_ptr | match_words) & !match_lines;

  for (mb_start = beg = start_ptr ? start_ptr : buf; beg <= buf + size; beg++)
    {
+      struct kwsmatch kwsmatch;
      ptrdiff_t offset = kwsexec (kwset, beg - match_lines,
                                  buf + size - beg + match_lines, &kwsmatch,
                                  longest);
      if (offset < 0)
        break;
-      len = kwsmatch.size[0] - 2 * match_lines;
+      len = kwsmatch.size - 2 * match_lines;

-      if (kwsearch->words <= kwsmatch.index)
-        {
-          /* The data contain a multibyte character that matches
-             some pattern character that is a case folded counterpart.
-             Since the kwset code cannot handle this case, fall back
-             on the DFA code, which can.  */
-          if (! kwsearch->re)
-            {
-              fgrep_to_grep_pattern (&kwsearch->pattern, &kwsearch->size);
-              kwsearch->re = GEAcompile (kwsearch->pattern, kwsearch->size,
-                                         RE_SYNTAX_GREP);
-            }
-          return EGexecute (kwsearch->re, buf, size, match_size, start_ptr);
-        }
-
-      if (mb_check && mb_goback (&mb_start, beg + offset, buf + size) != 0)
+      idx_t mbclen = 0;
+      if (mb_check
+          && mb_goback (&mb_start, &mbclen, beg + offset, buf + size) != 0)
        {
          /* We have matched a single byte that is not at the beginning of a
             multibyte character.  mb_goback has advanced MB_START past that
@ -217,19 +149,27 @@ Fexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
        goto success_in_beg_and_len;
      if (match_lines)
        {
-          len += start_ptr == NULL;
+          len += !start_ptr;
          goto success_in_beg_and_len;
        }
      if (! match_words)
        goto success;

-      /* Succeed if the preceding and following characters are word
-         constituents.  If the following character is not a word
-         constituent, keep trying with shorter matches.  */
-      char const *bol = memrchr (mb_start, eol, beg - mb_start);
-      if (bol)
-        mb_start = bol + 1;
-      if (! wordchar_prev (mb_start, beg, buf + size))
+      /* We need a preceding mb_start pointer.  Use the beginning of line
+         if there is a preceding newline.  */
+      if (mbclen == 0)
+        {
+          char const *nl = memrchr (mb_start, eol, beg - mb_start);
+          if (nl)
+            mb_start = nl + 1;
+        }
+
+      /* Succeed if neither the preceding nor the following character is a
+         word constituent.  If the preceding is not, yet the following
+         character IS a word constituent, keep trying with shorter matches.  */
+      if (mbclen > 0
+          ? ! wordchar_next (beg - mbclen, buf + size)
+          : ! wordchar_prev (mb_start, beg, buf + size))
        for (;;)
          {
            if (! wordchar_next (beg + len, buf + size))
@ -239,12 +179,36 @@ Fexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
                else
                  goto success;
              }
+            if (!start_ptr && !localeinfo.multibyte)
+              {
+                if (! kwsearch->re)
+                  {
+                    fgrep_to_grep_pattern (&kwsearch->pattern, &kwsearch->size);
+                    kwsearch->re = GEAcompile (kwsearch->pattern,
+                                               kwsearch->size,
+                                               RE_SYNTAX_GREP, !!start_ptr);
+                  }
+                if (beg + len < buf + size)
+                  {
+                    end = rawmemchr (beg + len, eol);
+                    end++;
+                  }
+                else
+                  end = buf + size;
+
+                if (0 <= EGexecute (kwsearch->re, beg, end - beg,
+                                    match_size, nullptr))
+                  goto success_match_words;
+                beg = end - 1;
+                break;
+              }
            if (!len)
              break;
-            offset = kwsexec (kwset, beg, --len, &kwsmatch, true);
-            if (offset != 0)
+
+            struct kwsmatch shorter_match;
+            if (kwsexec (kwset, beg, --len, &shorter_match, true) != 0)
              break;
-            len = kwsmatch.size[0];
+            len = shorter_match.size;
          }

      /* No word match was found at BEG.  Skip past word constituents,
@ -252,20 +216,23 @@ Fexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
         them could make things much slower.  */
      beg += wordchars_size (beg, buf + size);
      mb_start = beg;
-    } /* for (beg in buf) */
+    }

  return -1;

 success:
-  end = memchr (beg + len, eol, (buf + size) - (beg + len));
-  end = end ? end + 1 : buf + size;
+  if (beg + len < buf + size)
+    {
+      end = rawmemchr (beg + len, eol);
+      end++;
+    }
+  else
+    end = buf + size;
+ success_match_words:
  beg = memrchr (buf, eol, beg - buf);
  beg = beg ? beg + 1 : buf;
  len = end - beg;
 success_in_beg_and_len:;
-  size_t off = beg - buf;
-
  *match_size = len;
-  ret_val = off;
-  return ret_val;
+  return beg - buf;
 }
--- a/src/kwset.c
+++ b/src/kwset.c
@ -1,933 +0,0 @@
-/* kwset.c - search for any of a set of keywords.
-   Copyright (C) 1989, 1998, 2000, 2005, 2007, 2009-2018 Free Software
-   Foundation, Inc.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
-
-/* Written August 1989 by Mike Haertel.  */
-
-/* For the Aho-Corasick algorithm, see:
-   Aho AV, Corasick MJ. Efficient string matching: an aid to
-   bibliographic search. CACM 18, 6 (1975), 333-40
-   <https://dx.doi.org/10.1145/360825.360855>, which describes the
-   failure function used below.
-
-   For the Boyer-Moore algorithm, see: Boyer RS, Moore JS.
-   A fast string searching algorithm. CACM 20, 10 (1977), 762-72
-   <https://dx.doi.org/10.1145/359842.359859>.
-
-   For a survey of more-recent string matching algorithms that might
-   help improve performance, see: Faro S, Lecroq T. The exact online
-   string matching problem: a review of the most recent results.
-   ACM Computing Surveys 45, 2 (2013), 13
-   <https://dx.doi.org/10.1145/2431211.2431212>.  */
-
-#include <config.h>
-
-#include "kwset.h"
-
-#include <stdint.h>
-#include <sys/types.h>
-#include "system.h"
-#include "intprops.h"
-#include "memchr2.h"
-#include "obstack.h"
-#include "xalloc.h"
-#include "verify.h"
-
-#define obstack_chunk_alloc xmalloc
-#define obstack_chunk_free free
-
-static unsigned char
-U (char ch)
-{
-  return to_uchar (ch);
-}
-
-/* Balanced tree of edges and labels leaving a given trie node.  */
-struct tree
-{
-  struct tree *llink;		/* Left link; MUST be first field.  */
-  struct tree *rlink;		/* Right link (to larger labels).  */
-  struct trie *trie;		/* Trie node pointed to by this edge.  */
-  unsigned char label;		/* Label on this edge.  */
-  char balance;			/* Difference in depths of subtrees.  */
-};
-
-/* Node of a trie representing a set of keywords.  */
-struct trie
-{
-  /* If an accepting node, this is either 2*W + 1 where W is the word
-     index, or is SIZE_MAX if Aho-Corasick is in use and FAIL
-     specifies where to look for more info.  If not an accepting node,
-     this is zero.  */
-  size_t accepting;
-
-  struct tree *links;		/* Tree of edges leaving this node.  */
-  struct trie *parent;		/* Parent of this node.  */
-  struct trie *next;		/* List of all trie nodes in level order.  */
-  struct trie *fail;		/* Aho-Corasick failure function.  */
-  ptrdiff_t depth;		/* Depth of this node from the root.  */
-  ptrdiff_t shift;		/* Shift function for search failures.  */
-  ptrdiff_t maxshift;		/* Max shift of self and descendants.  */
-};
-
-/* Structure returned opaquely to the caller, containing everything.  */
-struct kwset
-{
-  struct obstack obstack;	/* Obstack for node allocation.  */
-  ptrdiff_t words;		/* Number of words in the trie.  */
-  struct trie *trie;		/* The trie itself.  */
-  ptrdiff_t mind;		/* Minimum depth of an accepting node.  */
-  ptrdiff_t maxd;		/* Maximum depth of any node.  */
-  unsigned char delta[NCHAR];	/* Delta table for rapid search.  */
-  struct trie *next[NCHAR];	/* Table of children of the root.  */
-  char *target;			/* Target string if there's only one.  */
-  ptrdiff_t *shift;		/* Used in Boyer-Moore search for one
-                                   string.  */
-  char const *trans;		/* Character translation table.  */
-
-  /* This helps to match a terminal byte, which is the first byte
-     for Aho-Corasick, and the last byte for Boyer-More.  If all the
-     patterns have the same terminal byte (after translation via TRANS
-     if TRANS is nonnull), then this is that byte as an unsigned char.
-     Otherwise this is -1 if there is disagreement among the strings
-     about terminal bytes, and -2 if there are no terminal bytes and
-     no disagreement because all the patterns are empty.  */
-  int gc1;
-
-  /* This helps to match a terminal byte.  If 0 <= GC1HELP, B is
-     terminal when B == GC1 || B == GC1HELP (note that GC1 == GCHELP
-     is common here).  This is typically faster than evaluating
-     to_uchar (TRANS[B]) == GC1.  */
-  int gc1help;
-
-  /* If the string has two or more bytes, this is the penultimate byte,
-     after translation via TRANS if TRANS is nonnull.  This variable
-     is used only by Boyer-Moore.  */
-  char gc2;
-
-  /* kwsexec implementation.  */
-  ptrdiff_t (*kwsexec) (kwset_t, char const *, ptrdiff_t,
-                        struct kwsmatch *, bool);
-};
-
-/* Use TRANS to transliterate C.  A null TRANS does no transliteration.  */
-static inline char
-tr (char const *trans, char c)
-{
-  return trans ? trans[U(c)] : c;
-}
-
-static ptrdiff_t acexec (kwset_t, char const *, ptrdiff_t,
-                         struct kwsmatch *, bool);
-static ptrdiff_t bmexec (kwset_t, char const *, ptrdiff_t,
-                         struct kwsmatch *, bool);
-
-/* Return a newly allocated keyword set.  A nonnull TRANS specifies a
-   table of character translations to be applied to all pattern and
-   search text.  */
-kwset_t
-kwsalloc (char const *trans)
-{
-  struct kwset *kwset = xmalloc (sizeof *kwset);
-
-  obstack_init (&kwset->obstack);
-  kwset->words = 0;
-  kwset->trie = obstack_alloc (&kwset->obstack, sizeof *kwset->trie);
-  kwset->trie->accepting = 0;
-  kwset->trie->links = NULL;
-  kwset->trie->parent = NULL;
-  kwset->trie->next = NULL;
-  kwset->trie->fail = NULL;
-  kwset->trie->depth = 0;
-  kwset->trie->shift = 0;
-  kwset->mind = PTRDIFF_MAX;
-  kwset->maxd = -1;
-  kwset->target = NULL;
-  kwset->trans = trans;
-  kwset->kwsexec = acexec;
-
-  return kwset;
-}
-
-/* This upper bound is valid for CHAR_BIT >= 4 and
-   exact for CHAR_BIT in { 4..11, 13, 15, 17, 19 }.  */
-enum { DEPTH_SIZE = CHAR_BIT + CHAR_BIT / 2 };
-
-/* Add the given string to the contents of the keyword set.  */
-void
-kwsincr (kwset_t kwset, char const *text, ptrdiff_t len)
-{
-  assume (0 <= len);
-  struct trie *trie = kwset->trie;
-  char const *trans = kwset->trans;
-  bool reverse = kwset->kwsexec == bmexec;
-
-  if (reverse)
-    text += len;
-
-  /* Descend the trie (built of keywords) character-by-character,
-     installing new nodes when necessary.  */
-  while (len--)
-    {
-      unsigned char uc = reverse ? *--text : *text++;
-      unsigned char label = trans ? trans[uc] : uc;
-
-      /* Descend the tree of outgoing links for this trie node,
-         looking for the current character and keeping track
-         of the path followed.  */
-      struct tree *cur = trie->links;
-      struct tree *links[DEPTH_SIZE];
-      enum { L, R } dirs[DEPTH_SIZE];
-      links[0] = (struct tree *) &trie->links;
-      dirs[0] = L;
-      ptrdiff_t depth = 1;
-
-      while (cur && label != cur->label)
-        {
-          links[depth] = cur;
-          if (label < cur->label)
-            dirs[depth++] = L, cur = cur->llink;
-          else
-            dirs[depth++] = R, cur = cur->rlink;
-        }
-
-      /* The current character doesn't have an outgoing link at
-         this trie node, so build a new trie node and install
-         a link in the current trie node's tree.  */
-      if (!cur)
-        {
-          cur = obstack_alloc (&kwset->obstack, sizeof *cur);
-          cur->llink = NULL;
-          cur->rlink = NULL;
-          cur->trie = obstack_alloc (&kwset->obstack, sizeof *cur->trie);
-          cur->trie->accepting = 0;
-          cur->trie->links = NULL;
-          cur->trie->parent = trie;
-          cur->trie->next = NULL;
-          cur->trie->fail = NULL;
-          cur->trie->depth = trie->depth + 1;
-          cur->trie->shift = 0;
-          cur->label = label;
-          cur->balance = 0;
-
-          /* Install the new tree node in its parent.  */
-          if (dirs[--depth] == L)
-            links[depth]->llink = cur;
-          else
-            links[depth]->rlink = cur;
-
-          /* Back up the tree fixing the balance flags.  */
-          while (depth && !links[depth]->balance)
-            {
-              if (dirs[depth] == L)
-                --links[depth]->balance;
-              else
-                ++links[depth]->balance;
-              --depth;
-            }
-
-          /* Rebalance the tree by pointer rotations if necessary.  */
-          if (depth && ((dirs[depth] == L && --links[depth]->balance)
-                        || (dirs[depth] == R && ++links[depth]->balance)))
-            {
-              struct tree *t, *r, *l, *rl, *lr;
-
-              switch (links[depth]->balance)
-                {
-                case (char) -2:
-                  switch (dirs[depth + 1])
-                    {
-                    case L:
-                      r = links[depth], t = r->llink, rl = t->rlink;
-                      t->rlink = r, r->llink = rl;
-                      t->balance = r->balance = 0;
-                      break;
-                    case R:
-                      r = links[depth], l = r->llink, t = l->rlink;
-                      rl = t->rlink, lr = t->llink;
-                      t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
-                      l->balance = t->balance != 1 ? 0 : -1;
-                      r->balance = t->balance != (char) -1 ? 0 : 1;
-                      t->balance = 0;
-                      break;
-                    default:
-                      abort ();
-                    }
-                  break;
-                case 2:
-                  switch (dirs[depth + 1])
-                    {
-                    case R:
-                      l = links[depth], t = l->rlink, lr = t->llink;
-                      t->llink = l, l->rlink = lr;
-                      t->balance = l->balance = 0;
-                      break;
-                    case L:
-                      l = links[depth], r = l->rlink, t = r->llink;
-                      lr = t->llink, rl = t->rlink;
-                      t->llink = l, l->rlink = lr, t->rlink = r, r->llink = rl;
-                      l->balance = t->balance != 1 ? 0 : -1;
-                      r->balance = t->balance != (char) -1 ? 0 : 1;
-                      t->balance = 0;
-                      break;
-                    default:
-                      abort ();
-                    }
-                  break;
-                default:
-                  abort ();
-                }
-
-              if (dirs[depth - 1] == L)
-                links[depth - 1]->llink = t;
-              else
-                links[depth - 1]->rlink = t;
-            }
-        }
-
-      trie = cur->trie;
-    }
-
-  /* Mark the node finally reached as accepting, encoding the
-     index number of this word in the keyword set so far.  */
-  if (!trie->accepting)
-    {
-      size_t words = kwset->words;
-      trie->accepting = 2 * words + 1;
-    }
-  ++kwset->words;
-
-  /* Keep track of the longest and shortest string of the keyword set.  */
-  if (trie->depth < kwset->mind)
-    kwset->mind = trie->depth;
-  if (trie->depth > kwset->maxd)
-    kwset->maxd = trie->depth;
-}
-
-ptrdiff_t
-kwswords (kwset_t kwset)
-{
-  return kwset->words;
-}
-
-/* Enqueue the trie nodes referenced from the given tree in the
-   given queue.  */
-static void
-enqueue (struct tree *tree, struct trie **last)
-{
-  if (!tree)
-    return;
-  enqueue (tree->llink, last);
-  enqueue (tree->rlink, last);
-  (*last) = (*last)->next = tree->trie;
-}
-
-/* Compute the Aho-Corasick failure function for the trie nodes referenced
-   from the given tree, given the failure function for their parent as
-   well as a last resort failure node.  */
-static void
-treefails (struct tree const *tree, struct trie const *fail,
-           struct trie *recourse, bool reverse)
-{
-  struct tree *cur;
-
-  if (!tree)
-    return;
-
-  treefails (tree->llink, fail, recourse, reverse);
-  treefails (tree->rlink, fail, recourse, reverse);
-
-  /* Find, in the chain of fails going back to the root, the first
-     node that has a descendant on the current label.  */
-  while (fail)
-    {
-      cur = fail->links;
-      while (cur && tree->label != cur->label)
-        if (tree->label < cur->label)
-          cur = cur->llink;
-        else
-          cur = cur->rlink;
-      if (cur)
-        {
-          tree->trie->fail = cur->trie;
-          if (!reverse && cur->trie->accepting && !tree->trie->accepting)
-            tree->trie->accepting = SIZE_MAX;
-          return;
-        }
-      fail = fail->fail;
-    }
-
-  tree->trie->fail = recourse;
-}
-
-/* Set delta entries for the links of the given tree such that
-   the preexisting delta value is larger than the current depth.  */
-static void
-treedelta (struct tree const *tree, ptrdiff_t depth, unsigned char delta[])
-{
-  if (!tree)
-    return;
-  treedelta (tree->llink, depth, delta);
-  treedelta (tree->rlink, depth, delta);
-  if (depth < delta[tree->label])
-    delta[tree->label] = depth;
-}
-
-/* Return true if A has every label in B.  */
-static bool _GL_ATTRIBUTE_PURE
-hasevery (struct tree const *a, struct tree const *b)
-{
-  if (!b)
-    return true;
-  if (!hasevery (a, b->llink))
-    return false;
-  if (!hasevery (a, b->rlink))
-    return false;
-  while (a && b->label != a->label)
-    if (b->label < a->label)
-      a = a->llink;
-    else
-      a = a->rlink;
-  return !!a;
-}
-
-/* Compute a vector, indexed by character code, of the trie nodes
-   referenced from the given tree.  */
-static void
-treenext (struct tree const *tree, struct trie *next[])
-{
-  if (!tree)
-    return;
-  treenext (tree->llink, next);
-  treenext (tree->rlink, next);
-  next[tree->label] = tree->trie;
-}
-
-/* Prepare a built keyword set for use.  */
-void
-kwsprep (kwset_t kwset)
-{
-  char const *trans = kwset->trans;
-  ptrdiff_t i;
-  unsigned char deltabuf[NCHAR];
-  unsigned char *delta = trans ? deltabuf : kwset->delta;
-  struct trie *curr, *last;
-
-  /* Use Boyer-Moore if just one pattern, Aho-Corasick otherwise.  */
-  bool reverse = kwset->words == 1;
-
-  if (reverse)
-    {
-      kwset_t new_kwset;
-
-      /* Enqueue the immediate descendants in the level order queue.  */
-      for (curr = last = kwset->trie; curr; curr = curr->next)
-        enqueue (curr->links, &last);
-
-      /* Looking for just one string.  Extract it from the trie.  */
-      kwset->target = obstack_alloc (&kwset->obstack, kwset->mind);
-      for (i = 0, curr = kwset->trie; i < kwset->mind; ++i)
-        {
-          kwset->target[i] = curr->links->label;
-          curr = curr->next;
-        }
-
-      new_kwset = kwsalloc (kwset->trans);
-      new_kwset->kwsexec = bmexec;
-      kwsincr (new_kwset, kwset->target, kwset->mind);
-      obstack_free (&kwset->obstack, NULL);
-      *kwset = *new_kwset;
-      free (new_kwset);
-    }
-
-  /* Initial values for the delta table; will be changed later.  The
-     delta entry for a given character is the smallest depth of any
-     node at which an outgoing edge is labeled by that character.  */
-  memset (delta, MIN (kwset->mind, UCHAR_MAX), sizeof deltabuf);
-
-  /* Traverse the nodes of the trie in level order, simultaneously
-     computing the delta table, failure function, and shift function.  */
-  for (curr = last = kwset->trie; curr; curr = curr->next)
-    {
-      /* Enqueue the immediate descendants in the level order queue.  */
-      enqueue (curr->links, &last);
-
-      /* Update the delta table for the descendants of this node.  */
-      treedelta (curr->links, curr->depth, delta);
-
-      /* Compute the failure function for the descendants of this node.  */
-      treefails (curr->links, curr->fail, kwset->trie, reverse);
-
-      if (reverse)
-        {
-          curr->shift = kwset->mind;
-          curr->maxshift = kwset->mind;
-
-          /* Update the shifts at each node in the current node's chain
-             of fails back to the root.  */
-          struct trie *fail;
-          for (fail = curr->fail; fail; fail = fail->fail)
-            {
-              /* If the current node has some outgoing edge that the fail
-                 doesn't, then the shift at the fail should be no larger
-                 than the difference of their depths.  */
-              if (!hasevery (fail->links, curr->links))
-                if (curr->depth - fail->depth < fail->shift)
-                  fail->shift = curr->depth - fail->depth;
-
-              /* If the current node is accepting then the shift at the
-                 fail and its descendants should be no larger than the
-                 difference of their depths.  */
-              if (curr->accepting && fail->maxshift > curr->depth - fail->depth)
-                fail->maxshift = curr->depth - fail->depth;
-            }
-        }
-    }
-
-  if (reverse)
-    {
-      /* Traverse the trie in level order again, fixing up all nodes whose
-         shift exceeds their inherited maxshift.  */
-      for (curr = kwset->trie->next; curr; curr = curr->next)
-        {
-          if (curr->maxshift > curr->parent->maxshift)
-            curr->maxshift = curr->parent->maxshift;
-          if (curr->shift > curr->maxshift)
-            curr->shift = curr->maxshift;
-        }
-    }
-
-  /* Create a vector, indexed by character code, of the outgoing links
-     from the root node.  Accumulate GC1 and GC1HELP.  */
-  struct trie *nextbuf[NCHAR];
-  struct trie **next = trans ? nextbuf : kwset->next;
-  memset (next, 0, sizeof nextbuf);
-  treenext (kwset->trie->links, next);
-  int gc1 = -2;
-  int gc1help = -1;
-  for (i = 0; i < NCHAR; i++)
-    {
-      int ti = i;
-      if (trans)
-        {
-          ti = U(trans[i]);
-          kwset->next[i] = next[ti];
-        }
-      if (kwset->next[i])
-        {
-          if (gc1 < -1)
-            {
-              gc1 = ti;
-              gc1help = i;
-            }
-          else if (gc1 == ti)
-            gc1help = gc1help == ti ? i : -1;
-          else if (i == ti && gc1 == gc1help)
-            gc1help = i;
-          else
-            gc1 = -1;
-        }
-    }
-  kwset->gc1 = gc1;
-  kwset->gc1help = gc1help;
-
-  if (reverse)
-    {
-      /* Looking for just one string.  Extract it from the trie.  */
-      kwset->target = obstack_alloc (&kwset->obstack, kwset->mind);
-      for (i = kwset->mind - 1, curr = kwset->trie; i >= 0; --i)
-        {
-          kwset->target[i] = curr->links->label;
-          curr = curr->next;
-        }
-
-      if (kwset->mind > 1)
-        {
-          /* Looking for the delta2 shift that might be made after a
-             backwards match has failed.  Extract it from the trie.  */
-          kwset->shift
-            = obstack_alloc (&kwset->obstack,
-                             sizeof *kwset->shift * (kwset->mind - 1));
-          for (i = 0, curr = kwset->trie->next; i < kwset->mind - 1; ++i)
-            {
-              kwset->shift[i] = curr->shift;
-              curr = curr->next;
-            }
-
-          /* The penultimate byte.  */
-          kwset->gc2 = tr (trans, kwset->target[kwset->mind - 2]);
-        }
-    }
-
-  /* Fix things up for any translation table.  */
-  if (trans)
-    for (i = 0; i < NCHAR; ++i)
-      kwset->delta[i] = delta[U(trans[i])];
-}
-
-/* Delta2 portion of a Boyer-Moore search.  *TP is the string text
-   pointer; it is updated in place.  EP is the end of the string text,
-   and SP the end of the pattern.  LEN is the pattern length; it must
-   be at least 2.  TRANS, if nonnull, is the input translation table.
-   GC1 and GC2 are the last and second-from last bytes of the pattern,
-   transliterated by TRANS; the caller precomputes them for
-   efficiency.  If D1 is nonnull, it is a delta1 table for shifting *TP
-   when failing.  KWSET->shift says how much to shift.  */
-static inline bool
-bm_delta2_search (char const **tpp, char const *ep, char const *sp,
-                  ptrdiff_t len,
-                  char const *trans, char gc1, char gc2,
-                  unsigned char const *d1, kwset_t kwset)
-{
-  char const *tp = *tpp;
-  ptrdiff_t d = len, skip = 0;
-
-  while (true)
-    {
-      ptrdiff_t i = 2;
-      if (tr (trans, tp[-2]) == gc2)
-        {
-          while (++i <= d)
-            if (tr (trans, tp[-i]) != tr (trans, sp[-i]))
-              break;
-          if (i > d)
-            {
-              for (i = d + skip + 1; i <= len; ++i)
-                if (tr (trans, tp[-i]) != tr (trans, sp[-i]))
-                  break;
-              if (i > len)
-                {
-                  *tpp = tp - len;
-                  return true;
-                }
-            }
-        }
-
-      tp += d = kwset->shift[i - 2];
-      if (tp > ep)
-        break;
-      if (tr (trans, tp[-1]) != gc1)
-        {
-          if (d1)
-            tp += d1[U(tp[-1])];
-          break;
-        }
-      skip = i - 1;
-    }
-
-  *tpp = tp;
-  return false;
-}
-
-/* Return the address of the first byte in the buffer S (of size N)
-   that matches the terminal byte specified by KWSET, or NULL if there
-   is no match.  KWSET->gc1 should be nonnegative.  */
-static char const *
-memchr_kwset (char const *s, ptrdiff_t n, kwset_t kwset)
-{
-  char const *slim = s + n;
-  if (kwset->gc1help < 0)
-    {
-      for (; s < slim; s++)
-        if (kwset->next[U(*s)])
-          return s;
-    }
-  else
-    {
-      int small_heuristic = 2;
-      size_t small_bytes = small_heuristic * sizeof (unsigned long int);
-      while (s < slim)
-        {
-          if (kwset->next[U(*s)])
-            return s;
-          s++;
-          if ((uintptr_t) s % small_bytes == 0)
-            return memchr2 (s, kwset->gc1, kwset->gc1help, slim - s);
-        }
-    }
-  return NULL;
-}
-
-/* Fast Boyer-Moore search (inlinable version).  */
-static inline ptrdiff_t
-bmexec_trans (kwset_t kwset, char const *text, ptrdiff_t size)
-{
-  assume (0 <= size);
-  unsigned char const *d1;
-  char const *ep, *sp, *tp;
-  int d;
-  ptrdiff_t len = kwset->mind;
-  char const *trans = kwset->trans;
-
-  if (len == 0)
-    return 0;
-  if (len > size)
-    return -1;
-  if (len == 1)
-    {
-      tp = memchr_kwset (text, size, kwset);
-      return tp ? tp - text : -1;
-    }
-
-  d1 = kwset->delta;
-  sp = kwset->target + len;
-  tp = text + len;
-  char gc1 = kwset->gc1;
-  char gc2 = kwset->gc2;
-
-  /* Significance of 12: 1 (initial offset) + 10 (skip loop) + 1 (md2).  */
-  ptrdiff_t len12;
-  if (!INT_MULTIPLY_WRAPV (len, 12, &len12) && len12 < size)
-    /* 11 is not a bug, the initial offset happens only once.  */
-    for (ep = text + size - 11 * len; tp <= ep; )
-      {
-        char const *tp0 = tp;
-        d = d1[U(tp[-1])], tp += d;
-        d = d1[U(tp[-1])], tp += d;
-        if (d != 0)
-          {
-            d = d1[U(tp[-1])], tp += d;
-            d = d1[U(tp[-1])], tp += d;
-            d = d1[U(tp[-1])], tp += d;
-            if (d != 0)
-              {
-                d = d1[U(tp[-1])], tp += d;
-                d = d1[U(tp[-1])], tp += d;
-                d = d1[U(tp[-1])], tp += d;
-                if (d != 0)
-                  {
-                    d = d1[U(tp[-1])], tp += d;
-                    d = d1[U(tp[-1])], tp += d;
-
-                    /* As a heuristic, prefer memchr to seeking by
-                       delta1 when the latter doesn't advance much.  */
-                    int advance_heuristic = 16 * sizeof (long);
-                    if (advance_heuristic <= tp - tp0)
-                      continue;
-                    tp--;
-                    tp = memchr_kwset (tp, text + size - tp, kwset);
-                    if (! tp)
-                      return -1;
-                    tp++;
-                    if (ep <= tp)
-                      break;
-                  }
-              }
-          }
-        if (bm_delta2_search (&tp, ep, sp, len, trans, gc1, gc2, d1, kwset))
-          return tp - text;
-      }
-
-  /* Now only a few characters are left to search.  Carefully avoid
-     ever producing an out-of-bounds pointer.  */
-  ep = text + size;
-  d = d1[U(tp[-1])];
-  while (d <= ep - tp)
-    {
-      d = d1[U((tp += d)[-1])];
-      if (d != 0)
-        continue;
-      if (bm_delta2_search (&tp, ep, sp, len, trans, gc1, gc2, NULL, kwset))
-        return tp - text;
-    }
-
-  return -1;
-}
-
-/* Fast Boyer-Moore search.  */
-static ptrdiff_t
-bmexec (kwset_t kwset, char const *text, ptrdiff_t size,
-        struct kwsmatch *kwsmatch, bool longest)
-{
-  /* Help the compiler inline in two ways, depending on whether
-     kwset->trans is null.  */
-  ptrdiff_t ret = (IGNORE_DUPLICATE_BRANCH_WARNING
-                   (kwset->trans
-                    ? bmexec_trans (kwset, text, size)
-                    : bmexec_trans (kwset, text, size)));
-  if (0 <= ret)
-    {
-       kwsmatch->index = 0;
-       kwsmatch->offset[0] = ret;
-       kwsmatch->size[0] = kwset->mind;
-    }
-
-  return ret;
-}
-
-/* Hairy multiple string search with the Aho-Corasick algorithm.
-   (inlinable version)  */
-static inline ptrdiff_t
-acexec_trans (kwset_t kwset, char const *text, ptrdiff_t len,
-              struct kwsmatch *kwsmatch, bool longest)
-{
-  struct trie const *trie, *accept;
-  char const *tp, *left, *lim;
-  struct tree const *tree;
-  char const *trans;
-
-  /* Initialize register copies and look for easy ways out.  */
-  if (len < kwset->mind)
-    return -1;
-  trans = kwset->trans;
-  trie = kwset->trie;
-  lim = text + len;
-  tp = text;
-
-  if (!trie->accepting)
-    {
-      unsigned char c;
-      int gc1 = kwset->gc1;
-
-      while (true)
-        {
-          if (gc1 < 0)
-            {
-              while (! (trie = kwset->next[c = tr (trans, *tp++)]))
-                if (tp >= lim)
-                  return -1;
-            }
-          else
-            {
-              tp = memchr_kwset (tp, lim - tp, kwset);
-              if (!tp)
-                return -1;
-              c = tr (trans, *tp++);
-              trie = kwset->next[c];
-            }
-
-          while (true)
-            {
-              if (trie->accepting)
-                goto match;
-              if (tp >= lim)
-                return -1;
-              c = tr (trans, *tp++);
-
-              for (tree = trie->links; c != tree->label; )
-                {
-                  tree = c < tree->label ? tree->llink : tree->rlink;
-                  if (! tree)
-                    {
-                      trie = trie->fail;
-                      if (!trie)
-                        {
-                          trie = kwset->next[c];
-                          if (trie)
-                            goto have_trie;
-                          if (tp >= lim)
-                            return -1;
-                          goto next_c;
-                        }
-                      if (trie->accepting)
-                        {
-                          --tp;
-                          goto match;
-                        }
-                      tree = trie->links;
-                    }
-                }
-              trie = tree->trie;
-            have_trie:;
-            }
-        next_c:;
-        }
-    }
-
- match:
-  accept = trie;
-  while (accept->accepting == SIZE_MAX)
-    accept = accept->fail;
-  left = tp - accept->depth;
-
-  /* Try left-most longest match.  */
-  if (longest)
-    {
-      while (tp < lim)
-        {
-          struct trie const *accept1;
-          char const *left1;
-          unsigned char c = tr (trans, *tp++);
-
-          do
-            {
-              tree = trie->links;
-              while (tree && c != tree->label)
-                tree = c < tree->label ? tree->llink : tree->rlink;
-            }
-          while (!tree && (trie = trie->fail) && accept->depth <= trie->depth);
-
-          if (!tree)
-            break;
-          trie = tree->trie;
-          if (trie->accepting)
-            {
-              accept1 = trie;
-              while (accept1->accepting == SIZE_MAX)
-                accept1 = accept1->fail;
-              left1 = tp - accept1->depth;
-              if (left1 <= left)
-                {
-                  left = left1;
-                  accept = accept1;
-                }
-            }
-        }
-    }
-
-  kwsmatch->index = accept->accepting / 2;
-  kwsmatch->offset[0] = left - text;
-  kwsmatch->size[0] = accept->depth;
-
-  return left - text;
-}
-
-/* Hairy multiple string search with Aho-Corasick algorithm.  */
-static ptrdiff_t
-acexec (kwset_t kwset, char const *text, ptrdiff_t size,
-        struct kwsmatch *kwsmatch, bool longest)
-{
-  assume (0 <= size);
-  /* Help the compiler inline in two ways, depending on whether
-     kwset->trans is null.  */
-  return (IGNORE_DUPLICATE_BRANCH_WARNING
-          (kwset->trans
-           ? acexec_trans (kwset, text, size, kwsmatch, longest)
-           : acexec_trans (kwset, text, size, kwsmatch, longest)));
-}
-
-/* Find the first instance of a KWSET member in TEXT, which has SIZE bytes.
-   Return the offset (into TEXT) of the first byte of the matching substring,
-   or -1 if no match is found.  Upon a match, store details in
-   *KWSMATCH: index of matched keyword, start offset (same as the return
-   value), and length.  If LONGEST, find the longest match; otherwise
-   any match will do.  */
-ptrdiff_t
-kwsexec (kwset_t kwset, char const *text, ptrdiff_t size,
-         struct kwsmatch *kwsmatch, bool longest)
-{
-  return kwset->kwsexec (kwset, text, size, kwsmatch, longest);
-}
-
-/* Free the components of the given keyword set.  */
-void
-kwsfree (kwset_t kwset)
-{
-  obstack_free (&kwset->obstack, NULL);
-  free (kwset);
-}
--- a/src/kwset.h
+++ b/src/kwset.h
@ -1,44 +0,0 @@
-/* kwset.h - header declaring the keyword set library.
-   Copyright (C) 1989, 1998, 2005, 2007, 2009-2018 Free Software Foundation,
-   Inc.
-
-   This program is free software; you can redistribute it and/or modify
-   it under the terms of the GNU General Public License as published by
-   the Free Software Foundation; either version 3, or (at your option)
-   any later version.
-
-   This program is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-   GNU General Public License for more details.
-
-   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
-
-/* Written August 1989 by Mike Haertel.  */
-
-#include <stddef.h>
-#include <stdbool.h>
-
-struct kwsmatch
-{
-  ptrdiff_t index;			/* Index number of matching keyword.  */
-  ptrdiff_t offset[1];		/* Offset of match.  */
-  ptrdiff_t size[1];		/* Length of match.  */
-};
-
-#include "arg-nonnull.h"
-
-struct kwset;
-typedef struct kwset *kwset_t;
-
-extern kwset_t kwsalloc (char const *);
-extern void kwsincr (kwset_t, char const *, ptrdiff_t);
-extern ptrdiff_t kwswords (kwset_t) _GL_ATTRIBUTE_PURE;
-extern void kwsprep (kwset_t);
-extern ptrdiff_t kwsexec (kwset_t, char const *, ptrdiff_t,
-                          struct kwsmatch *, bool)
-  _GL_ARG_NONNULL ((4));
-extern void kwsfree (kwset_t);
--- a/src/pcresearch.c
+++ b/src/pcresearch.c
@ -1,5 +1,5 @@
 /* pcresearch.c - searching subroutines using PCRE for grep.
-   Copyright 2000, 2007, 2009-2018 Free Software Foundation, Inc.
+   Copyright 2000, 2007, 2009-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,235 +12,286 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
-
-/* Written August 1992 by Mike Haertel. */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 #include <config.h>
-#include "search.h"
+
+#include <search.h>
 #include "die.h"

-#if HAVE_LIBPCRE
-# include <pcre.h>
+#include <stdckdint.h>

-/* This must be at least 2; everything after that is for performance
-   in pcre_exec.  */
-enum { NSUB = 300 };
+#define PCRE2_CODE_UNIT_WIDTH 8
+#include <pcre2.h>

-# ifndef PCRE_EXTRA_MATCH_LIMIT_RECURSION
-#  define PCRE_EXTRA_MATCH_LIMIT_RECURSION 0
-# endif
-# ifndef PCRE_STUDY_JIT_COMPILE
-#  define PCRE_STUDY_JIT_COMPILE 0
-# endif
-# ifndef PCRE_STUDY_EXTRA_NEEDED
-#  define PCRE_STUDY_EXTRA_NEEDED 0
-# endif
+/* For older PCRE2.  */
+#ifndef PCRE2_SIZE_MAX
+# define PCRE2_SIZE_MAX SIZE_MAX
+#endif
+#ifndef PCRE2_CONFIG_DEPTHLIMIT
+# define PCRE2_CONFIG_DEPTHLIMIT PCRE2_CONFIG_RECURSIONLIMIT
+# define PCRE2_ERROR_DEPTHLIMIT PCRE2_ERROR_RECURSIONLIMIT
+# define pcre2_set_depth_limit pcre2_set_recursion_limit
+#endif
+#ifndef PCRE2_EXTRA_ASCII_BSD
+# define PCRE2_EXTRA_ASCII_BSD 0
+#endif
+
+/* Use PCRE2_MATCH_INVALID_UTF if supported and not buggy;
+   see <https://github.com/PCRE2Project/pcre2/issues/224>.
+   Assume the bug will be fixed after PCRE2 10.42.  */
+#if defined PCRE2_MATCH_INVALID_UTF && 10 < PCRE2_MAJOR + (42 < PCRE2_MINOR)
+enum { MATCH_INVALID_UTF = PCRE2_MATCH_INVALID_UTF };
+#else
+enum { MATCH_INVALID_UTF = 0 };
+#endif

 struct pcre_comp
 {
+  /* General context for PCRE operations.  */
+  pcre2_general_context *gcontext;
+
  /* Compiled internal form of a Perl regular expression.  */
-  pcre *cre;
+  pcre2_code *cre;

-  /* Additional information about the pattern.  */
-  pcre_extra *extra;
+  /* Match context and data block.  */
+  pcre2_match_context *mcontext;
+  pcre2_match_data *data;

-# if PCRE_STUDY_JIT_COMPILE
  /* The JIT stack and its maximum size.  */
-  pcre_jit_stack *jit_stack;
-  int jit_stack_size;
-# endif
+  pcre2_jit_stack *jit_stack;
+  idx_t jit_stack_size;

-  /* Table, indexed by ! (flag & PCRE_NOTBOL), of whether the empty
+  /* Table, indexed by ! (flag & PCRE2_NOTBOL), of whether the empty
     string matches when that flag is used.  */
  int empty_match[2];
 };

+/* Memory allocation functions for PCRE.  */
+static void *
+private_malloc (PCRE2_SIZE size, _GL_UNUSED void *unused)
+{
+  if (IDX_MAX < size)
+    xalloc_die ();
+  return ximalloc (size);
+}
+static void
+private_free (void *ptr, _GL_UNUSED void *unused)
+{
+  free (ptr);
+}
+
+void
+Pprint_version (void)
+{
+  char *buf = ximalloc (pcre2_config (PCRE2_CONFIG_VERSION, nullptr));
+  pcre2_config (PCRE2_CONFIG_VERSION, buf);
+  printf (_("\ngrep -P uses PCRE2 %s\n"), buf);
+  free (buf);
+}

 /* Match the already-compiled PCRE pattern against the data in SUBJECT,
   of size SEARCH_BYTES and starting with offset SEARCH_OFFSET, with
-   options OPTIONS, and storing resulting matches into SUB.  Return
-   the (nonnegative) match location or a (negative) error number.  */
+   options OPTIONS.
+   Return the (nonnegative) match count or a (negative) error number.  */
 static int
-jit_exec (struct pcre_comp *pc, char const *subject, int search_bytes,
-          int search_offset, int options, int *sub)
+jit_exec (struct pcre_comp *pc, char const *subject, idx_t search_bytes,
+          idx_t search_offset, int options)
 {
  while (true)
    {
-      int e = pcre_exec (pc->cre, pc->extra, subject, search_bytes,
-                         search_offset, options, sub, NSUB);
+      /* STACK_GROWTH_RATE is taken from PCRE's src/pcre2_jit_compile.c.
+         Going over the jitstack_max limit could trigger an int
+         overflow bug.  */
+      int STACK_GROWTH_RATE = 8192;
+      idx_t jitstack_max = MIN (IDX_MAX, SIZE_MAX - (STACK_GROWTH_RATE - 1));

-# if PCRE_STUDY_JIT_COMPILE
-      if (e == PCRE_ERROR_JIT_STACKLIMIT
-          && 0 < pc->jit_stack_size && pc->jit_stack_size <= INT_MAX / 2)
+      int e = pcre2_match (pc->cre, (PCRE2_SPTR) subject, search_bytes,
+                           search_offset, options, pc->data, pc->mcontext);
+      if (e == PCRE2_ERROR_JIT_STACKLIMIT
+          && pc->jit_stack_size <= jitstack_max / 2)
        {
-          int old_size = pc->jit_stack_size;
-          int new_size = pc->jit_stack_size = old_size * 2;
-          if (pc->jit_stack)
-            pcre_jit_stack_free (pc->jit_stack);
-          pc->jit_stack = pcre_jit_stack_alloc (old_size, new_size);
+          idx_t old_size = pc->jit_stack_size;
+          idx_t new_size = pc->jit_stack_size = old_size * 2;
+          pcre2_jit_stack_free (pc->jit_stack);
+          pc->jit_stack = pcre2_jit_stack_create (old_size, new_size,
+                                                  pc->gcontext);
          if (!pc->jit_stack)
-            die (EXIT_TROUBLE, 0,
-                 _("failed to allocate memory for the PCRE JIT stack"));
-          pcre_assign_jit_stack (pc->extra, NULL, pc->jit_stack);
-          continue;
+            xalloc_die ();
+          if (!pc->mcontext)
+            pc->mcontext = pcre2_match_context_create (pc->gcontext);
+          pcre2_jit_stack_assign (pc->mcontext, nullptr, pc->jit_stack);
        }
-# endif
-
-# if PCRE_EXTRA_MATCH_LIMIT_RECURSION
-      if (e == PCRE_ERROR_RECURSIONLIMIT
-          && (PCRE_STUDY_EXTRA_NEEDED || pc->extra)
-          && pc->extra->match_limit_recursion <= ULONG_MAX / 2)
+      else if (e == PCRE2_ERROR_DEPTHLIMIT)
        {
-          pc->extra->match_limit_recursion *= 2;
-          if (pc->extra->match_limit_recursion == 0)
-            {
-              pc->extra->match_limit_recursion = (1 << 24) - 1;
-              pc->extra->flags |= PCRE_EXTRA_MATCH_LIMIT_RECURSION;
-            }
-          continue;
+          uint32_t lim;
+          pcre2_config (PCRE2_CONFIG_DEPTHLIMIT, &lim);
+          if (ckd_mul (&lim, lim, 2))
+            return e;
+          if (!pc->mcontext)
+            pc->mcontext = pcre2_match_context_create (pc->gcontext);
+          pcre2_set_depth_limit (pc->mcontext, lim);
        }
-# endif
-
-      return e;
+      else
+        return e;
    }
 }

-#endif
+/* Return true if E is an error code for bad UTF-8.  */
+static bool
+bad_utf8_from_pcre2 (int e)
+{
+  return PCRE2_ERROR_UTF8_ERR21 <= e && e <= PCRE2_ERROR_UTF8_ERR1;
+}
+
+/* Compile the -P style PATTERN, containing SIZE bytes that are
+   followed by '\n'.  Return a description of the compiled pattern.  */

 void *
-Pcompile (char *pattern, size_t size, reg_syntax_t ignored)
+Pcompile (char *pattern, idx_t size, reg_syntax_t ignored, bool exact)
 {
-#if !HAVE_LIBPCRE
-  die (EXIT_TROUBLE, 0,
-       _("support for the -P option is not compiled into "
-         "this --disable-perl-regexp binary"));
-#else
-  int e;
-  char const *ep;
-  static char const wprefix[] = "(?<!\\w)(?:";
-  static char const wsuffix[] = ")(?!\\w)";
-  static char const xprefix[] = "^(?:";
-  static char const xsuffix[] = ")$";
-  int fix_len_max = MAX (sizeof wprefix - 1 + sizeof wsuffix - 1,
-                         sizeof xprefix - 1 + sizeof xsuffix - 1);
-  char *re = xnmalloc (4, size + (fix_len_max + 4 - 1) / 4);
-  int flags = PCRE_DOLLAR_ENDONLY | (match_icase ? PCRE_CASELESS : 0);
-  char const *patlim = pattern + size;
-  char *n = re;
-  char const *p;
-  char const *pnul;
-  struct pcre_comp *pc = xcalloc (1, sizeof (*pc));
+  PCRE2_SIZE e;
+  int ec;
+  int flags = PCRE2_DOLLAR_ENDONLY | (match_icase ? PCRE2_CASELESS : 0);
+  char *patlim = pattern + size;
+  struct pcre_comp *pc = ximalloc (sizeof *pc);
+  pcre2_general_context *gcontext = pc->gcontext
+    = pcre2_general_context_create (private_malloc, private_free, nullptr);
+  pcre2_compile_context *ccontext = pcre2_compile_context_create (gcontext);

  if (localeinfo.multibyte)
    {
+      uint32_t unicode;
+      if (pcre2_config (PCRE2_CONFIG_UNICODE, &unicode) < 0 || !unicode)
+        die (EXIT_TROUBLE, 0,
+             _("-P supports only unibyte locales on this platform"));
      if (! localeinfo.using_utf8)
        die (EXIT_TROUBLE, 0, _("-P supports only unibyte and UTF-8 locales"));
-      flags |= PCRE_UTF8;
+
+      flags |= PCRE2_UTF;
+
+      /* If supported, consider invalid UTF-8 as a barrier not an error.  */
+      flags |= MATCH_INVALID_UTF;
+
+      /* If PCRE2_EXTRA_ASCII_BSD is available, use PCRE2_UCP
+         so that \d does not have the undesirable effect of matching
+         non-ASCII digits.  Otherwise (i.e., with PCRE2 10.42 and earlier),
+         escapes like \w have only their ASCII interpretations,
+         but that's better than the confusion that would ensue if \d
+         matched non-ASCII digits.  */
+      flags |= PCRE2_EXTRA_ASCII_BSD ? PCRE2_UCP : 0;
+
+#if 0
+      /* Do not match individual code units but only UTF-8.  */
+      flags |= PCRE2_NEVER_BACKSLASH_C;
+#endif
    }

  /* FIXME: Remove this restriction.  */
-  if (memchr (pattern, '\n', size))
+  if (rawmemchr (pattern, '\n') != patlim)
    die (EXIT_TROUBLE, 0, _("the -P option only supports a single pattern"));

-  *n = '\0';
-  if (match_words)
-    strcpy (n, wprefix);
-  if (match_lines)
-    strcpy (n, xprefix);
-  n += strlen (n);
+#ifdef PCRE2_EXTRA_MATCH_LINE
+  uint32_t extra_options = (PCRE2_EXTRA_ASCII_BSD
+                            | (match_lines ? PCRE2_EXTRA_MATCH_LINE : 0));
+  pcre2_set_compile_extra_options (ccontext, extra_options);
+#endif

-  /* The PCRE interface doesn't allow NUL bytes in the pattern, so
-     replace each NUL byte in the pattern with the four characters
-     "\000", removing a preceding backslash if there are an odd
-     number of backslashes before the NUL.  */
-  for (p = pattern; (pnul = memchr (p, '\0', patlim - p)); p = pnul + 1)
+  void *re_storage = nullptr;
+  if (match_lines)
    {
-      memcpy (n, p, pnul - p);
-      n += pnul - p;
-      for (p = pnul; pattern < p && p[-1] == '\\'; p--)
-        continue;
-      n -= (pnul - p) & 1;
-      strcpy (n, "\\000");
-      n += 4;
+#ifndef PCRE2_EXTRA_MATCH_LINE
+      static char const *const xprefix = "^(?:";
+      static char const *const xsuffix = ")$";
+      idx_t re_size = size + strlen (xprefix) + strlen (xsuffix);
+      char *re = re_storage = ximalloc (re_size);
+      char *rez = mempcpy (re, xprefix, strlen (xprefix));
+      rez = mempcpy (rez, pattern, size);
+      memcpy (rez, xsuffix, strlen (xsuffix));
+      pattern = re;
+      size = re_size;
+#endif
+    }
+  else if (match_words)
+    {
+      /* PCRE2_EXTRA_MATCH_WORD is incompatible with grep -w;
+         do things the grep way.  */
+      static char const *const wprefix = "(?<!\\w)(?:";
+      static char const *const wsuffix = ")(?!\\w)";
+      idx_t re_size = size + strlen (wprefix) + strlen (wsuffix);
+      char *re = re_storage = ximalloc (re_size);
+      char *rez = mempcpy (re, wprefix, strlen (wprefix));
+      rez = mempcpy (rez, pattern, size);
+      memcpy (rez, wsuffix, strlen (wsuffix));
+      pattern = re;
+      size = re_size;
    }

-  memcpy (n, p, patlim - p);
-  n += patlim - p;
-  *n = '\0';
-  if (match_words)
-    strcpy (n, wsuffix);
-  if (match_lines)
-    strcpy (n, xsuffix);
+  if (!localeinfo.multibyte)
+    pcre2_set_character_tables (ccontext, pcre2_maketables (gcontext));

-  pc->cre = pcre_compile (re, flags, &ep, &e, pcre_maketables ());
+  pc->cre = pcre2_compile ((PCRE2_SPTR) pattern, size, flags,
+                           &ec, &e, ccontext);
  if (!pc->cre)
-    die (EXIT_TROUBLE, 0, "%s", ep);
+    {
+      enum { ERRBUFSIZ = 256 }; /* Taken from pcre2grep.c ERRBUFSIZ.  */
+      PCRE2_UCHAR8 ep[ERRBUFSIZ];
+      pcre2_get_error_message (ec, ep, sizeof ep);
+      die (EXIT_TROUBLE, 0, "%s", ep);
+    }

-  int pcre_study_flags = PCRE_STUDY_EXTRA_NEEDED | PCRE_STUDY_JIT_COMPILE;
-  pc->extra = pcre_study (pc->cre, pcre_study_flags, &ep);
-  if (ep)
-    die (EXIT_TROUBLE, 0, "%s", ep);
+  free (re_storage);
+  pcre2_compile_context_free (ccontext);

-# if PCRE_STUDY_JIT_COMPILE
-  if (pcre_fullinfo (pc->cre, pc->extra, PCRE_INFO_JIT, &e))
-    die (EXIT_TROUBLE, 0, _("internal error (should never happen)"));
+  pc->mcontext = nullptr;
+  pc->data = pcre2_match_data_create_from_pattern (pc->cre, gcontext);
+
+  /* Ignore any failure return from pcre2_jit_compile, as that merely
+     means JIT won't be used during matching.  */
+  pcre2_jit_compile (pc->cre, PCRE2_JIT_COMPLETE);

  /* The PCRE documentation says that a 32 KiB stack is the default.  */
-  if (e)
-    pc->jit_stack_size = 32 << 10;
-# endif
+  pc->jit_stack = nullptr;
+  pc->jit_stack_size = 32 << 10;

-  free (re);
-
-  int sub[NSUB];
-  pc->empty_match[false] = pcre_exec (pc->cre, pc->extra, "", 0, 0,
-                                      PCRE_NOTBOL, sub, NSUB);
-  pc->empty_match[true] = pcre_exec (pc->cre, pc->extra, "", 0, 0, 0, sub,
-                                     NSUB);
+  pc->empty_match[false] = jit_exec (pc, "", 0, 0, PCRE2_NOTBOL);
+  pc->empty_match[true] = jit_exec (pc, "", 0, 0, 0);

  return pc;
-#endif /* HAVE_LIBPCRE */
 }

-size_t
-Pexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
+ptrdiff_t
+Pexecute (void *vcp, char const *buf, idx_t size, idx_t *match_size,
          char const *start_ptr)
 {
-#if !HAVE_LIBPCRE
-  /* We can't get here, because Pcompile would have been called earlier.  */
-  die (EXIT_TROUBLE, 0, _("internal error"));
-#else
-  int sub[NSUB];
  char const *p = start_ptr ? start_ptr : buf;
  bool bol = p[-1] == eolbyte;
  char const *line_start = buf;
-  int e = PCRE_ERROR_NOMATCH;
+  int e = PCRE2_ERROR_NOMATCH;
  char const *line_end;
  struct pcre_comp *pc = vcp;
+  PCRE2_SIZE *sub = pcre2_get_ovector_pointer (pc->data);

-  /* The search address to pass to pcre_exec.  This is the start of
+  /* The search address to pass to PCRE.  This is the start of
     the buffer, or just past the most-recently discovered encoding
     error or line end.  */
  char const *subject = buf;

  do
    {
-      /* Search line by line.  Although this code formerly used
-         PCRE_MULTILINE for performance, the performance wasn't always
+      /* Search line by line.  Although this formerly used something like
+         PCRE2_MULTILINE for performance, the performance wasn't always
         better and the correctness issues were too puzzling.  See
         Bug#22655.  */
-      line_end = memchr (p, eolbyte, buf + size - p);
-      if (INT_MAX < line_end - p)
+      line_end = rawmemchr (p, eolbyte);
+      if (PCRE2_SIZE_MAX < line_end - p)
        die (EXIT_TROUBLE, 0, _("exceeded PCRE's line length limit"));

      for (;;)
        {
          /* Skip past bytes that are easily determined to be encoding
             errors, treating them as data that cannot match.  This is
-             faster than having pcre_exec check them.  */
+             faster than having PCRE check them.  */
          while (localeinfo.sbclen[to_uchar (*p)] == -1)
            {
              p++;
@ -248,10 +299,10 @@ Pexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
              bol = false;
            }

-          int search_offset = p - subject;
+          idx_t search_offset = p - subject;

          /* Check for an empty match; this is faster than letting
-             pcre_exec do it.  */
+             PCRE do it.  */
          if (p == line_end)
            {
              sub[0] = sub[1] = search_offset;
@ -261,13 +312,14 @@ Pexecute (void *vcp, char const *buf, size_t size, size_t *match_size,

          int options = 0;
          if (!bol)
-            options |= PCRE_NOTBOL;
+            options |= PCRE2_NOTBOL;

-          e = jit_exec (pc, subject, line_end - subject, search_offset,
-                        options, sub);
-          if (e != PCRE_ERROR_BADUTF8)
+          e = jit_exec (pc, subject, line_end - subject,
+                        search_offset, options);
+          if (MATCH_INVALID_UTF || !bad_utf8_from_pcre2 (e))
            break;
-          int valid_bytes = sub[0];
+
+          idx_t valid_bytes = pcre2_get_startchar (pc->data);

          if (search_offset <= valid_bytes)
            {
@ -277,14 +329,15 @@ Pexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
                  /* Handle the empty-match case specially, for speed.
                     This optimization is valid if VALID_BYTES is zero,
                     which means SEARCH_OFFSET is also zero.  */
+                  sub[0] = valid_bytes;
                  sub[1] = 0;
                  e = pc->empty_match[bol];
                }
              else
                e = jit_exec (pc, subject, valid_bytes, search_offset,
-                              options | PCRE_NO_UTF8_CHECK | PCRE_NOTEOL, sub);
+                              options | PCRE2_NO_UTF_CHECK | PCRE2_NOTEOL);

-              if (e != PCRE_ERROR_NOMATCH)
+              if (e != PCRE2_ERROR_NOMATCH)
                break;

              /* Treat the encoding error as data that cannot match.  */
@ -295,7 +348,7 @@ Pexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
          subject += valid_bytes + 1;
        }

-      if (e != PCRE_ERROR_NOMATCH)
+      if (e != PCRE2_ERROR_NOMATCH)
        break;
      bol = true;
      p = subject = line_start = line_end + 1;
@ -306,26 +359,42 @@ Pexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
    {
      switch (e)
        {
-        case PCRE_ERROR_NOMATCH:
+        case PCRE2_ERROR_NOMATCH:
          break;

-        case PCRE_ERROR_NOMEMORY:
-          die (EXIT_TROUBLE, 0, _("memory exhausted"));
+        case PCRE2_ERROR_NOMEMORY:
+          die (EXIT_TROUBLE, 0, _("%s: memory exhausted"), input_filename ());

-# if PCRE_STUDY_JIT_COMPILE
-        case PCRE_ERROR_JIT_STACKLIMIT:
-          die (EXIT_TROUBLE, 0, _("exhausted PCRE JIT stack"));
-# endif
+        case PCRE2_ERROR_JIT_STACKLIMIT:
+          die (EXIT_TROUBLE, 0, _("%s: exhausted PCRE JIT stack"),
+               input_filename ());

-        case PCRE_ERROR_MATCHLIMIT:
-          die (EXIT_TROUBLE, 0, _("exceeded PCRE's backtracking limit"));
+        case PCRE2_ERROR_MATCHLIMIT:
+          die (EXIT_TROUBLE, 0, _("%s: exceeded PCRE's backtracking limit"),
+               input_filename ());
+
+        case PCRE2_ERROR_DEPTHLIMIT:
+          die (EXIT_TROUBLE, 0,
+               _("%s: exceeded PCRE's nested backtracking limit"),
+               input_filename ());
+
+        case PCRE2_ERROR_RECURSELOOP:
+          die (EXIT_TROUBLE, 0, _("%s: PCRE detected recurse loop"),
+               input_filename ());
+
+#ifdef PCRE2_ERROR_HEAPLIMIT
+        case PCRE2_ERROR_HEAPLIMIT:
+          die (EXIT_TROUBLE, 0, _("%s: exceeded PCRE's heap limit"),
+               input_filename ());
+#endif

        default:
          /* For now, we lump all remaining PCRE failures into this basket.
             If anyone cares to provide sample grep usage that can trigger
             particular PCRE errors, we can add to the list (above) of more
             detailed diagnostics.  */
-          die (EXIT_TROUBLE, 0, _("internal PCRE error: %d"), e);
+          die (EXIT_TROUBLE, 0, _("%s: internal PCRE error: %d"),
+               input_filename (), e);
        }

      return -1;
@ -349,5 +418,4 @@ Pexecute (void *vcp, char const *buf, size_t size, size_t *match_size,
      *match_size = end - beg;
      return beg - buf;
    }
-#endif
 }
--- a/src/search.h
+++ b/src/search.h
@ -1,5 +1,5 @@
 /* search.c - searching subroutines using dfa, kwset and regex for grep.
-   Copyright 1992, 1998, 2000, 2007, 2009-2018 Free Software Foundation, Inc.
+   Copyright 1992, 1998, 2000, 2007, 2009-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,9 +12,7 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 #ifndef GREP_SEARCH_H
 #define GREP_SEARCH_H 1
@ -24,7 +22,6 @@
 #include <sys/types.h>
 #include <stdint.h>
 #include <wchar.h>
-#include <wctype.h>
 #include <regex.h>

 #include "system.h"
@ -48,39 +45,60 @@ typedef signed char mb_len_map_t;
 /* searchutils.c */
 extern void wordinit (void);
 extern kwset_t kwsinit (bool);
-extern size_t wordchars_size (char const *, char const *) _GL_ATTRIBUTE_PURE;
-extern size_t wordchar_next (char const *, char const *) _GL_ATTRIBUTE_PURE;
-extern size_t wordchar_prev (char const *, char const *, char const *)
+extern idx_t wordchars_size (char const *, char const *) _GL_ATTRIBUTE_PURE;
+extern idx_t wordchar_next (char const *, char const *) _GL_ATTRIBUTE_PURE;
+extern idx_t wordchar_prev (char const *, char const *, char const *)
  _GL_ATTRIBUTE_PURE;
-extern ptrdiff_t mb_goback (char const **, char const *, char const *);
+extern ptrdiff_t mb_goback (char const **, idx_t *, char const *, char const *);

 /* dfasearch.c */
-extern void *GEAcompile (char *, size_t, reg_syntax_t);
-extern size_t EGexecute (void *, char const *, size_t, size_t *, char const *);
+extern void *GEAcompile (char *, idx_t, reg_syntax_t, bool);
+extern ptrdiff_t EGexecute (void *, char const *, idx_t, idx_t *, char const *);

 /* kwsearch.c */
-extern void *Fcompile (char *, size_t, reg_syntax_t);
-extern size_t Fexecute (void *, char const *, size_t, size_t *, char const *);
+extern void *Fcompile (char *, idx_t, reg_syntax_t, bool);
+extern ptrdiff_t Fexecute (void *, char const *, idx_t, idx_t *, char const *);

 /* pcresearch.c */
-extern void *Pcompile (char *, size_t, reg_syntax_t);
-extern size_t Pexecute (void *, char const *, size_t, size_t *, char const *);
+extern void *Pcompile (char *, idx_t, reg_syntax_t, bool);
+extern ptrdiff_t Pexecute (void *, char const *, idx_t, idx_t *, char const *);
+extern void Pprint_version (void);

 /* grep.c */
 extern struct localeinfo localeinfo;
-extern void fgrep_to_grep_pattern (char **, size_t *);
+extern void fgrep_to_grep_pattern (char **, idx_t *);
+
+/* Return the number of bytes in the character at the start of S, which
+   is of size N.  N must be positive.  MBS is the conversion state.
+   This acts like mbrlen, except it returns -1 and -2 instead of
+   (size_t) -1 and (size_t) -2.  */
+SEARCH_INLINE ptrdiff_t
+imbrlen (char const *s, idx_t n, mbstate_t *mbs)
+{
+  size_t len = mbrlen (s, n, mbs);
+
+  /* Convert result to ptrdiff_t portably, even on oddball platforms.
+     When optimizing, this typically uses no machine instructions.  */
+  if (len <= MB_LEN_MAX)
+    return len;
+  ptrdiff_t neglen = -len;
+  return -neglen;
+}

 /* Return the number of bytes in the character at the start of S, which
   is of size N.  N must be positive.  MBS is the conversion state.
   This acts like mbrlen, except it returns 1 when mbrlen would return 0,
+   it returns -1 and -2 instead of (size_t) -1 and (size_t) -2,
   and it is typically faster because of the cache.  */
-SEARCH_INLINE size_t
-mb_clen (char const *s, size_t n, mbstate_t *mbs)
+SEARCH_INLINE ptrdiff_t
+mb_clen (char const *s, idx_t n, mbstate_t *mbs)
 {
-  size_t len = localeinfo.sbclen[to_uchar (*s)];
-  return len == (size_t) -2 ? mbrlen (s, n, mbs) : len;
+  signed char len = localeinfo.sbclen[to_uchar (*s)];
+  return len == -2 ? imbrlen (s, n, mbs) : len;
 }

+extern char const *input_filename (void);
+
 _GL_INLINE_HEADER_END

 #endif /* GREP_SEARCH_H */
--- a/src/searchutils.c
+++ b/src/searchutils.c
@ -1,5 +1,5 @@
 /* searchutils.c - helper subroutines for grep's matchers.
-   Copyright 1992, 1998, 2000, 2007, 2009-2018 Free Software Foundation, Inc.
+   Copyright 1992, 1998, 2000, 2007, 2009-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,15 +12,15 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 #include <config.h>

 #define SEARCH_INLINE _GL_EXTERN_INLINE
 #define SYSTEM_INLINE _GL_EXTERN_INLINE
-#include "search.h"
+#include <search.h>
+
+#include <uchar.h>

 /* For each byte B, sbwordchar[B] is true if B is a single-byte
   character that is a word constituent, and is false otherwise.  */
@ -30,7 +30,7 @@ static bool sbwordchar[NCHAR];
 static bool
 wordchar (wint_t wc)
 {
-  return wc == L'_' || iswalnum (wc);
+  return wc == L'_' || c32isalnum (wc);
 }

 void
@ -43,47 +43,53 @@ wordinit (void)
 kwset_t
 kwsinit (bool mb_trans)
 {
-  char *trans = NULL;
+  char *trans = nullptr;

  if (match_icase && (MB_CUR_MAX == 1 || mb_trans))
    {
-      trans = xmalloc (NCHAR);
-      if (MB_CUR_MAX == 1)
-        for (int i = 0; i < NCHAR; i++)
-          trans[i] = toupper (i);
-      else
-        for (int i = 0; i < NCHAR; i++)
-          {
-            wint_t wc = localeinfo.sbctowc[i];
-            wint_t uwc = towupper (wc);
-            if (uwc != wc)
-              {
-                mbstate_t mbs = { 0 };
-                size_t len = wcrtomb (&trans[i], uwc, &mbs);
-                if (len != 1)
-                  abort ();
-              }
-            else
-              trans[i] = i;
-          }
+      trans = ximalloc (NCHAR);
+      /* If I is a single-byte character that becomes a different
+         single-byte character when uppercased, set trans[I]
+         to that character.  Otherwise, set trans[I] to I.  */
+      for (int i = 0; i < NCHAR; i++)
+        trans[i] = toupper (i);
    }

  return kwsalloc (trans);
 }

-/* In the buffer *MB_START, return the number of bytes needed to go
-   back from CUR to the previous boundary, where a "boundary" is the
-   start of a multibyte character or is an error-encoding byte.  The
-   buffer ends at END (i.e., one past the address of the buffer's last
-   byte).  If CUR is already at a boundary, return 0.  If *MB_START is
-   greater than CUR, return the negative value CUR - *MB_START.
+/* Return the number of bytes needed to go back to the start of a
+   multibyte character in a buffer.  The buffer starts at *MB_START.
+   (See below for MBCLEN's role.)  The multibyte character contains
+   the byte addressed by CUR.  The buffer ends just before END, which
+   must not be less than CUR.

-   When returning zero, set *MB_START to CUR.  When returning a
-   positive value, set *MB_START to the next boundary after CUR, or to
-   END if there is no such boundary.  When returning a negative value,
-   leave *MB_START alone.  */
+   If CUR is no larger than *MB_START, return CUR - *MB_START without
+   modifying *MB_START or dealing with MBCLEN.  Otherwise, update
+   *MB_START to point to the first multibyte character starting on or
+   after CUR, and if MBCLEN is nonnull then deal with MBCLEN as follows:
+
+     - If this function returns 0 and the locale is multibyte and is
+       not UTF-8, set *MBCLEN to the number of bytes in the multibyte
+       character containing the byte addressed by (CUR - 1).
+
+     - Otherwise, possibly set *MBCLEN to an unspecified value.
+
+   *MB_START should point to the start of a multibyte character, or to
+   an encoding-error byte.
+
+   *END should be a sentinel byte - one of '\0', '\r', '\n', '.', '/',
+   which POSIX says cannot be part of any other character.  Also,
+   there should be a byte string immediately before *MB_START that
+   contains a sentinel byte.  This means it is OK to scan backwards
+   before *MB_START as long as the scan stops at a sentinel byte, and
+   similarly it is OK to scan forwards from CUR (without checking END)
+   so long as the scan stops at a sentinel byte.
+
+   Treat encoding errors as if they were single-byte characters.  */
 ptrdiff_t
-mb_goback (char const **mb_start, char const *cur, char const *end)
+mb_goback (char const **mb_start, idx_t *mbclen, char const *cur,
+           char const *end)
 {
  const char *p = *mb_start;
  const char *p0 = p;
@ -93,30 +99,44 @@ mb_goback (char const **mb_start, char const *cur, char const *end)

  if (localeinfo.using_utf8)
    {
+      /* UTF-8 permits scanning backward to the previous character.
+         Start by assuming CUR is at a character boundary.  */
      p = cur;

-      if (cur < end && (*cur & 0xc0) == 0x80)
+      if ((*cur & 0xc0) == 0x80)
        for (int i = 1; i <= 3; i++)
          if ((cur[-i] & 0xc0) != 0x80)
            {
-              mbstate_t mbs = { 0 };
-              size_t clen = mb_clen (cur - i, end - (cur - i), &mbs);
-              if (i < clen && clen < (size_t) -2)
+              /* True if the length implied by the putative byte 1 at
+                 CUR[-I] extends at least through *CUR.  */
+              bool long_enough = (~cur[-i] & 0xff) >> (7 - i) == 0;
+
+              if (long_enough)
                {
-                  p0 = cur - i;
-                  p = p0 + clen;
+                  mbstate_t mbs; mbszero (&mbs);
+                  ptrdiff_t clen = imbrlen (cur - i, end - (cur - i), &mbs);
+                  if (0 <= clen)
+                    {
+                      /* This multibyte character contains *CUR.  */
+                      p0 = cur - i;
+                      p = p0 + clen;
+                    }
                }
              break;
            }
    }
  else
    {
-      mbstate_t mbs = { 0 };
+      /* In non-UTF-8 encodings, to find character boundaries one must
+         in general scan forward from the start of the buffer.  */
+      mbstate_t mbs; mbszero (&mbs);
+      ptrdiff_t clen;
+
      do
        {
-          size_t clen = mb_clen (p, end - p, &mbs);
+          clen = mb_clen (p, end - p, &mbs);

-          if ((size_t) -2 <= clen)
+          if (clen < 0)
            {
              /* An invalid sequence, or a truncated multibyte character.
                 Treat it as a single byte character.  */
@ -127,6 +147,9 @@ mb_goback (char const **mb_start, char const *cur, char const *end)
          p += clen;
        }
      while (p < cur);
+
+      if (mbclen)
+        *mbclen = clen;
    }

  *mb_start = p;
@ -136,36 +159,36 @@ mb_goback (char const **mb_start, char const *cur, char const *end)
 /* Examine the start of BUF (which goes to END) for word constituents.
   If COUNTALL, examine as many as possible; otherwise, examine at most one.
   Return the total number of bytes in the examined characters.  */
-static size_t
+static idx_t
 wordchars_count (char const *buf, char const *end, bool countall)
 {
-  size_t n = 0;
-  mbstate_t mbs = { 0 };
-  while (n < end - buf)
+  mbstate_t mbs; mbszero (&mbs);
+  char const *p = buf;
+  while (p < end)
    {
-      unsigned char b = buf[n];
+      unsigned char b = *p;
      if (sbwordchar[b])
-        n++;
+        p++;
      else if (localeinfo.sbclen[b] != -2)
        break;
      else
        {
-          wchar_t wc = 0;
-          size_t wcbytes = mbrtowc (&wc, buf + n, end - buf - n, &mbs);
+          char32_t wc = 0;
+          size_t wcbytes = mbrtoc32 (&wc, p, end - p, &mbs);
          if (!wordchar (wc))
            break;
-          n += wcbytes + !wcbytes;
+          p += wcbytes + !wcbytes;
        }
      if (!countall)
        break;
    }
-  return n;
+  return p - buf;
 }

 /* Examine the start of BUF for the longest prefix containing just
   word constituents.  Return the total number of bytes in the prefix.
   The buffer ends at END.  */
-size_t
+idx_t
 wordchars_size (char const *buf, char const *end)
 {
  return wordchars_count (buf, end, true);
@ -173,7 +196,7 @@ wordchars_size (char const *buf, char const *end)

 /* If BUF starts with a word constituent, return the number of bytes
   used to represent it; otherwise, return zero.  The buffer ends at END.  */
-size_t
+idx_t
 wordchar_next (char const *buf, char const *end)
 {
  return wordchars_count (buf, end, false);
@ -182,16 +205,15 @@ wordchar_next (char const *buf, char const *end)
 /* In the buffer BUF, return nonzero if the character whose encoding
   contains the byte before CUR is a word constituent.  The buffer
   ends at END.  */
-size_t
+idx_t
 wordchar_prev (char const *buf, char const *cur, char const *end)
 {
  if (buf == cur)
    return 0;
  unsigned char b = *--cur;
-  if (! localeinfo.multibyte
-      || (localeinfo.using_utf8 && localeinfo.sbclen[b] != -2))
+  if (! localeinfo.multibyte || localeinfo.using_utf8 & ~(b >> 7))
    return sbwordchar[b];
  char const *p = buf;
-  cur -= mb_goback (&p, cur, end);
+  cur -= mb_goback (&p, nullptr, cur, end);
  return wordchar_next (cur, end);
 }
--- a/src/system.h
+++ b/src/system.h
@ -1,5 +1,5 @@
 /* Portability cruft.  Include after config.h and sys/types.h.
-   Copyright 1996, 1998-2000, 2007, 2009-2018 Free Software Foundation, Inc.
+   Copyright 1996, 1998-2000, 2007, 2009-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,9 +12,7 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 #ifndef GREP_SYSTEM_H
 #define GREP_SYSTEM_H 1
@ -101,9 +99,9 @@ void __asan_unpoison_memory_region (void const volatile *addr, size_t size);

 #else

-static _GL_UNUSED void
+_GL_UNUSED static void
 __asan_poison_memory_region (void const volatile *addr, size_t size) { }
-static _GL_UNUSED void
+_GL_UNUSED static void
 __asan_unpoison_memory_region (void const volatile *addr, size_t size) { }
 #endif

--- a/tests/100k-entries
+++ b/tests/100k-entries
@ -0,0 +1,15 @@
+#!/bin/sh
+# This would make grep-3.11 fail with ENOTSUP and exit 2.
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+expensive_
+
+fail=0
+
+mkdir t || framework_failure_
+(cd t && seq 100000|xargs touch) || framework_failure_
+
+returns_ 1 grep -r x t > out 2> err
+compare /dev/null out || fail=1
+compare /dev/null err || fail=1
+
+Exit $fail
--- a/tests/Coreutils.pm
+++ b/tests/Coreutils.pm
@ -1,7 +1,7 @@
 package Coreutils;
 # This is a testing framework.

-# Copyright (C) 1998-2015, 2017-2018 Free Software Foundation, Inc.
+# Copyright (C) 1998-2015, 2017-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/CuSkip.pm
+++ b/tests/CuSkip.pm
@ -1,7 +1,7 @@
 package CuSkip;
 # Skip a test: emit diag to log and to stderr, and exit 77

-# Copyright (C) 2011-2015, 2017-2018 Free Software Foundation, Inc.
+# Copyright (C) 2011-2015, 2017-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/CuTmpdir.pm
+++ b/tests/CuTmpdir.pm
@ -1,7 +1,7 @@
 package CuTmpdir;
 # create, then chdir into a temporary sub-directory

-# Copyright (C) 2007-2015, 2017-2018 Free Software Foundation, Inc.
+# Copyright (C) 2007-2015, 2017-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@ -1,5 +1,5 @@
 ## Process this file with automake to create Makefile.in
-# Copyright 1997-1998, 2005-2018 Free Software Foundation, Inc.
+# Copyright 1997-1998, 2005-2026 Free Software Foundation, Inc.
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -41,18 +41,27 @@ AM_CFLAGS = $(WARN_CFLAGS) $(WERROR_CFLAGS)

 # Tell the linker to omit references to unused shared libraries.
 AM_LDFLAGS = $(IGNORE_UNUSED_LIBRARIES_CFLAGS)
-LDADD = ../lib/libgreputils.a $(LIBINTL) ../lib/libgreputils.a
+LDADD = ../lib/libgreputils.a $(LIBINTL) ../lib/libgreputils.a \
+  $(HARD_LOCALE_LIB) $(LIBC32CONV) $(LIBCSTACK) \
+  $(LIBSIGSEGV) $(LIBUNISTRING) $(MBRTOWC_LIB) $(SETLOCALE_NULL_LIB) \
+  $(LIBTHREAD)

 # The triple-backref test is expected to fail with both the system
 # matcher (i.e., with glibc) and with the included matcher.
 # Both matchers need to be fixed.
-# FIXME-2015: Remove this once the glibc and gnulib bugs are fixed.
+# FIXME-2025: Remove this once the glibc and gnulib bugs are fixed.
 XFAIL_TESTS = triple-backref

+# The glibc-infloop test is expected to fail with both the system
+# matcher (i.e., with glibc) and with the included matcher.
+# Both matchers need to be fixed.
+# FIXME-2025: Remove this once the glibc and gnulib bugs are fixed.
+XFAIL_TESTS += glibc-infloop
+
 # Equivalence classes are only supported when using the system
 # matcher (which means only with glibc).
 # The included matcher needs to be fixed.
-# FIXME-2015: Remove this once the gnulib bug is fixed.
+# FIXME-2025: Remove this once the gnulib bug is fixed.
 if USE_INCLUDED_REGEX
 XFAIL_TESTS += equiv-classes
 else
@ -62,14 +71,17 @@ else
 endif

 TESTS =						\
+  100k-entries					\
  backref					\
  backref-alt					\
  backref-multibyte-slow			\
  backref-word					\
+  backslash-dot					\
  backslash-s-and-repetition-operators		\
-  backslash-s-vs-invalid-multitype		\
+  backslash-s-vs-invalid-multibyte		\
  big-hole					\
  big-match					\
+  binary-file-matches				\
  bogus-wctob					\
  bre						\
  c-locale					\
@ -81,11 +93,13 @@ TESTS =						\
  case-fold-titlecase				\
  char-class-multibyte				\
  char-class-multibyte2				\
+  color-colors					\
  context-0					\
  count-newline					\
  dfa-coverage					\
  dfa-heap-overrun				\
  dfa-infloop					\
+  dfa-invalid-utf8				\
  dfaexec-multibyte				\
  empty						\
  empty-line					\
@ -101,11 +115,15 @@ TESTS =						\
  fgrep-longest					\
  file						\
  filename-lineno.pl				\
+  fillbuf-long-line				\
  fmbtest					\
  foad1						\
+  glibc-infloop					\
  grep-dev-null					\
  grep-dev-null-out				\
  grep-dir					\
+  hangul-syllable				\
+  hash-collision-perf				\
  help-version					\
  high-bit-range				\
  in-eq-out-infloop				\
@ -117,18 +135,22 @@ TESTS =						\
  kwset-abuse					\
  long-line-vs-2GiB-read			\
  long-pattern-perf				\
+  many-regex-performance			\
  match-lines					\
  max-count-overread				\
  max-count-vs-context				\
  mb-dot-newline				\
  mb-non-UTF8-overrun				\
+  mb-non-UTF8-perf-Fw				\
  mb-non-UTF8-performance			\
+  mb-non-UTF8-word-boundary			\
  multibyte-white-space				\
  multiple-begin-or-end-line			\
  null-byte					\
  options					\
  pcre						\
  pcre-abort					\
+  pcre-ascii-digits				\
  pcre-context					\
  pcre-count					\
  pcre-infloop					\
@ -137,6 +159,8 @@ TESTS =						\
  pcre-jitstack					\
  pcre-o					\
  pcre-utf8					\
+  pcre-utf8-bug224				\
+  pcre-utf8-w					\
  pcre-w					\
  pcre-wx-backref				\
  pcre-z					\
@ -154,6 +178,7 @@ TESTS =						\
  stack-overflow				\
  status					\
  surrogate-pair				\
+  surrogate-search				\
  symlink					\
  triple-backref				\
  turkish-I					\
@ -165,11 +190,13 @@ TESTS =						\
  unibyte-bracket-expr				\
  unibyte-negated-circumflex			\
  utf8-bracket					\
+  version-pcre					\
  warn-char-classes				\
  word-delim-multibyte				\
  word-multi-file				\
  word-multibyte				\
  write-error-msg				\
+  y2038-vs-32-bit				\
  yesno						\
  z-anchor-newline

@ -231,15 +258,16 @@ TESTS_ENVIRONMENT =				\
  LOCALE_FR='$(LOCALE_FR)'			\
  LOCALE_FR_UTF8='$(LOCALE_FR_UTF8)'		\
  AWK=$(AWK)					\
-  GREP_OPTIONS=''				\
  LC_ALL=C					\
  abs_top_builddir='$(abs_top_builddir)'	\
  abs_top_srcdir='$(abs_top_srcdir)'		\
  abs_srcdir='$(abs_srcdir)'			\
  built_programs="$$built_programs"		\
+  host_triplet='$(host_triplet)'		\
  srcdir='$(srcdir)'				\
  top_srcdir='$(top_srcdir)'			\
  CC='$(CC)'					\
+  CONFIG_HEADER='$(abs_top_builddir)/$(CONFIG_INCLUDE)' \
  GREP_TEST_NAME=`echo $$tst|sed 's,^\./,,;s,/,-,g'` \
  MAKE=$(MAKE)					\
  MALLOC_PERTURB_=$(MALLOC_PERTURB_)		\
@ -248,7 +276,13 @@ TESTS_ENVIRONMENT =				\
  PERL='$(PERL)'				\
  SHELL='$(SHELL)'				\
  PATH='$(abs_top_builddir)/src$(PATH_SEPARATOR)'"$$PATH" \
-  ; 9>&2
+  ;						\
+						\
+  : 'set this envvar to indicate whether -P works';	\
+  m=0; if err=`echo .|grep -Pq . 2>&1`; then		\
+    test -z "$$err" && m=1; fi;				\
+  export PCRE_WORKS=$$m;				\
+  9>&2

 LOG_COMPILER = $(SHELL)

--- a/tests/backref
+++ b/tests/backref
@ -1,7 +1,7 @@
 #! /bin/sh
-# Test for backreferences and other things.
+# Test for back-references and other things.
 #
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
@ -43,4 +43,12 @@ if test $? -ne 2 ; then
        failures=1
 fi

+# https://bugs.gnu.org/36148#13
+echo 'Total failed: 2 (1 ignored)' |
+    grep -e '^Total failed: 0$' -e '^Total failed: \([0-9]*\) (\1 ignored)$'
+if test $? -ne 1 ; then
+        echo "Backref: Multiple -e test, test #5 failed"
+        failures=1
+fi
+
 Exit $failures
--- a/tests/backref-alt
+++ b/tests/backref-alt
@ -1,7 +1,7 @@
 #! /bin/sh
 # Test for a bug in glibc's regex code as of 2015-09-19.
 #
-# Copyright 2015-2018 Free Software Foundation, Inc.
+# Copyright 2015-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/backslash-dot
+++ b/tests/backslash-dot
@ -0,0 +1,20 @@
+#! /bin/sh
+# This once failed to match: echo . | grep '\.'
+#
+# Copyright (C) 2020-2026 Free Software Foundation, Inc.
+#
+# Copying and distribution of this file, with or without modification,
+# are permitted in any medium without royalty provided the copyright
+# notice and this notice are preserved.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+echo . > in || framework_failure_
+
+grep '\.' in > out 2> err || fail=1
+compare in out || fail=1
+compare /dev/null err || fail=1
+
+Exit $fail
--- a/tests/backslash-s-and-repetition-operators
+++ b/tests/backslash-s-and-repetition-operators
@ -1,7 +1,7 @@
 #! /bin/sh
 # Ensure that \s and \S work with repetition operators.
 #
-# Copyright (C) 2013-2018 Free Software Foundation, Inc.
+# Copyright (C) 2013-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/backslash-s-vs-invalid-multibyte
+++ b/tests/backslash-s-vs-invalid-multibyte
@ -1,7 +1,7 @@
 #! /bin/sh
 # Ensure that neither \s nor \S matches an invalid multibyte character.
 #
-# Copyright (C) 2013-2018 Free Software Foundation, Inc.
+# Copyright (C) 2013-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
@ -11,11 +11,11 @@

 require_en_utf8_locale_

+printf '\202\n' > in || framework_failure_
+
 LC_ALL=en_US.UTF-8
 export LC_ALL

-printf '\202\n' > in || framework_failure_
-
 fail=0
 grep '^\S$' in > out-S && fail=1
 compare /dev/null out-S || fail=1
--- a/tests/big-hole
+++ b/tests/big-hole
@ -4,6 +4,7 @@
 . "${srcdir=.}/init.sh"; path_prepend_ ../src

 expensive_
+require_perl_

 # Skip this test if there is no usable SEEK_HOLE support,
 # as is the case with linux-3.5.0 on ext4 and tmpfs file systems.
--- a/tests/binary-file-matches
+++ b/tests/binary-file-matches
@ -0,0 +1,23 @@
+#! /bin/sh
+# Test for the "binary file ... matches" diagnostic.
+#
+# Copyright (C) 2020-2026 Free Software Foundation, Inc.
+#
+# Copying and distribution of this file, with or without modification,
+# are permitted in any medium without royalty provided the copyright
+# notice and this notice are preserved.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+echo "grep: (standard input): binary file matches" > exp \
+  || framework_failure_
+
+for option in '' -s; do
+  printf 'a\0' | grep $option a > out 2> err || fail=1
+  compare /dev/null out || fail=1
+  compare exp err || fail=1
+done
+
+Exit $fail
--- a/tests/bre
+++ b/tests/bre
@ -1,7 +1,7 @@
 #! /bin/sh
 # Regression test for GNU grep.
 #
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/bre.awk
+++ b/tests/bre.awk
@ -1,4 +1,4 @@
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/c-locale
+++ b/tests/c-locale
@ -1,7 +1,7 @@
 #! /bin/sh
 # Regression test for GNU grep.
 #
-# Copyright 2016-2018 Free Software Foundation, Inc.
+# Copyright 2016-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/case-fold-titlecase
+++ b/tests/case-fold-titlecase
@ -1,7 +1,7 @@
 #!/bin/sh
 # Check that case folding works even with titlecase and similarly odd chars.

-# Copyright 2014-2018 Free Software Foundation, Inc.
+# Copyright 2014-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -168,7 +168,7 @@ do
 done

 # Try a unibyte test with ISO 8859-7, if available.
-if test "$(get-mb-cur-max el_GR.iso88597)" -eq 1; then
+if test "$(get-mb-cur-max el_GR.iso88597)" = 1; then
  LC_ALL=el_GR.iso88597
  export LC_ALL

--- a/tests/color-colors
+++ b/tests/color-colors
@ -0,0 +1,48 @@
+#!/bin/sh
+# Check that GREP_COLOR elicits a warning.
+
+# Copyright 2022-2026 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+unset GREP_COLORS
+unset GREP_COLOR
+LC_ALL=C
+export LC_ALL
+
+printf 'x\n\n' >in || framework_failure_
+printf '%s\n' \
+  "grep: warning: GREP_COLOR='36' is deprecated; use GREP_COLORS='mt=36'" \
+  >exp.err || framework_failure_
+
+GREP_COLORS='mt=36:ln=35' grep --color=always . in >exp 2>err || fail=1
+compare /dev/null err || fail=1
+GREP_COLOR='36' GREP_COLORS='ln=35' grep --color=always . in >out 2>err \
+  || fail=1
+compare exp out || fail=1
+compare exp.err err || fail=1
+
+GREP_COLORS='mt=36' grep --color=always . in >exp 2>err || fail=1
+compare /dev/null err || fail=1
+GREP_COLOR='36' grep --color=always . in >out 2>err || fail=1
+compare exp out || fail=1
+compare exp.err err || fail=1
+
+GREP_COLORS='ln=35' grep --color=always . in >out 2>err || fail=1
+compare /dev/null err || fail=1
+
+Exit $fail
--- a/tests/count-newline
+++ b/tests/count-newline
@ -2,7 +2,7 @@
 # Test that newline is counted correctly even when the transition
 # table is rebuilt.

-# Copyright 2014-2018 Free Software Foundation, Inc.
+# Copyright 2014-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/dfa-coverage
+++ b/tests/dfa-coverage
@ -1,7 +1,7 @@
 #!/bin/sh
 # Exercise the final reachable code in dfa.c's match_mb_charset.

-# Copyright (C) 2012-2018 Free Software Foundation, Inc.
+# Copyright (C) 2012-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/dfa-heap-overrun
+++ b/tests/dfa-heap-overrun
@ -1,7 +1,7 @@
 #!/bin/sh
 # Trigger a heap overrun in grep-2.6..grep-2.8.

-# Copyright (C) 2011-2018 Free Software Foundation, Inc.
+# Copyright (C) 2011-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/dfa-invalid-utf8
+++ b/tests/dfa-invalid-utf8
@ -0,0 +1,29 @@
+#! /bin/sh
+# Test whether "grep '.'" matches invalid UTF-8 byte sequences.
+#
+# Copyright 2019-2026 Free Software Foundation, Inc.
+#
+# Copying and distribution of this file, with or without modification,
+# are permitted in any medium without royalty provided the copyright
+# notice and this notice are preserved.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+require_en_utf8_locale_
+require_compiled_in_MB_support
+
+fail=0
+
+printf 'a\360\202\202\254b\n' >in1 || framework_failure_
+LC_ALL=en_US.UTF-8 grep 'a.b' in1 > out1 2> err
+test $? -eq 1 || fail=1
+compare /dev/null out1 || fail=1
+compare /dev/null err1 || fail=1
+
+printf 'a\360\202\202\254ba\360\202\202\254b\n' >in2 ||
+  framework_failure_
+LC_ALL=en_US.UTF-8 grep -E '(a.b)\1' in2 > out2 2> err
+test $? -eq 1 || fail=1
+compare /dev/null out2 || fail=1
+compare /dev/null err2 || fail=1
+
+Exit $fail
--- a/tests/empty
+++ b/tests/empty
@ -2,7 +2,7 @@
 # test that the empty file means no pattern
 # and an empty pattern means match all.
 #
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
@ -39,17 +39,10 @@ for locale in C en_US.UTF-8; do
          failures=1
        fi

-        # should return 0 found a match
-        echo "" | LC_ALL=$locale timeout 10s grep $options -e ''
-        if test $? -ne 0 ; then
-          echo "Status: Wrong status code, test \#4 failed ($options $locale)"
-          failures=1
-        fi
-
        # should return 0 found a match
        echo abcd | LC_ALL=$locale timeout 10s grep $options -e ''
        if test $? -ne 0 ; then
-          echo "Status: Wrong status code, test \#5 failed ($options $locale)"
+          echo "Status: Wrong status code, test \#4 failed ($options $locale)"
          failures=1
        fi
    done
--- a/tests/empty-line-mb
+++ b/tests/empty-line-mb
@ -1,7 +1,7 @@
 #! /bin/sh
 # Exercise bugs in grep-2.13 with -i, -n and an RE of ^$ in a multi-byte locale.
 #
-# Copyright (C) 2012-2018 Free Software Foundation, Inc.
+# Copyright (C) 2012-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/encoding-error
+++ b/tests/encoding-error
@ -1,7 +1,7 @@
 #! /bin/sh
 # Test grep's behavior on encoding errors.
 #
-# Copyright 2015-2018 Free Software Foundation, Inc.
+# Copyright 2015-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
@ -11,22 +11,25 @@

 require_en_utf8_locale_

-LC_ALL=en_US.UTF-8
-export LC_ALL
-
 printf 'Alfred Jones\n' > a || framework_failure_
 printf 'John Smith\n' >j || framework_failure_
 printf 'Pedro P\351rez\n' >p || framework_failure_
 cat a p j >in || framework_failure_

+LC_ALL=en_US.UTF-8
+export LC_ALL
+
 fail=0

 grep '^A' in >out || fail=1
 compare a out || fail=1

 grep '^P' in >out || fail=1
-printf 'Binary file in matches\n' >exp || framework_failure_
-compare exp out || fail=1
+compare /dev/null out || fail=1
+
+grep -I '^P' in >out 2>err || fail=1
+compare /dev/null out || fail=1
+compare /dev/null err || fail=1

 grep '^J' in >out || fail=1
 compare j out || fail=1
@ -35,9 +38,14 @@ returns_ 1 grep '^X' in >out || fail=1
 compare /dev/null out || fail=1

 grep . in >out || fail=1
-(cat a j && printf 'Binary file in matches\n') >exp || framework_failure_
+cat a j >exp || framework_failure_
 compare exp out || fail=1

+grep -I . in >out 2>err || fail=1
+cat a j >exp || framework_failure_
+compare exp out || fail=1
+compare /dev/null err || fail=1
+
 grep -a . in >out || fail=1
 compare in out

--- a/tests/envvar-check
+++ b/tests/envvar-check
@ -1,7 +1,7 @@
 # -*- sh -*-
 # Check environment variables for sane values while testing.

-# Copyright (C) 2000-2018 Free Software Foundation, Inc.
+# Copyright (C) 2000-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/ere
+++ b/tests/ere
@ -1,7 +1,7 @@
 #! /bin/sh
 # Regression test for GNU grep.
 #
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/ere.awk
+++ b/tests/ere.awk
@ -1,4 +1,4 @@
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/ere.tests
+++ b/tests/ere.tests
@ -218,3 +218,5 @@
 0@)@)
 1@)@x
 0@\()\((a\())(b))@()(a()b)
+# This would erroneously match from grep-3.2 to grep-3.5
+1@a+a+a@aa
--- a/tests/false-match-mb-non-utf8
+++ b/tests/false-match-mb-non-utf8
@ -1,7 +1,7 @@
 #! /bin/sh
 # Test for false matches in grep 2.19..2.26 in multibyte, non-UTF8 locales
 #
-# Copyright (C) 2016-2018 Free Software Foundation, Inc.
+# Copyright (C) 2016-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/fedora
+++ b/tests/fedora
@ -18,7 +18,7 @@ ok ()	{ printf "${G}OK${D}"; }
 fail () { printf "${R}FAIL${D} (See ${U})"; failures=1; }

 U=https://bugzilla.redhat.com/show_bug.cgi?id=116909
-printf "fgrep false negatives: "
+printf "grep -F false negatives: "
 cat > 116909.list <<EOF
 a
 b
@ -59,7 +59,7 @@ if ( timeout --version ) > /dev/null 2>&1; then
  echo foobar | returns_ 124 timeout 10 grep -Fw "" && fail || ok

  U=https://bugzilla.redhat.com/show_bug.cgi?id=140781
-  printf 'fgrep hangs on binary files: '
+  printf 'grep -F hangs on binary files: '
  returns_ 124 timeout 10 grep -F grep "$abs_top_builddir/src/grep" \
    > /dev/null && fail || ok

--- a/tests/fgrep-longest
+++ b/tests/fgrep-longest
@ -2,7 +2,7 @@
 # With multiple matches, grep -Fo could print a shorter one.
 # This bug affected grep versions 2.26 through 2.27.
 #
-# Copyright (C) 2017-2018 Free Software Foundation, Inc.
+# Copyright (C) 2017-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/file
+++ b/tests/file
@ -4,7 +4,7 @@
 # grep -F -f pattern_file file
 # grep -G -f pattern_file file
 #
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/filename-lineno.pl
+++ b/tests/filename-lineno.pl
@ -4,7 +4,7 @@
 # file or line number from which the offending regular expression came.
 # With 2.26, now, each such diagnostic has a "FILENAME:LINENO: " prefix.

-# Copyright (C) 2016-2018 Free Software Foundation, Inc.
+# Copyright (C) 2016-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -37,6 +37,8 @@ $prog = $full_prog_name if $full_prog_name;
 # Transform each to this: "Unmatched [..."
 my $err_subst = {ERR_SUBST => 's/(: Unmatched \[).*/$1.../'};

+my $no_pcre = "$prog: Perl matching not supported in a --disable-perl-regexp build\n";
+
 my @Tests =
  (
   # Show that grep now includes filename:lineno in the diagnostic:
@ -48,7 +50,7 @@ my @Tests =
   # Show that with two or more errors, grep now prints all diagnostics:
   ['invalid-re-2-files', '-f g -f h', {EXIT=>2},
    {AUX=>{g=>"1\n2[[\n3\n4[[\n"}},
-    {AUX=>{h=>"\n\n[[\n"}},
+    {AUX=>{h=>"5\n6\n7[[\n"}},
    $err_subst,
    {ERR => "$prog: g:2: Unmatched [...\n"
         . "$prog: g:4: Unmatched [...\n"
@ -59,7 +61,7 @@ my @Tests =
   # Like the above, but on the other lines.
   ['invalid-re-2-files2', '-f g -f h', {EXIT=>2},
    {AUX=>{g=>"1[[\n2\n3[[\n4\n"}},
-    {AUX=>{h=>"[[\n[[\n\n"}},
+    {AUX=>{h=>"5[[\n6[[\n7\n"}},
    $err_subst,
    {ERR => "$prog: g:1: Unmatched [...\n"
         . "$prog: g:3: Unmatched [...\n"
@ -68,12 +70,57 @@ my @Tests =
    },
   ],

+   # Make sure the line numbers are right when some regexps are duplicates.
+   ['invalid-re-line-numbers', '-f g -f h', {EXIT=>2},
+    {AUX=>{g=>"1[[\n\n3[[\n\n5[[\n"}},
+    {AUX=>{h=>"1[[\n\n\n4[[\n\n6[[\n"}},
+    $err_subst,
+    {ERR => "$prog: g:1: Unmatched [...\n"
+         . "$prog: g:3: Unmatched [...\n"
+         . "$prog: g:5: Unmatched [...\n"
+         . "$prog: h:4: Unmatched [...\n"
+         . "$prog: h:6: Unmatched [...\n"
+    },
+   ],
+
   # Show that with two '-e'-specified erroneous regexps,
   # there is no file name or line number.
-   ['invalid-re-2e', '-e "[[" -e "[["', {EXIT=>2},
+   ['invalid-re-2e', '-e "1[[" -e "2[["', {EXIT=>2},
    $err_subst,
    {ERR => "$prog: Unmatched [...\n" x 2},
   ],
+
+   # Test unmatched ) as well.  It is OK with -E and an error with -G and -P.
+   ['invalid-re-E-paren', '-E ")"', {IN=>''}, {EXIT=>1}],
+   ['invalid-re-E-star-paren', '-E ".*)"', {IN=>''}, {EXIT=>1}],
+   ['invalid-re-G-paren', '-G "\\)"', {EXIT=>2},
+    {ERR => "$prog: Unmatched ) or \\)\n"},
+   ],
+   ['invalid-re-G-star-paren', '-G "a.*\\)"', {EXIT=>2},
+    {ERR => "$prog: Unmatched ) or \\)\n"},
+   ],
+   ['invalid-re-P-paren', '-P ")"', {EXIT=>2},
+    {ERR => $ENV{PCRE_WORKS} == 1
+       ? "$prog: unmatched closing parenthesis\n"
+       : $no_pcre
+    },
+   ],
+   ['invalid-re-P-star-paren', '-P "a.*)"', {EXIT=>2},
+    {ERR => $ENV{PCRE_WORKS} == 1
+       ? "$prog: unmatched closing parenthesis\n"
+       : $no_pcre
+    },
+   ],
+
+   # Prior to grep-3.6, the name of the offending file was not printed.
+   ['backtracking-with-file', '-P "((a+)*)+$"', {EXIT=>2},
+    {IN=>{f=>"a"x20 ."b"}},
+    {ERR => $ENV{PCRE_WORKS} == 1
+       ? "$prog: f: exceeded PCRE's backtracking limit\n"
+       : $no_pcre
+    },
+   ],
+
  );

 my $save_temps = $ENV{DEBUG};
--- a/tests/fillbuf-long-line
+++ b/tests/fillbuf-long-line
@ -0,0 +1,11 @@
+#!/bin/sh
+# This would fail for v3.7-15-ge3694e9 .. grep-v3.7-48-g5c3c427
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+printf %0104681d 0 > in || framework_failure_
+
+fail=0
+
+returns_ 1 grep xx in || fail=1
+
+Exit $fail
--- a/tests/fmbtest
+++ b/tests/fmbtest
@ -1,5 +1,5 @@
 #! /bin/sh
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
@ -10,7 +10,7 @@
 cz=cs_CZ.UTF-8

 # If cs_CZ.UTF-8 locale doesn't work, skip this test.
-LC_ALL=$cz locale -k LC_CTYPE 2>/dev/null | grep -q charmap.*UTF-8 \
+test "`LC_ALL=$cz locale charmap 2>/dev/null`" = UTF-8 \
  || skip_ this system lacks the $cz locale

 # If matching is done in single-byte mode, skip this test too
@ -53,21 +53,21 @@ EOF
 for mode in F G E; do

 test1=$(echo $(LC_ALL=$cz grep -${mode} -f cspatfile csinput |
-               tr -cs '0-9' '[ *]'))
+               tr '\n' ' ' | tr -cd '0-9 '))
 if test "$test1" != "11 12 13 14 15 16 17 18"; then
  echo "Test #1 ${mode} failed: $test1"
  failures=1
 fi

 test2=$(echo $(LC_ALL=$cz grep -${mode}i -f cspatfile csinput |
-               tr -cs '0-9' '[ *]'))
+               tr '\n' ' ' | tr -cd '0-9 '))
 if test "$test2" != "01 02 07 08 10 11 12 13 14 15 16 17 18 19 20"; then
  echo "Test #2 ${mode} failed: $test2"
  failures=1
 fi

 test3=$(echo $(LC_ALL=$cz grep -${mode}i -e 'ČÍšE' -e 'Čas' csinput |
-               tr -cs '0-9' '[ *]'))
+               tr '\n' ' ' | tr -cd '0-9 '))
 if test "$test3" != "01 02 07 08 10 11 12 13 14 15 16 17 18 19 20"; then
  echo "Test #3 ${mode} failed: $test3"
  failures=1
@ -115,7 +115,7 @@ done
 for mode in G E; do

 test8=$(echo $(LC_ALL=$cz grep -${mode}i -e 'Č.šE' -e 'Č[a-f]s' csinput |
-               tr -cs '0-9' '[ *]'))
+               tr '\n' ' ' | tr -cd '0-9 '))
 if test "$test8" != "01 02 07 08 10 11 12 13 14 15 16 17 18 19 20"; then
  echo "Test #8 ${mode} failed: $test8"
  failures=1
--- a/tests/foad1
+++ b/tests/foad1
@ -1,7 +1,7 @@
 #! /bin/sh
 # Test various combinations of command-line options.
 #
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
@ -150,7 +150,7 @@ Exit $failures
 # The rest of this file is meant to be executed under this locale.
 LC_ALL=cs_CZ.UTF-8; export LC_ALL
 # If the UTF-8 locale doesn't work, skip these tests silently.
-locale -k LC_CTYPE 2>/dev/null | grep -q "charmap.*UTF-8" || Exit $failures
+test "`locale charmap 2>/dev/null`" = UTF-8 || Exit $failures

 # Test character class erroneously matching a '[' character.
 grep_test "[/" "" "[[:alpha:]]" -E
--- a/tests/get-mb-cur-max.c
+++ b/tests/get-mb-cur-max.c
@ -1,5 +1,5 @@
 /* Auxiliary program to detect support for a locale.
-   Copyright 2010-2018 Free Software Foundation, Inc.
+   Copyright 2010-2026 Free Software Foundation, Inc.

   This program is free software; you can redistribute it and/or modify
   it under the terms of the GNU General Public License as published by
@ -12,17 +12,13 @@
   GNU General Public License for more details.

   You should have received a copy of the GNU General Public License
-   along with this program; if not, write to the Free Software
-   Foundation, Inc., 51 Franklin Street - Fifth Floor, Boston, MA
-   02110-1301, USA.  */
+   along with this program.  If not, see <https://www.gnu.org/licenses/>.  */

 #include <config.h>
 #include <locale.h>
 #include <stdio.h>
 #include <stdlib.h>

-#include "getprogname.h"
-
 int
 main (int argc, char **argv)
 {
--- a/tests/glibc-infloop
+++ b/tests/glibc-infloop
@ -0,0 +1,30 @@
+#!/bin/sh
+# This would infloop when using glibc's regex at least until glibc-2.36.
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+require_timeout_
+require_en_utf8_locale_
+
+fail=0
+
+cat <<\EOF > glibc-check.c
+#include <features.h>
+#ifdef __GLIBC__
+int ok;
+#else
+# error "not glibc"
+#endif
+EOF
+$CC -c glibc-check.c && glibc=1 || glibc=0
+
+grep '^#define USE_INCLUDED_REGEX 1' "$CONFIG_HEADER" \
+  && included_regex=1 || included_regex=0
+
+case $glibc:$included_regex in
+  0:0) skip_ 'runs only with glibc or when built with the included regex'
+esac
+
+echo a > in || framework_failure_
+timeout 2 env LC_ALL=en_US.UTF-8 grep -E -w '((()|a)|())*' in || fail=1
+
+Exit $fail
--- a/tests/grep-dev-null-out
+++ b/tests/grep-dev-null-out
@ -6,7 +6,7 @@
 require_timeout_

 ${AWK-awk} 'BEGIN {while (1) print "x"}' </dev/null |
-  returns_ 124 timeout 1 grep x >/dev/null || fail=1
+  returns_ 124 timeout 10 grep x >/dev/null || fail=1

 echo abc | grep b >>/dev/null || fail=1

--- a/tests/hangul-syllable
+++ b/tests/hangul-syllable
@ -0,0 +1,184 @@
+#!/bin/sh
+# grep 3.4 through 3.7 mishandled matching '.' against the valid UTF-8
+# sequences (ED)(90-9F)(80-BF) corresponding to U+D400 through U+D7FF,
+# which are some Hangul Syllables and Hangul Jamo Extended-B.  They
+# also mishandled (F4)(88-8F)(80-BF)(80-BF) which correspond to
+# U+108000 through U+10FFFF (Supplemental Private Use Area plane B).
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+require_en_utf8_locale_
+
+LC_ALL=en_US.UTF-8
+export LC_ALL
+
+# Check that '.' completely matches $1, i.e., that $1 is a single UTF-8 char.
+check_char ()
+{
+  printf "$1\\n" >in || framework_failure_
+
+  grep $2 '^.$' in >out || fail=1
+  cmp in out || fail=1
+}
+
+# Check that '.*' does not completely match $1, i.e., that
+# $1 contains an encoding error.
+check_nonchar ()
+{
+  printf "$1\\n" >in || framework_failure_
+
+  grep -a -v '^.*$' in >out || fail=1
+  cmp in out || fail=1
+}
+
+fail=0
+
+# "." should match U+D45C HANGUL SYLLABLE PYO.
+check_char '\355\221\234'
+
+# Check boundary-condition characters, and non-characters,
+# while we are at it.
+
+check_char '\0' -a
+check_char '\177'
+check_nonchar '\200'
+check_nonchar '\277'
+check_nonchar '\300\200'
+check_nonchar '\301\277'
+
+for i in 302 337; do
+  for j in 200 277; do
+    check_char "\\$i\\$j"
+  done
+  for j in 177 300; do
+    check_nonchar "\\$i\\$j"
+  done
+done
+for i in 340; do
+  for j in 240 277; do
+    for k in 200 277; do
+      check_char "\\$i\\$j\\$k"
+    done
+    for k in 177 300; do
+      check_nonchar "\\$i\\$j\\$k"
+    done
+  done
+  for j in 239 300; do
+    for k in 177 200 277 300; do
+      check_nonchar "\\$i\\$j\\$k"
+    done
+  done
+done
+for i in 341 354 356 357; do
+  for j in 200 277; do
+    for k in 200 277; do
+      check_char "\\$i\\$j\\$k"
+    done
+    for k in 177 300; do
+      check_nonchar "\\$i\\$j\\$k"
+    done
+  done
+  for j in 177 300; do
+    for k in 177 200 277 300; do
+      check_nonchar "\\$i\\$j\\$k"
+    done
+  done
+done
+for i in 355; do
+  for j in 200 237; do
+    for k in 200 277; do
+      check_char "\\$i\\$j\\$k"
+    done
+    for k in 177 300; do
+      check_nonchar "\\$i\\$j\\$k"
+    done
+  done
+  for j in 177 240; do
+    for k in 177 200 277 300; do
+      check_nonchar "\\$i\\$j\\$k"
+    done
+  done
+done
+
+# On platforms like 32-bit AIX where WCHAR_MAX == 0xFFFF, skip checks
+# where the corresponding Unicode characters are not supported.
+if test $fail -eq 0; then
+  printf '\360\220\200\200\n' >in || framework_failure_
+  grep '^.$' in >out 2>&1 || fail=1
+  cmp in out || skip_ 'platform does not support U+10000'
+fi
+
+for i in 360; do
+  for j in 220 277; do
+    for k in 200 277; do
+      for l in 200 277; do
+        check_char "\\$i\\$j\\$k\\$l"
+      done
+      for l in 177 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+    for k in 177 300; do
+      for l in 177 200 277 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+  done
+  for j in 217 300; do
+    for k in 177 200 277 300; do
+      for l in 177 200 277 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+  done
+done
+for i in 361 363; do
+  for j in 200 277; do
+    for k in 200 277; do
+      for l in 200 277; do
+        check_char "\\$i\\$j\\$k\\$l"
+      done
+      for l in 177 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+    for k in 177 300; do
+      for l in 177 200 277 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+  done
+  for j in 177 300; do
+    for k in 177 200 277 300; do
+      for l in 177 200 277 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+  done
+done
+for i in 364; do
+  for j in 200 217; do
+    for k in 200 277; do
+      for l in 200 277; do
+        check_char "\\$i\\$j\\$k\\$l"
+      done
+      for l in 177 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+    for k in 177 300; do
+      for l in 177 200 277 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+  done
+  for j in 177 220; do
+    for k in 177 200 277 300; do
+      for l in 177 200 277 300; do
+        check_nonchar "\\$i\\$j\\$k\\$l"
+      done
+    done
+  done
+done
+
+Exit $fail
--- a/tests/hash-collision-perf
+++ b/tests/hash-collision-perf
@ -0,0 +1,57 @@
+#!/bin/sh
+# Test for this performance regression:
+# grep-3.5 and 3.6 would take O(N^2) time for some sets of input regexps.
+
+# Copyright 2020-2026 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+require_perl_
+
+: > empty || framework_failure_
+
+# Construct a test case that consumes enough CPU time that we don't
+# have to worry about measurement noise. This first case is searching
+# for digits, which never exhibited a problem with hash collisions.
+n_pat=40000
+while :; do
+  seq $n_pat > in || framework_failure_
+  small_ms=$(LC_ALL=C user_time_ 1 grep --file=in empty) || fail=1
+  test $small_ms -ge 200 && break
+  n_pat=$(expr $n_pat '*' 2)
+  case $n_pat:$small_ms in
+    640000:0) skip_ 'user_time_ appears always to report 0 elapsed ms';;
+  esac
+done
+
+# Now, search for those same digits mapped to A-J.
+# With the PJW-based hash function, this became O(N^2).
+seq $n_pat | tr 0-9 A-J > in || framework_failure_
+large_ms=$(LC_ALL=C user_time_ 1 grep --file=in empty) || fail=1
+
+# Deliberately recording in an unused variable so it
+# shows up in set -x output, in case this test fails.
+ratio=$(expr "$large_ms" / "$small_ms")
+
+# The duration of the latter run must be no more than 10 times
+# that of the former.  Using recent versions prior to this fix,
+# this test would fail due to ratios > 800.  Using the fixed version,
+# it's common to see a ratio less than 1.
+returns_ 1 expr $small_ms '<' $large_ms / 10 || fail=1
+
+Exit $fail
--- a/tests/help-version
+++ b/tests/help-version
@ -2,7 +2,7 @@
 # Make sure all of these programs work properly
 # when invoked with --help or --version.

-# Copyright (C) 2000-2018 Free Software Foundation, Inc.
+# Copyright (C) 2000-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/high-bit-range
+++ b/tests/high-bit-range
@ -1,7 +1,7 @@
 #!/bin/sh
 # Exercise high-bit-set unibyte-in-[...]-range bug.

-# Copyright (C) 2011-2018 Free Software Foundation, Inc.
+# Copyright (C) 2011-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/in-eq-out-infloop
+++ b/tests/in-eq-out-infloop
@ -17,13 +17,13 @@ echo "$v" > out || framework_failure_
 for arg in out - ''; do
  # Accommodate both 'out' and '(standard input)', as well as
  # the multi-byte quoting we see on OS/X-based systems.
-  echo grep: input file ... is also the output > err.exp || framework_failure_
+  echo grep: ...: input file is also the output > err.exp || framework_failure_

  # Require an exit status of 2.
  # grep-2.8 and earlier would infloop with $arg = out.
  # grep-2.10 and earlier would infloop with $arg = - or $arg = ''.
  timeout 10 grep 0 $arg < out >> out 2> err; st=$?; test $st = 2 || fail=1
-  sed 's/file .* is/file ... is/' err > k && mv k err
+  sed 's/grep: .*: /grep: ...: /' err > k && mv k err
  # Normalize the diagnostic prefix from e.g., "/mnt/dir/grep: " to "grep: "
  sed 's/^[^:]*: /grep: /' err > k && mv k err
  compare err.exp err || fail=1
--- a/tests/init.cfg
+++ b/tests/init.cfg
@ -21,7 +21,6 @@ fi
 vars_='
 GREP_COLOR
 GREP_COLORS
-GREP_OPTIONS
 TERM
 '
 envvar_check_fail=0
@ -43,14 +42,24 @@ require_timeout_()
    || skip_ your system lacks the timeout program
  returns_ 1 timeout 10s false \
    || skip_ your system has a non-GNU timeout program
+  returns_ 124 timeout 0.01 sleep 0.02 \
+    || skip_ "'timeout 0.01 sleep 0.02' did not time out"
 }

 require_pcre_()
 {
-  echo . | grep -P . 2>err || {
-    test $? -eq 1 && fail_ PCRE available, but does not work.
-    skip_ no PCRE support
-  }
+  case $LC_ALL in
+    *.UTF-8)
+      printf '\303\241\n' | grep -P '^.$' 2>err || {
+        test $? -eq 1 && fail_ PCRE available, but does not work
+        skip_ no PCRE Unicode support
+      };;
+    *)
+      echo . | grep -P '^.$' 2>err || {
+        test $? -eq 1 && fail_ PCRE available, but does not work.
+        skip_ no PCRE support
+      };;
+  esac
  compare /dev/null err || fail_ PCRE available, but stderr not empty.
 }

@ -137,6 +146,13 @@ require_JP_EUC_locale_()
  skip_ "$locale locale not found"
 }

+# Skip the current test if we lack Perl.
+require_perl_()
+{
+  test "$PERL" && $PERL -e 'use warnings' > /dev/null 2>&1 \
+    || skip_ 'configure did not find a usable version of Perl'
+}
+
 expensive_()
 {
  if test "$RUN_EXPENSIVE_TESTS" != yes; then
@ -203,3 +219,17 @@ user_time_()

 # yes is not portable, fake it with $AWK
 yes() { line=${*-y} ${AWK-awk} 'BEGIN{for (;;) print ENVIRON["line"]}'; }
+
+# Some systems lack seq.
+# A limited replacement for seq: handle 1 or 2 args; increment must be 1
+if ! type seq > /dev/null 2>&1; then
+  seq()
+  {
+    case $# in
+      1) start=1  final=$1;;
+      2) start=$1 final=$2;;
+      *) echo you lose 1>&2; exit 1;;
+    esac
+    awk 'BEGIN{for(i='$start';i<='$final';i++) print i}' < /dev/null
+  }
+fi
--- a/tests/init.sh
+++ b/tests/init.sh
@ -1,618 +0,0 @@
-# source this file; set up for tests
-
-# Copyright (C) 2009-2018 Free Software Foundation, Inc.
-
-# This program is free software: you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation, either version 3 of the License, or
-# (at your option) any later version.
-
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-
-# You should have received a copy of the GNU General Public License
-# along with this program.  If not, see <https://www.gnu.org/licenses/>.
-
-# Using this file in a test
-# =========================
-#
-# The typical skeleton of a test looks like this:
-#
-#   #!/bin/sh
-#   . "${srcdir=.}/init.sh"; path_prepend_ .
-#   Execute some commands.
-#   Note that these commands are executed in a subdirectory, therefore you
-#   need to prepend "../" to relative filenames in the build directory.
-#   Note that the "path_prepend_ ." is useful only if the body of your
-#   test invokes programs residing in the initial directory.
-#   For example, if the programs you want to test are in src/, and this test
-#   script is named tests/test-1, then you would use "path_prepend_ ../src",
-#   or perhaps export PATH='$(abs_top_builddir)/src$(PATH_SEPARATOR)'"$$PATH"
-#   to all tests via automake's TESTS_ENVIRONMENT.
-#   Set the exit code 0 for success, 77 for skipped, or 1 or other for failure.
-#   Use the skip_ and fail_ functions to print a diagnostic and then exit
-#   with the corresponding exit code.
-#   Exit $?
-
-# Executing a test that uses this file
-# ====================================
-#
-# Running a single test:
-#   $ make check TESTS=test-foo.sh
-#
-# Running a single test, with verbose output:
-#   $ make check TESTS=test-foo.sh VERBOSE=yes
-#
-# Running a single test, keeping the temporary directory:
-#   $ make check TESTS=test-foo.sh KEEP=yes
-#
-# Running a single test, with single-stepping:
-#   1. Go into a sub-shell:
-#   $ bash
-#   2. Set relevant environment variables from TESTS_ENVIRONMENT in the
-#      Makefile:
-#   $ export srcdir=../../tests # this is an example
-#   3. Execute the commands from the test, copy&pasting them one by one:
-#   $ . "$srcdir/init.sh"; path_prepend_ .
-#   ...
-#   4. Finally
-#   $ exit
-
-ME_=`expr "./$0" : '.*/\(.*\)$'`
-
-# Prepare PATH_SEPARATOR.
-# The user is always right.
-if test "${PATH_SEPARATOR+set}" != set; then
-  # Determine PATH_SEPARATOR by trying to find /bin/sh in a PATH which
-  # contains only /bin. Note that ksh looks also at the FPATH variable,
-  # so we have to set that as well for the test.
-  PATH_SEPARATOR=:
-  (PATH='/bin;/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 \
-    && { (PATH='/bin:/bin'; FPATH=$PATH; sh -c :) >/dev/null 2>&1 \
-           || PATH_SEPARATOR=';'
-       }
-fi
-
-# We use a trap below for cleanup.  This requires us to go through
-# hoops to get the right exit status transported through the handler.
-# So use 'Exit STATUS' instead of 'exit STATUS' inside of the tests.
-# Turn off errexit here so that we don't trip the bug with OSF1/Tru64
-# sh inside this function.
-Exit () { set +e; (exit $1); exit $1; }
-
-# Print warnings (e.g., about skipped and failed tests) to this file number.
-# Override by defining to say, 9, in init.cfg, and putting say,
-#   export ...ENVVAR_SETTINGS...; $(SHELL) 9>&2
-# in the definition of TESTS_ENVIRONMENT in your tests/Makefile.am file.
-# This is useful when using automake's parallel tests mode, to print
-# the reason for skip/failure to console, rather than to the .log files.
-: ${stderr_fileno_=2}
-
-# Note that correct expansion of "$*" depends on IFS starting with ' '.
-# Always write the full diagnostic to stderr.
-# When stderr_fileno_ is not 2, also emit the first line of the
-# diagnostic to that file descriptor.
-warn_ ()
-{
-  # If IFS does not start with ' ', set it and emit the warning in a subshell.
-  case $IFS in
-    ' '*) printf '%s\n' "$*" >&2
-          test $stderr_fileno_ = 2 \
-            || { printf '%s\n' "$*" | sed 1q >&$stderr_fileno_ ; } ;;
-    *) (IFS=' '; warn_ "$@");;
-  esac
-}
-fail_ () { warn_ "$ME_: failed test: $@"; Exit 1; }
-skip_ () { warn_ "$ME_: skipped test: $@"; Exit 77; }
-fatal_ () { warn_ "$ME_: hard error: $@"; Exit 99; }
-framework_failure_ () { warn_ "$ME_: set-up failure: $@"; Exit 99; }
-
-# This is used to simplify checking of the return value
-# which is useful when ensuring a command fails as desired.
-# I.e., just doing `command ... &&fail=1` will not catch
-# a segfault in command for example.  With this helper you
-# instead check an explicit exit code like
-#   returns_ 1 command ... || fail
-returns_ () {
-  # Disable tracing so it doesn't interfere with stderr of the wrapped command
-  { set +x; } 2>/dev/null
-
-  local exp_exit="$1"
-  shift
-  "$@"
-  test $? -eq $exp_exit && ret_=0 || ret_=1
-
-  if test "$VERBOSE" = yes && test "$gl_set_x_corrupts_stderr_" = false; then
-    set -x
-  fi
-  { return $ret_; } 2>/dev/null
-}
-
-# Sanitize this shell to POSIX mode, if possible.
-DUALCASE=1; export DUALCASE
-if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then
-  emulate sh
-  NULLCMD=:
-  alias -g '${1+"$@"}'='"$@"'
-  setopt NO_GLOB_SUBST
-else
-  case `(set -o) 2>/dev/null` in
-    *posix*) set -o posix ;;
-  esac
-fi
-
-# We require $(...) support unconditionally.
-# We require non-surprising "local" semantics (this eliminates dash).
-# This takes the admittedly draconian step of eliminating dash, because the
-# assignment tab=$(printf '\t') works fine, yet preceding it with "local "
-# transforms it into an assignment that sets the variable to the empty string.
-# That is too counter-intuitive, and can lead to subtle run-time malfunction.
-# The example below is less subtle in that with dash, it evokes the run-time
-# exception "dash: 1: local: 1: bad variable name".
-# We require a few additional shell features only when $EXEEXT is nonempty,
-# in order to support automatic $EXEEXT emulation:
-# - hyphen-containing alias names
-# - we prefer to use ${var#...} substitution, rather than having
-#   to work around lack of support for that feature.
-# The following code attempts to find a shell with support for these features.
-# If the current shell passes the test, we're done.  Otherwise, test other
-# shells until we find one that passes.  If one is found, re-exec it.
-# If no acceptable shell is found, skip the current test.
-#
-# The "...set -x; P=1 true 2>err..." test is to disqualify any shell that
-# emits "P=1" into err, as /bin/sh from SunOS 5.11 and OpenBSD 4.7 do.
-#
-# Use "9" to indicate success (rather than 0), in case some shell acts
-# like Solaris 10's /bin/sh but exits successfully instead of with status 2.
-
-# Eval this code in a subshell to determine a shell's suitability.
-# 10 - passes all tests; ok to use
-#  9 - ok, but enabling "set -x" corrupts app stderr; prefer higher score
-#  ? - not ok
-gl_shell_test_script_='
-test $(echo y) = y || exit 1
-f_local_() { local v=1; }; f_local_ || exit 1
-f_dash_local_fail_() { local t=$(printf " 1"); }; f_dash_local_fail_
-score_=10
-if test "$VERBOSE" = yes; then
-  test -n "$( (exec 3>&1; set -x; P=1 true 2>&3) 2> /dev/null)" && score_=9
-fi
-test -z "$EXEEXT" && exit $score_
-shopt -s expand_aliases
-alias a-b="echo zoo"
-v=abx
-     test ${v%x} = ab \
-  && test ${v#a} = bx \
-  && test $(a-b) = zoo \
-  && exit $score_
-'
-
-if test "x$1" = "x--no-reexec"; then
-  shift
-else
-  # Assume a working shell.  Export to subshells (setup_ needs this).
-  gl_set_x_corrupts_stderr_=false
-  export gl_set_x_corrupts_stderr_
-
-  # Record the first marginally acceptable shell.
-  marginal_=
-
-  # Search for a shell that meets our requirements.
-  for re_shell_ in __current__ "${CONFIG_SHELL:-no_shell}" \
-      /bin/sh bash dash zsh pdksh fail
-  do
-    test "$re_shell_" = no_shell && continue
-
-    # If we've made it all the way to the sentinel, "fail" without
-    # finding even a marginal shell, skip this test.
-    if test "$re_shell_" = fail; then
-      test -z "$marginal_" && skip_ failed to find an adequate shell
-      re_shell_=$marginal_
-      break
-    fi
-
-    # When testing the current shell, simply "eval" the test code.
-    # Otherwise, run it via $re_shell_ -c ...
-    if test "$re_shell_" = __current__; then
-      # 'eval'ing this code makes Solaris 10's /bin/sh exit with
-      # $? set to 2.  It does not evaluate any of the code after the
-      # "unexpected" first '('.  Thus, we must run it in a subshell.
-      ( eval "$gl_shell_test_script_" ) > /dev/null 2>&1
-    else
-      "$re_shell_" -c "$gl_shell_test_script_" 2>/dev/null
-    fi
-
-    st_=$?
-
-    # $re_shell_ works just fine.  Use it.
-    if test $st_ = 10; then
-      gl_set_x_corrupts_stderr_=false
-      break
-    fi
-
-    # If this is our first marginally acceptable shell, remember it.
-    if test "$st_:$marginal_" = 9: ; then
-      marginal_="$re_shell_"
-      gl_set_x_corrupts_stderr_=true
-    fi
-  done
-
-  if test "$re_shell_" != __current__; then
-    # Found a usable shell.  Preserve -v and -x.
-    case $- in
-      *v*x* | *x*v*) opts_=-vx ;;
-      *v*) opts_=-v ;;
-      *x*) opts_=-x ;;
-      *) opts_= ;;
-    esac
-    re_shell=$re_shell_
-    export re_shell
-    exec "$re_shell_" $opts_ "$0" --no-reexec "$@"
-    echo "$ME_: exec failed" 1>&2
-    exit 127
-  fi
-fi
-
-# If this is bash, turn off all aliases.
-test -n "$BASH_VERSION" && unalias -a
-
-# Note that when supporting $EXEEXT (transparently mapping from PROG_NAME to
-# PROG_NAME.exe), we want to support hyphen-containing names like test-acos.
-# That is part of the shell-selection test above.  Why use aliases rather
-# than functions?  Because support for hyphen-containing aliases is more
-# widespread than that for hyphen-containing function names.
-test -n "$EXEEXT" && test -n "$BASH_VERSION" && shopt -s expand_aliases
-
-# Enable glibc's malloc-perturbing option.
-# This is useful for exposing code that depends on the fact that
-# malloc-related functions often return memory that is mostly zeroed.
-# If you have the time and cycles, use valgrind to do an even better job.
-: ${MALLOC_PERTURB_=87}
-export MALLOC_PERTURB_
-
-# This is a stub function that is run upon trap (upon regular exit and
-# interrupt).  Override it with a per-test function, e.g., to unmount
-# a partition, or to undo any other global state changes.
-cleanup_ () { :; }
-
-# Emit a header similar to that from diff -u;  Print the simulated "diff"
-# command so that the order of arguments is clear.  Don't bother with @@ lines.
-emit_diff_u_header_ ()
-{
-  printf '%s\n' "diff -u $*" \
-    "--- $1	1970-01-01" \
-    "+++ $2	1970-01-01"
-}
-
-# Arrange not to let diff or cmp operate on /dev/null,
-# since on some systems (at least OSF/1 5.1), that doesn't work.
-# When there are not two arguments, or no argument is /dev/null, return 2.
-# When one argument is /dev/null and the other is not empty,
-# cat the nonempty file to stderr and return 1.
-# Otherwise, return 0.
-compare_dev_null_ ()
-{
-  test $# = 2 || return 2
-
-  if test "x$1" = x/dev/null; then
-    test -s "$2" || return 0
-    emit_diff_u_header_ "$@"; sed 's/^/+/' "$2"
-    return 1
-  fi
-
-  if test "x$2" = x/dev/null; then
-    test -s "$1" || return 0
-    emit_diff_u_header_ "$@"; sed 's/^/-/' "$1"
-    return 1
-  fi
-
-  return 2
-}
-
-for diff_opt_ in -u -U3 -c '' no; do
-  test "$diff_opt_" != no &&
-    diff_out_=`exec 2>/dev/null; diff $diff_opt_ "$0" "$0" < /dev/null` &&
-    break
-done
-if test "$diff_opt_" != no; then
-  if test -z "$diff_out_"; then
-    compare_ () { diff $diff_opt_ "$@"; }
-  else
-    compare_ ()
-    {
-      # If no differences were found, AIX and HP-UX 'diff' produce output
-      # like "No differences encountered".  Hide this output.
-      diff $diff_opt_ "$@" > diff.out
-      diff_status_=$?
-      test $diff_status_ -eq 0 || cat diff.out || diff_status_=2
-      rm -f diff.out || diff_status_=2
-      return $diff_status_
-    }
-  fi
-elif cmp -s /dev/null /dev/null 2>/dev/null; then
-  compare_ () { cmp -s "$@"; }
-else
-  compare_ () { cmp "$@"; }
-fi
-
-# Usage: compare EXPECTED ACTUAL
-#
-# Given compare_dev_null_'s preprocessing, defer to compare_ if 2 or more.
-# Otherwise, propagate $? to caller: any diffs have already been printed.
-compare ()
-{
-  # This looks like it can be factored to use a simple "case $?"
-  # after unchecked compare_dev_null_ invocation, but that would
-  # fail in a "set -e" environment.
-  if compare_dev_null_ "$@"; then
-    return 0
-  else
-    case $? in
-      1) return 1;;
-      *) compare_ "$@";;
-    esac
-  fi
-}
-
-# An arbitrary prefix to help distinguish test directories.
-testdir_prefix_ () { printf gt; }
-
-# Run the user-overridable cleanup_ function, remove the temporary
-# directory and exit with the incoming value of $?.
-remove_tmp_ ()
-{
-  __st=$?
-  cleanup_
-  if test "$KEEP" = yes; then
-    echo "Not removing temporary directory $test_dir_"
-  else
-    # cd out of the directory we're about to remove
-    cd "$initial_cwd_" || cd / || cd /tmp
-    chmod -R u+rwx "$test_dir_"
-    # If removal fails and exit status was to be 0, then change it to 1.
-    rm -rf "$test_dir_" || { test $__st = 0 && __st=1; }
-  fi
-  exit $__st
-}
-
-# Given a directory name, DIR, if every entry in it that matches *.exe
-# contains only the specified bytes (see the case stmt below), then print
-# a space-separated list of those names and return 0.  Otherwise, don't
-# print anything and return 1.  Naming constraints apply also to DIR.
-find_exe_basenames_ ()
-{
-  feb_dir_=$1
-  feb_fail_=0
-  feb_result_=
-  feb_sp_=
-  for feb_file_ in $feb_dir_/*.exe; do
-    # If there was no *.exe file, or there existed a file named "*.exe" that
-    # was deleted between the above glob expansion and the existence test
-    # below, just skip it.
-    test "x$feb_file_" = "x$feb_dir_/*.exe" && test ! -f "$feb_file_" \
-      && continue
-    # Exempt [.exe, since we can't create a function by that name, yet
-    # we can't invoke [ by PATH search anyways due to shell builtins.
-    test "x$feb_file_" = "x$feb_dir_/[.exe" && continue
-    case $feb_file_ in
-      *[!-a-zA-Z/0-9_.+]*) feb_fail_=1; break;;
-      *) # Remove leading file name components as well as the .exe suffix.
-         feb_file_=${feb_file_##*/}
-         feb_file_=${feb_file_%.exe}
-         feb_result_="$feb_result_$feb_sp_$feb_file_";;
-    esac
-    feb_sp_=' '
-  done
-  test $feb_fail_ = 0 && printf %s "$feb_result_"
-  return $feb_fail_
-}
-
-# Consider the files in directory, $1.
-# For each file name of the form PROG.exe, create an alias named
-# PROG that simply invokes PROG.exe, then return 0.  If any selected
-# file name or the directory name, $1, contains an unexpected character,
-# define no alias and return 1.
-create_exe_shims_ ()
-{
-  case $EXEEXT in
-    '') return 0 ;;
-    .exe) ;;
-    *) echo "$0: unexpected \$EXEEXT value: $EXEEXT" 1>&2; return 1 ;;
-  esac
-
-  base_names_=`find_exe_basenames_ $1` \
-    || { echo "$0 (exe_shim): skipping directory: $1" 1>&2; return 0; }
-
-  if test -n "$base_names_"; then
-    for base_ in $base_names_; do
-      alias "$base_"="$base_$EXEEXT"
-    done
-  fi
-
-  return 0
-}
-
-# Use this function to prepend to PATH an absolute name for each
-# specified, possibly-$initial_cwd_-relative, directory.
-path_prepend_ ()
-{
-  while test $# != 0; do
-    path_dir_=$1
-    case $path_dir_ in
-      '') fail_ "invalid path dir: '$1'";;
-      /* | ?:*) abs_path_dir_=$path_dir_;;
-      *) abs_path_dir_=$initial_cwd_/$path_dir_;;
-    esac
-    case $abs_path_dir_ in
-      *$PATH_SEPARATOR*) fail_ "invalid path dir: '$abs_path_dir_'";;
-    esac
-    PATH="$abs_path_dir_$PATH_SEPARATOR$PATH"
-
-    # Create an alias, FOO, for each FOO.exe in this directory.
-    create_exe_shims_ "$abs_path_dir_" \
-      || fail_ "something failed (above): $abs_path_dir_"
-    shift
-  done
-  export PATH
-}
-
-setup_ ()
-{
-  if test "$VERBOSE" = yes; then
-    # Test whether set -x may cause the selected shell to corrupt an
-    # application's stderr.  Many do, including zsh-4.3.10 and the /bin/sh
-    # from SunOS 5.11, OpenBSD 4.7 and Irix 5.x and 6.5.
-    # If enabling verbose output this way would cause trouble, simply
-    # issue a warning and refrain.
-    if $gl_set_x_corrupts_stderr_; then
-      warn_ "using SHELL=$SHELL with 'set -x' corrupts stderr"
-    else
-      set -x
-    fi
-  fi
-
-  initial_cwd_=$PWD
-
-  pfx_=`testdir_prefix_`
-  test_dir_=`mktempd_ "$initial_cwd_" "$pfx_-$ME_.XXXX"` \
-    || fail_ "failed to create temporary directory in $initial_cwd_"
-  cd "$test_dir_" || fail_ "failed to cd to temporary directory"
-
-  # As autoconf-generated configure scripts do, ensure that IFS
-  # is defined initially, so that saving and restoring $IFS works.
-  gl_init_sh_nl_='
-'
-  IFS=" ""	$gl_init_sh_nl_"
-
-  # This trap statement, along with a trap on 0 below, ensure that the
-  # temporary directory, $test_dir_, is removed upon exit as well as
-  # upon receipt of any of the listed signals.
-  for sig_ in 1 2 3 13 15; do
-    eval "trap 'Exit $(expr $sig_ + 128)' $sig_"
-  done
-}
-
-# Create a temporary directory, much like mktemp -d does.
-# Written by Jim Meyering.
-#
-# Usage: mktempd_ /tmp phoey.XXXXXXXXXX
-#
-# First, try to use the mktemp program.
-# Failing that, we'll roll our own mktemp-like function:
-#  - try to get random bytes from /dev/urandom
-#  - failing that, generate output from a combination of quickly-varying
-#      sources and gzip.  Ignore non-varying gzip header, and extract
-#      "random" bits from there.
-#  - given those bits, map to file-name bytes using tr, and try to create
-#      the desired directory.
-#  - make only $MAX_TRIES_ attempts
-
-# Helper function.  Print $N pseudo-random bytes from a-zA-Z0-9.
-rand_bytes_ ()
-{
-  n_=$1
-
-  # Maybe try openssl rand -base64 $n_prime_|tr '+/=\012' abcd first?
-  # But if they have openssl, they probably have mktemp, too.
-
-  chars_=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
-  dev_rand_=/dev/urandom
-  if test -r "$dev_rand_"; then
-    # Note: 256-length($chars_) == 194; 3 copies of $chars_ is 186 + 8 = 194.
-    dd ibs=$n_ count=1 if=$dev_rand_ 2>/dev/null \
-      | LC_ALL=C tr -c $chars_ 01234567$chars_$chars_$chars_
-    return
-  fi
-
-  n_plus_50_=`expr $n_ + 50`
-  cmds_='date; date +%N; free; who -a; w; ps auxww; ps -ef'
-  data_=` (eval "$cmds_") 2>&1 | gzip `
-
-  # Ensure that $data_ has length at least 50+$n_
-  while :; do
-    len_=`echo "$data_"|wc -c`
-    test $n_plus_50_ -le $len_ && break;
-    data_=` (echo "$data_"; eval "$cmds_") 2>&1 | gzip `
-  done
-
-  echo "$data_" \
-    | dd bs=1 skip=50 count=$n_ 2>/dev/null \
-    | LC_ALL=C tr -c $chars_ 01234567$chars_$chars_$chars_
-}
-
-mktempd_ ()
-{
-  case $# in
-  2);;
-  *) fail_ "Usage: mktempd_ DIR TEMPLATE";;
-  esac
-
-  destdir_=$1
-  template_=$2
-
-  MAX_TRIES_=4
-
-  # Disallow any trailing slash on specified destdir:
-  # it would subvert the post-mktemp "case"-based destdir test.
-  case $destdir_ in
-  / | //) destdir_slash_=$destdir;;
-  */) fail_ "invalid destination dir: remove trailing slash(es)";;
-  *) destdir_slash_=$destdir_/;;
-  esac
-
-  case $template_ in
-  *XXXX) ;;
-  *) fail_ \
-       "invalid template: $template_ (must have a suffix of at least 4 X's)";;
-  esac
-
-  # First, try to use mktemp.
-  d=`unset TMPDIR; { mktemp -d -t -p "$destdir_" "$template_"; } 2>/dev/null` &&
-
-  # The resulting name must be in the specified directory.
-  case $d in "$destdir_slash_"*) :;; *) false;; esac &&
-
-  # It must have created the directory.
-  test -d "$d" &&
-
-  # It must have 0700 permissions.  Handle sticky "S" bits.
-  perms=`ls -dgo "$d" 2>/dev/null` &&
-  case $perms in drwx--[-S]---*) :;; *) false;; esac && {
-    echo "$d"
-    return
-  }
-
-  # If we reach this point, we'll have to create a directory manually.
-
-  # Get a copy of the template without its suffix of X's.
-  base_template_=`echo "$template_"|sed 's/XX*$//'`
-
-  # Calculate how many X's we've just removed.
-  template_length_=`echo "$template_" | wc -c`
-  nx_=`echo "$base_template_" | wc -c`
-  nx_=`expr $template_length_ - $nx_`
-
-  err_=
-  i_=1
-  while :; do
-    X_=`rand_bytes_ $nx_`
-    candidate_dir_="$destdir_slash_$base_template_$X_"
-    err_=`mkdir -m 0700 "$candidate_dir_" 2>&1` \
-      && { echo "$candidate_dir_"; return; }
-    test $MAX_TRIES_ -le $i_ && break;
-    i_=`expr $i_ + 1`
-  done
-  fail_ "$err_"
-}
-
-# If you want to override the testdir_prefix_ function,
-# or to add more utility functions, use this file.
-test -f "$srcdir/init.cfg" \
-  && . "$srcdir/init.cfg"
-
-setup_ "$@"
-# This trap is here, rather than in the setup_ function, because some
-# shells run the exit trap at shell function exit, rather than script exit.
-trap remove_tmp_ 0
--- a/tests/initial-tab
+++ b/tests/initial-tab
@ -1,7 +1,7 @@
 #!/bin/sh
 # Exercise -T.

-# Copyright 2016-2018 Free Software Foundation, Inc.
+# Copyright 2016-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/invalid-multibyte-infloop
+++ b/tests/invalid-multibyte-infloop
@ -24,12 +24,10 @@ else
  test $status -eq 2
 fi || fail=1

-echo 'Binary file input matches' >binary-file-matches
-
 LC_ALL=en_US.UTF-8 timeout 10 grep -F $(encode A) input > out
 status=$?
 if test $status -eq 0; then
-  compare binary-file-matches out
+  compare /dev/null out
 elif test $status -eq 1; then
  compare_dev_null_ /dev/null out
 else
--- a/tests/khadafy
+++ b/tests/khadafy
@ -1,7 +1,7 @@
 #! /bin/sh
 # Regression test for GNU grep.
 #
-# Copyright (C) 2001, 2006, 2009-2018 Free Software Foundation, Inc.
+# Copyright (C) 2001, 2006, 2009-2026 Free Software Foundation, Inc.
 #
 # Copying and distribution of this file, with or without modification,
 # are permitted in any medium without royalty provided the copyright
--- a/tests/kwset-abuse
+++ b/tests/kwset-abuse
@ -2,7 +2,7 @@
 # Evoke a segfault in a hard-to-reach code path of kwset.c.
 # This bug affected grep versions 2.19 through 2.21.
 #
-# Copyright (C) 2015-2018 Free Software Foundation, Inc.
+# Copyright (C) 2015-2026 Free Software Foundation, Inc.
 #
 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/long-pattern-perf
+++ b/tests/long-pattern-perf
@ -1,7 +1,7 @@
 #!/bin/sh
 # grep-2.21 would incur a 100x penalty for 10x increase in regexp length

-# Copyright 2015-2018 Free Software Foundation, Inc.
+# Copyright 2015-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@ -24,17 +24,30 @@ fail=0
 # system load during the two test runs, so we'll mark it as
 # "expensive", making it less likely to be run by regular users.
 expensive_
+require_perl_

 echo x > in || framework_failure_
-# We could use seq -s '' (avoiding the tr filter), but I
-# suspect some version of seq does not honor that option.
 # Note that we want 10x the byte count (not line count) in the larger file.
 seq 10000 50000 | tr -d '\012' > r || framework_failure_
 cat r r r r r r r r r r > re-10x || framework_failure_
 mv r re || framework_failure_

-base_ms=$(user_time_ 1 grep -f re in    ) || fail=1
-b10x_ms=$(user_time_ 1 grep -f re-10x in) || fail=1
+returns_ 0 user_time_ 1 grep -f re in > base-ms \
+    || framework_failure_ 'failed to compute baseline timing'
+base_ms=$(cat base-ms)
+
+# This test caused trouble on at least two types of fringe hosts: those
+# with very little memory (a 1.5GB RAM Solaris host) and a Linux/s390x
+# (emulated with qemu-system-s390x). The former became unusable due to
+# mem requirements of the 2nd test, and the latter ended up taking >35x
+# more time than the base case. Skipping this test for any system using
+# more than this many milliseconds for the first case should avoid those
+# false-positive failures while skipping the test on few other systems.
+test 800 -lt "$base_ms" && skip_ "this base-case test took too long"
+
+returns_ 0 user_time_ 1 grep -f re-10x in > b10x-ms \
+    || framework_failure_ 'failed to compute 10x timing'
+b10x_ms=$(cat b10x-ms)

 # Increasing the length of the regular expression by a factor
 # of 10 should cause no more than a 10x increase in duration.
--- a/tests/many-regex-performance
+++ b/tests/many-regex-performance
@ -0,0 +1,80 @@
+#!/bin/sh
+# Test for this performance regression:
+# grep-3.4 would require O(N^2) RSS for N regexps
+# grep-3.5 requires O(N) in the most common cases.
+
+# Copyright 2020-2026 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <https://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+# This test is susceptible to failure due to differences in
+# system load during the two test runs, so we'll mark it as
+# "expensive", making it less likely to be run by regular users.
+expensive_
+require_perl_
+
+# Make the quick/small input large enough so that even on high-end
+# systems this first invocation takes at least 10ms of user time.
+word_list=/usr/share/dict/linux.words
+
+# If $word_list does not exist, generate an input that exhibits
+# similar performance characteristics.
+if ! test -f $word_list; then
+  # Generate data comparable to that word list.
+  # Note how all "words" start with "a", and that there is
+  # a small percentage of lines with at least one "." metachar.
+  # This requires /dev/urandom, so if it's not present, skip
+  # this test. If desperate, we could fall back to using
+  # tar+compressed lib/*.c as the data source.
+  test -r /dev/urandom \
+    || skip_ 'this system has neither word list nor working /dev/urandom'
+  word_list=word_list
+  ( echo a; cat /dev/urandom		\
+    | LC_ALL=C tr -dc 'a-zA-Z0-9_'	\
+    | head -c500000			\
+    | sed 's/\(........\)/\1\n/g'	\
+    | sed s/rs/./			\
+    | sed s/./a/			\
+    | sort				\
+  ) > $word_list
+fi
+
+n_lines=2000
+while :; do
+  sed ${n_lines}q < $word_list > in || framework_failure_
+  small_ms=$(LC_ALL=C user_time_ 1 grep --file=in -v in) || fail=1
+  test $small_ms -ge 10 && break
+  n_lines=$(expr $n_lines + 2000)
+done
+
+# Now, run it again, but with 20 times as many lines.
+n_lines=$(expr $n_lines \* 20)
+sed ${n_lines}q < $word_list > in || framework_failure_
+large_ms=$(LC_ALL=C user_time_ 1 grep --file=in -v in) || fail=1
+
+# Deliberately recording in an unused variable so it
+# shows up in set -x output, in case this test fails.
+ratio=$(expr "$large_ms" / "$small_ms")
+
+# The duration of the larger run must be no more than 60 times
+# that of the small one.  Using recent versions prior to this fix,
+# this test would fail due to ratios larger than 300.  Using the
+# fixed version, it's common to see a ratio of 20-30.
+returns_ 1 expr $small_ms '<' $large_ms / 60 || fail=1
+
+Exit $fail
--- a/tests/match-lines
+++ b/tests/match-lines
@ -3,7 +3,7 @@
 # grep -F -x -o PAT print an extra newline for each match.
 # This would fail for grep-2.19 and grep-2.20.

-# Copyright 2014-2018 Free Software Foundation, Inc.
+# Copyright 2014-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/max-count-overread
+++ b/tests/max-count-overread
@ -12,4 +12,20 @@ echo x > exp || framework_failure_
 yes x | timeout 10 grep -m1 x > out || fail=1
 compare exp out || fail=1

+# Make sure -m2 stops reading even when output is /dev/null.
+# In grep 3.11, it would continue reading.
+printf 'x\nx\nx\n' >in || framework_failure
+(grep -m2 x >/dev/null && head -n1) <in >out || fail=1
+compare exp out || fail=1
+
+# The following two tests would fail before v3.11-70
+echo x > in || framework_failure_
+echo in > exp || framework_failure_
+grep -l -m1 . in > out || fail=1
+compare exp out || fail=1
+
+# Ensure that this prints nothing and exits successfully.
+grep -q -m1 . in > out || fail=1
+compare /dev/null out || fail=1
+
 Exit $fail
--- a/tests/mb-dot-newline
+++ b/tests/mb-dot-newline
@ -2,7 +2,7 @@
 # Trigger a bug in the DFA matcher.
 # This would fail for grep-2.20.

-# Copyright 2014-2018 Free Software Foundation, Inc.
+# Copyright 2014-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/tests/mb-non-UTF8-overrun
+++ b/tests/mb-non-UTF8-overrun
@ -2,7 +2,7 @@
 # grep would sometimes read beyond end of input, when using a non-UTF8
 # multibyte locale.

-# Copyright 2014-2018 Free Software Foundation, Inc.
+# Copyright 2014-2026 Free Software Foundation, Inc.

 # This program is free software: you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
--- a/Show More
+++ b/Show More
 @ -1 +1 @@
 .2
 .12