diff --git a/doc/regexprops.texi b/doc/regexprops.texi index 62cc7c86..17cfc55e 100644 --- a/doc/regexprops.texi +++ b/doc/regexprops.texi @@ -11,14 +11,14 @@ @menu * findutils-default regular expression syntax:: +* awk regular expression syntax:: +* egrep regular expression syntax:: * emacs regular expression syntax:: * gnu-awk regular expression syntax:: * grep regular expression syntax:: * posix-awk regular expression syntax:: -* awk regular expression syntax:: * posix-basic regular expression syntax:: * posix-egrep regular expression syntax:: -* egrep regular expression syntax:: * posix-extended regular expression syntax:: @end menu @@ -42,7 +42,7 @@ matches a @samp{?}. @end table -Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}. +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. GNU extensions are supported: @@ -108,11 +108,124 @@ The character @samp{$} only represents the end of a string when it appears: @end enumerate +Intervals are specified by @samp{\@{} and @samp{\@}}. +Invalid intervals such as @samp{a\@{1z} are not accepted. The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. +@node awk regular expression syntax +@subsection @samp{awk} regular expression syntax + + +The character @samp{.} matches any single character except the null character. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. + + +GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively. + + +Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + + +@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except: +@enumerate + +@item At the beginning of a regular expression + +@item After an open-group, signified by @samp{(} + +@item After the alternation operator @samp{|} + +@end enumerate + + + + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + +@node egrep regular expression syntax +@subsection @samp{egrep} regular expression syntax + + +The character @samp{.} matches any single character. + + +@table @samp + +@item + +indicates that the regular expression should match one or more occurrences of the previous atom or regexp. +@item ? +indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. +@item \+ +matches a @samp{+} +@item \? +matches a @samp{?}. +@end table + + +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. + + +GNU extensions are supported: +@enumerate + +@item @samp{\w} matches a character within a word + +@item @samp{\W} matches a character which is not within a word + +@item @samp{\<} matches the beginning of a word + +@item @samp{\>} matches the end of a word + +@item @samp{\b} matches a word boundary + +@item @samp{\B} matches characters which are not a word boundary + +@item @samp{\`} matches the beginning of the whole input + +@item @samp{\'} matches the end of the whole input + +@end enumerate + + +Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}. + +The alternation operator is @samp{|}. + +The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. + + +The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression. + + +Intervals are specified by @samp{@{} and @samp{@}}. +Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1} + +The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. + + @node emacs regular expression syntax @subsection @samp{emacs} regular expression syntax @@ -133,7 +246,7 @@ matches a @samp{?}. @end table -Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}. +Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. GNU extensions are supported: @@ -199,6 +312,8 @@ The character @samp{$} only represents the end of a string when it appears: @end enumerate +Intervals are specified by @samp{\@{} and @samp{\@}}. +Invalid intervals such as @samp{a\@{1z} are not accepted. The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. @@ -420,56 +535,6 @@ The characters @samp{^} and @samp{$} always represent the beginning and end of a Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1} -The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. - - -@node awk regular expression syntax -@subsection @samp{awk} regular expression syntax - - -The character @samp{.} matches any single character except the null character. - - -@table @samp - -@item + -indicates that the regular expression should match one or more occurrences of the previous atom or regexp. -@item ? -indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. -@item \+ -matches a @samp{+} -@item \? -matches a @samp{?}. -@end table - - -Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. - - -GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively. - - -Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit. - -The alternation operator is @samp{|}. - -The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. - - -@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except: -@enumerate - -@item At the beginning of a regular expression - -@item After an open-group, signified by @samp{(} - -@item After the alternation operator @samp{|} - -@end enumerate - - - - The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. @@ -567,68 +632,7 @@ The longest possible match is returned; this applies to the regular expression a @node posix-egrep regular expression syntax @subsection @samp{posix-egrep} regular expression syntax - - -The character @samp{.} matches any single character. - - -@table @samp - -@item + -indicates that the regular expression should match one or more occurrences of the previous atom or regexp. -@item ? -indicates that the regular expression should match zero or one occurrence of the previous atom or regexp. -@item \+ -matches a @samp{+} -@item \? -matches a @samp{?}. -@end table - - -Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. - - -GNU extensions are supported: -@enumerate - -@item @samp{\w} matches a character within a word - -@item @samp{\W} matches a character which is not within a word - -@item @samp{\<} matches the beginning of a word - -@item @samp{\>} matches the end of a word - -@item @samp{\b} matches a word boundary - -@item @samp{\B} matches characters which are not a word boundary - -@item @samp{\`} matches the beginning of the whole input - -@item @samp{\'} matches the end of the whole input - -@end enumerate - - -Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}. - -The alternation operator is @samp{|}. - -The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified. - - -The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression. - - -Intervals are specified by @samp{@{} and @samp{@}}. -Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1} - -The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups. - - -@node egrep regular expression syntax -@subsection @samp{egrep} regular expression syntax -This is a synonym for posix-egrep. +This is a synonym for egrep. @node posix-extended regular expression syntax @subsection @samp{posix-extended} regular expression syntax