Add support for Shell printf format strings.

* gettext-tools/src/message.h (enum format_type): Add format_sh_printf.
(NFORMATS): Increment.
* gettext-tools/src/message.c (format_language, format_language_pretty): Add an
entry for format_sh_printf.
* gettext-tools/src/format.h (formatstring_sh_printf): New declaration.
* gettext-tools/src/format.c (formatstring_parsers): Add an entry for
format_sh_printf.
* gettext-tools/src/format-sh-printf.c: New file, based on
gettext-tools/src/format-awk.c.
* gettext-tools/src/FILES: Mention it.
* gettext-tools/src/x-sh.h (SCANNERS_SH): Use formatstring_sh_printf as
secondary format string type.
* gettext-tools/src/xgettext.c (xgettext_record_flag): Update accordingly.
* gettext-tools/src/x-sh.c (init_flag_table_sh): Register gettext, ngettext with
flag 'pass-sh-printf-format'. Register 'printf' with flag 'sh-printf-format'.
* gettext-tools/src/Makefile.am (FORMAT_SOURCE): Add format-sh-printf.c.
* gettext-tools/libgettextpo/Makefile.am (libgettextpo_la_AUXSOURCES): Likewise.
* gettext-tools/doc/gettext.texi (PO Files): Mention sh-printf-format.
(sh-format): Document also the sh-printf-format strings.
* gettext-tools/doc/lang-sh.texi (sh): Mention the coreutils 'printf' command.
* gettext-tools/tests/xgettext-sh-1: Add a test case with a printf invocation.
* gettext-tools/tests/format-sh-printf-1: New file, based on
gettext-tools/tests/format-awk-1.
* gettext-tools/tests/format-sh-printf-2: New file, based on
gettext-tools/tests/format-awk-2.
* gettext-tools/tests/Makefile.am (TESTS): Add them.
* NEWS: Mention the change.
This commit is contained in:
Bruno Haible 2025-06-22 10:56:33 +02:00
parent 6d6241d0fe
commit c93b9f3976
18 changed files with 1007 additions and 10 deletions

2
NEWS
View File

@ -12,6 +12,8 @@ Version 0.26 - July 2025
in a context that requires a format string. You can override this
heuristic by using a comment of the form /* xgettext: c-format */.
* Shell:
- xgettext now recognizes format strings in the 'printf' command syntax.
They are marked as 'sh-printf-format' in POT and PO files.
- xgettext now recognizes the \c, \u, and \U escape sequences in dollar-
single-quoted strings $'...'.

View File

@ -1733,7 +1733,13 @@ Likewise for Ruby, see @ref{ruby-format}.
@kwindex sh-format@r{ flag}
@itemx no-sh-format
@kwindex no-sh-format@r{ flag}
Likewise for Shell, see @ref{sh-format}.
Likewise for Shell format strings, see @ref{sh-format}.
@item sh-printf-format
@kwindex sh-printf-format@r{ flag}
@itemx no-sh-printf-format
@kwindex no-sh-printf-format@r{ flag}
Likewise for Shell @code{printf} format strings, see @ref{sh-format}.
@item awk-format
@kwindex awk-format@r{ flag}
@ -10227,6 +10233,14 @@ equivalent to @code{%<@var{name}>s}.
@node sh-format
@subsection Shell Format Strings
There are two kinds of format strings in shell scripts:
those with dollar notation for placeholders,
called @emph{Shell format strings}
and labelled as @samp{sh-format},
and those acceptable to the @samp{printf} command (or shell built-in command),
called @emph{Shell @code{printf} format strings}
and labelled as @samp{sh-printf-format}.
Shell format strings, as supported by GNU gettext and the @samp{envsubst}
program, are strings with references to shell variables in the form
@code{$@var{variable}} or @code{$@{@var{variable}@}}. References of the form
@ -10243,6 +10257,28 @@ that would be valid inside shell scripts, are not supported. The
ASCII characters, not start with a digit and be nonempty; otherwise such
a variable reference is ignored.
Shell @code{printf} format strings are the format strings supported
by the POSIX @samp{printf} command
(@url{https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html}),
including the floating-point conversion specifiers
@code{a}, @code{A}, @code{e}, @code{E}, @code{f}, @code{F}, @code{g}, @code{G},
but without the obsolescent @code{b} conversion specifier.
Extensions by the GNU coreutils @samp{printf} command
(@url{https://www.gnu.org/software/coreutils/manual/html_node/printf-invocation.html})
are not supported:
use of @samp{*} or @samp{*@var{m}$} as width or precision;
use of size specifiers @code{h}, @code{l}, @code{j}, @code{z}, @code{t} (ignored);
and the escape sequences @code{\c},
@code{\x@var{nn}}, @code{\u@var{nnnn}}, @code{\U@var{nnnnnnnn}}.
Extensions by the GNU bash @samp{printf} built-in
(@url{https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html#index-printf})
are not supported either:
use of @samp{*} as width or precision;
use of size specifiers @code{h}, @code{l}, @code{j}, @code{z}, @code{t} (ignored);
the @code{%b}, @code{%q}, @code{%Q}, @code{%T}, @code{%n} directives;
and the escape sequences
@code{\x@var{nn}}, @code{\u@var{nnnn}}, @code{\U@var{nnnnnnnn}}.
@node awk-format
@subsection awk Format Strings

View File

@ -1,5 +1,5 @@
@c This file is part of the GNU gettext manual.
@c Copyright (C) 1995-2024 Free Software Foundation, Inc.
@c Copyright (C) 1995-2025 Free Software Foundation, Inc.
@c See the file gettext.texi for copying conditions.
@node sh
@ -50,10 +50,11 @@ use
@code{xgettext}
@item Formatting with positions
---
@c Not yet: It requires support in GNU coreutils, GNU bash, dash, etc.
@c @url{https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html,
@c @code{printf}}
A POSIX compliant
@url{https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html,
@code{printf}}
command, such as the one from GNU coreutils 9.6 or newer.
@c GNU Bash built-in?
@item Portability
fully portable

View File

@ -83,6 +83,7 @@ libgettextpo_la_AUXSOURCES = \
../src/format-go.c \
../src/format-ruby.c \
../src/format-sh.c \
../src/format-sh-printf.c \
../src/format-awk.c \
../src/format-lua.c \
../src/format-pascal.c \

View File

@ -240,6 +240,7 @@ format-rust.c Format string handling for Rust.
format-go.c Format string handling for Go.
format-ruby.c Format string handling for Ruby.
format-sh.c Format string handling for Shell.
format-sh-printf.c Format string handling for Shell, printf syntax.
format-awk.c Format string handling for awk.
format-lua.c Format string handling for Lua.
format-pascal.c Format string handling for Object Pascal.

View File

@ -203,6 +203,7 @@ FORMAT_SOURCE += \
format-go.c \
format-ruby.c \
format-sh.c \
format-sh-printf.c \
format-awk.c \
format-lua.c \
format-pascal.c \

View File

@ -0,0 +1,608 @@
/* Shell printf format strings.
Copyright (C) 2001-2025 Free Software Foundation, Inc.
Written by Bruno Haible <bruno@clisp.org>, 2025.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>. */
#ifdef HAVE_CONFIG_H
# include <config.h>
#endif
#include <stdbool.h>
#include <stdlib.h>
#include "format.h"
#include "c-ctype.h"
#include "xalloc.h"
#include "xvasprintf.h"
#include "format-invalid.h"
#include "gettext.h"
#define _(str) gettext (str)
/* Shell printf format strings are described in
* POSIX:
<https://pubs.opengroup.org/onlinepubs/9799919799/utilities/printf.html>
<https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap05.html#tag_05>
* The GNU coreutils documentation:
<https://www.gnu.org/software/coreutils/manual/html_node/printf-invocation.html>
* The GNU bash documentation:
<https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html#index-printf>
The format string consists of
- plain text,
- directives, that start with '%',
- escape sequences, that start with a backslash and don't contain '%'.
The set of supported directives and escape sequences is documented in gettext.texi.
A directive
- starts with '%' or '%m$' where m is a positive integer,
- is optionally followed by any of the characters '#', '0', '-', ' ', '+',
each of which acts as a flag,
- is optionally followed by a width specification: a nonempty digit sequence,
[not in POSIX: '*' (reads an argument) or '*m$']
- is optionally followed by '.' and a precision specification: an optional
nonempty digit sequence,
[not in POSIX: '*' (reads an argument) or '*m$']
- [not in POSIX: is optionally followed by a size specifier, one of
'hh' 'h' 'l' 'll' 'L' 'q' 'j' 'z' 't']
- is finished by a specifier
- 'c', that needs a character argument,
- 's', that needs a string argument,
- 'i', 'd', that need a signed integer argument,
- 'u', 'o', 'x', 'X', that need an unsigned integer argument,
- [optional in POSIX, but supported here:] 'e', 'E', 'f', 'F', 'g', 'G',
'a', 'A', that need a floating-point argument.
Additionally there is the directive '%%', which takes no argument.
Numbered ('%m$' or '*m$') and unnumbered argument specifications cannot
be used in the same string.
The valid escape sequences are:
\\ \a \b \f \n \r \t \v
\nnn with 1 to 3 octal digits n
[not in POSIX: \c \xnn \unnnn \Unnnnnnnn]
*/
enum format_arg_type
{
FAT_NONE,
FAT_CHARACTER,
FAT_STRING,
FAT_INTEGER,
FAT_UNSIGNED_INTEGER,
FAT_FLOAT
};
struct numbered_arg
{
unsigned int number;
enum format_arg_type type;
};
struct spec
{
unsigned int directives;
/* We consider a directive as "likely intentional" if it does not contain a
space. This prevents xgettext from flagging strings like "100% complete"
as 'sh-printf-format' if they don't occur in a context that requires a
format string. */
unsigned int likely_intentional_directives;
unsigned int numbered_arg_count;
struct numbered_arg *numbered;
};
static int
numbered_arg_compare (const void *p1, const void *p2)
{
unsigned int n1 = ((const struct numbered_arg *) p1)->number;
unsigned int n2 = ((const struct numbered_arg *) p2)->number;
return (n1 > n2 ? 1 : n1 < n2 ? -1 : 0);
}
static void *
format_parse (const char *format, bool translated, char *fdi,
char **invalid_reason)
{
const char *const format_start = format;
struct spec spec;
unsigned int numbered_allocated;
unsigned int unnumbered_arg_count;
struct spec *result;
spec.directives = 0;
spec.likely_intentional_directives = 0;
spec.numbered_arg_count = 0;
spec.numbered = NULL;
numbered_allocated = 0;
unnumbered_arg_count = 0;
for (; *format != '\0';)
/* Invariant: spec.numbered_arg_count == 0 || unnumbered_arg_count == 0. */
if (*format == '%')
{
/* A directive. */
bool likely_intentional = true;
FDI_SET (format, FMTDIR_START);
format++;
spec.directives++;
if (*format != '%')
{
unsigned int number = 0;
enum format_arg_type type;
if (c_isdigit (*format))
{
const char *f = format;
unsigned int m = 0;
do
{
m = 10 * m + (*f - '0');
f++;
}
while (c_isdigit (*f));
if (*f == '$')
{
if (m == 0)
{
*invalid_reason = INVALID_ARGNO_0 (spec.directives);
FDI_SET (f, FMTDIR_ERROR);
goto bad_format;
}
number = m;
format = ++f;
}
}
/* Parse flags. */
while (*format == ' ' || *format == '+' || *format == '-'
|| *format == '#' || *format == '0')
{
if (*format == ' ')
likely_intentional = false;
format++;
}
/* Parse width. */
if (c_isdigit (*format))
{
do format++; while (c_isdigit (*format));
}
/* Parse precision. */
if (*format == '.')
{
format++;
while (c_isdigit (*format))
format++;
}
switch (*format)
{
case 'c':
type = FAT_CHARACTER;
break;
case 's':
type = FAT_STRING;
break;
case 'i': case 'd':
type = FAT_INTEGER;
break;
case 'u': case 'o': case 'x': case 'X':
type = FAT_UNSIGNED_INTEGER;
break;
case 'e': case 'E': case 'f': case 'F': case 'g': case 'G':
case 'a': case 'A':
type = FAT_FLOAT;
break;
default:
if (*format == '\0')
{
*invalid_reason = INVALID_UNTERMINATED_DIRECTIVE ();
FDI_SET (format - 1, FMTDIR_ERROR);
}
else
{
*invalid_reason =
INVALID_CONVERSION_SPECIFIER (spec.directives, *format);
FDI_SET (format, FMTDIR_ERROR);
}
goto bad_format;
}
if (number)
{
/* Numbered argument. */
/* Numbered and unnumbered specifications are exclusive. */
if (unnumbered_arg_count > 0)
{
*invalid_reason = INVALID_MIXES_NUMBERED_UNNUMBERED ();
FDI_SET (format, FMTDIR_ERROR);
goto bad_format;
}
if (numbered_allocated == spec.numbered_arg_count)
{
numbered_allocated = 2 * numbered_allocated + 1;
spec.numbered = (struct numbered_arg *) xrealloc (spec.numbered, numbered_allocated * sizeof (struct numbered_arg));
}
spec.numbered[spec.numbered_arg_count].number = number;
spec.numbered[spec.numbered_arg_count].type = type;
spec.numbered_arg_count++;
}
else
{
/* Unnumbered argument. */
/* Numbered and unnumbered specifications are exclusive. */
if (spec.numbered_arg_count > 0)
{
*invalid_reason = INVALID_MIXES_NUMBERED_UNNUMBERED ();
FDI_SET (format, FMTDIR_ERROR);
goto bad_format;
}
if (numbered_allocated == unnumbered_arg_count)
{
numbered_allocated = 2 * numbered_allocated + 1;
spec.numbered = (struct numbered_arg *) xrealloc (spec.numbered, numbered_allocated * sizeof (struct numbered_arg));
}
spec.numbered[unnumbered_arg_count].number = unnumbered_arg_count + 1;
spec.numbered[unnumbered_arg_count].type = type;
unnumbered_arg_count++;
}
}
if (likely_intentional)
spec.likely_intentional_directives++;
FDI_SET (format, FMTDIR_END);
format++;
}
else if (*format == '\\')
{
/* An escape sequence. */
FDI_SET (format, FMTDIR_START);
format++;
switch (*format)
{
case '\\':
case 'a':
case 'b':
case 'f':
case 'n':
case 'r':
case 't':
case 'v':
format++;
break;
case '0': case '1': case '2': case '3': case '4': case '5': case '6':
case '7':
format++;
if (*format >= '0' && *format <= '7')
{
format++;
if (*format >= '0' && *format <= '7')
format++;
}
break;
default:
if (*format == '\0')
{
*invalid_reason =
xstrdup (_("The string ends in the middle of an escape sequence."));
FDI_SET (format - 1, FMTDIR_ERROR);
}
else
{
*invalid_reason =
(c_isprint (*format)
? ((*format == 'c'
|| *format == 'x'
|| *format == 'u' || *format == 'U')
? xasprintf (_("The escape sequence '%c%c' is unsupported (not in POSIX)."), '\\', *format)
: xasprintf (_("The escape sequence '%c%c' is invalid."), '\\', *format))
: xstrdup (_("This escape sequence is invalid.")));
FDI_SET (format, FMTDIR_ERROR);
}
goto bad_format;
}
FDI_SET (format - 1, FMTDIR_END);
}
else
format++;
/* Convert the unnumbered argument array to numbered arguments. */
if (unnumbered_arg_count > 0)
spec.numbered_arg_count = unnumbered_arg_count;
/* Sort the numbered argument array, and eliminate duplicates. */
else if (spec.numbered_arg_count > 1)
{
unsigned int i, j;
bool err;
qsort (spec.numbered, spec.numbered_arg_count,
sizeof (struct numbered_arg), numbered_arg_compare);
/* Remove duplicates: Copy from i to j, keeping 0 <= j <= i. */
err = false;
for (i = j = 0; i < spec.numbered_arg_count; i++)
if (j > 0 && spec.numbered[i].number == spec.numbered[j-1].number)
{
enum format_arg_type type1 = spec.numbered[i].type;
enum format_arg_type type2 = spec.numbered[j-1].type;
enum format_arg_type type_both;
if (type1 == type2)
type_both = type1;
else
{
/* Incompatible types. */
type_both = FAT_NONE;
if (!err)
*invalid_reason =
INVALID_INCOMPATIBLE_ARG_TYPES (spec.numbered[i].number);
err = true;
}
spec.numbered[j-1].type = type_both;
}
else
{
if (j < i)
{
spec.numbered[j].number = spec.numbered[i].number;
spec.numbered[j].type = spec.numbered[i].type;
}
j++;
}
spec.numbered_arg_count = j;
if (err)
/* *invalid_reason has already been set above. */
goto bad_format;
}
result = XMALLOC (struct spec);
*result = spec;
return result;
bad_format:
if (spec.numbered != NULL)
free (spec.numbered);
return NULL;
}
static void
format_free (void *descr)
{
struct spec *spec = (struct spec *) descr;
if (spec->numbered != NULL)
free (spec->numbered);
free (spec);
}
static int
format_get_number_of_directives (void *descr)
{
struct spec *spec = (struct spec *) descr;
return spec->directives;
}
static bool
format_is_unlikely_intentional (void *descr)
{
struct spec *spec = (struct spec *) descr;
return spec->likely_intentional_directives == 0;
}
static bool
format_check (void *msgid_descr, void *msgstr_descr, bool equality,
formatstring_error_logger_t error_logger, void *error_logger_data,
const char *pretty_msgid, const char *pretty_msgstr)
{
struct spec *spec1 = (struct spec *) msgid_descr;
struct spec *spec2 = (struct spec *) msgstr_descr;
bool err = false;
if (spec1->numbered_arg_count + spec2->numbered_arg_count > 0)
{
unsigned int i, j;
unsigned int n1 = spec1->numbered_arg_count;
unsigned int n2 = spec2->numbered_arg_count;
/* Check that the argument numbers are the same.
Both arrays are sorted. We search for the first difference. */
for (i = 0, j = 0; i < n1 || j < n2; )
{
int cmp = (i >= n1 ? 1 :
j >= n2 ? -1 :
spec1->numbered[i].number > spec2->numbered[j].number ? 1 :
spec1->numbered[i].number < spec2->numbered[j].number ? -1 :
0);
if (cmp > 0)
{
if (error_logger)
error_logger (error_logger_data,
_("a format specification for argument %u, as in '%s', doesn't exist in '%s'"),
spec2->numbered[j].number, pretty_msgstr,
pretty_msgid);
err = true;
break;
}
else if (cmp < 0)
{
if (equality)
{
if (error_logger)
error_logger (error_logger_data,
_("a format specification for argument %u doesn't exist in '%s'"),
spec1->numbered[i].number, pretty_msgstr);
err = true;
break;
}
else
i++;
}
else
j++, i++;
}
/* Check the argument types are the same. */
if (!err)
for (i = 0, j = 0; j < n2; )
{
if (spec1->numbered[i].number == spec2->numbered[j].number)
{
if (spec1->numbered[i].type != spec2->numbered[j].type)
{
if (error_logger)
error_logger (error_logger_data,
_("format specifications in '%s' and '%s' for argument %u are not the same"),
pretty_msgid, pretty_msgstr,
spec2->numbered[j].number);
err = true;
break;
}
j++, i++;
}
else
i++;
}
}
return err;
}
struct formatstring_parser formatstring_sh_printf =
{
format_parse,
format_free,
format_get_number_of_directives,
format_is_unlikely_intentional,
format_check
};
#ifdef TEST
/* Test program: Print the argument list specification returned by
format_parse for strings read from standard input. */
#include <stdio.h>
static void
format_print (void *descr)
{
struct spec *spec = (struct spec *) descr;
unsigned int last;
unsigned int i;
if (spec == NULL)
{
printf ("INVALID");
return;
}
printf ("(");
last = 1;
for (i = 0; i < spec->numbered_arg_count; i++)
{
unsigned int number = spec->numbered[i].number;
if (i > 0)
printf (" ");
if (number < last)
abort ();
for (; last < number; last++)
printf ("_ ");
switch (spec->numbered[i].type)
{
case FAT_CHARACTER:
printf ("c");
break;
case FAT_STRING:
printf ("s");
break;
case FAT_INTEGER:
printf ("i");
break;
case FAT_UNSIGNED_INTEGER:
printf ("[unsigned]i");
break;
case FAT_FLOAT:
printf ("f");
break;
default:
abort ();
}
last = number + 1;
}
printf (")");
}
int
main ()
{
for (;;)
{
char *line = NULL;
size_t line_size = 0;
int line_len;
char *invalid_reason;
void *descr;
line_len = getline (&line, &line_size, stdin);
if (line_len < 0)
break;
if (line_len > 0 && line[line_len - 1] == '\n')
line[--line_len] = '\0';
invalid_reason = NULL;
descr = format_parse (line, false, NULL, &invalid_reason);
format_print (descr);
printf ("\n");
if (descr == NULL)
printf ("%s\n", invalid_reason);
free (invalid_reason);
free (line);
}
return 0;
}
/*
* For Emacs M-x compile
* Local Variables:
* compile-command: "/bin/sh ../libtool --tag=CC --mode=link gcc -o a.out -static -O -g -Wall -I.. -I../gnulib-lib -I../../gettext-runtime/intl -DHAVE_CONFIG_H -DTEST format-sh-printf.c ../gnulib-lib/libgettextlib.la"
* End:
*/
#endif /* TEST */

View File

@ -51,6 +51,7 @@ struct formatstring_parser *formatstring_parsers[NFORMATS] =
/* format_go */ &formatstring_go,
/* format_ruby */ &formatstring_ruby,
/* format_sh */ &formatstring_sh,
/* format_sh_printf */ &formatstring_sh_printf,
/* format_awk */ &formatstring_awk,
/* format_lua */ &formatstring_lua,
/* format_pascal */ &formatstring_pascal,

View File

@ -117,6 +117,7 @@ extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_rust;
extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_go;
extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_ruby;
extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_sh;
extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_sh_printf;
extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_awk;
extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_lua;
extern LIBGETTEXTSRC_DLL_VARIABLE struct formatstring_parser formatstring_pascal;

View File

@ -51,6 +51,7 @@ const char *const format_language[NFORMATS] =
/* format_go */ "go",
/* format_ruby */ "ruby",
/* format_sh */ "sh",
/* format_sh_printf */ "sh-printf",
/* format_awk */ "awk",
/* format_lua */ "lua",
/* format_pascal */ "object-pascal",
@ -90,6 +91,7 @@ const char *const format_language_pretty[NFORMATS] =
/* format_go */ "Go",
/* format_ruby */ "Ruby",
/* format_sh */ "Shell",
/* format_sh_printf */ "Shell printf",
/* format_awk */ "awk",
/* format_lua */ "Lua",
/* format_pascal */ "Object Pascal",

View File

@ -60,6 +60,7 @@ enum format_type
format_go,
format_ruby,
format_sh,
format_sh_printf,
format_awk,
format_lua,
format_pascal,
@ -79,7 +80,7 @@ enum format_type
format_gfc_internal,
format_ycp
};
#define NFORMATS 35 /* Number of format_type enum values. */
#define NFORMATS 36 /* Number of format_type enum values. */
extern LIBGETTEXTSRC_DLL_VARIABLE const char *const format_language[NFORMATS];
extern LIBGETTEXTSRC_DLL_VARIABLE const char *const format_language_pretty[NFORMATS];

View File

@ -138,14 +138,18 @@ void
init_flag_table_sh ()
{
xgettext_record_flag ("gettext:1:pass-sh-format");
xgettext_record_flag ("gettext:1:pass-sh-printf-format");
xgettext_record_flag ("ngettext:1:pass-sh-format");
xgettext_record_flag ("ngettext:1:pass-sh-printf-format");
xgettext_record_flag ("ngettext:2:pass-sh-format");
xgettext_record_flag ("ngettext:2:pass-sh-printf-format");
xgettext_record_flag ("eval_gettext:1:sh-format");
xgettext_record_flag ("eval_ngettext:1:sh-format");
xgettext_record_flag ("eval_ngettext:2:sh-format");
xgettext_record_flag ("eval_pgettext:2:sh-format");
xgettext_record_flag ("eval_npgettext:2:sh-format");
xgettext_record_flag ("eval_npgettext:3:sh-format");
xgettext_record_flag ("printf:1:sh-printf-format");
}

View File

@ -1,5 +1,5 @@
/* xgettext sh backend.
Copyright (C) 2003, 2006, 2014, 2018, 2020 Free Software Foundation, Inc.
Copyright (C) 2003-2025 Free Software Foundation, Inc.
Written by Bruno Haible <bruno@clisp.org>, 2003.
This program is free software: you can redistribute it and/or modify
@ -33,7 +33,8 @@ extern "C" {
#define SCANNERS_SH \
{ "Shell", extract_sh, NULL, \
&flag_table_sh, &formatstring_sh, NULL }, \
&flag_table_sh, \
&formatstring_sh, &formatstring_sh_printf }, \
/* Scan a shell script file and add its translatable strings to mdlp. */
extern void extract_sh (FILE *fp, const char *real_filename,

View File

@ -1753,6 +1753,11 @@ xgettext_record_flag (const char *optionstring)
name_start, name_end,
argnum, value, pass);
break;
case format_sh_printf:
flag_context_list_table_insert (&flag_table_sh, XFORMAT_SECONDARY,
name_start, name_end,
argnum, value, pass);
break;
case format_awk:
flag_context_list_table_insert (&flag_table_awk, XFORMAT_PRIMARY,
name_start, name_end,

View File

@ -227,6 +227,7 @@ TESTS = gettext-1 gettext-2 \
format-rust-1 format-rust-2 \
format-scheme-1 format-scheme-2 \
format-sh-1 format-sh-2 \
format-sh-printf-1 format-sh-printf-2 \
format-tcl-1 format-tcl-2 format-tcl-3 \
format-ycp-1 format-ycp-2 \
plural-1 plural-2 plural-3 plural-4 \

View File

@ -0,0 +1,178 @@
#! /bin/sh
. "${srcdir=.}/init.sh"; path_prepend_ . ../src
# Test recognition of Shell printf format strings.
escape_backslashes='s/\\/\\\\/g'
LC_ALL=C sed -e "$escape_backslashes" <<\EOF > f-sp-1.data
# Valid: no argument
"abc%%"
# Valid: one character argument
"abc%c"
# Valid: one string argument
"abc%s"
# Valid: one integer argument
"abc%i"
# Valid: one integer argument
"abc%d"
# Valid: one integer argument
"abc%o"
# Valid: one integer argument
"abc%u"
# Valid: one integer argument
"abc%x"
# Valid: one integer argument
"abc%X"
# Valid: one floating-point argument
"abc%e"
# Valid: one floating-point argument
"abc%E"
# Valid: one floating-point argument
"abc%f"
# Valid: one floating-point argument
"abc%F"
# Valid: one floating-point argument
"abc%g"
# Valid: one floating-point argument
"abc%G"
# Valid: one floating-point argument
"abc%a"
# Valid: one floating-point argument
"abc%A"
# Valid: one argument with flags
"abc%0#g"
# Valid: one argument with width
"abc%2g"
# Invalid: one argument with width
"abc%*g"
# Valid: one argument with precision
"abc%.4g"
# Invalid: one argument with precision
"abc%.*g"
# Valid: one argument with width and precision
"abc%14.4g"
# Invalid: one argument with width and precision
"abc%14.*g"
# Invalid: one argument with width and precision
"abc%*.4g"
# Invalid: one argument with width and precision
"abc%*.*g"
# Invalid: unterminated
"abc%"
# Invalid: unknown format specifier
"abc%y"
# Invalid: flags after width
"abc%*0g"
# Valid: null precision
"abc%.f"
# Invalid: twice precision
"abc%.4.2g"
# Valid: three arguments
"abc%d%u%u"
# Valid: a numbered argument
"abc%1$d"
# Invalid: zero
"abc%0$d"
# Valid: two-digit numbered arguments
"abc%11$def%10$dgh%9$dij%8$dkl%7$dmn%6$dop%5$dqr%4$dst%3$duv%2$dwx%1$dyz"
# Invalid: unterminated number
"abc%1"
# Invalid: flags before number
"abc%+1$d"
# Valid: three arguments, two with same number
"abc%1$4x,%2$c,%1$u"
# Invalid: argument with conflicting types
"abc%1$4x,%2$c,%1$s"
# Valid: no conflict
"abc%1$4x,%2$c,%1$u"
# Invalid: mixing of numbered and unnumbered arguments
"abc%d%2$x"
# Valid: numbered argument with constant precision
"abc%1$.9x"
# Invalid: mixing of numbered and unnumbered arguments
"abc%1$.*x"
# Valid: missing non-final argument
"abc%2$x%3$s"
# Valid: permutation
"abc%2$ddef%1$d"
# Valid: multiple uses of same argument
"abc%2$xdef%1$sghi%2$x"
# Invalid: one argument with width
"abc%2$#*1$g"
# Invalid: one argument with width and precision
"abc%3$*2$.*1$g"
# Invalid: zero
"abc%2$*0$.*1$g"
# Valid: escape sequence
"abc%%def\\"
# Valid: escape sequence
"abc%%def\a"
# Valid: escape sequence
"abc%%def\b"
# Valid: escape sequence
"abc%%def\f"
# Valid: escape sequence
"abc%%def\n"
# Valid: escape sequence
"abc%%def\r"
# Valid: escape sequence
"abc%%def\t"
# Valid: escape sequence
"abc%%def\v"
# Valid: escape sequence
"abc%%def\066"
# Invalid: escape sequence
"abc%%def\"
# Invalid: escape sequence
"abc%%def\""
# Invalid: escape sequence
"abc%%def\c"
# Invalid: escape sequence
"abc%%def\x32"
# Invalid: escape sequence
"abc%%def\u20ac"
# Invalid: escape sequence
"abc%%def\U0001F41C"
# Invalid: escape sequence
"abc%%def\%d"
EOF
: ${XGETTEXT=xgettext}
n=0
while read comment; do
# Note: The 'read' command processes backslashes. ('read -r' is not portable.)
read string
n=`expr $n + 1`
escape_backslashes='s/\\/\\\\/g'
escape_dollars='s/\$/\\\$/g'
string=`echo "$string" | LC_ALL=C sed -e "$escape_backslashes" -e "$escape_dollars"`
cat <<EOF > f-sp-1-$n.in
gettext ${string};
EOF
${XGETTEXT} -L Shell -o f-sp-1-$n.po f-sp-1-$n.in || Exit 1
test -f f-sp-1-$n.po || Exit 1
fail=
if echo "$comment" | grep 'Valid:' > /dev/null; then
if grep sh-printf-format f-sp-1-$n.po > /dev/null; then
:
else
fail=yes
fi
else
if grep sh-printf-format f-sp-1-$n.po > /dev/null; then
fail=yes
else
:
fi
fi
if test -n "$fail"; then
echo "Format string recognition error:" 1>&2
cat f-sp-1-$n.in 1>&2
echo "Got:" 1>&2
cat f-sp-1-$n.po 1>&2
Exit 1
fi
rm -f f-sp-1-$n.in f-sp-1-$n.po
done < f-sp-1.data
Exit 0

View File

@ -0,0 +1,145 @@
#! /bin/sh
. "${srcdir=.}/init.sh"; path_prepend_ . ../src
# Test checking of Shell printf format strings.
cat <<\EOF > f-sp-2.data
# Valid: %% doesn't count
msgid "abc%%def"
msgstr "xyz"
# Invalid: invalid msgstr
msgid "abc%%def"
msgstr "xyz%"
# Valid: same arguments
msgid "abc%s%gdef"
msgstr "xyz%s%g"
# Valid: same arguments, with different widths
msgid "abc%2sdef"
msgstr "xyz%3s"
# Valid: same arguments but in numbered syntax
msgid "abc%s%gdef"
msgstr "xyz%1$s%2$g"
# Valid: permutation
msgid "abc%s%g%cdef"
msgstr "xyz%3$c%2$g%1$s"
# Invalid: too few arguments
msgid "abc%2$udef%1$s"
msgstr "xyz%1$s"
# Invalid: too few arguments
msgid "abc%sdef%u"
msgstr "xyz%s"
# Invalid: too many arguments
msgid "abc%udef"
msgstr "xyz%uvw%c"
# Valid: same numbered arguments, with different widths
msgid "abc%2$5s%1$4s"
msgstr "xyz%2$4s%1$5s"
# Invalid: missing argument
msgid "abc%2$sdef%1$u"
msgstr "xyz%1$u"
# Invalid: missing argument
msgid "abc%1$sdef%2$u"
msgstr "xyz%2$u"
# Invalid: added argument
msgid "abc%1$udef"
msgstr "xyz%1$uvw%2$c"
# Valid: type compatibility
msgid "abc%i"
msgstr "xyz%d"
# Valid: type compatibility
msgid "abc%o"
msgstr "xyz%u"
# Valid: type compatibility
msgid "abc%u"
msgstr "xyz%x"
# Valid: type compatibility
msgid "abc%u"
msgstr "xyz%X"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%E"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%f"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%F"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%g"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%G"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%a"
# Valid: type compatibility
msgid "abc%e"
msgstr "xyz%A"
# Invalid: type incompatibility
msgid "abc%c"
msgstr "xyz%s"
# Invalid: type incompatibility
msgid "abc%c"
msgstr "xyz%i"
# Invalid: type incompatibility
msgid "abc%c"
msgstr "xyz%o"
# Invalid: type incompatibility
msgid "abc%c"
msgstr "xyz%e"
# Invalid: type incompatibility
msgid "abc%s"
msgstr "xyz%i"
# Invalid: type incompatibility
msgid "abc%s"
msgstr "xyz%o"
# Invalid: type incompatibility
msgid "abc%s"
msgstr "xyz%e"
# Invalid: type incompatibility
msgid "abc%i"
msgstr "xyz%o"
# Invalid: type incompatibility
msgid "abc%i"
msgstr "xyz%e"
# Invalid: type incompatibility
msgid "abc%u"
msgstr "xyz%e"
EOF
: ${MSGFMT=msgfmt}
n=0
while read comment; do
read msgid_line
read msgstr_line
n=`expr $n + 1`
cat <<EOF > f-sp-2-$n.po
#, sh-printf-format
${msgid_line}
${msgstr_line}
EOF
fail=
if echo "$comment" | grep 'Valid:' > /dev/null; then
if ${MSGFMT} --check-format -o f-sp-2-$n.mo f-sp-2-$n.po; then
:
else
fail=yes
fi
else
${MSGFMT} --check-format -o f-sp-2-$n.mo f-sp-2-$n.po 2> /dev/null
if test $? = 1; then
:
else
fail=yes
fi
fi
if test -n "$fail"; then
echo "Format string checking error:" 1>&2
cat f-sp-2-$n.po 1>&2
Exit 1
fi
rm -f f-sp-2-$n.po f-sp-2-$n.mo
done < f-sp-2.data
Exit 0

View File

@ -1,7 +1,7 @@
#!/bin/sh
. "${srcdir=.}/init.sh"; path_prepend_ . ../src
# Test of Shell support: escape sequences, string concatenation,
# Test of Shell support: escape sequences, format strings, string concatenation,
# strings with embedded expressions.
# Note! This file contains unescaped ASCII control characters. Edit carefully!
@ -495,6 +495,10 @@ echo `echo \`gettext $'depth_2_dollar_posix_0_"ab\"cd\'ef\\gh\eij\fkl\nmn\rop\tq
echo `echo \`gettext $'depth_2_dollar_posix_1_\cvab\cVcd\c[ef\c\\gh\c]ij\c?kl'\``
echo `echo \`gettext $'depth_2_dollar_bash_0_\Eab'\``
# Test format strings.
printf "`gettext 'User name: %s\nUser ID: %u'`"'\n' "$USER" `id -u`
# Test string concatenation.
gettext "concat_0_""part2"
@ -1919,6 +1923,10 @@ msgstr ""
msgid "depth_2_dollar_bash_0_ab"
msgstr ""
#, sh-printf-format
msgid "User name: %s\\nUser ID: %u"
msgstr ""
msgid "concat_0_part2"
msgstr ""