New program 'spit'.

* gettext-tools/machine-translation/README: New file.
* gettext-tools/machine-translation/prototype/README: New file.
* gettext-tools/machine-translation/prototype/OllamaSpit.java: New file.
* gettext-tools/machine-translation/prototype/ollama-spit.c: New file.
* gettext-tools/machine-translation/prototype/ollama-spit.go: New file.
* gettext-tools/machine-translation/prototype/ollama-spit.py: New file.
* gettext-tools/machine-translation/prototype/ollama-spit.sh: New file.
* gettext-tools/configure.ac (INCJSON_C, LIBJSON_C, INCCURL, LIBCURL): New
variables.
(BUILD_SPIT_IN_C): New conditional.
(AC_CONFIG_FILES): Add src/spit.py.
* gettext-tools/src/country-table.h: New file.
* gettext-tools/src/country-table.c: New file.
* gettext-tools/src/spit.c: New file.
* gettext-tools/src/spit.py.in: New file.
* gettext-tools/src/FILES: Mention the new files.
* gettext-tools/src/Makefile.am (bin_PROGRAMS): Conditionally add 'spit'.
(noinst_SCRIPTS): New variable.
(noinst_HEADERS): Add country-table.h.
(spit_SOURCES, spit_CFLAGS, spit_LDADD, spit_DEPENDENCIES, spit_CPPFLAGS,
spit_LDFLAGS): New variables.
(install-exec-local): Conditionally install spit.py.
(installdirs-local, uninstall-local): Update accordingly.
(DISTCLEANFILES): Add spit.py.
* gettext-tools/po/POTFILES.in: Add spit.c.
* gettext-tools/man/spit.x: New file.
* gettext-tools/man/Makefile.am (man_aux): Add spit.x.
(man_MAN1SRC): Add spit.1.
(man_HTML): Add spit.1.html.
(spit.1, spit.1.html): Add dependencies.
* gettext-tools/doc/gettext.texi (Pretranslating): New chapter.
* gettext-tools/doc/spit.texi: New file.
* gettext-tools/doc/Makefile.am (gettext_TEXINFOS): Add it.
* DEPENDENCIES: Update URLs for libxml2. Add libjson-c, libcurl, Python, the
Python module 'requests'.
* PACKAGING: Mention the 'spit' program and its manual page.
* NEWS: Mention the change.
This commit is contained in:
Bruno Haible 2025-12-27 23:53:53 +01:00
parent 7b2f1337c6
commit f12d6634be
25 changed files with 3445 additions and 22 deletions

6
.gitignore vendored
View File

@ -15,6 +15,7 @@
/gettext-runtime/doc/Admin/jdom-1.0.jar
/gettext-runtime/doc/Admin/Matrix*.class
/gettext-runtime/doc/Admin/matrix.xml
/gettext-tools/machine-translation/prototype/OllamaSpit.class
# Files brought in by autopull.sh:
/gettext-tools/tree-sitter-*
@ -550,6 +551,8 @@
/gettext-tools/man/msguniq.1.html
/gettext-tools/man/recode-sr-latin.1
/gettext-tools/man/recode-sr-latin.1.html
/gettext-tools/man/spit.1
/gettext-tools/man/spit.1.html
/gettext-tools/man/xgettext.1
/gettext-tools/man/xgettext.1.html
/gettext-tools/po/gettext-tools.pot
@ -680,6 +683,7 @@ autom4te.cache/
/gettext-tools/po/Makefile
/gettext-tools/projects/Makefile
/gettext-tools/src/Makefile
/gettext-tools/src/spit.py
/gettext-tools/src/user-email
/gettext-tools/styles/Makefile
/gettext-tools/system-tests/Makefile
@ -792,6 +796,8 @@ autom4te.cache/
/gettext-tools/src/msguniq.exe
/gettext-tools/src/recode-sr-latin
/gettext-tools/src/recode-sr-latin.exe
/gettext-tools/src/spit
/gettext-tools/src/spit.exe
/gettext-tools/src/urlget
/gettext-tools/src/urlget.exe
/gettext-tools/src/xgettext

View File

@ -143,6 +143,8 @@ We assume that the following environment variables are set:
gettext-tools/src/msgunfmt.c
gettext-tools/src/msguniq.c
gettext-tools/src/recode-sr-latin.c
gettext-tools/src/spit.c
gettext-tools/src/spit.py.in
gettext-tools/src/urlget.c
gettext-tools/src/xgettext.c

View File

@ -37,15 +37,14 @@ The following packages should be installed before GNU gettext is installed
* libxml2
+ Recommended.
Needed for 'xgettext' and 'msgfmt', so that it can parse XML
files. Also needed for the --color option of the various
programs.
Needed for 'xgettext' and 'msgfmt', so that it can parse XML files.
Also needed for the --color option of the various programs.
If not present, a subset of libxml2 (included in this package) will be
compiled into libgettextlib.
+ Homepage:
http://xmlsoft.org/
https://gitlab.gnome.org/GNOME/libxml2/-/wikis/home
+ Download:
ftp://xmlsoft.org/libxml2/
https://download.gnome.org/sources/libxml2/
+ Pre-built package name:
- On Debian and Debian-based systems: libxml2-dev,
- On Red Hat distributions: libxml2-devel.
@ -53,6 +52,32 @@ The following packages should be installed before GNU gettext is installed
+ If it is installed in a nonstandard directory, pass the option
--with-libxml2-prefix=DIR to 'configure'.
* libjson-c
+ Recommended.
Needed for machine translation.
If not present, 'spit' will be a Python script instead of an executable.
+ Homepage:
https://github.com/json-c/json-c/wiki
+ Download:
https://s3.amazonaws.com/json-c_releases/releases/index.html
+ Pre-built package name:
- On Debian and Debian-based systems: libjson-c-dev,
- On Red Hat distributions: json-c-devel.
- Other: https://repology.org/project/json-c/versions
* libcurl
+ Recommended.
Needed for machine translation.
If not present, 'spit' will be a Python script instead of an executable.
+ Homepage:
https://curl.se/libcurl/
+ Download:
https://curl.se/download.html
+ Pre-built package name:
- On Debian and Debian-based systems: libcurl4-gnutls-dev or libcurl4-openssl-dev,
- On Red Hat distributions: libcurl-devel.
- Other: https://repology.org/project/curl/versions
* libacl
+ Recommended on Linux systems.
Needed so that the creation of backup files respects the access control
@ -66,18 +91,18 @@ The following packages should be installed before GNU gettext is installed
- On Red Hat distributions: acl, libacl-devel.
- Other: https://repology.org/project/acl/versions
* libattr
+ Recommended on Linux systems.
Needed so that the creation of backup files respects the access control
lists (ACLs) set on the original files, with fewer system calls.
+ Homepage:
https://savannah.nongnu.org/projects/attr/
+ Download:
https://download.savannah.nongnu.org/releases/attr/
+ Pre-built package name:
- On Debian and Debian-based systems: libattr1-dev,
- On Red Hat distributions: libattr-devel.
- Other: https://repology.org/project/attr/versions
* libattr
+ Recommended on Linux systems.
Needed so that the creation of backup files respects the access control
lists (ACLs) set on the original files, with fewer system calls.
+ Homepage:
https://savannah.nongnu.org/projects/attr/
+ Download:
https://download.savannah.nongnu.org/releases/attr/
+ Pre-built package name:
- On Debian and Debian-based systems: libattr1-dev,
- On Red Hat distributions: libattr-devel.
- Other: https://repology.org/project/attr/versions
* A Java runtime and compiler (e.g. OpenJDK, AdoptOpenJDK, or kaffe).
+ Recommended.
@ -266,6 +291,29 @@ The following packages should be installed when GNU gettext is installed
+ Download:
https://ftp.gnu.org/gnu/gnulib/gnulib-l10n-*
* Python 3.7 or newer.
+ Recommended if GNU Gettext was built without libjson-c or without libcurl.
Needed for machine translation.
+ Homepage:
https://www.python.org/
+ Download:
https://www.python.org/downloads/
+ Pre-built package name:
- On Debian and Debian-based systems: python3,
- On Red Hat distributions: python3.
- Other: https://repology.org/project/python/versions
* The Python module 'requests'.
+ Recommended if GNU Gettext was built without libjson-c or without libcurl.
Needed for machine translation.
+ Homepage:
https://pypi.org/project/requests/
+ Download:
https://pypi.org/project/requests/#files
+ Pre-built package name:
- On Debian and Debian-based systems: python3-requests,
- On Red Hat distributions: python-requests.
- Other: https://repology.org/project/python%3Arequests/versions
The following should be installed when GNU gettext is built, but are not
needed later, once it is installed (build dependencies, but not runtime

7
NEWS
View File

@ -1,4 +1,4 @@
Version 1.0 - October 2025
Version 1.0 - December 2025
# Improvements for maintainers and distributors:
* In a po/ directory, the PO files are now exactly those that the
@ -36,6 +36,11 @@ Version 1.0 - October 2025
POT file, like 'msgmerge' would do. Previously, 'msginit' failed with
an error message in this situation.
* Pretranslation:
- A new program 'spit' is provided, that implements machine translation
through a locally installed Large Language Model (LLM).
- The documentation has a new chapter "Pretranslation".
# Programming languages support:
* OCaml:
- xgettext now supports OCaml.

View File

@ -139,16 +139,19 @@ the following file list.
$prefix/bin/autopoint
$prefix/bin/po-fetch
$prefix/bin/recode*
$prefix/bin/spit
$prefix/share/man/man1/msg*.1
$prefix/share/man/man1/xgettext.1
$prefix/share/man/man1/gettextize.1
$prefix/share/man/man1/autopoint.1
$prefix/share/man/man1/recode*.1
$prefix/share/man/man1/spit.1
$prefix/share/doc/gettext/msg*.1.html
$prefix/share/doc/gettext/xgettext.1.html
$prefix/share/doc/gettext/gettextize.1.html
$prefix/share/doc/gettext/autopoint.1.html
$prefix/share/doc/gettext/recode*.1.html
$prefix/share/doc/gettext/spit.1.html
$prefix/share/doc/gettext/gettext_*.html
$prefix/share/doc/gettext/FAQ.html
$prefix/share/doc/gettext/tutorial.html

View File

@ -477,6 +477,90 @@ AS_IF([test "$MODULA2_CHOICE" != no],
])
AC_SUBST([BUILDMODULA2])
dnl Check for the libjson-c library.
dnl Set INCJSON_C to the -I option for the include files.
dnl Set LIBJSON_C to the -L and -l options for the library.
AC_MSG_CHECKING([for the libjson-c library])
LIBJSON_C=?
AC_COMPILE_IFELSE(
[AC_LANG_PROGRAM([[#include <json-c/json_c_version.h>]], [[]])],
[],
[dnl The include files are not present.
LIBJSON_C=
])
if test "$LIBJSON_C" = "?"; then
save_LIBS="$LIBS"
LIBS="$LIBS -ljson-c"
AC_LINK_IFELSE(
[AC_LANG_PROGRAM(
[[#include <json-c/json_c_version.h>]],
[[return json_c_version () [0] != '0';]])
],
[LIBJSON_C="-ljson-c"],
[dnl The library is not present.
LIBJSON_C=
])
LIBS="$save_LIBS"
fi
dnl Find the location of the include files <json-c/> subdirectory.
INCJSON_C=
if test -n "$LIBJSON_C"; then
gl_ABSOLUTE_HEADER_ONE([json-c/json_c_version.h])
if test -n "$gl_cv_absolute_json_c_json_c_version_h"; then
INCJSON_C="-I"`echo "$gl_cv_absolute_json_c_json_c_version_h" | sed -e 's|.json_c_version[.]h$||'`
else
dnl Could not find the include files <json-c/> subdirectory.
LIBJSON_C=
fi
fi
AC_SUBST([INCJSON_C])
AC_SUBST([LIBJSON_C])
if test -n "$LIBJSON_C"; then
gt_libjson_c_found=yes
else
gt_libjson_c_found=no
fi
AC_MSG_RESULT([$gt_libjson_c_found])
dnl Check for the libcurl library.
dnl Set INCCURL to the -I option for the include files.
dnl Set LIBCURL to the -L and -l options for the library.
dnl TODO: Support linking with libcurl.a and its many dependencies.
AC_MSG_CHECKING([for the libcurl library])
INCCURL=
LIBCURL=?
AC_COMPILE_IFELSE(
[AC_LANG_PROGRAM([[#include <curl/curlver.h>]], [[]])],
[],
[dnl The include files are not present.
LIBCURL=
])
if test "$LIBCURL" = "?"; then
save_LIBS="$LIBS"
LIBS="$LIBS -lcurl"
AC_LINK_IFELSE(
[AC_LANG_PROGRAM(
[[#include <curl/curl.h>]],
[[curl_global_init (CURL_GLOBAL_DEFAULT);]])
],
[LIBCURL="-lcurl"],
[dnl The library is not present.
LIBCURL=
])
LIBS="$save_LIBS"
fi
AC_SUBST([INCCURL])
AC_SUBST([LIBCURL])
if test -n "$LIBCURL"; then
gt_libcurl_found=yes
else
gt_libcurl_found=no
fi
AC_MSG_RESULT([$gt_libcurl_found])
dnl Check in which form to install the 'spit' program.
AM_CONDITIONAL([BUILD_SPIT_IN_C], [test -n "$LIBJSON_C" && test -n "$LIBCURL"])
dnl Check for Emacs and where to install .elc files.
dnl Sometimes Emacs is badly installed. Allow the user to work around it.
AC_ARG_WITH([emacs],
@ -737,6 +821,7 @@ AC_CONFIG_FILES([libgrep/gnulib-lib/Makefile])
AC_CONFIG_FILES([src/Makefile])
AC_CONFIG_FILES([src/user-email:src/user-email.sh.in])
AC_CONFIG_FILES([src/spit.py], [chmod a+x src/spit.py])
AC_CONFIG_FILES([libgettextpo/Makefile])
AC_CONFIG_FILES([libgettextpo/exported.sh])

View File

@ -45,6 +45,7 @@ gettext_TEXINFOS = \
xgettext.texi \
msginit.texi \
msgmerge.texi \
spit.texi \
msgcat.texi \
msgconv.texi \
msggrep.texi \

View File

@ -169,6 +169,7 @@ version @value{VERSION}.
* Template:: Making the PO Template File
* Creating:: Creating a New PO File
* Updating:: Updating Existing PO Files
* Pretranslating:: Pretranslating PO Files
* Editing:: Editing PO Files
* Manipulating:: Manipulating PO Files
* Binaries:: Producing Binary MO Files
@ -254,6 +255,11 @@ Updating Existing PO Files
* msgmerge Invocation:: Invoking the @code{msgmerge} Program
Pretranslating PO Files
* Installing an LLM:: Installing a Large Language Model
* spit Invocation:: Invoking the @code{spit} Program
Editing PO Files
* Web based localization:: Web-based PO editing
@ -4035,6 +4041,83 @@ format of the plural forms field is described in @ref{Plural forms} and
@include msgmerge.texi
@node Pretranslating
@chapter Pretranslating PO Files
@cindex Pretranslating PO Files
@cindex Machine translation
As a translator,
you may save yourself some work
by starting from a reasonably good translation produced by a machine,
and modify that translation, to make it perfect.
Thus, before you start working on a translation,
you might have the PO file @emph{pretranslated}.
This process, also called @emph{machine translation},
is nowadays best performed through a Large Language Model (LLM).
(See @url{https://en.wikipedia.org/wiki/Machine_translation#Neural_MT},
@url{https://en.wikipedia.org/wiki/Neural_machine_translation#Generative_LLMs}.)
@node Installing an LLM
@section Installing a Large Language Model
We don't recommend to use machine translation
through a web service in the cloud, controlled by someone else than yourself.
Such a machine translation service would be have major drawbacks
(it could go away any time,
it could be used to spy on you or manipulate you,
or the costs could go up beyond your control);
see @url{https://www.gnu.org/philosophy/who-does-that-server-really-serve.en.html}.
Additionally, such a service typically has some cost
(between $10 and $25 per megabyte, as of 2025).
Instead, we recommend a Large Language Model execution engine
that runs on hardware under your control.
This can be a desktop computer,
or for instance a single-board computer in your local network.
@cindex ollama
At this point (in 2025),
a Large Language Model execution engine that is Free Software is
@samp{ollama}, that can be downloaded from @url{https://ollama.com/}.
Together with an LLM of reasonable quality,
such as the model @code{ministral-3:14b},
the system requirements are as follows:
@itemize @bullet
@item
RAM: 16 GB.
@item
Disk space: 10 GB (1 GB for @code{ollama}, 9 GB for the model).
@item
GPU or TPU: A GPU (Graphics Processing Unit) or TPU (Tensor Processing Unit)
is not needed.
As of 2025, a high-end GPU from certain vendors
can be used by @code{ollama} to provide an optional speedup.
@end itemize
Additional configuration:
@itemize @bullet
@item
If you are running @code{ollama} on your computer directly,
no further configuration is needed.
@item
If you are running @code{ollama} on a separate machine,
and want to make it accessible from all machines in the LAN:
Edit the file @file{/etc/systemd/system/ollama.services},
adding a line: @code{Environment="OLLAMA_HOST=0.0.0.0"}.
See @url{https://github.com/ollama/ollama/issues/703}.
@item
If you are running @code{ollama} in a virtual machine,
make the port 11434 accessible through port forwarding.
@end itemize
@node spit Invocation
@section Invoking the @code{spit} Program
@include spit.texi
@node Editing
@chapter Editing PO Files
@cindex Editing PO Files

108
gettext-tools/doc/spit.texi Normal file
View File

@ -0,0 +1,108 @@
@c This file is part of the GNU gettext manual.
@c Copyright (C) 2025 Free Software Foundation, Inc.
@c See the file gettext.texi for copying conditions.
@pindex spit
@cindex @code{spit} program, usage
@example
spit [@var{option}...]
@end example
@cindex query a Large Language Model
@cindex translate through a Large Language Model
The @code{spit} program
passes its input to a Large Language Model (LLM) instance
and prints the response.
With the @code{--to} option,
it translates its input to the specified language
through a Large Language Model (LLM) and prints the translation.
@strong{Warning:} The output might not be what you expect.
It might be of the wrong form, be of poor quality, or reflect some biases.
@subsection Large Language Model (LLM) options
@table @samp
@item --species=@var{type}
@opindex --species@r{, @code{spit} option}
Specifies the type of Large Language Model execution engine.
The default and only valid value is @code{ollama}.
@item --url=@var{url}
@opindex --url@r{, @code{spit} option}
Specifies the URL of the server that runs Large Language Model execution engine.
For @code{ollama}, the default is @code{http://localhost:11434}.
@item -m @var{model}
@itemx --model=@var{model}
@opindex -m@r{, @code{spit} option}
@opindex --model@r{, @code{spit} option}
Specifies the model to use.
This option is mandatory; no default exists.
The specified model must
already be installed in the Large Language Model execution engine.
@item --to=@var{language}
@opindex --to@r{, @code{spit} option}
Specifies the target language.
@var{language} may be specified
as an ISO 639 language code (such as @code{fr} for French),
as a combination of an ISO 639 language code and an ISO 3166 country code
(such as @code{fr_CA} for French in Canada,
or @code{zh_TW} for traditional Chinese),
or as the English name of a language (such as @code{French}).
The effect of this option is to
add a prompt similar to "Translate to @var{language}:".
@item --prompt=@var{text}
@opindex --prompt@r{, @code{spit} option}
Specifies the prompt to use before the input that comes from standard input.
It allows you to specify extra instructions for the LLM.
This option overrides the @code{--to} option.
@item --postprocess=@var{command}
@opindex --postprocess@r{, @code{spit} option}
Specifies a command to post-process the output.
This should be a Bourne shell command
that reads from standard input and writes to standard output.
For instance, the @code{ministral-3:14b} model
often emphasizes part of the output with @samp{**} characters.
To eliminate these markers,
you could use the command @samp{sed -e 's/[*][*]//g'}.
@end table
@subsection Informative output
@table @samp
@item -h
@itemx --help
@opindex -h@r{, @code{spit} option}
@opindex --help@r{, @code{spit} option}
Display this help and exit.
@item -V
@itemx --version
@opindex -V@r{, @code{spit} option}
@opindex --version@r{, @code{spit} option}
Output version information and exit.
@end table
@subsection Examples
Machine translation of a single sentence:
@smallexample
$ echo 'Translate into German: "Welcome to the GNU project!"' \
| spit --model=ministral-3:14b \
--postprocess="sed -e 's/[*][*]//g'"
"Willkommen zum GNU-Projekt!"
@end smallexample
@noindent
The perfect translation would be @code{"Willkommen beim GNU-Projekt!"}.
You can see: some manual adjustment after the machine translation is needed.

View File

@ -0,0 +1,62 @@
This directory contains programs for interfacing to machine translation tools.
Which types of machine translation tools to support?
* We don't support machine translation through web services in the cloud,
because
- They are not under the control of the user (the famous SaaS problem:
<https://www.gnu.org/philosophy/who-does-that-server-really-serve.en.html>).
- They typically have some cost (between $10 and $25 per megabyte,
as of 2025).
* We therefore only support locally running machine translation engines.
Which kinds of machine translation tools to support?
* Large Language Models (LLMs) produce good quality translations nowadays
(December 2025), at least regarding French and German.
With 'ollama' there exists an execution engine (so-called "inference engine")
that does not require a graphics card; it can run directly on a CPU.
'llama.cpp' and 'KoboldCpp' seem to be roughly equivalent to 'ollama',
therefore here we focus on 'ollama'.
* The Argos Translate tools, which are based on neural networks but not LLMs,
https://github.com/argosopentech/argos-translate
https://www.argosopentech.com/
work:
$ ./argos-translate --from-lang en --to-lang de "You are lucky"
Du hast Glück
and would be theoretically acceptable, but they have two problems:
- Unclear copyright of some of the language models
<https://github.com/argosopentech/argos-translate/issues/507>,
- Translation quality not as good as the one from LLMs.
Prerequisite:
* ollama
Homepage: https://ollama.com/
Source code: https://github.com/ollama/ollama
Binaries exist for various platforms:
- for GNU/Linux, macOS, Windows: https://ollama.com/download
- for various distributions: https://repology.org/project/ollama/versions
- for GNU Guix: https://codeberg.org/tusharhero/ollama-guix
- for Ubuntu: https://snapcraft.io/ollama
System requirements (for ollama with model 'ministral-3:14b'):
- 16 GB RAM,
- 10 GB disk space (1 GB for ollama, 9 GB for the model).
Configuration:
- If you have it on a separate machine, and want to make it accessible
from all machines in the LAN:
Edit /etc/systemd/system/ollama.services :
Add a line: Environment="OLLAMA_HOST=0.0.0.0"
Cf. <https://github.com/ollama/ollama/issues/703>.
- If you have it running in a virtual machine, make the port 11434
accessible through port forwarding.
Programs:
* spit
is an extension of 'ollama-spit', with an option --to=LANGUAGE for
machine translation of a single input.
* msgpre
applies machine translation to the untranslated (and optionally, fuzzy)
messages of a PO file.

View File

@ -0,0 +1,228 @@
/*
* Copyright (C) 2025 Free Software Foundation, Inc.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*
* Written by Bruno Haible <bruno@clisp.org>, 2025.
*/
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;
import java.net.URI;
// Documentation:
// https://docs.oracle.com/en/java/javase/11/docs/api/java.net.http/java/net/http/package-summary.html
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
// Documentation:
// https://google.github.io/gson/UserGuide.html
// https://www.javadoc.io/doc/com.google.code.gson/gson/2.8.0/com/google/gson/Gson.html
import com.google.gson.JsonElement;
import com.google.gson.JsonObject;
import com.google.gson.JsonParser;
/*
* This program passes an input to an ollama instance and prints the response.
*/
public class OllamaSpit {
private static void usage () {
System.out.println("Usage: spit [OPTION...]");
System.out.println();
System.out.println("Passes standard input to an ollama instance and prints the response.");
System.out.println();
System.out.println("Options:");
System.out.println(" --url Specifies the ollama server's URL.");
System.out.println(" --model Specifies the model to use.");
System.out.println();
System.out.println("Informative output:");
System.out.println();
System.out.println(" --help Show this help text.");
}
public static void main (String[] args) throws IOException, InterruptedException {
// Command-line option processing.
String url = "http://localhost:11434";
String model = null;
{
boolean do_help = false;
int i;
for (i = 0; i < args.length; i++) {
String arg = args[i];
if (arg.equals("--url") || arg.equals("--ur") || arg.equals("--u")) {
i++;
if (i == args.length) {
System.err.println("Spit: missing argument for --url");
System.exit(1);
}
url = args[i];
} else if (arg.startsWith("--url=")) {
url = arg.substring(6);
} else if (arg.equals("--model") || arg.equals("--mode") || arg.equals("--mod") || arg.equals("--mo") || arg.equals("--m")) {
i++;
if (i == args.length) {
System.err.println("Spit: missing argument for --model");
System.exit(1);
}
model = args[i];
} else if (arg.startsWith("--model=")) {
model = arg.substring(8);
} else if (arg.equals("--help")) {
do_help = true;
} else if (arg.equals("--")) {
// Stop option processing
i++;
break;
} else if (arg.startsWith("-")) {
System.err.println("Spit: unknown option " + arg);
System.err.println("Try 'Spit --help' for more information.");
System.exit(1);
} else
break;
}
if (do_help) {
usage();
System.exit(0);
}
if (i < args.length) {
System.err.println("Spit: too many arguments");
System.err.println("Try 'Spit --help' for more information.");
System.exit(1);
}
}
if (model == null) {
System.err.println("Spit: missing --model option");
System.exit(1);
}
// Sanitize URL.
if (!url.endsWith("/"))
url = url + "/";
// Read the contents of standard input.
String input = new String(System.in.readAllBytes(), StandardCharsets.UTF_8);
// Documentation of the ollama API:
// <https://docs.ollama.com/api/generate>
// Compose the payload.
JsonObject payload = new JsonObject();
payload.addProperty("model", model);
payload.addProperty("prompt", input);
String payloadAsString = payload.toString();
// Make the request to the ollama server.
HttpClient client = HttpClient.newHttpClient();
HttpRequest request =
HttpRequest.newBuilder(URI.create(url+"api/generate"))
.POST(HttpRequest.BodyPublishers.ofString(payloadAsString))
.build();
HttpResponse<InputStream> response =
client.send(request, HttpResponse.BodyHandlers.ofInputStream());
InputStream responseStream = response.body();
if (response.statusCode() != 200) {
System.err.println("Status: "+response.statusCode());
}
if (response.statusCode() >= 400) {
System.err.print("Body: ");
responseStream.transferTo(System.err);
System.err.println();
System.exit(1);
}
int bufSize = 4096;
byte[] buffer = new byte[bufSize];
int bufStart = 0;
int bufEnd = 0;
for (;;) {
// Read a line when available.
// But don't force reading more than one line.
String line;
{
// Boy, this is complicated code. There should really be a helper
// method on InputStream for this purpose.
int i;
for (i = bufStart; i < bufEnd; i++) {
if (buffer[i] == (byte)'\n')
break;
}
for (;;) {
if (i < bufEnd) {
// An entire line in buffer.
i++;
line = new String(buffer, bufStart, i-bufStart, StandardCharsets.UTF_8);
bufStart = i;
break;
}
// We need to read something from the stream.
int avail = responseStream.available();
if (avail == 0)
// If nothing is available, read 1 byte, even if that needs to block.
avail = 1;
if ((bufEnd-bufStart) + avail > bufSize) {
// Grow the buffer.
int newBufSize = (bufEnd-bufStart) + avail;
if (newBufSize < 2*bufSize)
newBufSize = 2*bufSize;
byte[] newBuffer = new byte[newBufSize];
System.arraycopy(buffer, bufStart, newBuffer, 0, bufEnd-bufStart);
bufSize = newBufSize;
buffer = newBuffer;
bufEnd = bufEnd-bufStart;
bufStart = 0;
} else {
// We can keep the buffer, but may need to move the contents to the
// front.
if (bufEnd + avail > bufSize) {
System.arraycopy(buffer, bufStart, buffer, 0, bufEnd-bufStart);
bufEnd = bufEnd-bufStart;
bufStart = 0;
}
}
// Now bufEnd + avail <= bufSize.
int nBytesRead = responseStream.readNBytes(buffer, bufEnd, avail);
if (nBytesRead == 0) {
// We're at EOF.
if (bufEnd == bufStart)
return;
// Convert the last line (without terminating newline).
line = new String(buffer, bufStart, bufEnd-bufStart, StandardCharsets.UTF_8);
break;
}
// We have read at least one byte. Look if we have a complete line now.
int oldBufEnd = bufEnd;
bufEnd += nBytesRead;
for (i = oldBufEnd; i < bufEnd; i++) {
if (buffer[i] == (byte)'\n')
break;
}
}
}
// We have a line now. It should contain a single JSON object.
JsonElement part = JsonParser.parseString(line);
if (part.isJsonObject()) {
System.out.print(((JsonObject)part).get("response").getAsString());
System.out.flush();
}
}
}
}
/*
* Local Variables:
* compile-command: "javac -d . -cp /usr/share/java/gson.jar OllamaSpit.java"
* run-command: "echo 'Translate into German: "Welcome to the GNU project!"' | java -cp .:/usr/share/java/gson.jar OllamaSpit --model=ministral-3:14b"
* End:
*/

View File

@ -0,0 +1,89 @@
This directory contains a prototype program 'ollama-spit'
that passes an input to an LLM and prints the output.
Since it implements an HTTP client with some JSON processing, it
needs the following dependencies, depending on the programming language:
Language HTTP client JSON library
C libcurl libjson-c
C++ httplib.h json.hpp
Python request json (built-in)
Java java.net.http.HttpClient Gson
(built-in)
Go net/http encoding/json
(built-in) (built-in)
Shell curl jq
Example:
$ echo 'Translate into German: "Welcome to the GNU project!"' | ollama-spit --model=ministral-3:14b
"Willkommen zum GNU-Projekt!"
More elaborate code of the same kind, in various programming languages,
can be found at
<https://github.com/ollama/ollama?tab=readme-ov-file#libraries-1>.
The programs behave identically, except for the Shell implementation that
has bad error handling.
Discussion of the other implementations:
* Deployment considerations:
The C program depends on libcurl and libjson-c. libcurl is big (> 700 KB
binary code size, with lots of further dependencies shown by 'ldd').
But libcurl and libjson-c are the de-facto standards in the C ecosystem:
in Debian, more than 250 packages depend on libcurl4t64,
and more than 100 packages depend on libjson-c5.
(libjansson, which is sometimes considered as an alternative to libjson-c,
is too buggy for serious use:
<https://github.com/akheron/jansson/issues/642>,
<https://github.com/akheron/jansson/issues/475>.)
The C++ program depends on
https://github.com/yhirose/cpp-httplib
https://github.com/nlohmann/json
which are header-only C++ libraries, not widely used in the C++ ecosystem.
The compiled executable has > 500 KB binary code size, but no dependency
to shared libraries other than libstdc++ and libc.
The Python program depends on the 'requests' module, that is widely used
in the Python ecosystem. The user may need to install it via
$ pip install requests
The Java program depends on the Gson library. It is widely used in the Java
ecosystem. The user may need to fetch it from
https://mvnrepository.com/artifact/com.google.code.gson/gson
The Go program would need to depend on the 'pflag' module, that is widely
used on the Go ecosystem. <https://pkg.go.dev/github.com/spf13/pflag>
In Debian, it would also
- either depend on libgo, the Go runtime library, which is 60 MB large,
- or be statically linked, and be more than 25 MB large.
* Development considerations:
The C program is a bit clumsy, because libcurl does not provide an easy
way to have separate code paths for the success case and for the error case.
The C++ program takes 13 seconds to compile with g++.
The Python program has no issues.
The Java program has no issues.
The Go program is a pain to develop and maintain, because the Go authors put
language minimality above developer needs
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123293>.
As a consequence, we will use
- primarily a C implementation and
- as fallback for users who don't like to install many packages before
building GNU gettext: a Python implementation.
(Why Python, not Java?
1. Because it does not need a Java compiler during "make && make install",
2. Because Python is more popular and thus more likely to be maintained.)

View File

@ -0,0 +1,440 @@
/*
* Copyright (C) 2025 Free Software Foundation, Inc.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*
* Written by Bruno Haible <bruno@clisp.org>, 2025.
*/
/*
* This program passes an input to an ollama instance and prints the response.
*/
#include <getopt.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* We use JSON-C.
It is multithread-safe, as long as we don't use any of the json_global_*
API functions, nor the json_util_get_last_err function.
Documentation:
<http://json-c.github.io/json-c/json-c-0.18/doc/html/index.html#using> */
#include <json.h>
/* We use libcurl.
Documentation:
<https://curl.se/libcurl/c/> */
#include <curl/curl.h>
static void
xalloc_die ()
{
fprintf (stderr, "spit: out of memory\n");
exit (EXIT_FAILURE);
}
static void
curl_die ()
{
fprintf (stderr, "spit: curl error\n");
exit (EXIT_FAILURE);
}
static void
process_response_line (const char *line)
{
/* Note: As of json-c version 0.15, the jerrno value is unreliable.
See <https://github.com/json-c/json-c/issues/853>
and <https://github.com/json-c/json-c/issues/854>.
json-c version 0.17 introduces json_tokener_error_memory, but its value
changes in version 0.18. */
struct json_object *j;
#if JSON_C_MAJOR_VERSION > 0 || (JSON_C_MAJOR_VERSION == 0 && JSON_C_MINOR_VERSION >= 18)
enum json_tokener_error jerrno = json_tokener_error_memory;
j = json_tokener_parse_verbose (line, &jerrno);
if (j == NULL && jerrno == json_tokener_error_memory)
xalloc_die ();
#else
j = json_tokener_parse (line);
#endif
/* Ignore an empty line. */
if (j != NULL)
{
/* We expect a JSON object. */
if (json_object_is_type (j, json_type_object))
{
/* Output its "response" property. */
const char *prop =
json_object_get_string(json_object_object_get (j, "response"));
if (prop != NULL)
{
fputs (prop, stdout);
fflush (stdout);
}
}
}
}
/* A libcurl header callback that determines, during curl_easy_perform,
whether the HTTP request returned an error. */
static size_t
my_header_callback (char *buffer, size_t one, size_t n, void *userdata)
{
bool *is_error_p = (bool *) userdata;
#if DEBUG
fprintf (stderr, "in my_header_callback: buffer = %.*s\n", (int) n, buffer);
#endif
if (n >= 5 && memcmp (buffer, "HTTP/", 5) == 0)
{
/* buffer contains a line of the form "HTTP/1.1 code description".
Extract the code. */
char old = buffer[n - 1];
buffer[n - 1] = '\0';
int code;
if (sscanf (buffer, "%*s %d\n", &code) == 1)
{
if (code >= 400)
*is_error_p = true;
}
buffer[n - 1] = old;
}
return n;
}
struct my_write_locals
{
bool is_error;
char *body;
size_t body_allocated;
size_t body_start;
size_t body_end;
};
/* Makes room for n more bytes in l->body. */
static void
my_write_grow (struct my_write_locals *l, size_t n)
{
if ((l->body_end - l->body_start) + n > l->body_allocated)
{
/* Grow the buffer. */
size_t new_allocated = (l->body_end - l->body_start) + n;
if (new_allocated < 2 * l->body_allocated)
new_allocated = 2 * l->body_allocated;
if (new_allocated < 1024)
new_allocated = 1024;
char *new_body;
if (l->body_start == 0 && l->body_end > 0)
{
new_body = (char *) realloc (l->body, new_allocated);
if (new_body == NULL)
xalloc_die ();
}
else
{
new_body = (char *) malloc (new_allocated);
if (new_body == NULL)
xalloc_die ();
memcpy (new_body, l->body + l->body_start, l->body_end - l->body_start);
free (l->body);
l->body_end = l->body_end - l->body_start;
l->body_start = 0;
}
l->body = new_body;
l->body_allocated = new_allocated;
}
else
{
/* We can keep the buffer, but may need to move the contents to the
front. */
if (l->body_end + n > l->body_allocated)
{
memmove (l->body, l->body + l->body_start, l->body_end - l->body_start);
l->body_end = l->body_end - l->body_start;
l->body_start = 0;
}
}
/* Here l->body_end + n <= l->body_allocated. */
}
/* A libcurl write callback that processes a piece of response body,
depending on whether the HTTP request returned an error. */
static size_t
my_write_callback (char *buffer, size_t one, size_t n, void *userdata)
{
struct my_write_locals *l = (struct my_write_locals *) userdata;
#if DEBUG
fprintf (stderr, "in my_write_callback: buffer = %.*s\n", (int) n, buffer);
#endif
/* Append the buffer's contents to the body. */
my_write_grow (l, n);
memcpy (l->body + l->body_end, buffer, n);
l->body_end += n;
if (!l->is_error)
{
/* Process entire lines that are in the buffer. */
char *newline = (char *) memchr (l->body + l->body_end - n, '\n', n);
while (newline != NULL)
{
/* We have an entire line. */
*newline = '\0';
process_response_line (l->body + l->body_start);
l->body_start = (newline + 1) - l->body;
newline = (char *) memchr (l->body + l->body_start, '\n',
l->body_end - l->body_start);
}
}
return n;
}
static void
usage ()
{
printf ("%s", "Usage: spit [OPTION...]\n\
\n\
Passes standard input to an ollama instance and prints the response.\n\
\n\
Options:\n\
--url Specifies the ollama server's URL.\n\
--model Specifies the model to use.\n\
\n\
Informative output:\n\
\n\
--help Show this help text.\n");
}
int
main (int argc, char *argv[])
{
/* Command-line option processing. */
const char *url = "http://localhost:11434";
const char *model = NULL;
bool do_help = false;
static const struct option long_options[] =
{
{ "url", required_argument, NULL, CHAR_MAX + 2 },
{ "model", required_argument, NULL, CHAR_MAX + 3 },
{ "help", no_argument, NULL, CHAR_MAX + 1 },
{ NULL, 0, NULL, 0 }
};
{
int optc;
while ((optc = getopt_long (argc, argv, "", long_options, NULL)) != EOF)
switch (optc)
{
case '\0': /* Long option. */
break;
case CHAR_MAX + 1: /* --help */
do_help = true;
break;
case CHAR_MAX + 2: /* --url */
url = optarg;
break;
case CHAR_MAX + 3: /* --model */
model = optarg;
break;
default:
fprintf (stderr, "Try 'spit --help' for more information.\n");
exit (EXIT_FAILURE);
}
}
if (do_help)
{
usage ();
exit (EXIT_SUCCESS);
}
if (argc > optind)
{
fprintf (stderr, "spit: too many arguments\n");
fprintf (stderr, "Try 'spit --help' for more information.\n");
exit (EXIT_FAILURE);
}
if (model == NULL)
{
fprintf (stderr, "spit: missing --model option\n");
exit (EXIT_FAILURE);
}
/* Sanitize URL. */
if (!(strlen (url) > 0 && url[strlen (url) - 1] == '/'))
{
char *new_url = malloc (strlen (url) + 1 + 1);
if (new_url == NULL)
xalloc_die ();
sprintf (new_url, "%s/", url);
url = new_url;
}
/* Read the contents of standard input. */
char *input = NULL;
size_t input_allocated = 0;
size_t input_length = 0;
for (;;)
{
int c = fgetc (stdin);
if (c == EOF)
break;
if (input_length >= input_allocated)
{
size_t new_allocated = 2 * input_allocated + 1;
char *new_input = (char *) realloc (input, new_allocated);
if (new_input == NULL)
xalloc_die ();
input = new_input;
input_allocated = new_allocated;
}
input[input_length++] = c;
}
/* Documentation of the ollama API:
<https://docs.ollama.com/api/generate> */
/* Compose the payload. */
struct json_object *payload = json_object_new_object ();
if (payload == NULL)
xalloc_die ();
{
struct json_object *value = json_object_new_string (model);
if (value == NULL)
xalloc_die ();
if (json_object_object_add (payload, "model", value))
xalloc_die ();
}
{
struct json_object *value = json_object_new_string (input);
if (value == NULL)
xalloc_die ();
if (json_object_object_add (payload, "prompt", value))
xalloc_die ();
}
const char *payload_as_string =
json_object_to_json_string_ext (payload, JSON_C_TO_STRING_PLAIN
| JSON_C_TO_STRING_NOSLASHESCAPE);
if (payload_as_string == NULL)
xalloc_die ();
/* Make the request to the ollama server. */
if (curl_global_init (CURL_GLOBAL_DEFAULT))
curl_die ();
CURL *curl = curl_easy_init ();
if (!curl)
curl_die ();
{
char *target_url = malloc (strlen (url) + 12 + 1);
if (target_url == NULL)
xalloc_die ();
sprintf (target_url, "%sapi/generate", url);
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_URL.html> */
curl_easy_setopt (curl, CURLOPT_URL, target_url);
}
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_POST.html> */
curl_easy_setopt (curl, CURLOPT_POST, 1L);
{
struct curl_slist *headers = NULL;
/* Override the Content-Type header set by CURLOPT_POST. */
headers = curl_slist_append(headers, "Content-Type: " "application/json");
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_HTTPHEADER.html> */
curl_easy_setopt (curl, CURLOPT_HTTPHEADER, headers);
}
/* Set the payload.
Documentation: <https://curl.se/libcurl/c/CURLOPT_POSTFIELDS.html> */
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, payload_as_string);
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_NOPROGRESS.html> */
curl_easy_setopt (curl, CURLOPT_NOPROGRESS, 1L);
#if DEBUG > 1
/* For debugging: */
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_VERBOSE.html> */
curl_easy_setopt (curl, CURLOPT_VERBOSE, 1L);
#endif
#if 0
/* Not reliable, see <https://curl.se/libcurl/c/CURLOPT_FAILONERROR.html>. */
curl_easy_setopt (curl, CURLOPT_FAILONERROR, 1L);
#endif
struct my_write_locals locals;
locals.is_error = false;
locals.body = NULL;
locals.body_allocated = 0;
locals.body_start = 0;
locals.body_end = 0;
/* Documentation:
<https://curl.se/libcurl/c/CURLOPT_HEADERFUNCTION.html>
<https://curl.se/libcurl/c/CURLOPT_HEADERDATA.html> */
curl_easy_setopt (curl, CURLOPT_HEADERFUNCTION, my_header_callback);
curl_easy_setopt (curl, CURLOPT_HEADERDATA, &locals.is_error);
/* Documentation:
<https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html>
<https://curl.se/libcurl/c/CURLOPT_WRITEDATA.html> */
curl_easy_setopt (curl, CURLOPT_WRITEFUNCTION, my_write_callback);
curl_easy_setopt (curl, CURLOPT_WRITEDATA, &locals);
/* Documentation: <https://curl.se/libcurl/c/curl_easy_perform.html> */
CURLcode ret = curl_easy_perform (curl);
if (ret != CURLE_OK)
{
fprintf (stderr, "spit: curl error %u: %s\n", ret,
/* Documentation: <https://curl.se/libcurl/c/curl_easy_strerror.html> */
curl_easy_strerror (ret));
exit (EXIT_FAILURE);
}
/* Documentation: <https://curl.se/libcurl/c/CURLINFO_RESPONSE_CODE.html> */
long status_code;
curl_easy_getinfo (curl, CURLINFO_RESPONSE_CODE, &status_code);
if (status_code != 200)
fprintf (stderr, "Status: %ld\n", status_code);
if (locals.is_error != (status_code >= 400))
/* The my_header_callback did not work right. */
abort ();
if (status_code >= 400)
{
fprintf (stderr, "Body: ");
fwrite (locals.body + locals.body_start,
1, locals.body_end - locals.body_start,
stderr);
fprintf (stderr, "\n");
exit (1);
}
/* Most lines have already been processed through my_write_callback.
Now process the last line (without terminating newline). */
if (locals.body_end > locals.body_start)
{
my_write_grow (&locals, 1);
*(locals.body + locals.body_end) = '\0';
process_response_line (locals.body + locals.body_start);
}
return 0;
}
/*
* Local Variables:
* compile-command: "gcc -Wall -I/usr/include/json-c -O2 -o ollama-spit ollama-spit.c -lcurl -ljson-c"
* run-command: "echo 'Translate into German: "Welcome to the GNU project!"' | ./ollama-spit --model=ministral-3:14b"
* End:
*/

View File

@ -0,0 +1,164 @@
//
// Copyright (C) 2025 Free Software Foundation, Inc.
//
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with this program. If not, see <https://www.gnu.org/licenses/>.
//
// Written by Bruno Haible <bruno@clisp.org>, 2025.
// This program passes an input to an ollama instance and prints the response.
package main
import (
"bufio"
"bytes"
"encoding/json"
"flag"
"fmt"
"io"
"net/http"
"os"
"strings"
)
func main() {
// Command-line option processing.
// Note: This is incompatible with GNU conventions.
// <https://pkg.go.dev/flag#hdr-Command_line_flag_syntax> says
// "Flag parsing stops just before the first non-flag argument
// ("-" is a non-flag argument) or after the terminator "--"."
// which means that options after non-options are not reordered
// like they are by GNU getopt.
// The common workaround is to use https://github.com/spf13/pflag
// instead.
url_option := flag.String("url", "http://localhost:11434", "the ollama server's URL")
// Clumsy code is needed when we want an option of type string
// that has no default. The 'flag' package's String and StringVar
// functions reject a default value of nil, pushing developers
// into confusing an option with an empty string value with an
// omitted option.
var model_option *string = nil
flag.Func( "model", "the model to use",
func (s string) error {
model_option = &s
return nil
})
do_help_option := flag.Bool ("help", false, "this help text")
flag.Parse()
if *do_help_option {
fmt.Println("Usage: spit [OPTION...]")
fmt.Println()
fmt.Println("Passes standard input to an ollama instance and prints the response.")
fmt.Println()
fmt.Println("Options:")
fmt.Println(" --url Specifies the ollama server's URL.")
fmt.Println(" --model Specifies the model to use.")
fmt.Println()
fmt.Println("Informative output:")
fmt.Println()
fmt.Println(" --help Show this help text.")
os.Exit(0)
}
if model_option == nil {
fmt.Fprintln(os.Stderr, "spit: missing --model option")
os.Exit(1)
}
if len(flag.Args()) > 0 {
fmt.Fprintln(os.Stderr, "spit: too many arguments")
fmt.Fprintln(os.Stderr, "'Try 'spit --help' for more information.")
os.Exit(1)
}
// Sanitize URL.
url := *url_option
if !strings.HasSuffix(url, "/") {
url = url + "/"
}
model := *model_option
// Read the contents of standard input.
allBytes, err := io.ReadAll(os.Stdin)
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
input := string(allBytes)
// Documentation of the ollama API:
// <https://docs.ollama.com/api/generate>
// JSON in Go is a pain:
// 1) There is no way to just create a JSON object and add properties
// to it. We are forced to either use a map[string]any (and lose
// the advantages of type checking) or create a struct that reflects
// the desired shape of the JSON object.
// 2) In this struct, fields whose name starts with a lowercase letter
// are ignored by json.Marshal! Here's the workaround syntax:
type GeneratePayload struct {
Model string `json:"model"`
Prompt string `json:"prompt"`
}
payload := GeneratePayload {
Model: model,
Prompt: input,
}
payloadAsBytes, err := json.Marshal(payload)
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
response, err := http.Post(url + "api/generate",
"application/json", bytes.NewReader(payloadAsBytes))
if err != nil {
fmt.Fprintln(os.Stderr, err)
os.Exit(1)
}
if response.StatusCode != 200 {
fmt.Fprintln(os.Stderr, "Status:", response.StatusCode)
}
if response.StatusCode >= 400 {
responseBodyBytes, _ := io.ReadAll(response.Body)
fmt.Fprintln(os.Stderr, "Body:", string(responseBodyBytes))
os.Exit(1)
}
body := response.Body
reader := bufio.NewReader(body)
for {
line, err := reader.ReadBytes('\n')
if len(line) == 0 && err != nil {
break
}
var part map[string]any
if json.Unmarshal(line, &part) == nil {
fmt.Print(part["response"])
}
}
}
/*
* Local Variables:
* compile-command: "gccgo -Wall -O2 -o ollama-spit ollama-spit.go"
* run-command: "echo 'Translate into German: "Welcome to the GNU project!"' | ./ollama-spit --model=ministral-3:14b"
* End:
*/

View File

@ -0,0 +1,130 @@
#! /usr/bin/env python3
#
# Copyright (C) 2025 Free Software Foundation, Inc.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
# Written by Bruno Haible <bruno@clisp.org>, 2025.
# This program passes an input to an ollama instance and prints the response.
#
# Dependencies: request.
import sys
# Documentation: https://docs.python.org/3/library/argparse.html
import argparse
# Documentation: https://requests.readthedocs.io/en/latest/
import requests
# Documentation: https://docs.python.org/3/library/json.html
import json
def main():
parser = argparse.ArgumentParser(
prog='spit',
usage='spit --help',
add_help=False)
parser.add_argument('--url',
dest='url',
default='http://localhost:11434')
parser.add_argument('--model',
dest='model',
default=None,
nargs=1)
parser.add_argument('--help', '--hel', '--he', '--h',
dest='help',
default=None,
action='store_true')
# All other arguments are collected.
parser.add_argument('non_option_arguments',
nargs='*')
# Parse the given arguments. Don't signal an error if non-option arguments
# occur between or after options.
cmdargs, unhandled = parser.parse_known_args()
# Handle --help, ignoring all other options.
if cmdargs.help != None:
print('''
Usage: spit [OPTION...]
Passes standard input to an ollama instance and prints the response.
Options:
--url Specifies the ollama server's URL.
--model Specifies the model to use.
Informative output:
--help Show this help text.
''')
sys.exit(0)
# Report unhandled arguments.
for arg in unhandled:
if arg.startswith('-'):
message = '%s: Unrecognized option \'%s\'.\n' % ('spit', arg)
message += 'Try \'spit --help\' for more information.\n'
sys.stderr.write(message)
sys.exit(1)
# By now, all unhandled arguments were non-options.
cmdargs.non_option_arguments += unhandled
if cmdargs.model == None:
sys.stderr.write('%s: missing --model option\n' % 'spit')
sys.exit(1)
if len(cmdargs.non_option_arguments) > 0:
message = '%s: too many arguments\n' % 'spit'
message += 'Try \'spit --help\' for more information.\n'
sys.stderr.write(message)
sys.exit(1)
# Sanitize URL.
url = cmdargs.url
if not url.endswith('/'):
url += '/'
model = cmdargs.model[0]
# Read the contents of standard input.
input = sys.stdin.read()
# Documentation of the ollama API:
# <https://docs.ollama.com/api/generate>
payload = { 'model': model, 'prompt': input }
# We need the payload in JSON syntax (with double-quotes around the strings),
# not in Python syntax (with single-quotes around the strings):
payload = json.dumps(payload)
response = requests.post(url + 'api/generate', data=payload, stream=True)
if response.status_code != 200:
print('Status:', response.status_code, file=sys.stderr)
if response.status_code >= 400:
print('Body:', response.text, file=sys.stderr)
sys.exit(1)
# Not needed any more:
#response.raise_for_status()
for line in response.iter_lines():
part = json.loads(line.decode('utf-8'))
print(part.get('response', ''), end='', flush=True)
if __name__ == '__main__':
main()

View File

@ -0,0 +1,130 @@
#! /bin/sh
#
# Copyright (C) 2025 Free Software Foundation, Inc.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
# Written by Bruno Haible <bruno@clisp.org>, 2025.
# This program passes an input to an ollama instance and prints the response.
#
# Dependencies: curl, jq.
progname=$0
# func_exit STATUS
# exits with a given status.
func_exit ()
{
exit $1
}
# func_usage
# outputs to stdout the --help usage message.
func_usage ()
{
echo "\
Usage: spit [OPTION...]
Passes standard input to an ollama instance and prints the response.
Options:
--url Specifies the ollama server's URL.
--model Specifies the model to use.
Informative output:
--help Show this help text."
}
# Command-line option processing.
# Removes the OPTIONS from the arguments. Sets the variables:
# - url
# - model
{
url='http://localhost:11434'
model=
while test $# -gt 0; do
case "$1" in
--url | --ur | --u )
shift
if test $# = 0; then
echo "$progname: missing argument for --url" 1>&2
func_exit 1
fi
url="$1"
shift
;;
--url=* )
url=`echo "X$1" | sed -e 's/^X--url=//'`
shift
;;
--model | --mode | --mod | --mo | --m )
shift
if test $# = 0; then
echo "$progname: missing argument for --model" 1>&2
func_exit 1
fi
model="$1"
shift
;;
--model=* )
model=`echo "X$1" | sed -e 's/^X--model=//'`
shift
;;
--help | --hel | --he | --h )
func_usage
func_exit $? ;;
-- )
# Stop option processing
shift
break ;;
-* )
echo "$progname: unknown option $1" 1>&2
echo "Try '$progname --help' for more information." 1>&2
func_exit 1 ;;
* )
break ;;
esac
done
if test -z "$model"; then
echo "$progname: missing --model option" 1>&2
func_exit 1
fi
# Sanitize URL.
case "$url" in
*/) ;;
*) url="$url/" ;;
esac
# Read the contents of standard input.
input=`cat`
input_quoted=`echo "$input" | sed -e 's|"|\\\"|g'`
# Documentation of the ollama API:
# <https://docs.ollama.com/api/generate>
curl --no-progress-meter --no-buffer \
-X POST "$url"api/generate \
-d '{ "model": "'"$model"'", "prompt": "'"$input_quoted"'" }' \
| jq --join-output --unbuffered '.response'
# Note: Error handling is not good here. For example, when passing a
# nonexistent model name, the server answers with HTTP status 404
# and a response body {"error":"<error message>"}, that gets output
# to stdout (not stderr!) and then swallowed by 'jq'.
}

View File

@ -29,6 +29,7 @@ msgcmp.x msgfmt.x msgmerge.x msgunfmt.x xgettext.x \
msgattrib.x msgcat.x msgcomm.x msgconv.x msgen.x msgexec.x msgfilter.x \
msggrep.x msginit.x msguniq.x \
recode-sr-latin.x \
spit.x \
gettextize.x autopoint.x
# Likewise.
@ -37,7 +38,8 @@ man_MAN1SRC = \
msgcmp.1 msgfmt.1 msgmerge.1 msgunfmt.1 xgettext.1 \
msgattrib.1 msgcat.1 msgcomm.1 msgconv.1 msgen.1 msgexec.1 msgfilter.1 \
msggrep.1 msginit.1 msguniq.1 \
recode-sr-latin.1
recode-sr-latin.1 \
spit.1
man_MAN1WIZARD = \
gettextize.1
man_MAN1AUTOTOOLS = \
@ -50,6 +52,7 @@ msgcmp.1.html msgfmt.1.html msgmerge.1.html msgunfmt.1.html xgettext.1.html \
msgattrib.1.html msgcat.1.html msgcomm.1.html msgconv.1.html msgen.1.html \
msgexec.1.html msgfilter.1.html msggrep.1.html msginit.1.html msguniq.1.html \
recode-sr-latin.1.html \
spit.1.html \
gettextize.1.html autopoint.1.html
EXTRA_DIST += help2man $(man_aux) $(man_MANS) $(man_HTML)
@ -119,6 +122,7 @@ msggrep.1: msggrep.x ../src/msggrep.c
msginit.1: msginit.x ../src/msginit.c
msguniq.1: msguniq.x ../src/msguniq.c
recode-sr-latin.1: recode-sr-latin.x ../src/recode-sr-latin.c
spit.1: spit.x ../src/spit.c
$(man_MAN1WIZARD): help2man $(top_srcdir)/../.version
progname=`echo $@ | sed -e 's/\.in$$//' -e 's/\.1$$//'`; \
@ -162,6 +166,7 @@ msggrep.1.html: msggrep.1
msginit.1.html: msginit.1
msguniq.1.html: msguniq.1
recode-sr-latin.1.html: recode-sr-latin.1
spit.1.html: spit.1
gettextize.1.html: gettextize.1
autopoint.1.html: autopoint.1

4
gettext-tools/man/spit.x Normal file
View File

@ -0,0 +1,4 @@
[NAME]
spit \- translate some text through a Large Language Model
[DESCRIPTION]
.\" Add any additional description here

View File

@ -83,6 +83,7 @@ src/read-resources.c
src/read-stringtable.c
src/read-tcl.c
src/recode-sr-latin.c
src/spit.c
src/urlget.c
src/write-catalog.c
src/write-csharp.c

View File

@ -189,6 +189,16 @@ msggrep.c Main source for the 'msggrep' program.
|
+-------------- The 'msgen' program
+-------------- The 'spit' program
| country-table.h
| country-table.c
| Territory names according to ISO 3166.
| spit.c
| Main source for the 'spit' program.
| spit.py.in
| The same program, as a Python script.
+-------------- The 'spit' program
po-time.h
po-time.c
Create time stamps for use in PO/POT files.

View File

@ -30,9 +30,14 @@ bin_PROGRAMS = \
msgcmp msgfmt msgmerge msgunfmt xgettext \
msgattrib msgcat msgcomm msgconv msgen msgexec msgfilter msggrep msginit msguniq \
recode-sr-latin
if BUILD_SPIT_IN_C
bin_PROGRAMS += spit
endif
noinst_PROGRAMS = hostname urlget cldr-plurals
noinst_SCRIPTS = spit.py
if INSTALL_PRIVATE_LIBRARIES
# Specify that libgettextsrc should be installed in $(libdir).
lib_LTLIBRARIES = libgettextsrc.la
@ -65,7 +70,7 @@ noinst_HEADERS = \
write-qt.h \
read-desktop.h write-desktop.h \
write-xml.h \
po-time.h plural-table.h lang-table.h format.h filters.h \
po-time.h plural-table.h lang-table.h country-table.h format.h filters.h \
xgettext.h \
if-error.h \
rc-str-list.h xg-pos.h xg-encoding.h xg-mixed-string.h xg-formatstring.h \
@ -409,6 +414,10 @@ else
msguniq_SOURCES = ../woe32dll/c++msguniq.cc
endif
recode_sr_latin_SOURCES = recode-sr-latin.c filter-sr-latin.c
if BUILD_SPIT_IN_C
spit_SOURCES = spit.c country-table.c
spit_CFLAGS = $(AM_CFLAGS) $(INCJSON_C) $(INCCURL)
endif
hostname_SOURCES = hostname.c
urlget_SOURCES = urlget.c
cldr_plurals_SOURCES = cldr-plural.y cldr-plural-exp.c cldr-plurals.c
@ -527,6 +536,9 @@ msgfilter_LDADD = libgettextsrc.la @INTL_MACOSX_LIBS@ $(WOE32_LDADD)
msggrep_LDADD = $(LIBGREP) libgettextsrc.la @INTL_MACOSX_LIBS@ $(WOE32_LDADD)
msginit_LDADD = libgettextsrc.la @INTL_MACOSX_LIBS@ $(WOE32_LDADD)
msguniq_LDADD = libgettextsrc.la @INTL_MACOSX_LIBS@ $(WOE32_LDADD)
if BUILD_SPIT_IN_C
spit_LDADD = libgettextsrc.la @INTL_MACOSX_LIBS@ $(LIBJSON_C) $(LIBCURL) $(WOE32_LDADD)
endif
hostname_LDADD = $(LDADD) $(GETADDRINFO_LIB)
# Specify when to relink the programs.
@ -546,6 +558,9 @@ msggrep_DEPENDENCIES = $(LIBGREP) libgettextsrc.la ../gnulib-lib/libgettextlib.l
msginit_DEPENDENCIES = libgettextsrc.la ../gnulib-lib/libgettextlib.la $(WOE32_LDADD)
msguniq_DEPENDENCIES = libgettextsrc.la ../gnulib-lib/libgettextlib.la $(WOE32_LDADD)
recode_sr_latin_DEPENDENCIES = $(OTHERPROGDEPENDENCIES)
if BUILD_SPIT_IN_C
spit_DEPENDENCIES = libgettextsrc.la ../gnulib-lib/libgettextlib.la $(WOE32_LDADD)
endif
hostname_DEPENDENCIES = $(OTHERPROGDEPENDENCIES)
urlget_DEPENDENCIES = $(OTHERPROGDEPENDENCIES)
@ -566,6 +581,9 @@ msggrep_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(bindir_c_make)
msginit_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(bindir_c_make)
msguniq_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(bindir_c_make)
recode_sr_latin_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(bindir_c_make)
if BUILD_SPIT_IN_C
spit_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(bindir_c_make)
endif
hostname_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(pkglibexecdir_c_make)
urlget_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(pkglibexecdir_c_make)
cldr_plurals_CPPFLAGS = $(AM_CPPFLAGS) -DINSTALLDIR=$(pkglibexecdir_c_make)
@ -586,6 +604,9 @@ msggrep_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(bindir)`
msginit_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(bindir)`
msguniq_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(bindir)`
recode_sr_latin_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(bindir)`
if BUILD_SPIT_IN_C
spit_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(bindir)`
endif
hostname_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(pkglibexecdir)`
urlget_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(pkglibexecdir)`
cldr_plurals_LDFLAGS = `$(RELOCATABLE_LDFLAGS) $(pkglibexecdir)`
@ -771,9 +792,13 @@ EXTRA_DIST += cldr-plural.c cldr-plural.h
built-sources: $(BUILT_SOURCES)
# Special rules for installation of auxiliary programs.
# Special rules for installation of 'spit' and auxiliary programs.
install-exec-local:
if !BUILD_SPIT_IN_C
$(MKDIR_P) $(DESTDIR)$(bindir)
$(INSTALL_SCRIPT) spit.py $(DESTDIR)$(bindir)/spit
endif
$(MKDIR_P) $(DESTDIR)$(pkglibexecdir)
$(INSTALL_PROGRAM_ENV) $(LIBTOOL) --mode=install $(INSTALL_PROGRAM) hostname$(EXEEXT) $(DESTDIR)$(pkglibexecdir)/hostname$(EXEEXT)
$(INSTALL_PROGRAM_ENV) $(LIBTOOL) --mode=install $(INSTALL_PROGRAM) urlget$(EXEEXT) $(DESTDIR)$(pkglibexecdir)/urlget$(EXEEXT)
@ -782,16 +807,22 @@ install-exec-local:
$(INSTALL_SCRIPT) $(srcdir)/project-id $(DESTDIR)$(pkglibexecdir)/project-id
installdirs-local:
if !BUILD_SPIT_IN_C
$(MKDIR_P) $(DESTDIR)$(bindir)
endif
$(MKDIR_P) $(DESTDIR)$(pkglibexecdir)
uninstall-local:
if !BUILD_SPIT_IN_C
$(RM) $(DESTDIR)$(bindir)/spit
endif
$(RM) $(DESTDIR)$(pkglibexecdir)/hostname$(EXEEXT)
$(RM) $(DESTDIR)$(pkglibexecdir)/urlget$(EXEEXT)
$(RM) $(DESTDIR)$(pkglibexecdir)/cldr-plurals$(EXEEXT)
$(RM) $(DESTDIR)$(pkglibexecdir)/user-email
$(RM) $(DESTDIR)$(pkglibexecdir)/project-id
DISTCLEANFILES += user-email
DISTCLEANFILES += spit.py user-email
# Special rules for Java compilation.

View File

@ -0,0 +1,277 @@
/* Table of territories.
Copyright (C) 2025 Free Software Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>. */
/* Written by Bruno Haible <bruno@clisp.org>, 2025. */
#include <config.h>
/* Specification. */
#include "country-table.h"
/* Derived from ISO 3166. */
struct country_table_entry country_table[] =
{
{ "AD", "Andorra" },
{ "AE", "United Arab Emirates" },
{ "AF", "Afghanistan" },
{ "AG", "Antigua and Barbuda" },
{ "AI", "Anguilla" },
{ "AL", "Albania" },
{ "AM", "Armenia" },
{ "AO", "Angola" },
{ "AQ", "Antarctica" },
{ "AR", "Argentina" },
{ "AS", "American Samoa" },
{ "AT", "Austria" },
{ "AU", "Australia" },
{ "AW", "Aruba" },
{ "AX", "Åland Islands" },
{ "AZ", "Azerbaijan" },
{ "BA", "Bosnia and Herzegovina" },
{ "BB", "Barbados" },
{ "BD", "Bangladesh" },
{ "BE", "Belgium" },
{ "BF", "Burkina Faso" },
{ "BG", "Bulgaria" },
{ "BH", "Bahrain" },
{ "BI", "Burundi" },
{ "BJ", "Benin" },
{ "BL", "Saint Barthélemy" },
{ "BM", "Bermuda" },
{ "BN", "Brunei Darussalam" },
{ "BO", "Bolivia, Plurinational State of" },
{ "BQ", "Bonaire, Sint Eustatius and Saba" },
{ "BR", "Brazil" },
{ "BS", "Bahamas" },
{ "BT", "Bhutan" },
{ "BV", "Bouvet Island" },
{ "BW", "Botswana" },
{ "BY", "Belarus" },
{ "BZ", "Belize" },
{ "CA", "Canada" },
{ "CC", "Cocos (Keeling) Islands" },
{ "CD", "Congo, Democratic Republic of the" },
{ "CF", "Central African Republic" },
{ "CG", "Congo" },
{ "CH", "Switzerland" },
{ "CI", "Côte d'Ivoire" },
{ "CK", "Cook Islands" },
{ "CL", "Chile" },
{ "CM", "Cameroon" },
{ "CN", "China" },
{ "CO", "Colombia" },
{ "CR", "Costa Rica" },
{ "CU", "Cuba" },
{ "CV", "Cabo Verde" },
{ "CW", "Curaçao" },
{ "CX", "Christmas Island" },
{ "CY", "Cyprus" },
{ "CZ", "Czechia" },
{ "DE", "Germany" },
{ "DJ", "Djibouti" },
{ "DK", "Denmark" },
{ "DM", "Dominica" },
{ "DO", "Dominican Republic" },
{ "DZ", "Algeria" },
{ "EC", "Ecuador" },
{ "EE", "Estonia" },
{ "EG", "Egypt" },
{ "EH", "Western Sahara" },
{ "ER", "Eritrea" },
{ "ES", "Spain" },
{ "ET", "Ethiopia" },
{ "FI", "Finland" },
{ "FJ", "Fiji" },
{ "FK", "Falkland Islands (Malvinas)" },
{ "FM", "Micronesia, Federated States of" },
{ "FO", "Faroe Islands" },
{ "FR", "France" },
{ "GA", "Gabon" },
{ "GB", "United Kingdom of Great Britain and Northern Ireland" },
{ "GD", "Grenada" },
{ "GE", "Georgia" },
{ "GF", "French Guiana" },
{ "GG", "Guernsey" },
{ "GH", "Ghana" },
{ "GI", "Gibraltar" },
{ "GL", "Greenland" },
{ "GM", "Gambia" },
{ "GN", "Guinea" },
{ "GP", "Guadeloupe" },
{ "GQ", "Equatorial Guinea" },
{ "GR", "Greece" },
{ "GS", "South Georgia and the South Sandwich Islands" },
{ "GT", "Guatemala" },
{ "GU", "Guam" },
{ "GW", "Guinea-Bissau" },
{ "GY", "Guyana" },
{ "HK", "Hong Kong" },
{ "HM", "Heard Island and McDonald Islands" },
{ "HN", "Honduras" },
{ "HR", "Croatia" },
{ "HT", "Haiti" },
{ "HU", "Hungary" },
{ "ID", "Indonesia" },
{ "IE", "Ireland" },
{ "IL", "Israel" },
{ "IM", "Isle of Man" },
{ "IN", "India" },
{ "IO", "British Indian Ocean Territory" },
{ "IQ", "Iraq" },
{ "IR", "Iran, Islamic Republic of" },
{ "IS", "Iceland" },
{ "IT", "Italy" },
{ "JE", "Jersey" },
{ "JM", "Jamaica" },
{ "JO", "Jordan" },
{ "JP", "Japan" },
{ "KE", "Kenya" },
{ "KG", "Kyrgyzstan" },
{ "KH", "Cambodia" },
{ "KI", "Kiribati" },
{ "KM", "Comoros" },
{ "KN", "Saint Kitts and Nevis" },
{ "KP", "Korea, Democratic People's Republic of" },
{ "KR", "Korea, Republic of" },
{ "KW", "Kuwait" },
{ "KY", "Cayman Islands" },
{ "KZ", "Kazakhstan" },
{ "LA", "Lao People's Democratic Republic" },
{ "LB", "Lebanon" },
{ "LC", "Saint Lucia" },
{ "LI", "Liechtenstein" },
{ "LK", "Sri Lanka" },
{ "LR", "Liberia" },
{ "LS", "Lesotho" },
{ "LT", "Lithuania" },
{ "LU", "Luxembourg" },
{ "LV", "Latvia" },
{ "LY", "Libya" },
{ "MA", "Morocco" },
{ "MC", "Monaco" },
{ "MD", "Moldova, Republic of" },
{ "ME", "Montenegro" },
{ "MF", "Saint Martin (French part)" },
{ "MG", "Madagascar" },
{ "MH", "Marshall Islands" },
{ "MK", "North Macedonia" },
{ "ML", "Mali" },
{ "MM", "Myanmar" },
{ "MN", "Mongolia" },
{ "MO", "Macao" },
{ "MP", "Northern Mariana Islands" },
{ "MQ", "Martinique" },
{ "MR", "Mauritania" },
{ "MS", "Montserrat" },
{ "MT", "Malta" },
{ "MU", "Mauritius" },
{ "MV", "Maldives" },
{ "MW", "Malawi" },
{ "MX", "Mexico" },
{ "MY", "Malaysia" },
{ "MZ", "Mozambique" },
{ "NA", "Namibia" },
{ "NC", "New Caledonia" },
{ "NE", "Niger" },
{ "NF", "Norfolk Island" },
{ "NG", "Nigeria" },
{ "NI", "Nicaragua" },
{ "NL", "Netherlands, Kingdom of the" },
{ "NO", "Norway" },
{ "NP", "Nepal" },
{ "NR", "Nauru" },
{ "NU", "Niue" },
{ "NZ", "New Zealand" },
{ "OM", "Oman" },
{ "PA", "Panama" },
{ "PE", "Peru" },
{ "PF", "French Polynesia" },
{ "PG", "Papua New Guinea" },
{ "PH", "Philippines" },
{ "PK", "Pakistan" },
{ "PL", "Poland" },
{ "PM", "Saint Pierre and Miquelon" },
{ "PN", "Pitcairn" },
{ "PR", "Puerto Rico" },
{ "PS", "Palestine, State of" },
{ "PT", "Portugal" },
{ "PW", "Palau" },
{ "PY", "Paraguay" },
{ "QA", "Qatar" },
{ "RE", "Réunion" },
{ "RO", "Romania" },
{ "RS", "Serbia" },
{ "RU", "Russian Federation" },
{ "RW", "Rwanda" },
{ "SA", "Saudi Arabia" },
{ "SB", "Solomon Islands" },
{ "SC", "Seychelles" },
{ "SD", "Sudan" },
{ "SE", "Sweden" },
{ "SG", "Singapore" },
{ "SH", "Saint Helena, Ascension and Tristan da Cunha" },
{ "SI", "Slovenia" },
{ "SJ", "Svalbard and Jan Mayen" },
{ "SK", "Slovakia" },
{ "SL", "Sierra Leone" },
{ "SM", "San Marino" },
{ "SN", "Senegal" },
{ "SO", "Somalia" },
{ "SR", "Suriname" },
{ "SS", "South Sudan" },
{ "ST", "Sao Tome and Principe" },
{ "SV", "El Salvador" },
{ "SX", "Sint Maarten (Dutch part)" },
{ "SY", "Syrian Arab Republic" },
{ "SZ", "Eswatini" },
{ "TC", "Turks and Caicos Islands" },
{ "TD", "Chad" },
{ "TF", "French Southern Territories" },
{ "TG", "Togo" },
{ "TH", "Thailand" },
{ "TJ", "Tajikistan" },
{ "TK", "Tokelau" },
{ "TL", "Timor-Leste" },
{ "TM", "Turkmenistan" },
{ "TN", "Tunisia" },
{ "TO", "Tonga" },
{ "TR", "Türkiye" },
{ "TT", "Trinidad and Tobago" },
{ "TV", "Tuvalu" },
{ "TW", "Taiwan, Province of China" },
{ "TZ", "Tanzania, United Republic of" },
{ "UA", "Ukraine" },
{ "UG", "Uganda" },
{ "UM", "United States Minor Outlying Islands" },
{ "US", "United States of America" },
{ "UY", "Uruguay" },
{ "UZ", "Uzbekistan" },
{ "VA", "Holy See" },
{ "VC", "Saint Vincent and the Grenadines" },
{ "VE", "Venezuela, Bolivarian Republic of" },
{ "VG", "Virgin Islands (British)" },
{ "VI", "Virgin Islands (U.S.)" },
{ "VN", "Viet Nam" },
{ "VU", "Vanuatu" },
{ "WF", "Wallis and Futuna" },
{ "WS", "Samoa" },
{ "YE", "Yemen" },
{ "YT", "Mayotte" },
{ "ZA", "South Africa" },
{ "ZM", "Zambia" },
{ "ZW", "Zimbabwe" }
};
const size_t country_table_size = sizeof (country_table) / sizeof (country_table[0]);

View File

@ -0,0 +1,45 @@
/* Table of territories.
Copyright (C) 2025 Free Software Foundation, Inc.
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>. */
/* Written by Bruno Haible <bruno@clisp.org>, 2025. */
#ifndef _COUNTRY_TABLE_H
#define _COUNTRY_TABLE_H
#include <stddef.h>
#ifdef __cplusplus
extern "C" {
#endif
struct country_table_entry
{
const char *code;
const char *english;
};
extern LIBGETTEXTSRC_DLL_VARIABLE struct country_table_entry country_table[];
extern LIBGETTEXTSRC_DLL_VARIABLE const size_t country_table_size;
#ifdef __cplusplus
}
#endif
#endif /* _COUNTRY_TABLE_H */

666
gettext-tools/src/spit.c Normal file
View File

@ -0,0 +1,666 @@
/*
* Copyright (C) 2025 Free Software Foundation, Inc.
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation, either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program. If not, see <https://www.gnu.org/licenses/>.
*
* Written by Bruno Haible <bruno@clisp.org>, 2025.
*/
/*
* This program passes an input to an ollama instance and prints the response.
*/
#include <config.h>
#include <errno.h>
#include <signal.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <locale.h>
/* We use JSON-C.
It is multithread-safe, as long as we don't use any of the json_global_*
API functions, nor the json_util_get_last_err function.
Documentation:
<http://json-c.github.io/json-c/json-c-0.18/doc/html/index.html#using> */
#include <json.h>
/* We use libcurl.
Documentation:
<https://curl.se/libcurl/c/> */
#include <curl/curl.h>
#include <error.h>
#include "options.h"
#include "closeout.h"
#include "progname.h"
#include "relocatable.h"
#include "basename-lgpl.h"
#include "xalloc.h"
#include "xvasprintf.h"
#include "read-file.h"
#include "full-write.h"
#include "spawn-pipe.h"
#include "wait-process.h"
#include "lang-table.h"
#include "country-table.h"
#include "propername.h"
#include "gettext.h"
#define _(str) gettext (str)
/* Returns the English name of a language (lowercase ISO 639 code),
or NULL if unknown. */
static const char *
englishname_of_language (const char *language)
{
for (size_t i = 0; i < language_table_size; i++)
if (strcmp (language_table[i].code, language) == 0)
return language_table[i].english;
return NULL;
}
/* Returns the English name of a country (uppercase ISO 3166 code),
or NULL if unknown. */
static const char *
englishname_of_country (const char *country)
{
for (size_t i = 0; i < country_table_size; i++)
if (strcmp (country_table[i].code, country) == 0)
return country_table[i].english;
return NULL;
}
/* Returns a name or description of a catalog name. */
static const char *
language_in_english (const char *catalogname)
{
const char *underscore = strchr (catalogname, '_');
if (underscore != NULL)
{
/* Treat a few cases specially. */
for (size_t i = 0; i < language_variant_table_size; i++)
if (strcmp (language_variant_table[i].code, catalogname) == 0)
return language_variant_table[i].english;
/* Decompose "ll_CC" into "ll" and "CC". */
char *language = xstrdup (catalogname);
language[underscore - catalogname] = '\0';
const char *country = underscore + 1;
const char *english_language = englishname_of_language (language);
if (english_language != NULL)
{
const char *english_country = englishname_of_country (country);
if (english_country != NULL)
return xasprintf ("%s (as spoken in %s)", english_language, english_country);
else
return english_language;
}
else
return catalogname;
}
else
{
/* It's a simple language name. */
const char *english_language = englishname_of_language (catalogname);
if (english_language != NULL)
return english_language;
else
return catalogname;
}
}
static void
curl_die ()
{
error (EXIT_FAILURE, 0, "%s", _("curl error"));
}
static void
process_response_line (const char *line, int out_fd)
{
/* Note: As of json-c version 0.15, the jerrno value is unreliable.
See <https://github.com/json-c/json-c/issues/853>
and <https://github.com/json-c/json-c/issues/854>.
json-c version 0.17 introduces json_tokener_error_memory, but its value
changes in version 0.18. */
struct json_object *j;
#if JSON_C_MAJOR_VERSION > 0 || (JSON_C_MAJOR_VERSION == 0 && JSON_C_MINOR_VERSION >= 18)
enum json_tokener_error jerrno = json_tokener_error_memory;
j = json_tokener_parse_verbose (line, &jerrno);
if (j == NULL && jerrno == json_tokener_error_memory)
xalloc_die ();
#else
j = json_tokener_parse (line);
#endif
/* Ignore an empty line. */
if (j != NULL)
{
/* We expect a JSON object. */
if (json_object_is_type (j, json_type_object))
{
/* Output its "response" property. */
const char *prop =
json_object_get_string(json_object_object_get (j, "response"));
if (prop != NULL)
{
size_t prop_length = strlen (prop);
if (full_write (out_fd, prop, prop_length) < prop_length)
if (errno != EPIPE)
error (EXIT_FAILURE, errno, _("write to subprocess failed"));
}
}
}
}
/* A libcurl header callback that determines, during curl_easy_perform,
whether the HTTP request returned an error. */
static size_t
my_header_callback (char *buffer, size_t one, size_t n, void *userdata)
{
bool *is_error_p = (bool *) userdata;
#if DEBUG
fprintf (stderr, "in my_header_callback: buffer = %.*s\n", (int) n, buffer);
#endif
if (n >= 5 && memcmp (buffer, "HTTP/", 5) == 0)
{
/* buffer contains a line of the form "HTTP/1.1 code description".
Extract the code. */
char old = buffer[n - 1];
buffer[n - 1] = '\0';
int code;
if (sscanf (buffer, "%*s %d\n", &code) == 1)
{
if (code >= 400)
*is_error_p = true;
}
buffer[n - 1] = old;
}
return n;
}
struct my_write_locals
{
int out_fd;
bool is_error;
char *body;
size_t body_allocated;
size_t body_start;
size_t body_end;
};
/* Makes room for n more bytes in l->body. */
static void
my_write_grow (struct my_write_locals *l, size_t n)
{
if ((l->body_end - l->body_start) + n > l->body_allocated)
{
/* Grow the buffer. */
size_t new_allocated = (l->body_end - l->body_start) + n;
if (new_allocated < 2 * l->body_allocated)
new_allocated = 2 * l->body_allocated;
if (new_allocated < 1024)
new_allocated = 1024;
char *new_body;
if (l->body_start == 0 && l->body_end > 0)
{
new_body = (char *) realloc (l->body, new_allocated);
if (new_body == NULL)
xalloc_die ();
}
else
{
new_body = (char *) malloc (new_allocated);
if (new_body == NULL)
xalloc_die ();
memcpy (new_body, l->body + l->body_start, l->body_end - l->body_start);
free (l->body);
l->body_end = l->body_end - l->body_start;
l->body_start = 0;
}
l->body = new_body;
l->body_allocated = new_allocated;
}
else
{
/* We can keep the buffer, but may need to move the contents to the
front. */
if (l->body_end + n > l->body_allocated)
{
memmove (l->body, l->body + l->body_start, l->body_end - l->body_start);
l->body_end = l->body_end - l->body_start;
l->body_start = 0;
}
}
/* Here l->body_end + n <= l->body_allocated. */
}
/* A libcurl write callback that processes a piece of response body,
depending on whether the HTTP request returned an error. */
static size_t
my_write_callback (char *buffer, size_t one, size_t n, void *userdata)
{
struct my_write_locals *l = (struct my_write_locals *) userdata;
#if DEBUG
fprintf (stderr, "in my_write_callback: buffer = %.*s\n", (int) n, buffer);
#endif
/* Append the buffer's contents to the body. */
my_write_grow (l, n);
memcpy (l->body + l->body_end, buffer, n);
l->body_end += n;
if (!l->is_error)
{
/* Process entire lines that are in the buffer. */
char *newline = (char *) memchr (l->body + l->body_end - n, '\n', n);
while (newline != NULL)
{
/* We have an entire line. */
*newline = '\0';
process_response_line (l->body + l->body_start, l->out_fd);
l->body_start = (newline + 1) - l->body;
newline = (char *) memchr (l->body + l->body_start, '\n',
l->body_end - l->body_start);
}
}
return n;
}
/* Make the HTTP POST request to the given URL, sending its output
to the file descriptor FD. */
static void
do_request (const char *url, const char *payload_as_string, int fd)
{
if (curl_global_init (CURL_GLOBAL_DEFAULT))
curl_die ();
CURL *curl = curl_easy_init ();
if (!curl)
curl_die ();
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_URL.html> */
curl_easy_setopt (curl, CURLOPT_URL, url);
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_POST.html> */
curl_easy_setopt (curl, CURLOPT_POST, 1L);
{
struct curl_slist *headers = NULL;
/* Override the Content-Type header set by CURLOPT_POST. */
headers = curl_slist_append(headers, "Content-Type: " "application/json");
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_HTTPHEADER.html> */
curl_easy_setopt (curl, CURLOPT_HTTPHEADER, headers);
}
/* Set the payload.
Documentation: <https://curl.se/libcurl/c/CURLOPT_POSTFIELDS.html> */
curl_easy_setopt(curl, CURLOPT_POSTFIELDS, payload_as_string);
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_NOPROGRESS.html> */
curl_easy_setopt (curl, CURLOPT_NOPROGRESS, 1L);
#if DEBUG > 1
/* For debugging: */
/* Documentation: <https://curl.se/libcurl/c/CURLOPT_VERBOSE.html> */
curl_easy_setopt (curl, CURLOPT_VERBOSE, 1L);
#endif
#if 0
/* Not reliable, see <https://curl.se/libcurl/c/CURLOPT_FAILONERROR.html>. */
curl_easy_setopt (curl, CURLOPT_FAILONERROR, 1L);
#endif
struct my_write_locals locals;
locals.out_fd = fd;
locals.is_error = false;
locals.body = NULL;
locals.body_allocated = 0;
locals.body_start = 0;
locals.body_end = 0;
/* Documentation:
<https://curl.se/libcurl/c/CURLOPT_HEADERFUNCTION.html>
<https://curl.se/libcurl/c/CURLOPT_HEADERDATA.html> */
curl_easy_setopt (curl, CURLOPT_HEADERFUNCTION, my_header_callback);
curl_easy_setopt (curl, CURLOPT_HEADERDATA, &locals.is_error);
/* Documentation:
<https://curl.se/libcurl/c/CURLOPT_WRITEFUNCTION.html>
<https://curl.se/libcurl/c/CURLOPT_WRITEDATA.html> */
curl_easy_setopt (curl, CURLOPT_WRITEFUNCTION, my_write_callback);
curl_easy_setopt (curl, CURLOPT_WRITEDATA, &locals);
/* Documentation: <https://curl.se/libcurl/c/curl_easy_perform.html> */
CURLcode ret = curl_easy_perform (curl);
if (ret != CURLE_OK)
error (EXIT_FAILURE, 0, _("curl error %u: %s"), ret,
/* Documentation: <https://curl.se/libcurl/c/curl_easy_strerror.html> */
curl_easy_strerror (ret));
/* Documentation: <https://curl.se/libcurl/c/CURLINFO_RESPONSE_CODE.html> */
long status_code;
curl_easy_getinfo (curl, CURLINFO_RESPONSE_CODE, &status_code);
if (status_code != 200)
fprintf (stderr, "Status: %ld\n", status_code);
if (locals.is_error != (status_code >= 400))
/* The my_header_callback did not work right. */
abort ();
if (status_code >= 400)
{
/* In this case, print the response body to stderr, not to fd. */
fprintf (stderr, "Body: ");
fwrite (locals.body + locals.body_start,
1, locals.body_end - locals.body_start,
stderr);
fprintf (stderr, "\n");
exit (EXIT_FAILURE);
}
/* Most lines have already been processed through my_write_callback.
Now process the last line (without terminating newline). */
if (locals.body_end > locals.body_start)
{
my_write_grow (&locals, 1);
*(locals.body + locals.body_end) = '\0';
process_response_line (locals.body + locals.body_start, locals.out_fd);
}
}
/* Display usage information and exit. */
static void
usage (int status)
{
if (status != EXIT_SUCCESS)
fprintf (stderr, _("Try '%s --help' for more information.\n"),
program_name);
else
{
printf (_("\
Usage: %s [OPTION...]\n"),
program_name);
printf ("\n");
printf (_("\
Passes standard input to a Large Language Model (LLM) instance and prints\n\
the response.\n\
With the %s option, it translates standard input to the specified language\n\
through a Large Language Model (LLM) and prints the translation.\n"),
"--to");
printf ("\n");
printf (_("\
Warning: The output might not be what you expect.\n\
It might be of the wrong form, be of poor quality, or reflect some biases.\n"));
printf ("\n");
printf (_("\
Options:\n"));
printf (_("\
--species=TYPE Specifies the type of LLM. The default and only\n\
valid value is '%s'.\n"),
"ollama");
printf (_("\
--url=URL Specifies the URL of the server that runs the LLM.\n"));
printf (_("\
-m, --model=MODEL Specifies the model to use.\n"));
printf (_("\
--to=LANGUAGE Specifies the target language.\n"));
printf (_("\
--prompt=TEXT Specifies the prompt to use before standard input.\n\
This option overrides the --to option.\n"));
printf (_("\
--postprocess=COMMAND Specifies a command to post-process the output.\n"));
printf ("\n");
printf (_("\
Informative output:\n"));
printf ("\n");
printf (_("\
-h, --help Display this help and exit.\n"));
printf (_("\
-V, --version Output version information and exit.\n"));
printf ("\n");
/* TRANSLATORS: The first placeholder is the web address of the Savannah
project of this package. The second placeholder is the bug-reporting
email address for this package. Please add _another line_ saying
"Report translation bugs to <...>\n" with the address for translation
bugs (typically your translation team's web or email address). */
printf (_("\
Report bugs in the bug tracker at <%s>\n\
or by email to <%s>.\n"),
"https://savannah.gnu.org/projects/gettext",
"bug-gettext@gnu.org");
}
exit (status);
}
int
main (int argc, char **argv)
{
/* Set program name for messages. */
set_program_name (argv[0]);
/* Set locale via LC_ALL. */
setlocale (LC_ALL, "");
/* Set the text message domain. */
bindtextdomain (PACKAGE, relocate (LOCALEDIR));
bindtextdomain ("gnulib", relocate (GNULIB_LOCALEDIR));
textdomain (PACKAGE);
/* Ensure that write errors on stdout are detected. */
atexit (close_stdout);
/* Default values for command line options. */
bool do_help = false;
bool do_version = false;
const char *species = "ollama";
const char *url = "http://localhost:11434";
const char *model = NULL;
const char *to_language = NULL;
const char *prompt = NULL;
const char *postprocess = NULL;
/* Parse command line options. */
BEGIN_ALLOW_OMITTING_FIELD_INITIALIZERS
static const struct program_option options[] =
{
{ "help", 'h', no_argument },
{ "model", 'm', required_argument },
{ "postprocess", CHAR_MAX + 5, required_argument },
{ "prompt", CHAR_MAX + 4, required_argument },
{ "species", CHAR_MAX + 1, required_argument },
{ "to", CHAR_MAX + 3, required_argument },
{ "url", CHAR_MAX + 2, required_argument },
{ "version", 'V', no_argument },
};
END_ALLOW_OMITTING_FIELD_INITIALIZERS
start_options (argc, argv, options, MOVE_OPTIONS_FIRST, 0);
{
int opt;
while ((opt = get_next_option ()) != -1)
switch (opt)
{
case '\0': /* Long option with key == 0. */
break;
case 'h': /* --help */
do_help = true;
break;
case 'V': /* --version */
do_version = true;
break;
case CHAR_MAX + 1: /* --species */
species = optarg;
break;
case CHAR_MAX + 2: /* --url */
url = optarg;
break;
case 'm': /* --model */
model = optarg;
break;
case CHAR_MAX + 3: /* --to */
to_language = optarg;
break;
case CHAR_MAX + 4: /* --prompt */
prompt = optarg;
break;
case CHAR_MAX + 5: /* --postprocess */
postprocess = optarg;
break;
default:
usage (EXIT_FAILURE);
break;
}
}
/* Version information is requested. */
if (do_version)
{
printf ("%s (GNU %s) %s\n", last_component (program_name),
PACKAGE, VERSION);
/* xgettext: no-wrap */
printf (_("Copyright (C) %s Free Software Foundation, Inc.\n\
License GPLv3+: GNU GPL version 3 or later <%s>\n\
This is free software: you are free to change and redistribute it.\n\
There is NO WARRANTY, to the extent permitted by law.\n\
"),
"2025", "https://gnu.org/licenses/gpl.html");
printf (_("Written by %s.\n"), proper_name ("Bruno Haible"));
exit (EXIT_SUCCESS);
}
/* Help is requested. */
if (do_help)
usage (EXIT_SUCCESS);
/* Test for extraneous arguments. */
if (optind != argc)
error (EXIT_FAILURE, 0, _("too many arguments"));
/* Check --species option. */
if (strcmp (species, "ollama") != 0)
error (EXIT_FAILURE, 0, _("invalid value for %s option: %s"),
"--species", species);
/* Check --model option. */
if (model == NULL)
error (EXIT_FAILURE, 0, _("missing %s option"),
"--model");
/* Sanitize URL. */
if (!(strlen (url) > 0 && url[strlen (url) - 1] == '/'))
url = xasprintf ("%s/", url);
/* Read the contents of standard input. */
errno = 0;
size_t input_length;
char *input = fread_file (stdin, 0, &input_length);
if (input == NULL)
error (EXIT_FAILURE, errno, _("error reading standard input"));
/* Compute a default prompt. */
if (prompt == NULL && to_language != NULL)
prompt = xasprintf ("Translate into %s:", language_in_english (to_language));
/* Prepend the prompt. */
if (prompt != NULL)
input = xasprintf ("%s\n%s", prompt, input);
/* Documentation of the ollama API:
<https://docs.ollama.com/api/generate> */
url = xasprintf ("%sapi/generate", url);
/* Compose the payload. */
struct json_object *payload = json_object_new_object ();
if (payload == NULL)
xalloc_die ();
{
struct json_object *value = json_object_new_string (model);
if (value == NULL)
xalloc_die ();
if (json_object_object_add (payload, "model", value))
xalloc_die ();
}
{
struct json_object *value = json_object_new_string (input);
if (value == NULL)
xalloc_die ();
if (json_object_object_add (payload, "prompt", value))
xalloc_die ();
}
const char *payload_as_string =
json_object_to_json_string_ext (payload, JSON_C_TO_STRING_PLAIN
| JSON_C_TO_STRING_NOSLASHESCAPE);
if (payload_as_string == NULL)
xalloc_die ();
/* Make the request to the ollama server. */
if (postprocess != NULL)
{
/* Open a pipe to a subprocess. */
const char *sub_argv[4];
sub_argv[0] = BOURNE_SHELL;
sub_argv[1] = "-c";
sub_argv[2] = postprocess;
sub_argv[3] = NULL;
int fd[1];
pid_t child = create_pipe_out (BOURNE_SHELL, BOURNE_SHELL, sub_argv, NULL,
NULL, NULL, false, true, true, fd);
/* Ignore SIGPIPE here. We don't care if the subprocesses terminates
successfully without having read all of the input that we feed it. */
void (*orig_sigpipe_handler)(int);
orig_sigpipe_handler = signal (SIGPIPE, SIG_IGN);
do_request (url, payload_as_string, fd[0]);
close (fd[0]);
signal (SIGPIPE, orig_sigpipe_handler);
/* Remove zombie process from process list, and retrieve exit status. */
int exitstatus =
wait_subprocess (child, BOURNE_SHELL, true, false, true, true, NULL);
return exitstatus;
}
else
{
do_request (url, payload_as_string, STDOUT_FILENO);
return EXIT_SUCCESS;
}
}
/*
* Local Variables:
* run-command: "echo 'Translate into German: "Welcome to the GNU project!"' | ./spit --model=ministral-3:14b"
* End:
*/

View File

@ -0,0 +1,800 @@
#! /usr/bin/env python3
#
# Copyright (C) 2001-2025 Free Software Foundation, Inc.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
#
# Written by Bruno Haible <bruno@clisp.org>, 2025.
# This program passes an input to an ollama instance and prints the response.
#
# Dependencies: request.
import sys
# Documentation: https://docs.python.org/3/library/argparse.html
import argparse
# Documentation: https://requests.readthedocs.io/en/latest/
import requests
# Documentation: https://docs.python.org/3/library/json.html
import json
# Documentation: https://docs.python.org/3/library/subprocess.html
import subprocess
# Converted from lang-table.c:
language_table = {
"aa": "Afar",
"ab": "Abkhazian",
"ace": "Achinese",
"ae": "Avestan",
"af": "Afrikaans",
"ak": "Akan",
"am": "Amharic",
"an": "Aragonese",
"ang": "Old English",
"ar": "Arabic",
"arn": "Mapudungun",
"as": "Assamese",
"ast": "Asturian",
"av": "Avaric",
"awa": "Awadhi",
"ay": "Aymara",
"az": "Azerbaijani",
"ba": "Bashkir",
"bal": "Baluchi",
"ban": "Balinese",
"be": "Belarusian",
"bej": "Beja",
"bem": "Bemba",
"bg": "Bulgarian",
"bh": "Bihari",
"bho": "Bhojpuri",
"bi": "Bislama",
"bik": "Bikol",
"bin": "Bini",
"bm": "Bambara",
"bn": "Bengali",
"bo": "Tibetan",
"br": "Breton",
"bs": "Bosnian",
"bug": "Buginese",
"ca": "Catalan",
"ce": "Chechen",
"ceb": "Cebuano",
"ch": "Chamorro",
"co": "Corsican",
"cr": "Cree",
"crh": "Crimean Tatar",
"cs": "Czech",
"csb": "Kashubian",
"cu": "Church Slavic",
"cv": "Chuvash",
"cy": "Welsh",
"da": "Danish",
"de": "German",
"din": "Dinka",
"doi": "Dogri",
"dsb": "Lower Sorbian",
"dv": "Divehi",
"dz": "Dzongkha",
"ee": "Ewe",
"el": "Greek",
"en": "English",
"eo": "Esperanto",
"es": "Spanish",
"et": "Estonian",
"eu": "Basque",
"fa": "Persian",
"ff": "Fulah",
"fi": "Finnish",
"fil": "Filipino",
"fj": "Fijian",
"fo": "Faroese",
"fon": "Fon",
"fr": "French",
"fur": "Friulian",
"fy": "Western Frisian",
"ga": "Irish",
"gd": "Scottish Gaelic",
"gl": "Galician",
"gn": "Guarani",
"gon": "Gondi",
"gsw": "Swiss German", # can also be "Alsatian"
"gu": "Gujarati",
"gv": "Manx",
"ha": "Hausa",
"he": "Hebrew",
"hi": "Hindi",
"hil": "Hiligaynon",
"hmn": "Hmong",
"ho": "Hiri Motu",
"hr": "Croatian",
"hsb": "Upper Sorbian",
"ht": "Haitian",
"hu": "Hungarian",
"hy": "Armenian",
"hz": "Herero",
"ia": "Interlingua",
"id": "Indonesian",
"ie": "Interlingue",
"ig": "Igbo",
"ii": "Sichuan Yi",
"ik": "Inupiak",
"ilo": "Iloko",
"is": "Icelandic",
"it": "Italian",
"iu": "Inuktitut",
"ja": "Japanese",
"jab": "Hyam",
"jv": "Javanese",
"ka": "Georgian",
"kab": "Kabyle",
"kaj": "Jju",
"kam": "Kamba",
"kbd": "Kabardian",
"kcg": "Tyap",
"kdm": "Kagoma",
"kg": "Kongo",
"ki": "Kikuyu",
"kj": "Kuanyama",
"kk": "Kazakh",
"kl": "Kalaallisut",
"km": "Central Khmer",
"kmb": "Kimbundu",
"kn": "Kannada",
"ko": "Korean",
"kr": "Kanuri",
"kru": "Kurukh",
"ks": "Kashmiri",
"ku": "Kurdish",
"kv": "Komi",
"kw": "Cornish",
"ky": "Kirghiz",
"kok": "Konkani",
"la": "Latin",
"lb": "Letzeburgesch",
"lg": "Ganda",
"li": "Limburgish",
"ln": "Lingala",
"lo": "Laotian",
"lt": "Lithuanian",
"lu": "Luba-Katanga",
"lua": "Luba-Lulua",
"luo": "Luo",
"lv": "Latvian",
"mad": "Madurese",
"mag": "Magahi",
"mai": "Maithili",
"mak": "Makasar",
"man": "Mandingo",
"men": "Mende",
"mg": "Malagasy",
"mh": "Marshallese",
"mi": "Maori",
"min": "Minangkabau",
"mk": "Macedonian",
"ml": "Malayalam",
"mn": "Mongolian",
"mni": "Manipuri",
"mo": "Moldavian",
"moh": "Mohawk",
"mos": "Mossi",
"mr": "Marathi",
"ms": "Malay",
"mt": "Maltese",
"mwr": "Marwari",
"my": "Burmese",
"myn": "Mayan",
"na": "Nauru",
"nap": "Neapolitan",
"nah": "Nahuatl",
"nb": "Norwegian Bokmal",
"nd": "North Ndebele",
"nds": "Low Saxon",
"ne": "Nepali",
"ng": "Ndonga",
"nl": "Dutch",
"nn": "Norwegian Nynorsk",
"no": "Norwegian",
"nr": "South Ndebele",
"nso": "Northern Sotho",
"nv": "Navajo",
"ny": "Nyanja",
"nym": "Nyamwezi",
"nyn": "Nyankole",
"oc": "Occitan",
"oj": "Ojibwa",
"om": "(Afan) Oromo",
"or": "Oriya",
"os": "Ossetian",
"pa": "Punjabi",
"pag": "Pangasinan",
"pam": "Pampanga",
"pap": "Papiamento",
"pbb": "Páez",
"pi": "Pali",
"pl": "Polish",
"ps": "Pashto",
"pt": "Portuguese",
"qu": "Quechua",
"raj": "Rajasthani",
"rm": "Romansh",
"rn": "Kirundi",
"ro": "Romanian",
"ru": "Russian",
"rw": "Kinyarwanda",
"sa": "Sanskrit",
"sah": "Yakut",
"sas": "Sasak",
"sat": "Santali",
"sc": "Sardinian",
"scn": "Sicilian",
"sd": "Sindhi",
"se": "Northern Sami",
"sg": "Sango",
"shn": "Shan",
"si": "Sinhala",
"sid": "Sidamo",
"sk": "Slovak",
"sl": "Slovenian",
"sm": "Samoan",
"sma": "Southern Sami",
"smj": "Lule Sami",
"smn": "Inari Sami",
"sms": "Skolt Sami",
"sn": "Shona",
"so": "Somali",
"sq": "Albanian",
"sr": "Serbian",
"srr": "Serer",
"ss": "Siswati",
"st": "Sesotho",
"su": "Sundanese",
"suk": "Sukuma",
"sus": "Susu",
"sv": "Swedish",
"sw": "Swahili",
"ta": "Tamil",
"te": "Telugu",
"tem": "Timne",
"tet": "Tetum",
"tg": "Tajik",
"th": "Thai",
"ti": "Tigrinya",
"tiv": "Tiv",
"tk": "Turkmen",
"tl": "Tagalog",
"tn": "Setswana",
"to": "Tonga",
"tr": "Turkish",
"ts": "Tsonga",
"tt": "Tatar",
"tum": "Tumbuka",
"tw": "Twi",
"ty": "Tahitian",
"ug": "Uighur",
"uk": "Ukrainian",
"umb": "Umbundu",
"ur": "Urdu",
"uz": "Uzbek",
"ve": "Venda",
"vi": "Vietnamese",
"vo": "Volapuk",
"wal": "Walamo",
"war": "Waray",
"wen": "Sorbian",
"wo": "Wolof",
"xh": "Xhosa",
"yao": "Yao",
"yi": "Yiddish",
"yo": "Yoruba",
"za": "Zhuang",
"zh": "Chinese",
"zu": "Zulu",
"zap": "Zapotec",
}
language_variant_table = {
"de_AT": "Austrian",
"en_GB": "English (British)",
"es_AR": "Argentinian",
"es_IC": "Spanish (Canary Islands)",
"pt_BR": "Brazilian Portuguese",
"zh_CN": "Chinese (simplified)",
"zh_HK": "Chinese (Hong Kong)",
"zh_TW": "Chinese (traditional)",
}
# Converted from country-table.c:
country_table = {
"AD": "Andorra",
"AE": "United Arab Emirates",
"AF": "Afghanistan",
"AG": "Antigua and Barbuda",
"AI": "Anguilla",
"AL": "Albania",
"AM": "Armenia",
"AO": "Angola",
"AQ": "Antarctica",
"AR": "Argentina",
"AS": "American Samoa",
"AT": "Austria",
"AU": "Australia",
"AW": "Aruba",
"AX": "Åland Islands",
"AZ": "Azerbaijan",
"BA": "Bosnia and Herzegovina",
"BB": "Barbados",
"BD": "Bangladesh",
"BE": "Belgium",
"BF": "Burkina Faso",
"BG": "Bulgaria",
"BH": "Bahrain",
"BI": "Burundi",
"BJ": "Benin",
"BL": "Saint Barthélemy",
"BM": "Bermuda",
"BN": "Brunei Darussalam",
"BO": "Bolivia, Plurinational State of",
"BQ": "Bonaire, Sint Eustatius and Saba",
"BR": "Brazil",
"BS": "Bahamas",
"BT": "Bhutan",
"BV": "Bouvet Island",
"BW": "Botswana",
"BY": "Belarus",
"BZ": "Belize",
"CA": "Canada",
"CC": "Cocos (Keeling) Islands",
"CD": "Congo, Democratic Republic of the",
"CF": "Central African Republic",
"CG": "Congo",
"CH": "Switzerland",
"CI": "Côte d'Ivoire",
"CK": "Cook Islands",
"CL": "Chile",
"CM": "Cameroon",
"CN": "China",
"CO": "Colombia",
"CR": "Costa Rica",
"CU": "Cuba",
"CV": "Cabo Verde",
"CW": "Curaçao",
"CX": "Christmas Island",
"CY": "Cyprus",
"CZ": "Czechia",
"DE": "Germany",
"DJ": "Djibouti",
"DK": "Denmark",
"DM": "Dominica",
"DO": "Dominican Republic",
"DZ": "Algeria",
"EC": "Ecuador",
"EE": "Estonia",
"EG": "Egypt",
"EH": "Western Sahara",
"ER": "Eritrea",
"ES": "Spain",
"ET": "Ethiopia",
"FI": "Finland",
"FJ": "Fiji",
"FK": "Falkland Islands (Malvinas)",
"FM": "Micronesia, Federated States of",
"FO": "Faroe Islands",
"FR": "France",
"GA": "Gabon",
"GB": "United Kingdom of Great Britain and Northern Ireland",
"GD": "Grenada",
"GE": "Georgia",
"GF": "French Guiana",
"GG": "Guernsey",
"GH": "Ghana",
"GI": "Gibraltar",
"GL": "Greenland",
"GM": "Gambia",
"GN": "Guinea",
"GP": "Guadeloupe",
"GQ": "Equatorial Guinea",
"GR": "Greece",
"GS": "South Georgia and the South Sandwich Islands",
"GT": "Guatemala",
"GU": "Guam",
"GW": "Guinea-Bissau",
"GY": "Guyana",
"HK": "Hong Kong",
"HM": "Heard Island and McDonald Islands",
"HN": "Honduras",
"HR": "Croatia",
"HT": "Haiti",
"HU": "Hungary",
"ID": "Indonesia",
"IE": "Ireland",
"IL": "Israel",
"IM": "Isle of Man",
"IN": "India",
"IO": "British Indian Ocean Territory",
"IQ": "Iraq",
"IR": "Iran, Islamic Republic of",
"IS": "Iceland",
"IT": "Italy",
"JE": "Jersey",
"JM": "Jamaica",
"JO": "Jordan",
"JP": "Japan",
"KE": "Kenya",
"KG": "Kyrgyzstan",
"KH": "Cambodia",
"KI": "Kiribati",
"KM": "Comoros",
"KN": "Saint Kitts and Nevis",
"KP": "Korea, Democratic People's Republic of",
"KR": "Korea, Republic of",
"KW": "Kuwait",
"KY": "Cayman Islands",
"KZ": "Kazakhstan",
"LA": "Lao People's Democratic Republic",
"LB": "Lebanon",
"LC": "Saint Lucia",
"LI": "Liechtenstein",
"LK": "Sri Lanka",
"LR": "Liberia",
"LS": "Lesotho",
"LT": "Lithuania",
"LU": "Luxembourg",
"LV": "Latvia",
"LY": "Libya",
"MA": "Morocco",
"MC": "Monaco",
"MD": "Moldova, Republic of",
"ME": "Montenegro",
"MF": "Saint Martin (French part)",
"MG": "Madagascar",
"MH": "Marshall Islands",
"MK": "North Macedonia",
"ML": "Mali",
"MM": "Myanmar",
"MN": "Mongolia",
"MO": "Macao",
"MP": "Northern Mariana Islands",
"MQ": "Martinique",
"MR": "Mauritania",
"MS": "Montserrat",
"MT": "Malta",
"MU": "Mauritius",
"MV": "Maldives",
"MW": "Malawi",
"MX": "Mexico",
"MY": "Malaysia",
"MZ": "Mozambique",
"NA": "Namibia",
"NC": "New Caledonia",
"NE": "Niger",
"NF": "Norfolk Island",
"NG": "Nigeria",
"NI": "Nicaragua",
"NL": "Netherlands, Kingdom of the",
"NO": "Norway",
"NP": "Nepal",
"NR": "Nauru",
"NU": "Niue",
"NZ": "New Zealand",
"OM": "Oman",
"PA": "Panama",
"PE": "Peru",
"PF": "French Polynesia",
"PG": "Papua New Guinea",
"PH": "Philippines",
"PK": "Pakistan",
"PL": "Poland",
"PM": "Saint Pierre and Miquelon",
"PN": "Pitcairn",
"PR": "Puerto Rico",
"PS": "Palestine, State of",
"PT": "Portugal",
"PW": "Palau",
"PY": "Paraguay",
"QA": "Qatar",
"RE": "Réunion",
"RO": "Romania",
"RS": "Serbia",
"RU": "Russian Federation",
"RW": "Rwanda",
"SA": "Saudi Arabia",
"SB": "Solomon Islands",
"SC": "Seychelles",
"SD": "Sudan",
"SE": "Sweden",
"SG": "Singapore",
"SH": "Saint Helena, Ascension and Tristan da Cunha",
"SI": "Slovenia",
"SJ": "Svalbard and Jan Mayen",
"SK": "Slovakia",
"SL": "Sierra Leone",
"SM": "San Marino",
"SN": "Senegal",
"SO": "Somalia",
"SR": "Suriname",
"SS": "South Sudan",
"ST": "Sao Tome and Principe",
"SV": "El Salvador",
"SX": "Sint Maarten (Dutch part)",
"SY": "Syrian Arab Republic",
"SZ": "Eswatini",
"TC": "Turks and Caicos Islands",
"TD": "Chad",
"TF": "French Southern Territories",
"TG": "Togo",
"TH": "Thailand",
"TJ": "Tajikistan",
"TK": "Tokelau",
"TL": "Timor-Leste",
"TM": "Turkmenistan",
"TN": "Tunisia",
"TO": "Tonga",
"TR": "Türkiye",
"TT": "Trinidad and Tobago",
"TV": "Tuvalu",
"TW": "Taiwan, Province of China",
"TZ": "Tanzania, United Republic of",
"UA": "Ukraine",
"UG": "Uganda",
"UM": "United States Minor Outlying Islands",
"US": "United States of America",
"UY": "Uruguay",
"UZ": "Uzbekistan",
"VA": "Holy See",
"VC": "Saint Vincent and the Grenadines",
"VE": "Venezuela, Bolivarian Republic of",
"VG": "Virgin Islands (British)",
"VI": "Virgin Islands (U.S.)",
"VN": "Viet Nam",
"VU": "Vanuatu",
"WF": "Wallis and Futuna",
"WS": "Samoa",
"YE": "Yemen",
"YT": "Mayotte",
"ZA": "South Africa",
"ZM": "Zambia",
"ZW": "Zimbabwe",
}
def englishname_of_language(language):
'''Returns the English name of a language (lowercase ISO 639 code),
or None if unknown.'''
return language_table.get(language, None)
def englishname_of_country(country):
'''Returns the English name of a country (uppercase ISO 3166 code),
or None if unknown.'''
return country_table.get(country, None)
def language_in_english(catalogname):
'''Returns a name or description of a catalog name.'''
underscore = catalogname.find('_')
if underscore >= 0:
# Treat a few cases specially.
english = language_variant_table.get(catalogname, None)
if english != None:
return english
# Decompose "ll_CC" into "ll" and "CC".
language = catalogname[:underscore]
country = catalogname[underscore+1:]
english_language = englishname_of_language(language)
if english_language != None:
english_country = englishname_of_country(country)
if english_country != None:
return "%s (as spoken in %s)" % (english_language, english_country)
else:
return english_language
else:
return catalogname
else:
# It's a simple language name.
english_language = englishname_of_language(catalogname)
if english_language != None:
return english_language
else:
return catalogname
def do_request(url, payload, stream):
'''Make the HTTP POST request to the given URL, sending its output
to the stream STREAM.'''
response = requests.post(url, data=payload, stream=True)
if response.status_code != 200:
print('Status:', response.status_code, file=sys.stderr)
if response.status_code >= 400:
print('Body:', response.text, file=sys.stderr)
sys.exit(1)
# Not needed any more:
#response.raise_for_status()
for line in response.iter_lines():
part = json.loads(line.decode('utf-8'))
print(part.get('response', ''), end='', flush=True, file=stream)
def main():
parser = argparse.ArgumentParser(
prog='spit',
usage='spit --help',
add_help=False)
parser.add_argument('--species',
dest='species',
default='ollama')
parser.add_argument('--url',
dest='url',
default='http://localhost:11434')
parser.add_argument('--model', '-m',
dest='model',
default=None,
nargs=1)
parser.add_argument('--to',
dest='to_language',
default=None,
nargs=1)
parser.add_argument('--prompt',
dest='prompt',
default=None,
nargs=1)
parser.add_argument('--postprocess',
dest='postprocess',
default=None,
nargs=1)
parser.add_argument('--help', '--hel', '--he', '--h', '-h',
dest='help',
default=None,
action='store_true')
parser.add_argument('--version', '--versio', '--versi', '--vers', '--ver', '--ve', '--v', '-V',
dest='version',
default=None,
action='store_true')
# All other arguments are collected.
parser.add_argument('non_option_arguments',
nargs='*')
# Parse the given arguments. Don't signal an error if non-option arguments
# occur between or after options.
cmdargs, unhandled = parser.parse_known_args()
# Handle --version, ignoring all other options.
if cmdargs.version != None:
print('''
spit (GNU gettext-tools) @VERSION@
Copyright (C) 2025 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Written by Bruno Haible.
''')
sys.exit(0)
# Handle --help, ignoring all other options.
if cmdargs.help != None:
print('''
Usage: spit [OPTION...]
Passes standard input to a Large Language Model (LLM) instance and prints
the response.
With the --to option, it translates standard input to the specified language
through a Large Language Model (LLM) and prints the translation.
Warning: The output might not be what you expect.
It might be of the wrong form, be of poor quality, or reflect some biases.
Options:
--species=TYPE Specifies the type of LLM. The default and only
valid value is 'ollama'.
--url=URL Specifies the URL of the server that runs the LLM.
-m, --model=MODEL Specifies the model to use.
--to=LANGUAGE Specifies the target language.
--prompt=TEXT Specifies the prompt to use before standard input.
This option overrides the --to option.
--postprocess=COMMAND Specifies a command to post-process the output.
Informative output:
-h, --help Display this help and exit.
-V, --version Output version information and exit.
Report bugs in the bug tracker at <https://savannah.gnu.org/projects/gettext>
or by email to <bug-gettext@gnu.org>.
''')
sys.exit(0)
# Report unhandled arguments.
for arg in unhandled:
if arg.startswith('-'):
message = '%s: Unrecognized option \'%s\'.\n' % ('spit', arg)
message += 'Try \'spit --help\' for more information.\n'
sys.stderr.write(message)
sys.exit(1)
# By now, all unhandled arguments were non-options.
cmdargs.non_option_arguments += unhandled
# Test for extraneous arguments.
if len(cmdargs.non_option_arguments) > 0:
message = '%s: too many arguments\n' % 'spit'
message += 'Try \'spit --help\' for more information.\n'
sys.stderr.write(message)
sys.exit(1)
# Check --species option.
if cmdargs.species != 'ollama':
sys.stderr.write('%s: invalid value for --species option: %s' % ('spit', cmdargs.species))
sys.exit(1)
# Check --model option.
if cmdargs.model == None:
sys.stderr.write('%s: missing --model option\n' % 'spit')
sys.exit(1)
model = cmdargs.model[0]
to_language = None
if cmdargs.to_language != None:
to_language = cmdargs.to_language[0]
prompt = None
if cmdargs.prompt != None:
prompt = cmdargs.prompt[0]
postprocess = None
if cmdargs.postprocess != None:
postprocess = cmdargs.postprocess[0]
# Sanitize URL.
url = cmdargs.url
if not url.endswith('/'):
url += '/'
# Read the contents of standard input.
input = sys.stdin.read()
# Compute a default prompt.
if prompt == None and to_language != None:
prompt = 'Translate into ' + language_in_english(to_language) + ':'
# Prepend the prompt.
if prompt != None:
input = prompt + '\n' + input
# For debugging.
#print(input)
# Documentation of the ollama API:
# <https://docs.ollama.com/api/generate>
url = url + 'api/generate'
payload = { 'model': model, 'prompt': input }
# We need the payload in JSON syntax (with double-quotes around the strings),
# not in Python syntax (with single-quotes around the strings):
payload = json.dumps(payload)
# Make the request to the ollama server.
if postprocess != None:
pipe = subprocess.Popen(["sh", "-c", postprocess],
stdin=subprocess.PIPE, text=True)
try:
do_request(url, payload, pipe.stdin)
finally:
pipe.stdin.close()
pipe.wait()
else:
do_request(url, payload, sys.stdout)
if __name__ == '__main__':
main()