If configuring the network times out, there will be no /run/net-*.conf
files present, and the attempt to source any interface config file will
fail. In this failure mode, dash (or 'bash --posix') immediately exits,
regardless of 'set -e' or not.
This precludes a caller from (cleanly) handling network bring-up
failure, particularly if the caller cares about the variables set from
sourcing the ipconfig config file.
paul@haley ~ % cat repro.sh
#!/bin/sh
. /nonexistent
echo hello
paul@haley ~ % dash ./repro.sh
./repro.sh: 3: .: cannot open /nonexistent: No such file
paul@haley ~ % sh ./repro.sh
./repro.sh: 3: .: cannot open /nonexistent: No such file
paul@haley ~ % bash ./repro.sh
./repro.sh: line 3: /nonexistent: No such file or directory
hello
paul@haley ~ % bash --posix ./repro.sh
./repro.sh: line 3: /nonexistent: No such file or directory
paul@haley ~ %
Co-authored-by: Pierre Neyron <pierre.neyron@imag.fr>
Closes: #1025730
To support using `set -eu` in initramfs-tools script (e.g. in
kdump-tools), the functions in `scripts/functions` need to support
`set -u` as well.
Signed-off-by: Benjamin Drung <bdrung@debian.org>
live-boot generates `/etc/hostname`, `/etc/hosts`, and
`/etc/resolv.conf` in `do_netsetup` from `9990-networking.sh`.
kdump-tools needs similar code for generating a self-contained initrd
for dumping a crashed kernel via network.
Ubuntu carries a patch for initramfs-tools that adds
`netinfo_to_resolv_conf` to configure `/etc/resolv.conf` which is more
complex than the code from live-boot.
To prevent code duplication, implement `netinfo_to_resolv_conf` and
`persist_hostname` as helper functions and let `configure_networking`
call these functions.
Let `netinfo_to_resolv_conf` support multiple `/run/net-<device>` style
files to allow Ubuntu to reuse this code.
Signed-off-by: Benjamin Drung <bdrung@debian.org>
configure_networking can now wait for a named net device, but the net
device may be specified by hardware (MAC) address or not at all.
* Factor out the hardware-address-to-device lookup into a function
* Add a function to check whether any suitable device exists
* Change the wait loop to use the appropriate check for device
existence
* Update the IP variable after the wait loop, so it follows the
hardware-address-to-device lookup
Closes: #911727
Signed-off-by: Ben Hutchings <benh@debian.org>
Make the order of precedence for setting device name explicit.
Split _handle_device_vs_ip into two functions, one to update DEVICE
and one to update IP. This allows setting the device name in a way
that obviously follows the order of precedence, and also enables
improvements to the device wait loop.
Signed-off-by: Ben Hutchings <benh@debian.org>
If the busybox-static package is installed, the modprobe implementation
used will be the one from busybox, which behaves slightly differently.
Specifically, the busybox implementation does not support `install`
commands from modprobe.d conf files:
https://git.busybox.net/busybox/tree/modutils/modprobe.c?h=1_31_stable#n279
Since mkinitramfs already ensures that /sbin/modprobe is copied into
/sbin for the initrd, it is safe to fully-qualify the modprobe call and
never invoke the busybox version.
In some old shell versions, string comparisons in [ ... ] could go
wrong if the first argument began with certain characters. It has
been common practice to avoid this problem by prefixing both sides
with 'x'.
bash and dash have not had this problem for well over a decade, so
clean this up.
Further details are at <https://www.shellcheck.net/wiki/SC2268>.
Signed-off-by: Ben Hutchings <benh@debian.org>
configure_networking() will issue a `udevadm settle` before trying to
configure an interface. However that's just a "best effort" mechanism
for waiting until all NICs have been discovered. There is no way to
*actually* know that all NICs have been discovered. The USB protocol sets
no time limit on enumeration, for example, and I have a USB NIC that
is consistently discovered after this point.
However, in the case that the user has told us which interface they expect
to be used in the initramfs (via ip=), we can just wait for it specifically.
Bail only if it hasn't appeared within 3 minutes. We can perhaps allow that
timeout to be overridden from the command line in the future.
Closes: #965935
Signed-off-by: dann frazier <dannf@debian.org>
Generalize the elapsed time tracking in local-top so that it can be used
elsewhere. This requires some additional quoting in local_device_setup()
comparisons to pass shellcheck.
Now that the reference time is recorded earlier (in init vs. local-top),
the rootdev wait time will now be reduced by however long it it takes to
process init-premount. The belief is that our wait time is sufficiently
long for that to be negligible. Also, this could potentially break any
local-top scripts that use $local_top_time directly. A survey of
the current packages in sid shows no packages that contain a file under
/usr/share/initramfs-tools/scripts/local-top/ that contain "local_top_time".
Signed-off-by: dann frazier <dannf@debian.org>
Negative timeout values are treated by the kernel as "reboot
immediately" and 0 is treated as "wait forever". Emulate this
behaviour in the panic() function.
Treat invalid (non-numeric) values the same as 0, which seems to match
what the kernel does. Previously we would ignore them completely and
open a shell as normal.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently the _log_msg() functions is "void" typed - with no return -,
which in terms of shell means it returns whatever its last command
returns. This function is the basic building block for all error/warning
messages in initramfs-tools.
It was noticed [0] that in case of bad console is provided to kernel on
command-line, printf (and apparently all write()-related functions) returns
error, and so this error is carried over in _log_msg(). Happens that
checkfs() function has a loop that runs forever in this scenario (*if* fsck
is not present in initramfs, and obviously if "quiet" is not provided in the
command-line). The situation is easily reproducible and we can find various
reports dating back some years. The reports usually are of the form
"machine can't boot if wrong console is provided" or slightly different
forms of that, almost always relating serial consoles with boot issues.
This patch proposes a pretty simple fix: return zero on _log_msg().
We should definitely not brake the boot due to error log functions;
one could argue we could fix checkfs() and that's true, until eventually
we find another subtle corner case of "misuse" of the _log_msg() return
value (after some debugging), and fix that too, and so on...
W could also argue that printf shouldn't return error in this case,
and although a valid discussion, it's not worth to have users waiting
on a dilemma while boot is quite easy to brake, just by passing a wrong
kernel parameter (or having the underlying serial console device changed
to output to a different port than the previously set on kernel cmdline).
[0] bugs.launchpad.net/cloud-images/+bug/1573095/comments/46
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
These all seem to be harmless in practice, as the parameter values
should not contain metacharacters.
In _checkfs_once() *do not* quote $spinner or $force; if these
are empty then we do not want to add arguments for them. Add a
comment to suppress the warning.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
klibc's reboot implementation always calls reboot(2), whereas
busybox's reboot implementation defaults to signalling an init daemon
which doesn't exist in the initramfs. The solution is simply to use
the -f option, which both implementations accept.
This seems to have been broken since commit c04a9db5b70b "hooks/klibc:
Make us play more nicely with busybox and static bin/sh" which caused
busybox's reboot implementation to be preferred over klibc's.
The failure to reboot was previously reported as #751488, but only
worked around by forcing a kernel panic if it failed.
Closes: #923165
Related-to: #751488
Thanks: Michael Niewöhner <linux@mniewoehner.de>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
We always need to write the resume device name to the kernel, either
(a) to resume, if a suspend image is found, or (b) to enable suspend
to that device later. We let the kernel distinguish the two cases
itself.
Since we started reporting the attempt to resume through plymouth, if
the text front-end is used (currently the default in Debian) the
message remains on-screen and is confusing in case (b). (If the
frame-buffer front-end is used, we clear the message.)
I can't think of a message that usefully covers both cases, and if
case (b) it's not really necessary to report anything. Use fstype to
check for a suspend image before reporting that we're resuming.
Closes: #928736
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Thee is no reason to avoid using a shell built-in implementation of
sleep here, not even the reason we had elsewhere of needing support
for fractional seconds.
Closes: #677049
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Resuming can be quite slow on a hard drive, and if plymouth is used
then the output of /bin/resume won't be seen. Show a message in
plymouth, then hide it if /bin/resume returns. Roughly based on a
patch by Mario Limonciello.
Thanks: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
_log_msg used to pass its arguments straight through to printf (but
only if "quiet" is not used), but this was changed by commits
f277309e0b6b and 3650731f3332. We still need it do that, so that
when callers pass "\\n" it is replaced by a newline.
Change the callers to separate the format and argument strings, fixing
the format-safety problem that the earlier commits were intended to
address.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
setupcon gained a --setup-dir option to support initramfs builders,
documented since console-setup 1.111. Use this to simplify our
own scripts.
Related-to: #620041
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
The kernel will automatically request filesystem modules as needed.
Also, the kernel now requests filesystem modules through an alias of
the form "fs-<type>" while we were still using the bare type name.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Panic if the root parameter is missing or empty, or if the mount
commands fail.
This is loosely based on a patch by G.raud.
Thanks: "G.raud" <graud@gmx.com>
Closes: #848906
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently get_fstype prints "unknown" on failure, and some callers
then have to check for this special string. Change it to return the
empty string on failure.
Most callers check whether it succeeded or not, so this doesn't
make any difference to them.
local_mount_root and hooks/fsck do check for "unknown", so update
them accordingly.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
In POSIX sh, echo flags are undefined. Busybox and dash support
'echo -n', but better be on the safe side by using printf instead.
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
Sometimes globbing and word splitting is wanted. Therefore explicitly
disable the check for these line.
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
shellcheck found more issues than SC1074. Address most of these issues.
You can check the shell code by running:
```
shellcheck -e SC1090,SC1091 -s dash hook-functions $(find * -type f
\( -executable ! -name rules -o -regex '.*\.\(post\|pre\).*'
-o -regex "^\(docs\|scripts\)/.*" ! -name '*.md' \))
```
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
Commit bf238f6aceb206e25969d64f5b496eb5d051f481 removed the code that
was using the render function. The render function is not used any more
and can be removed.
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
When the IFS is set (e.g. to ",") the reading of /proc/consoles might
not correctly split the line into `console` and `rest`. Running
```
IFS=","
panic "error message"
```
will show:
```
Spawning shell within the initramfs as requested
sh: syntax error: unexpected "("
```
Therefore explicitly unset IFS in the panic function.
Signed-off-by: Benjamin Drung <benjamin.drung@cloud.ionos.com>
If the BOOTIF parameter or the DEVICE configuration parameter is set,
and the ip parameter is also set but does not specify a device name,
we need to convey the device name to ipconfig as well as the ip
parameter value.
Previously we didn't do this if the value was a colon-separated list.
But now that we can parse the list correctly, it's easy to substitute
the device name into it.
Closes: #721088
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Parameter expansion is not useful for parsing delimited lists of
arbitrary length. Instead, set IFS=:, disable wildcard expansion, and
use "set". Use a separate function, as this makes it easy to change
$IFS temporarily.
Also do the parsing earlier, so we can avoid a redundant invocation of
ipconfig.
This was mentioned in #721088, but it's not the main bug reported.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently we don't wait for the resume device to appear, and will boot
without resuming if it is too slow to appear (e.g. USB storage or, in
the reported case, an NVMe device!).
Use local_device_setup to wait for the device, the same as we do for
the root and /usr devices. This also takes care of resolving UUID=
and LABEL= syntax, and adds support for PARTUUID= and PARTLABEL=.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Closes: #854791
There is no point in looking for the resume device if the kernel
doesn't support resuming from disk.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If the root or /usr filesystem is missing, we cannot continue
booting and must panic. This is not true for the resume device,
and a missing resume device has not been a fatal error up until
now.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Instead of counting how many times we wait and poll for each critical
device (root or /usr) to appear, use /proc/uptime to tell how long we
have waited in total (starting from when local_top runs).
This is complicated by the hack mdadm's local-block uses to decide
when 2/3 of the time limit has expired. Use an even worse hack to
keep it working.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
LILO allows specifying an md-RAID device as root, e.g. root=/dev/md0,
but as usual translates them into device numbers, e.g. root=900.
parsen_numeric translates device numbers back into e.g. /dev/block/9:0
but then tries to canonicalise them using readlink -f. This works OK
for simple disk devices whose drivers will have initialised by the
point it runs, but not for stacked disk devices which will be created
later.
We could move parse_numeric later, but I think it's preferable to
handle this weird special case before running external scripts that
use $ROOT.
Closes: #815555
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Currently the panic shell's controlling tty is /dev/console which is
not fully functional - the shell can't provide job control and more
can't work out the screen size for paging.
Fix this by reading /proc/consoles to find out the underlying tty
device and then connecting the shell to it directly with the aid of
setsid.
Closes: #512679
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
If we open a shell when udev is already running, any necessary drivers
should already have been loaded.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Unlike root, the types of all other filesystems in /etc/fstab have
historically been honoured and we should continue to do so.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
- Enable spinner (-C) when not debugging
- Test ${quiet}, not ${VERBOSE} as used in initscripts
- Suppress fsck title message (-T) when quiet
Closes: #781239
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Now that we use blkid directly, there is no need to resolve /dev/disk
symlinks. We shouldn't resolve any other symlinks we're given either,
as this causes /proc/mounts to be inconsistent with /etc/fstab.
Closes: #791754
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Since we now invoke blkid to resolve block device IDs rather than
relying on symlinks under /dev/disk, resolve_device just doesn't work
until the specified device exists. So we need to use it in the
multiple existence checks in local_device_setup, and nowhere else.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>