This change adds support for the following test scenarios:
- 15.0-RELEASE - ZFS
- 15.0-STABLE - UFS
- 15.0-STABLE - ZFS
This additional testing aims to catch issues with 15.x, as well as
ensure libarchive use doesn't regress when run on ZFS-based hosts.
Signed-off-by: Enji Cooper <yaneurabeya@gmail.com>
Building on illumos currently fails with:
```
ld: fatal: unrecognized option '--gc-sections'
```
This happens because `--gc-sections` isn't supported on illumos `ld`.
This patch updates CMakeLists.txt to skip unsupported linker options on
illumos. The flags used on other operating systems are optimizations
that don't affect correctness, so this change is safe.
Complete the refactoring of all 25 fuzzers:
- Remove duplicate Buffer struct definitions from 15 format fuzzers
- Remove duplicate DataConsumer class from 7 API fuzzers
- Update consume_bytes() calls to match new signature
- All fuzzers now use shared helpers from fuzz_helpers.h
This eliminates ~1000 lines of duplicated code.
The test runner already lists available tests if it fails to parse the
command line, but add a -l option to explicitly do this without also
printing an error message and a summary of options.
Adds a 'bsdtar-recovery' Android build target for use in Android recoveries
as a static binary, and fixes some build failures on the get-go.
Tested on halium-7.1, halium-9.0 & halium-13.0.
Change-Id: I9b656e7016d4bf21517e2edb18f2a7733edc6982
This reverts commit d8aaf88c9feab047139df4cae60d845764a2480a, reversing
changes made to ee49ac81068f93754f004368f2cc72c95a8bf056.
tree_reopen() and tree_dup() return NULL only of they
are unable to allocate memory. Otherwise libarchive enters
ARCHIVE_FATAL if trying to walk an enterable but unreadable
directory.
__archive_ensure_cloexec_flag() operates only on fd >= 0
so there is no need to skip it
I have reimplemented the check around fdopendir()
Reported by: Christian Weisgerber from OpenBSD
Uses the sed-like way (and Java-like, and .Net-like, and Javascript-like…) to fix this issue of advancing the string to be processed by one if the match is zero-length.
Fixeslibarchive/libarchive#2725 and solves libarchive/libarchive#2438.
Since LZ4 and zstd share the same format for skippable frames, we need
to skip over these frames when trying to detect the format of compressed
data. Let's read up to something like 64kb of data when performing this
scanning.
Note that the LZ4 specification advises against starting with a skippable
frame, but doesn't forbid it:
> For the purpose of facilitating identification, it is discouraged to
> start a flow of concatenated frames with a skippable frame. If there
> is a need to start such a flow with some user data encapsulated into
> a skippable frame, it's recommended to start with a zero-byte LZ4
> frame followed by a skippable frame. This will make it easier for
> file type identifiers.
Resolves#2692.
This is expected to fail until a followup commit, because lz4 and zstd
skippable frames are the same format and we don't skip over those when
performing format detection (yet).
Relates to #2692.
Whenever we need to create a temporary file while writing to disk on a
POSIX system, try to create it in the same directory as the final file
instead of the current working directory. The target directory can
reasonably be expected to be writable (and if it isn't, creating the
file will fail anyway), but the current working directory may not be.
While here, consistently use __archive_mkstemp(), and increase the
template from six to eight random characters.
Fixes: 2e73ea3a7db1 ("Fix max path-length metadata writing (#2243)")
Fixes: e12c955dca63 ("Unify temporary directory handling")
Raise the maximum size of Mac metadata from 4 MiB to 10 MiB, as that is
the value used by Apple themselves in the version of libarchive included
in Darwin.
Provide preprocessor macros for two recurring magic numbers in the zip
support code: the length of the local file header (30 bytes) and the
maximum allowable size for Mac metadata (4 MiB).
In archive_util.c, we have a private function named get_tempdir() which
is used by __archive_mktemp() to get the temporary directory if the
caller did not pass one.
In archive_read_disk_entry_from_file.c, we use the same logic with a
slight twist (don't trust the environment if setugid) to create a
temporary file for metadata.
Merge the two by renaming get_tempdir() to __archive_get_tempdir() and
unstaticizing it (with a prototype in archive_private.h).
When compiling libarchive using clang in module mode a special
module.modulemap file describes the structure of the header files
so that they can be imported modularly. Having this file makes
it easier for modular uses of the library out of the box so that
clients don't need to write their own, potentially making errors
in doing so.
Add a module.modulemap in the public header file location so that
clang and related tools can find it easily.
Our tar header parsing tracks a count of bytes that need to be
consumed from the input. After each header, we skip this many bytes,
discard them, and reset the count to zero. The `V` header parsing
added the size of the `V` entry body to this count, but failed to
check whether that size was negative. A negative size (from
overflowing the 64-bit signed number parsing) would decrement this
count, potentially leading us to consume zero bytes and leading to an
infinite loop parsing the same header over and over.
There are two fixes here:
* Check for a negative size for the `V` body
* Check for errors when skipping the bytes that
need to be consumed
Thanks to Zhang Tianyi from Wuhan University for finding
and reporting this issue.
Depending on header search path ordering, we can easily
confuse libarchive_fe/err.h with the system header.
Rename ours to lafe_err.h to avoid the confusion.
Rename libarchive_fe/err.c to match.
We reuse the compression buffer to format the gzip header,
but didn't check for an overlong gzip original_filename.
This adds that check. If the original_filename is
over 32k (or bigger than the buffer in case someone shrinks
the buffer someday), we WARN and ignore the filename.
In archive_write_header(), if the format method or a filter flush method
fails, we set the archive state to fatal, but we did not do this in
archive_write_data() or archive_write_finish_entry(). There is no good
reason for this discrepancy. Not setting the archive state to fatal
means a subsequent archive_write_free() will invoke archive_write_close()
which may retry the operation and cause archive_write_free() to return
an unexpected ARCHIVE_FATAL.
If a fatal error occurs, the closer will not be called, so neither will
BZ2_bzCompressEnd(), and we will leak memory. Fix this by calling it a
second time from the freer. This is harmless in the non-error case as
it will see that the compression state has already been cleared and
immediately return BZ_PARAM_ERROR, which we simply ignore.
The closer will not be called if a fatal error occurs, so the current
arrangement results in a memory leak. The downside is that the freer
may be called even if we were not fully constructed, so it needs to
perform additional checks. On the other hand, knowing that the freer
always gets called and will free the client state simplifies error
handling in the opener.
Close all the file descriptors in the range [3 ..
sysconf(_SC_OPEN_MAX)-1] before executing a filter program to avoid
leaking file descriptors into subprocesses.
Bug: https://github.com/libarchive/libarchive/issues/2520
This fixes an unhelpful "Couldn't visit directory: Unknown error: -1" message.
Fixes: 3311bb52cbe4 ("Bring the code supporting directory traversals from bsdtar/tree.[ch] into archive_read_disk.c and modify it. Introduce new APIs archive_read_disk_open and archive_read_disk_descend.")
RAR5 reader had inconsistent sanity checks for directory entries that
declare data. On one hand such declaration was accepted during the
header parsing process, but at the same time this was disallowed during
the data reading process. Disallow logic was returning the
ARCHIVE_FAILED error code that allowed the client to retry, while in
reality, the error was non-retryable.
This commit adds another sanity check during the header parsing logic
that disallows the directory entries to declare any data. This will make
clients fail early when such entry is detected.
Also, the commit changes the ARCHIVE_FAILED error code to ARCHIVE_FATAL
when trying to read data for the directory entry that declares data.
This makes sure that tools like bsdtar won't attempt to retry unpacking
such data.
Fixes issue #2714.
This commit fixes multiple issues found in the function that parses
extra fields found in the "file"/"service" blocks.
1. In case the file declares just one extra field, which is an
unsupported field, the function returns ARCHIVE_FATAL.
The commit fixes this so this case is allowed, and the unsupported
extra field is skipped. The commit also introduces a test for this
case.
2. Current parsing method of extra fields can report parsing errors in
case the file is malformed. The problem is that next iteration of
parsing, which is meant to process the next extra field (if any),
overwrites the result of the previous iteration, even if previous
iteration has reported parsing error. A successful parse can be
returned in this case, leading to undefined behavior.
This commit changes the behavior to fail the parsing function early.
Also a test file is introduced for this case.
3. In case the file declares only the EX_CRYPT extra field, current
function simply returns ARCHIVE_FATAL, preventing the caller from
setting the proper error string. This results in libarchive returning
an ARCHIVE_FATAL without any error messages set. The PR #2096 (commit
adee36b00) was specifically created to provide error strings in case
EX_CRYPT attribute was encountered, but current behavior contradicts
this case.
The commit changes the behavior so that ARCHIVE_OK is returned by the
extra field parsing function in only EX_CRYPT is encountered, so that
the caller header reading function can properly return ARCHIVE_FATAL
to the caller, at the same time setting a proper error string. A test
file is also provided for this case.
This PR should fix issue #2711.
Use posix_spawn() with POSIX_SPAWN_CLOEXEC_DEFAULT on systems that
define this constant, in order to avoid leaking file descriptors into
subprocesses.
Bug: https://github.com/libarchive/libarchive/issues/2520
to avoid use of undefined content of buf, in case when custom
locale makes the result string longer than buf length.
Signed-off-by: Marcin Mikula <marcin@helix.pl>
Some experiments showed strange things happen if you
provide an invalid type value when appending a new ACL entry.
Guard against that, and while we're here be a little more
paranoid elsewhere against bad types in case there is another
way to get them in.
When testing the feature with `bsdtar -acf test.zip --options
zip:compression=zstd …` on a tree of ~100MB, the execution would
appear to "hang" while writing a multi-gigabytes ZIP file.
setup_mac_metadata currently concates the template after TMPDIR without
adding a path separator, which causes mkstemp to create a temporary file
next to TMPDIR instead of in TMPDIR. Add a path separator to the
template to ensure that the temporary file is created under TMPDIR.
I hit this while rebuilding libarchive in nixpkgs. Lix recently started
using a dedicated build directory (under /nix/var/nix/builds) instead of
using a directory under /tmp [1]. nixpkgs & Lix support (optional)
on macOS sandboxing. The default sandbox profile allows builds to access
paths under the build directory and any path under /tmp. Because the
build directory is no longer under /tmp, some of libarchive's tests
started to fail as they accessed paths next to (but not under) the build
directory:
cpio/test/test_basic.c:65: Contents don't match
Description: Expected: 2 blocks
, options=
file="pack.err"
0000_62_73_64_63_70_69_6f_3a_20_43_6f_75_6c_64_20_6e_bsdcpio: Could n
0010_6f_74_20_6f_70_65_6e_20_65_78_74_65_6e_64_65_64_ot open extended
0020_20_61_74_74_72_69_62_75_74_65_20_66_69_6c_65_0a_ attribute file.
Sandbox: bsdcpio(11215) deny(1) file-write-create /nix/var/nix/builds/nix-build-libarchive-3.8.0.drv-7tar.md.5EUrQu
[1]: https://lix.systems/blog/2025-06-24-lix-cves/
If it's already known that we use variables for calculations with
size_t, use size_t for them directly instead of int, even if values
fit.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The off_t datatype in Windows is 32 bit, which leads to issues when
handling files larger than 2 GB.
Add a wrapper around fstat/stat calls to return a struct which has a
properly sized st_size variable. On systems with an off_t representing
the actual system limits, use the native system calls.
This also fixes mtree's checkfs option with large files on Windows.
Fixes https://github.com/libarchive/libarchive/issues/2685
Fixes 89b8c35ff4b5addc08a85bf5df02b407f8af1f6c
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
It is possible to handle entries and files with sizes which do not fit
into off_t of the current system (Windows always has 32 bit off_t and
32 bit systems without large file support also have 32 bit off_t).
Set sizes to 0 in such cases. The fstat system call would return -1 and
set errno to EOVERFLOW, but that's not how archive_entry_set_size acts.
It would simply ignore negative values and set the size to 0.
Actual callers of archive_entry_stat from foreign projects seem to not
even check for NULL return values, so let's try to handle such cases as
nice as possible.
Affects mtree's checkfs option as well (Windows only, 32 bit systems
would simply fail in fstat/stat).
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The new `__la_wopen` wrapper is a copy of `__la_open` that
expects--rather than converts--a wcs parameter.
The `sopen` variants are offered as "more secure" variants of `open` and
`wopen`; I cannot vouch for their security, but some build systems are
strict about the use of "banned insecure APIs".
I've confirmed that `_wsopen_s` and `_open_s` are present in the Windows
Vista SDK.
I did not confirm that they are available in the Windows XP Platform
SDK, in part because in e61afbd463d1 (2016!) Tim says:
> I'd like to completely remove support for WinXP and earlier.
For write, 0 may not mean an error at all. We need to instead check for the length not being the same.
With fwrite, because 0 could mean an error, but not always. We must check that we wrote the entire file!
Note that unlike write, fwrite's description according to POSIX does not mention returning a negative type at all. Nor does it say you can retry unlike write.
Finally, with write, we need to check less than 0, not 0, as 0 is a valid return and does not mean an error.
If opening a filename fails, make sure that allocated memory which is
not inserted into any remaining structure is freed.
Fixes https://github.com/libarchive/libarchive/issues/1949
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Make sure that the string table size is not smaller than 6 (and also
not larger than SIZE_MAX for better 32 bit support).
Such small values would lead to a large loop limit which either leads to
a crash or wrong detection of a ".data" string in possibly uninitialized
memory.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The 64 bit format requires at least 63 bytes, so increase this limit.
Such small binaries most likely don't contain 7zip data anyway.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If a sparse hole is located at the end of an entry, then the tar
parser returns ARCHIVE_EOF while updating the offset where 0 bytes of
data will follow.
If archive_read_data encounters such an ARCHIVE_EOF return value, it
has to recheck if the offsets (data offset and output offset) still
match. If they do not match, it has to keep filling 0 bytes.
This changes assumes that it's okay to call archive_read_data_block
again after an EOF. As far as I understood the parsers so far, this
should be okay, since it's always ARCHIVE_EOF afterwards.
Fixes https://github.com/libarchive/libarchive/issues/1194
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If an entry reaches its end of file, the offset is not necessarily
the same as unp_size. This is especially true for links which have
a "0 size body" even though the unpacked size is not 0.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
When _warc_read encounters end of entry, it adds 4 bytes to the last
offset for \r\n\r\n separator, which is never written. Ignore these
bytes since they are not part of the returned entry.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If xar_read_data has no further data, set offset to end of entry,
not to total size of parsed archive so far.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The string constants can be used directly for comparison, which makes
this code robust against future changes which could lead to names being
longer than str could hold on stack.
Also removes around 100 bytes from compiled library (with gcc 15).
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If zlib is not supported, do not run tests to avoid false positives.
Also adjust tests to support latest gzip versions (1.10+) which store
less information for improved reproducibility. The gzip binary is
used as a fallback if zlib is not available.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If a unix system has no iconv support, the best effort function will
be unable to convert KOI8 to UTF-8. Skip the test if such support is
missing.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If no encryption support exists, the -P option will always fail.
"Skip" the test by making sure that there really is no encryption
support according to libarchive functions.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Some functions might return -1 in case of library error. Use an
own return value if a stub function was used for better error
messages.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
After importing the latest libarchive into FreeBSD, Shawn Webb @
HardenedBSD noted that the test build is broken when FORTIFY_SOURCE=2
while building the base system. Braced initializer lists are a special
case that need some extra fun parentheses when we're dealing with the
preprocessor.
While it's not a particularly common setup, the extra parentheses don't
really hurt readability all that much so it's worth fixing for wider
compatibility.
Fixes: libarchive/libarchive#2657
Ignoring SIGCHLD gets passed to child processes. Doing that has
influence on waitpid, namely that zombie processes won't be
created. This means that a status can never be read.
We can't enforce this in library, but libarchive's tools can be
protected against this by enforcing default handling.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Use pid_t since waitpid returns a pid_t. Also check for a negative
return value in writer as well to avoid reading the possibly
unitialized status value.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Calling CloseHandle multiple times for the same handle can lead to
exceptions while debugging according to documentation.
Mimic the waitpid handling for success cases to behave more like the
Unix version which would "reap the zombie".
Doing this for an unsuccessful call is off, but the loop is never
entered again, so I guess it's okay and worth it to reduce the amount
of Windows specific definitions in source files.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If the waitpid version for Windows fails, preserve the error code and
avoid overwriting it with a possible CloseHandle error.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The archive_utility_string_sort function won't be part of the 4.0.0 API
anymore. No users were found and such a task should be done outside of
the library.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The utility function "archive_utility_string_sort" is a custom qsort
implementation. Since qsort is specified in C11 and POSIX.1-2008
which libarchive is based on, use system's qsort directly.
The function is not used directly in libarchive, so this is a good
way to save around 500 bytes in resulting library without breaking
compatibility for any user of this function (none found).
Also allows more than UINT_MAX entries which previously were limited
by data type and (way earlier) due to recursion.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Test cases already get a C locale, which is sufficient for this test.
IF LC_TIME was not previously set, the used en_US.UTF-8 would stay
as an environment variable, possibly affecting other test cases.
Since en_US.UTF-8 is not guaranteed to be available, C is a better
choice.
Fixes https://github.com/libarchive/libarchive/issues/2560
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Reset current locale settings through setlocale and also all
environment variables which might affect test cases which
spawn children through systemf which in turn would call setlocale
on their own, e.g. bsdtar.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Explicitly use goto to turn a recursive call into an iterative one.
Most compilers do this on their own with default settings, but MSVC
with default settings would create a binary which actually performs
recursive calls.
Fixes call stack overflow in binaries compiled with low optimization.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The sparse 1.0 parser skips lines with comments. The amount of skipped
bytes is stored in a ssize_t variable, although common 32 bit systems
allow files larger than 4 GB.
Gracefully handle files with more than 2 GB bytes full of comments to
prevent integer truncations.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If a pax global header specifies a negative size, it is possible to
reduce variable `unconsumed` by 512 bytes, leading to a re-reading
of the pax global header. Fortunately the loop verifies that only one
global header per entry is allowed, leading to a later ARCHIVE_FATAL.
Avoid any form of negative size handling and fail early.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Skip all entry bytes after sparse entries were encountered. This matches
GNU tar behavior.
I have adjusted (and fixed) the existing test case for this. The test
case test_read_format_gtar_sparse_skip_entry did not work with GNU tar.
In #2558 it was explained that the pax size always overrides the header
size (correct). Since the pax size in the test case was way larger than
the actual entry bytes in archive, GNU tar choke on the test file.
The libarchive parser did not skip any bytes not already read due to
references by sparse entries, so the huge pax size was not detected.
By adjusting the test case to have a leftover byte (only 3 bytes are
referenced through sparse entry now, leaving one extra byte) with a
correct pax size and an invalid header size (after all it is overridden
by pax size), GNU tar works and libarchive gets off its 512 byte
alignment, not being able to read the next entry.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The fseek* family of functions return 0 on success, not the new offset.
This is only true for lseek.
Fixes https://github.com/libarchive/libarchive/issues/2641
Fixes dcbf1e0ededa95849f098d154a25876ed5754bcf
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Consecutive to 16fd043f51d911b106f2a7834ad8f08f65051977
IID_ISequentialStream is required by the code.
This GUID is defined in uuid.lib or libuuid.a in mingw-w64. It is required
to link with that library to get the definition of the GUID. Some toolchains
add it by default but not all.
Some error messages already use the ll length modifier, which results
in raw formatter output, i.e. "%lld" instead of a number.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If a pax attribute has a 0 length value and no newline, the tar reader
gets out of sync with block alignment.
This happens because the pax parser assumes that variable value_length
(which includes the terminating newline) is at least 1. To get the
real value length, 1 is subtracted. This result is subtracted from
extsize, which in this case would lead to `extsize -= -1`, i.e.
the remaining byte count is increased.
Such an unexpected calculation leads to an off-by-one when skipping
to the next block. In supplied test case, bsdtar complains that the
checksum of the next block is wrong. Since the tar parser was not
properly 512 bytes aligned, this is no surprise.
Gracefully handle such a case like GNU tar does and warn the user that
an invalid attribute has been encountered.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
AppleDouble extension entries are present as separate files immediately
preceding the corresponding real files. In libarchive, we process the
entire metadata file (headers + data) as if it were a header in the real
file. However, the code forgets to reset the accumulated header state
before parsing the real file's headers. In one code path, this causes
the metadata file's name to be used as the real file's name.
Specifically, this can be triggered with a tar containing two files:
1. A file named `._badname` with pax header containing the `path` attribute
2. A file named `goodname` _with_ a pax header but _without_ the `path` attribute
libarchive will list one file, `._badname` containing the data of `goodname`.
This code is pretty brittle and we really should let the client deal with
it :(
Fixes#2510.
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li>
Pax extended headers may specify negative time values for files older
than the epoch.
Adjust the code to clear values to 0.0 more often and set ps to
INT64_MIN to have a proper error specifier, because the parser does
not allow anything below -INT64_MAX.
Fixes https://github.com/libarchive/libarchive/issues/2562
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Iterating over a size_t with unsigned could lead to an endless loop
while adding uid/gid to a list which already counts 4 billion
entries.
I doubt that this can ever happen, given that the routines become
very slow with insertions, but better be safe than sorry.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Turn unmatched_count into a size_t to support as many entries as
possible on the machine.
If more than INT_MAX entries are not matched, truncate the result
of archive_match_path_unmatched_inclusions for external callers.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The count fields are merely used to check if a list is empty or not.
A check for first being not NULL is sufficient and is already in
place while iterating over the linked elements (count is not used).
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The operations for key and node comparison depend on the platform
libarchive is compiled for. Since these values do not change
during runtime, set them only once during initialisation.
Further simplify the code by declaring only one "rb_ops" with
required functions based on platform.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The cygwin FAQ states that __CYGWIN__ is defined when building for a
Cygwin environment. Only a few test files check (inconsistently) for
CYGWIN, so adjust them to the recommended __CYGWIN__ definition.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Cast address of "version" to BYTE pointer for CryptGetProvParam.
Fix "major" variable assignment for picky compilers like MSVC.
The "length" variable is an in/out variable. It must be set to the size
of available memory within "version". Right now it is undefined behavior
and 0 would crash during runtime.
Fixes https://github.com/libarchive/libarchive/issues/2628
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Bumps the all-actions group with 4 updates:
`actions/checkout` from 4.2.1 to 4.2.2
`actions/upload-artifact` from 4.4.3 to 4.6.2
`github/codeql-action` from 3.26.12 to 3.28.18
`ossf/scorecard-action` from 2.4.0 to 2.4.1
We should not get here, but given that the check exists, we should not let it happen if this is NULL because otherwise we just dereference it later on.
When looping over program header entries (e_shnum)
we need to increment sec_tbl_offset by e_shentsize
and not by fixed values.
Fixes OSS-Fuzz issue 418349489
Copy ae digests to mtree_entry. This simplifies porting non-archive
formats to archive formats while preserving supported message
digests specifically in cases where recomputing digests is not
viable.
Signed-off-by: Nicholas Vinson <nvinson234@gmail.com>
The size_t to int conversion is especially required on Windows systems
to support their int-based functions. These variables should be properly
checked before casts. This avoids integer truncations with large
strings.
I prefer size_t over int for sizes and adjusted variables to size_t
where possible to avoid casts.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If vsnprintf fails with errno EOVERFLOW, the results are very platform
dependent but never useful. The implementation in glibc fills bytes with
blanks, FreeBSD fills them with zeros, OpenBSD and Windows set first
byte to '\0'.
Just stop processing and don't print anything, which makes it follow
the OpenBSD and Windows approach.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The vsnprintf calls might return INT_MAX with very long strings.
Prevent a signed integer overflow when taking an additional nul
byte into account.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
If the format buffer shall not be further increased in size, the
length value mistakenly takes the terminating nul byte into account.
This is in contrast to a successful vsnprintf call.
Also use the correct string length if fallback to stack buffer is
required.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The stack buffer is never cleared, which can become an issue depending
on vsnprintf implementation's behavior if -1 is returned. The code
would eventually fall back to stack buffer which might be not
nul terminated.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Hi,
This PR adds support for setting a forced mtime on all written files
(`--mtime` and `--clamp-mtime`) in bsdtar.
The end goal will be to support all functionalities in
<https://reproducible-builds.org/docs/archives/#full-example>, namely
`--sort` and disabling other attributes (atime, ctime, etc.).
Fixes#971.
## History
- [v1](https://github.com/zhaofengli/libarchive/tree/forced-mtime-v1):
Added `archive_read_disk_set_forced_mtime` in libarchive. As a result,
it was only applied when reading from the filesystem and not from other
archives.
- [v2](https://github.com/zhaofengli/libarchive/tree/forced-mtime-v2):
Refactored to apply the forced mtime in `archive_write`.
- v3 (current): Reduced libarchive change to exposing
`archive_parse_date`, moved clamping logic into bsdtar.
---------
Signed-off-by: Zhaofeng Li <hello@zhaofeng.li>
Co-authored-by: Dustin L. Howett <dustin@howett.net>
A filter block size must not be larger than the lzss window, which is
defined
by dictionary size, which in turn can be derived from unpacked file
size.
While at it, improve error messages and fix lzss window wrap around
logic.
Fixes https://github.com/libarchive/libarchive/issues/2565
---------
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Co-authored-by: Tim Kientzle <kientzle@acm.org>
If a system is capable of handling 4 billion nodes in memory, a double
free could occur because of an unsigned integer overflow leading to a
realloc call with size argument of 0. Eventually, the client will
release that memory again, triggering a double free.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
To detect 7z SFX files, libarchive currently searches for the 7z header
in a hard-coded addr range of the PE/ELF file
(specified via macros SFX_MIN_ADDR and SFX_MAX_ADDR). This causes it to
miss SFX files that may stray outside these values (libarchive fails to
extract 7z SFX ELF files created by recent versions of 7z tool because
of this issue). This patch fixes the issue by finding a more robust
starting point for the 7z header search: overlay in PE or the .data
section in ELF. This patch also adds 3 new test cases for 7z SFX to
libarchive.
Fixes https://github.com/libarchive/libarchive/issues/2075
---------
Co-authored-by: Masoud Mehrabi Koushki <masoud.mehrabi.koushki1@huawei.com>
Co-authored-by: Martin Matuška <martin@matuska.de>
This new test archive contains a C hello world executable built like so
on a ubuntu 24.04 machine:
```
int main(int argc, char *argv[]) {
printf("hello, world\n");
return 0;
}
```
`powerpc-linux-gnu-gcc hw.c -o hw-powerpc -Wall`
The test archive that contains this executable was created like so,
using 7-Zip 24.08: `7zz a -t7z -m0=lzma2 -mf=ppc
libarchive/test/test_read_format_7zip_lzma2_powerpc.7z hw-powerpc`
The new test archive is required because the powerpc filter for lzma is
implemented in liblzma rather than in libarchive.
This commit adds support for reading and writing XAR archives on Windows
using the built-in xmllite library. xmllite is present in all versions
of Windows starting with Windows XP.
With this change, no external XML library (libxml2, expat) is required
to read or produce XAR archives on Windows.
xmllite is a little bit annoying in that it's entirely a COM API--the
likes of which are annoying to use from C.
Signed-off-by: Dustin L. Howett <dustin@howett.net>
Depends on e619342dfa36b887ffa0ea33e98d04cb161cd7de
Closes#1811
A few small tweaks to improve reading/writing of the legacy GNU tar
format.
* Be more tolerant of redundant 'K' and 'L' headers
* Fill in missing error messages for redundant headers
* New test for reading archive with redundant 'L' headers
* Earlier identification of GNU tar format in some cases
These changes were inspired by Issue #2434. Although that was determined
to not technically be a bug in libarchive, it's relatively easy for
libarchive to tolerate duplicate 'K' and 'L' headers and we should be
issuing appropriate error messages in any case.
The refactoring of https://github.com/libarchive/libarchive/pull/2553
introduced three issues:
1. Introduction of a modifiable global static variable
This violates the goal of having no global variables as stated in [the
README.md](b6f6557abb/README.md (L195))
which in turn leads to concurrency issues. Without any form of mutex
protection, multiple threads are not guaranteed to see the correct
min/max values. Since these are not needed in regular use cases but only
in edge cases, handle them in functions with local variables only.
Also the global variables are locale-dependent which can change during
runtime. In that case, future calls leads to issues.
2. Broken 32 bit support
The writers for zip and others affected by the previously mentioned PR
and test-suite on Debian 12 i686 are broken, because the calculation of
maximum MS-DOS time is not possible with a 32 bit time_t. Treat these
cases properly.
3. Edge case protection
Huge or tiny int64_t values can easily lead to unsigned integer
overflows. While these do not affect stability of libarchive, the
results are still wrong, i.e. are not capped at min/max as expected.
In total, the functions are much closer to their original versions again
(+ more range checks).
---------
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Make sure that size_t casts do not truncate the value of packed_size on
32 bit systems since it's 64 bit. Extensions to RAR format allow 64 bit
values to be specified in archives.
Also verify that 64 bit signed arithmetics do not overflow.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
It is possible to trigger a call stack overflow by repeatedly entering
the rar_read_ahead function. In normal circumstances, this recursion is
optimized away by common compilers, but default settings with MSVC keep
the recursion in place. Explicitly turn the recursion into a goto-loop
to avoid the overflow even with no compiler optimizations.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Reset avail_in and next_in if the next entry of a split archive is
parsed to always update its internal structure to access next bytes when
cache runs empty.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Support header sizes larger than 32 bit even on 32 bit systems, since
these normally have large file support. Otherwise an unsigned integer
overflow could occur, leading to erroneous parsing on these systems.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Most source files use tabs instead of spaces, but
archive_read_support_format_rar.c uses spaces most of the time. A few
lines contain a mixture of tabs and spaces, which leads to poorly
formatted output with many default settings.
Unify the style. No functional change and preparation for upcoming
changes to get rid of white space diffs.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
PR #2258 hardened AFIO parsing by converting all inode values >= 2^63 to
zero values. This turns out to be problematic for filesystems that use
very large inode values; it results in all such files being viewed as
hardlinks to each other.
PR #2587 partially addressed this by instead considering inode values >=
2^63 as invalid and just ignoring them. This prevented the accidental
hardlinking, but at the cost of losing all hardlinks that involved large
inode values.
This PR further improves things by stripping the high order bit from
64-bit inode values in the AFIO parser. This allows them to be mostly
preserved and should allow hardlinks to get properly processed in the
vast majority of cases. The only false hardlinks would be when there are
inode values that differ exactly in the high order bit, which should be
very rare.
A full solution will require expanding inode handling to use unsigned
64-bit values; we can't do that without a major version bump, but this
PR also sets the stage for migrating inode support in a future
libarchive 4.0.
The calculations for the suffix and prefix can increment the endpoint
for a trailing slash. Hence the limits used should be one lower than the
maximum number of bytes.
Without this patch, when this happens for both the prefix and the
suffix, we end up with 156 + 100 bytes, and the write of the null at the
end will overflow the 256 byte buffer. This can be reproduced by running
```
mkdir -p foo/bar
bsdtar cvf test.tar foo////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////bar
```
when bsdtar is compiled with Address Sanitiser, although I originally
noticed this by accident with a genuine filename on a CHERI capability
system, which faults immediately on the buffer overflow.
It seems that valid inode can be 64-bit and negative (or rather outside
of 64-bit signed range)
a60615d5be/sys/sys/_types.h (L124)7e74f756f5/include/linux/types.h (L22)
But signed type is used in libarchive and there were some fuzzing issues
with it, https://github.com/libarchive/libarchive/pull/2258 converts
negative `ino` to `0`, which is actually a reserved inode value, but
more importantly it was still setting `AE_SET_INO` flag and later on
hardlink detection will treat all `0` on same `dev` as hardlinks to each
other if they have some hardlinks.
This showed up in BuildBarn FUSE filesystem
https://github.com/buildbarn/bb-remote-execution/issues/162 which has
both
- setting number of links to a big value
- generating random inode values in full uint64 range
Which in turn triggers false-positive hardlink detection in `bsdtar`
with high probability.
Let's mitigate it
- don't set `AE_SET_INO` on negative values (assuming rest of code isn't
ready to correctly handle full uint64 range)
- check that `ino + dev` are set in link resolver
Make sure to not skip past end of file for better error messages. One
such example is now visible with rar testsuite. You can see the
difference already by an actually not useless use of cat:
```
$ cat .../test_read_format_rar_ppmd_use_after_free.rar | bsdtar -t
bsdtar: Archive entry has empty or unreadable filename ... skipping.
bsdtar: Archive entry has empty or unreadable filename ... skipping.
bsdtar: Truncated input file (needed 119 bytes, only 0 available)
bsdtar: Error exit delayed from previous errors.
```
compared to
```
$ bsdtar -tf .../test_read_format_rar_ppmd_use_after_free.rar
bsdtar: Archive entry has empty or unreadable filename ... skipping.
bsdtar: Archive entry has empty or unreadable filename ... skipping.
bsdtar: Error exit delayed from previous errors.
```
Since the former cannot lseek, the error is a different one
(ARCHIVE_FATAL vs ARCHIVE_EOF). The piped version states explicitly that
truncation occurred, while the latter states EOF because the skip past
the end of file was successful.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The size_t data type is only 32 bit on 32 bit sytems while off_t is
generally 64 bit to support files larger than 2 GB.
If an entry is declared to be larger than 4 GB and the entry shall be
skipped, then 32 bit systems truncate the requested amount of bytes.
This leads to different interpretation of data in tar files compared to
64 bit systems.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Solves a Windows compile issue when OpenSSH/mbedTLS/Nettle is activated
and on the build system's paths by making the Windows API backend higher
priority on Windows (meaning that only RIPEMD160 will use
OpenSSH/mbedTLS/Nettle anymore).
Fixes#2536 and starts on #2320.
If a warc archive claims to have more than INT64_MAX - 4 content bytes,
the inevitable failure to skip all these bytes could lead to parsing
data which should be ignored instead.
The test case contains a conversation entry with that many bytes and if
the entry is not properly skipped, the warc implementation would read
the conversation data as a new file entry.
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
The skip functions are limited to 1 GB for cases in which libarchive
runs on a system with an off_t or long with 32 bits. This has negative
impact on 64 bit systems.
Instead, make sure that _all_ subsequent functions truncate properly.
Some of them already did and some had regressions for over 10 years.
Tests pass on Debian 12 i686 configured with --disable-largefile, i.e.
running with an off_t with 32 bits.
Casts added where needed to still pass MSVC builds.
---------
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
My attempt to fix#2404 just made the confusion between the size of the
extracted file and the size of the contents in the tar archive worse
than it was before.
@ferivoz in #2557 showed that the confusion stemmed from a point where
we were setting the size in the entry (which is by definition the size
of the file on disk) when we read the `GNU.sparse.size` and
`GNU.sparse.realsize` attributes (which might represent the size on disk
or in the archive) and then using that to determine whether to read the
value in ustar header (which represents the size of the data in the
archive).
The confusion stems from three issues:
* The GNU.sparse.* fields mean different things depending on the version
of GNU tar used.
* The regular Pax `size` field overrides the value in the ustar header,
but the GNU sparse size fields don't always do so.
* The previous libarchive code tried to reconcile different size
information as we went along, which is problematic because the order in
which this information appears can vary.
This PR makes one big structural change: We now have separate storage
for every different size field we might encounter. We now just store
these values and record which one we saw. Then at the end, when we have
all the information available at once, we can use this data to determine
the size on disk and the size in the archive.
A few key facts about GNU sparse formats:
* GNU legacy sparse format: Stored all the relevant info in an extension
of the ustar header.
* GNU pax 0.0 format: Used `GNU.sparse.size` to store the size on disk
* GNU pax 0.1 format: Used `GNU.sparse.size` to store the size on disk
* GNU pax 1.0 format: Used `GNU.sparse.realsize` to store the size on
disk; repurposed `GNU.sparse.size` to store the size in the archive, but
omitted this in favor of the ustar size field when that could be used.
And of course, some key precedence information:
* Pax `size` field always overrides the ustar header size field.
* GNU sparse size fields override it ONLY when they represent the size
of the data in the archive.
Resolves#2548
Hello,
- The `CMAKE_COMPILER_IS_*` variables are deprecated and
`CMAKE_C_COMPILER_ID` can be used in this case instead.
- The legacy `endif()` command argument also simplified to avoid
repeating the condition.
Moving archive_entry_set_digest() to the public API simplifies porting
non-archive formats to archive formats while preserving supported
message digests specifically in cases where recomputing digests is not
viable.
The lafe_errc function adds a newline by itself already, so do not
insert one into the message.
You can reproduce with the following commands:
```
touch archive.tar
bsdtar -xf archive.tar -C /non-existing
```
```
bsdtar --exclude ""
```
Signed-off-by: Tobias Stoeckmann <tobias@stoeckmann.org>
Hi,
please find my approach to fix the CVE-2025-1632 and CVE-2025-25724
vulnerabilities in this pr.
As both error cases did trigger a NULL pointer deref (and triggered
hopefully everywhere a coredump), we can safely replace the actual
information by a predefined invalid string without breaking any
functionality.
---------
Signed-off-by: Peter Kaestle <peter@piie.net>
Adding missing librairies to `archive_version_details()`'s output. I put
"system" if the library doesn't give a way to query its version and
"bundled" if there's a choice between the system copy of a library and a
bundled one and we took the bundled copy (Only one library in that case,
libb2. Maybe also xxhash in the future?).
I would have a question for the Windows specialists though: is there a
way to query the interface version of a CNG cryptographic provider?
Because I know of a way for Crypto API providers but I haven't found any
for CNG ones, despite `<bcrypt.h>` having an interface version
structure.
Fixes#2300.
As remarked in #2521, this test has unreachable code on Windows, which
triggers a build failure in development due to warnings-as-errors.
(Release versions should not have warnings-as-errors.)
The outer if checks !S_ISDIR(a->st.st_mode), so we know that the file
being overwritten is not a directory, and thus we can rename(2) over it
if we want to, but whether we can use a temporary regular file is a
property of the file being extracted. Otherwise, when replacing a
regular file with a directory, we end up in this case and create a
temporary regular file for the new directory, but with the permissions
of the directory (which likely includes x), and rename it over the top
at the end. Depending on where the archive_entry came from, it may have
a non-zero size that also isn't ovewritten with 0 (e.g. if it came from
stat(2)) and so the API user may then try to copy data (thus failing if
read(2) of directories isn't permitted, or writing the raw directory
contents if it is), but if the size is zero as is the case for this tar
test then it will end up not writing any data and "successfully"
overwrite the file with an empty file, not a directory.
The #if condition as-written fails for any major >= 3 if minor < 1, e.g.
GCC 15.0 (while in development).
Use the idiom described in the GCC docs [0] to avoid this.
[0] https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html
Fixes: ab94a813b0f64cbc1bcb952bf55424a2d9c7f1d9
`Int32x32To64` macro internally truncates the arguments to int32, while
`time_t` is 64-bit on most/all modern platforms. Therefore, usage of
this macro creates a Year 2038 bug.
I detailed this issue a while ago in a writeup, and spotted the same
issue in this repository when updating the list of affected
repositories:
<https://cookieplmonster.github.io/2022/02/17/year-2038-problem/>
A few more notes:
1. I changed all uses of `Int32x32To64` en masse, even though at least
one of them was technically OK and used with int32 parameters only. IMO
better safe than sorry.
2. This is untested, but it's a small enough change that I hope the CI
success is a good enough indicator.
Endianness is easy to determine at runtime, but detecting this a single
time and then reusing the cached result might require API changes.
However we can use compile-time detection for some known compiler macros
without API changes fairly easily. Let's start by enabling this for
Clang and GCC.
This new test archive contains a C hello world executable built like so
on a ubuntu 24.04 machine:
```
#include <stdio.h>
int main(int argc, char *argv[]) {
printf("hello, world\n");
return 0;
}
```
`powerpc-linux-gnu-gcc hw.c -o hw-powerpc -Wall`
The test archive that contains this executable was created like so,
using 7-Zip 24.08:
`7zz a -t7z -m0=deflate -mf=ppc
libarchive/test/test_read_format_7zip_deflate_powerpc.7z hw-powerpc`
This test fails in the first commit in this PR, and passes in the second
commit.
When the -s/regexp/replacement/ option was used with the b flag more
than once, the result of the previous substitution was appended to the
previous subject instead of replacing it. Fixed it by making sure the
subject is made the empty string before the call to realloc_strcat().
That in effect makes it more like a realloc_strcpy(), but creating a new
realloc_strcpy() function for that one usage doesn't feel worth it.
Resolves Issue libarchive/libarchive#2414
Co-authored-by: Stephane Chazelas <stephane@chazelas.org>
We previously told make to run as many threads as it likes on these CI
jobs, but that might sometimes hit resource limits like RAM or the
allowed number of open files.
These numbers were found experimentally by using `sysctl -n hw.ncpu` on
mac and `nproc` on linux.
`i4le()` returns an unsigned int, so `'%d'` is incorrect.
Reported by `clang -Wformat`. (Many more such fixes to come, but this is
the simplest set of them.)
7-Zip 24.05 and liblzma 5.5.1alpha added a RISC-V BCJ filter. Let's
enable this combination if we can.
Note that this does not allow the use of the RISC-V filter with other
compressors.
Previously skipped tests were reported like this when running the *_test
binaries:
```
4: test_acl_platform_nfs4 ok (S)
```
Let's make this more obvious:
```
4: test_acl_platform_nfs4 skipped
```
This plumbing is required for cmake/ctest to recognise and report
skipped tests.
Now skipped tests in cmake ci jobs are reported like so:
```
Start 7: libarchive_test_acl_platform_posix1e_read
7/785 Test #7: libarchive_test_acl_platform_posix1e_read ................................***Skipped 0.02 sec
```
And there is a list of skipped tests shown at the end of the test run.
Prior to this change, the ci autoconf jobs weren't looking for homebrew
headers or libraries unless pkg-config was used, so for example the
"MacOS (autotools)" ci job wasn't testing lz4 or zstd code.
Relates to #2426.
A few of libarchive's CI jobs don't find all the local support libraries
that they could be using. This change makes it easier to see which of
them are used.
We currently use XZ Utils 5.6.3 on windows CI jobs, but the Windows
(msvc)
job which uses cmake seems to only be looking for the old library name,
liblzma.lib:
```
-- Looking for lzma_auto_decoder in C:/Program Files (x86)/xz/lib/liblzma.lib
-- Looking for lzma_auto_decoder in C:/Program Files (x86)/xz/lib/liblzma.lib - not found
-- Looking for lzma_easy_encoder in C:/Program Files (x86)/xz/lib/liblzma.lib
-- Looking for lzma_easy_encoder in C:/Program Files (x86)/xz/lib/liblzma.lib - not found
-- Looking for lzma_lzma_preset in C:/Program Files (x86)/xz/lib/liblzma.lib
-- Looking for lzma_lzma_preset in C:/Program Files (x86)/xz/lib/liblzma.lib - not found
-- Could NOT find LibLZMA (missing: LIBLZMA_HAS_AUTO_DECODER LIBLZMA_HAS_EASY_ENCODER LIBLZMA_HAS_LZMA_PRESET) (found version "5.6.3")
```
We need to update build/ci/github_actions/ci.cmd to look for lzma.lib
instead.
The fallback for when `getline` is not implemented in libc was not
compiling due to the fact that the definition for it was missing, so add
the definition.
I have been using this for years without realizing it decompresses rar.
+ add rar to supported decompression formats
+ use section references to link sections (this makes them clickable in
GUIs)
+ add paragraph breaks for consistent spacing
+ pdtar is not this program, so use Sy per mdoc style guide
+ do almost the same in reverse for bsdtar
+ remove parenthetical around a complete sentance
Thank you so much, this is wonderful software.
This change fixes the autotools build to work with xz-utils 5.6.3, which
changed library names on windows, and fixes a couple of tests that I
noticed had dependencies on liblzma.
Moving the tests' integer reading functions to test_utils so that they
all use the same as well as moving the few using the archive_endian
functions over to the test_utils helper.
Follow-up from libarchive#2390.
When the pax `size` field is present, we should ignore the size value in
the ustar header. In particular, this fixes reading pax archives created
by GNU tar with entries larger than 8GB.
Note: This doesn't impact reading pax archives created by libarchive
because libarchive uses tar extensions to store an accurate large size
field in the ustar header. GNU tar instead strictly follows ustar in
this case, which prevents it from storing accurate sizes in the ustar
header.
Resolves#2404
These two new test archives contain a C hello world executable built
like so on a ubuntu 24.04 machine:
```
#include <stdio.h>
int main(int argc, char *argv[]) {
printf("hello, world\n");
return 0;
}
```
`sparc64-linux-gnu-gcc hw.c -o hw-sparc64 -Wall`
The two test archives that contain this executable were created like so,
using the https://github.com/tehmul/p7zip-zstd fork of 7-Zip:
`7z a -t7z -m0=zstd -mf=SPARC
libarchive/test/test_read_format_7zip_zstd_sparc.7z hw-sparc64`
`7z a -t7z -m0=lzma2 -mf=SPARC
libarchive/test/test_read_format_7zip_lzma2_sparc.7z hw-sparc64`
Two test files are required, because the 7zip reader code has two
different paths, one for lzma and one for all other compressors.
The test_read_format_7zip_lzma2_sparc test is expected to pass, because
LZMA BCJ filters are implemented in liblzma.
The test_read_format_7zip_zstd_sparc test is expected to fail in the
first commit, because libarchive does not currently implement the SPARC
BCJ filter. The second commit will make test_read_format_7zip_zstd_sparc
pass.
Two problems are prompting this revert:
* In order to change the Create OS value to "Windows", we would need to
record other data (such as `external_attributes`) in Windows format as
well.
* Changing the Create OS value doesn't actually fix the
filename-encoding issue that originally motivated this PR
This reverts commit 755af84301adc4262722a4c88671a8d0a1c83fae.
The `-P` flag is uppercase, so the test file should be named
test_option_P_upper.c for consistency with the other test files in this
directory.
Sorry about the noise.
Submitting this test at the request of @kientzle on issue #2041.
Note that failure is currently expected, as the feature it tests is not
yet implemented!
This commit prepares the XAR writer for another XML writing backend.
Almost everything in this changeset leaves the code identical to how
it started, except for a new layer of indirection between the xar writer
and the XML writer.
The things that are not one-to-one renames include:
- The removal of `UTF8Toisolat1` for the purposes of validating UTF-8
- The writer code made a copy of every filename for the purposes of
checking whether it was Latin-1 stored as UTF-8. In xar, Non-Latin-1
gets stored Base64-encoded.
- I've replaced this use because (1) it was inefficient and (2)
`UTF8Toisolat1` is a `libxml2` export.
- The new function has slightly different results than the one it is
replacing for invalid UTF-8. Namely, it treats illegal UTF-8 "overlong"
encodings of Latin-1 codepoints as _invalid_. It operates on the principle
that we can determine whether something is Latin-1 based entirely on how
long the sequence is expected to be.
- The move of `SetIndent` to before `StartDocument`, which the
abstraction layer immediately undoes. This is to accommodate XML writers
that require indent to be set _before_ the document starts.
PPMD may come later but I'd rather first iron out style issues with the
ones needing only to wire up libraries already-used in Libarchive before
going at the ones possibly requiring implementing algorithms as well.
Closes#1046 and resolves#1179.
Cygwin 3.5.4 (same applies for 3.5.3),
I get a compile error as shown below after a simple ./configure and
make. Adding <windef.h> solves the problem.
Co-authored-by: vco <god@universe.sys>
Bumps the all-actions group with 4 updates:
actions/checkout from 4.1.6 to 4.2.1
actions/upload-artifact from 4.3.3 to 4.4.3
github/codeql-action from 3.25.6 to 3.26.12
ossf/scorecard-action from 2.3.3 to 2.4.0
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
This is the first part of converting the project to use SPDX license
identifiers instead using the verbose license text.
The patches are semi-automated and I've went through manually to ensure
no license changes were made. That said, I would welcome another pair of
eyes, since I am only human.
See https://github.com/libarchive/libarchive/issues/2298
---------
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
People really should never, ever, ever use libarchive internal headers. And they definitely should not expect libarchive internal headers to work in a C++ compiler. (C++ and C are really just not that compatible.)
However, people do a lot of things they shouldn't: Avoid the reserved C++ keyword `template`
Followup from #2346
Add libbsd to make/cmake configuration for linking readpassphrase on
Haiku.
Maybe there is a better way to do this for cmake, I'm not that familiar
with it.
Co-authored-by: vco <god@universe.sys>
I previously tried to find documentation on how symlinks are expected to
be stored in 7zip files, however the best reference I could find was
[here](https://py7zr.readthedocs.io/en/latest/archive_format.html). That
site suggests that symlink paths are stored as UTF-8 encoded strings:
Currently, the RAR5 code always reports
`ARCHIVE_READ_FORMAT_ENCRYPTION_UNSUPPORTED` for
`archive_read_has_encrypted_entries`, nor does it set any of the
entry-specific properties, even though it has enough information to
properly report this information. Accurate reporting of encryption is
super useful for applications because reporting an error message such as
"the archive is encrypted, but we don't currently support encryption" is
a lot better than a not generally useful `errno` value and a
non-localizable error string with a confusing and unpredictable error
message.
Fixes#1661
Change to read absolute symlinks as verbatim paths instead of NT paths:
as far as I can see, libarchive can deal with verbatim paths while it
can't with NT ones.
Fixes#2274.
Remove the incorrect 4th argument from `AC_CHECK_FUNCS` calls. The macro
uses only three arguments, so it was ignored anyway. Furthermore, in at
least once instance it was wrong -- due to a typo in `attr/xatr.h`
header name.
The tar header parsing overhaul in #2127 introduced a systematic
mishandling of truncated files: the code incorrectly checks for whether
a given read operation failed, and ends up dereferencing a NULL pointer
in this case. I've gone back and double-checked how
`__archive_read_ahead` actually works (it returns NULL precisely when it
was unable to satisfy the read request) and reworked the error handling
for each call to this function in archive_read_support_format_tar.c
Resolves#2353
Resolves https://issues.oss-fuzz.com/issues/42537231
OSS-Fuzz managed to construct a small gzip input that decompresses into
another gzip input with an extremely large filename field. This causes
libarchive to hang processing the inner gzip.
Address this by rejecting any gzip input where the filename or comment
fields exceed 1MiB.
Credit: OSS-Fuzz
No functional change, just a tiny style improvement.
Use `crc32_computed` to refer to the crc32 that the reader has computed
and `crc32_read` to refer to the value that we read from the archive.
That hopefully makes this code a tiny bit easier to follow. (It confused
me recently when I was double-checking something in this area, so I
thought an improvement here might help others.)
We always print the error message with or without -v, but for some
reason, we were omitting the path being processed. Simplify so that we
always print the full error including context.
This fixes various code quality issues I encountered while chasing a
memory leak reported by test automation. I failed to reproduce the
memory leak, but I hope you find this useful nonetheless.
PR #2127 failed to clean up the linkpath storage between entries. As a
result, after the first hard/symlink entry in a pax format archive, all
subsequent entries would get the same link information.
I'm really unsure how this bug failed to trip CI. I'll do some digging
in the test suite before I merge this.
Resolves#2331 , #2337
P.S. Thanks to Brad King for noting that the linkpath wasn't being
managed correctly, which was a big hint for me.
Synchronize the last use of `attr/xattr.h` to support using
`sys/xattr.h` instead. The former header is deprecated on GNU/Linux, and
this replacement makes it possible to build libarchive without the
`attr` package.
Fix memory leaks introduced by #2127:
* `struct tar` member `entry_linkpath` was moved at the same time as
other members were removed, but its cleanup was accidentally removed
with the others.
* `header_pax_extension` local variable `attr_name` was not cleaned up.
Resolves#2336
Some ISO images don't have valid timestamps for the root directory
entry. Parsing such timestamps can generate nonsensical results, which
in one case showed up as an unexpected overflow on a 32-bit system.
Add some validation logic that can check whether a 7-byte or 17-byte
timestamp is reasonable-looking, and use this to ignore invalid
timestamps in various locations. This also requires us to be a little
more careful about tracking which timestamps are actually known.
Resolves issue #2329
RUN C:\tools\cygwin\cygwinsetup.exe -q -P make,autoconf,automake,cmake,gcc-core,binutils,libtool,pkg-config,bison,sharutils,zlib-devel,libbz2-devel,liblzma-devel,liblz4-devel,libiconv-devel,libxml2-devel,libzstd-devel,libssl-devel
RUN C:\tools\cygwin\cygwinsetup.exe -q -P make,autoconf,automake,cmake,gcc-core,binutils,libtool,pkg-config,bison,zlib-devel,libbz2-devel,liblzma-devel,liblz4-devel,libiconv-devel,libxml2-devel,libzstd-devel,libssl-devel
RUN C:\tools\cygwin\cygwinsetup.exe -q -P make,autoconf,automake,cmake,gcc-core,binutils,libtool,pkg-config,bison,sharutils,zlib-devel,libbz2-devel,liblzma-devel,liblz4-devel,libiconv-devel,libxml2-devel,libzstd-devel,libssl-devel
RUN C:\tools\cygwin\cygwinsetup.exe -q -P make,autoconf,automake,cmake,gcc-core,binutils,libtool,pkg-config,bison,zlib-devel,libbz2-devel,liblzma-devel,liblz4-devel,libiconv-devel,libxml2-devel,libzstd-devel,libssl-devel
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.