Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Added a few sentences to the description for lzma_block_encoder() and
lzma_block_decoder() to highlight that the Block Header must be coded
before calling these functions.
|
|
|
|
|
|
|
|
|
|
Standardizing each function to always specify params and return values.
Output pointer parameters are also marked with doxygen style [out] to
make it clear. Any note sections were also moved above the parameter and
return sections for consistency.
|
|
The flag description for LZMA_STR_NO_VALIDATION was previously confusing
about the treatment for filters than cannot be used with .xz format
(lzma1) without using LZMA_STR_ALL_FILTERS. Now, it is clear that
LZMA_STR_NO_VALIDATION is not a super set of LZMA_STR_ALL_FILTERS.
|
|
The workflow action for our CI pipeline can only reference artifacts in
the source directory, so we should ignore these files if the ci_build.sh
is run locally.
|
|
|
|
|
|
mythread.h and thus liblzma already does it.
|
|
|
|
This way, if xz is stopped the elapsed time and estimated time
remaining won't get confused by the amount of time spent in
the stopped state.
This raises SIGSTOP. It's not clear to me if this is the correct way.
POSIX and glibc docs say that SIGTSTP shouldn't stop the process if
it is orphaned but this commit doesn't attempt to handle that.
Search for SIGTSTP in section 2.4.3:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
|
|
Thanks to Rafael Fontenelle.
|
|
|
|
Clang can be configured to fake a too high GCC version so
this way it's more robust.
|
|
The previous documentation for lzma_str_to_filters() was technically
correct, but misleading. lzma_str_to_filters() returns NULL on success,
which is in practice always defined to 0. This is the same value as
LZMA_OK, but lzma_str_to_filters() does not return lzma_ret so we should
be more clear.
|
|
This reverts commit 82e3c968bfa10e3ff13333bd9cbbadb5988d6766.
Macros in the reserved namespace (_foo or __foo) shouldn't be #defined
without a very good reason. Here the alternative would have been
to #define tuklib_has_warning(str) to an approriate value.
Also the tuklib_* files should stay namespace clean if possible.
|
|
__has_warning and other __has_foo macros are meant to become
compiler-agnostic so it's not good to check for __clang__ with it.
This also relied on tuklib_common.h for #defining __has_warning
which was confusing as #defining reserved macros is generally
not a good idea.
|
|
Also edit style to match the existing coding style in the project.
|
|
|
|
This prevents the reserved fields from being part of the generated
Doxygen documentation.
|
|
A few Doxygen tags were obsolete from 1.4.7. Version 1.8.17 released
in 2019, so this should be compatible with resonable modern distros.
The purpose of Doxygen these days is for docs on the website, so it
doesn't necessarily have to work for everyone. Just when the maintainers
want to update the docs.
|
|
Doxygen is now configurable in autotools only with
--enable-doxygen=[api|all]. The default is "api", which will only
generate HTML output for liblzma API functions. The LaTex documentation
output was also disabled.
|
|
This improves the generated Doxygen HTML files to better highlight
how to properly use the liblzma API header files.
|
|
tuklib_physmem depends on GetProcAddress() for both MSVC and MinGW-w64
to retrieve a function address. The proper way to do this is to cast the
return value to the type of function pointer retrieved. Unfortunately,
this causes a cast-function-type warning, so the best solution is to
simply ignore the warning.
|
|
clang supports the __has_warning macro to determine if the version of
clang compiling the code supports a given warning. If we do not define
it for other compilers, it may cause a preprocessor error.
|
|
The 32-bit build needs to be first so the configure cache only needs to
be reset one time. The 32-bit build sets the CFLAGS env variable, so any
build using that flag after will fail unless the cache is reset.
|
|
If CFLAGS are set in a build, the cache must be cleared with
"make distclean", or by deleting the cache file.
|
|
|
|
Calling coder_set_compression_settings() in list mode with verbose mode
on caused the filter chain and memory requirements to print. This was
unnecessary since the command results in an error and not consistent
with other formats like lzma and alone.
|
|
|
|
Disabling shared library generation and linking should help speed up the
runners. The shared library is still being tested in the 32 bit build
and the full feature.
Disabling nls is to check for any unexpected warnings or errors.
|
|
Run the 32 bit job sooner since this is a more interesting test than
some of the later jobs.
|
|
|
|
|
|
|
|
|
|
It's not that important. It can be annoying in builds that
disable many features since in those cases the tests programs
will correctly trigger this warning with Clang.
|
|
It makes no difference here as the return value fits into an int
too and it then gets ignored but this looks better.
|
|
|
|
It doesn't warn on a 64-bit system because truncating
a ptrdiff_t (signed long) to uint32_t is diagnosed under
-Wconversion by GCC and -Wshorten-64-to-32 by Clang.
|
|
|
|
-Wstrict-aliasing was removed from the list since it is enabled
by -Wall already.
A normal build is clean with these on GNU/Linux x86-64 with
GCC 12.2.0 and Clang 14.0.6.
|
|
Explicitly casting the integer to lzma_check silences the warning.
Since such an invalid value is needed in multiple tests, a constant
INVALID_LZMA_CHECK_ID was added to tests.h.
The use of 0x1000 for lzma_block.check wasn't optimal as if
the underlying type is a char then 0x1000 will be truncated to 0.
However, in these test cases the value is ignored, thus even with
such truncation the test would have passed.
|
|
Note that assigning an unsigned int to lzma_check doesn't warn
on GNU/Linux x86-64 since the enum type is unsigned on that
platform. The enum can be signed on some other platform though
so it's best to use enumeration type lzma_check in these situations.
|
|
This is similar to 2ce4f36f179a81d0c6e182a409f363df759d1ad0.
The actual initialization of the variables is done inside
mythread_sync() macro. Clang doesn't seem to see that
the initialization code inside the macro is always executed.
|
|
|
|
|
|
|
|
|
|
clang and gcc differ in how they handle -Wformat-nonliteral. gcc will
allow a non-literal format string as long as the function takes its
format arguments as a va_list.
|
|
|
|
This only occurs in test_filter_flags when the BCJ filters are not
configured and built. In this case, ARRAY_SIZE() returns 0 and causes a
type-limits warning with the loop variable since an unsigned number will
always be >= 0.
|
|
This affects only 32-bit x86 builds. x86-64 is OK as is.
I still cannot easily test this myself. The reporter has tested
this and it passes the tests included in the CMake build and
performance is good: raw CRC64 is 2-3 times faster than the
C version of the slice-by-four method. (Note that liblzma doesn't
include a MSVC-compatible version of the 32-bit x86 assembly code
for the slice-by-four method.)
Thanks to Iouri Kharon for figuring out a fix, testing, and
benchmarking.
|
|
One of the global arrays of filters was only used in a test that
required both encoders and decoders to be configured in the build.
|
|
test_index_hash does not use fill_index_hash() unless both encoders
and decoders are configured in the build.
|
|
If all goes well, Mac autotools and Linux and Mac CMake will be added
later for 32-bit builds.
|
|
This will help us catch warnings and potential bugs in builds that are
not often tested by us.
|
|
For now, the suggested option is for -m32 only, but this can be updated
later if other flags are deemed useful.
|
|
This reverts commit 36edc65ab4cf10a131f239acbd423b4510ba52d5.
It was reported that it wasn't a good enough fix and MSVC
still produced (different kind of) bad code when building
for 32-bit x86 if optimizations are enabled.
Thanks to Iouri Kharon.
|
|
On some platforms src/xz/suffix.c may need <strings.h> for
strcasecmp() but suffix.c includes the header when it needs it.
Unless there is an old system that otherwise supports enough C99
to build XZ Utils but doesn't have C89/C90-compatible <string.h>,
there should be no need to include <strings.h> in sysdefs.h.
|
|
SUSv2 and POSIX.1‐2017 declare only a few functions in <strings.h>.
Of these, strcasecmp() is used on some platforms in suffix.c.
Nothing else in the project needs <strings.h> (at least if
building on a modern system).
sysdefs.h currently includes <strings.h> if HAVE_STRINGS_H is
defined and suffix.c relied on this.
Note that dos/config.h doesn't #define HAVE_STRINGS_H even though
DJGPP does have strings.h. It isn't needed with DJGPP as strcasecmp()
is also in <string.h> in DJGPP.
|
|
|
|
It quite probably was never needed, that is, any system where memory.h
was required likely couldn't compile XZ Utils for other reasons anyway.
XZ Utils 5.2.6 and later source packages were generated using
Autoconf 2.71 which no longer defines HAVE_MEMORY_H. So the code
being removed is no longer used anyway.
|
|
It's a string, not a list. It only worked when the variable was empty.
Thanks to Iouri Kharon.
|
|
|
|
At least on some systems, GNU windres needs --use-temp-file
in addition to the \x20 hack to avoid spaces in the command line
argument. Hovever, that \x20 syntax is broken with llvm-windres
version 15.0.0 (results in "XZx20Utils") but luckily it works
with a regular space. Thus it is best to limit the workarounds
to GNU toolchain on Windows.
|
|
Here are the list of the most significant issues addressed:
- Avoid using internal common.h header. It's not good to copy the
constants like this but common.h cannot be included for use outside
of liblzma. This is the quickest thing to do that could be fixed later.
- Omit the INIT_FILTER macro. Initialization should be done with just
regular designated initializers.
- Use start_offset = 257 for BCJ tests. It demonstrates that Filter
Flags encoder and decoder don't validate the options thoroughly.
257 is valid only for the x86 filter. This is a bit silly but
not a significant problem in practice because the encoder and
decoder initialization functions will catch bad alignment still.
Perhaps this should be fixed but it's not urgent and doesn't need
to be in 5.4.x.
- Various tweaks to comments such as filter id -> Filter ID
|
|
Converts the existing filter flags tests into tuktests.
|
|
I haven't tested with MSVC myself and there doesn't seem to be
information about the problem online, so I'm relying on the bug report.
Thanks to Iouri Kharon for the bug report and the patch.
|
|
It was my mistake. Thanks to Iouri Kharon for the bug report.
|
|
It's not needed in XZ Utils at least for now. It's good to support
it still because if such use is needed later, it wouldn't be
caught on GNU/Linux since malloc(0) from glibc returns non-NULL.
|
|
|
|
The changes listed on cmake-policies(7) for versions 3.17 to 3.25
shouldn't affect this project.
|
|
|
|
The command line tools cannot be built with MSVC for now but
they can be built with MinGW-w64.
Thanks to Iouri Kharon for the bug report and the original patch.
|
|
Thanks to Iouri Kharon for the bug report and the original patch.
|
|
VS2013 doesn't have _mm_set_epi64x() so this way CLMUL gets
disabled with VS2013.
Thanks to Iouri Kharon for the bug report.
|
|
The phase split was only done for Autotools before, so should also
apply to CMake.
|
|
The old version used too many runners that resulted in unnecessary
dependency downloads. Now, the runners are reused for the different
configurations for each OS and build system.
|
|
The new PHASE argument can be build, test, or all. all is the default.
This way, the CI/CD script can differentiate between the build and test
phases to make it easier to track down errors when they happen.
|
|
Tuktest index hash
|
|
|
|
|
|
|
|
|
|
It can trigger warnings from -Wshadow on some systems.
|
|
|
|
The words defined in the .xz file format specification
begin with capital letter to emphasize that they have
a specific meaning.
|
|
|
|
|
|
The line in the .vcxproj files for building with was missing in 5.4.0.
Thank to Hajin Jang for reporting the issue.
|
|
common/index.h is needed by liblzma internally and tests. common.h will
include and define many things that are not needed by the tests. Also,
this prevents include order problems because common.h will redefine
LZMA_API resulting in a warning.
|
|
|
|
|
|
The shell parameter expansion using # and ## is not supported in
Solaris 10 Bourne shell (/bin/sh). Even though this is POSIX, it is not fully
portable, so we should avoid it.
|
|
Thanks to Seong-ho Cho
|
|
|
|
5.5.0alpha won't be released, it's just to mark that
the branch is not for stable 5.4.x.
Once again there is no API/ABI stability for new features
in devel versions. The major soname won't be bumped even
if API/ABI of new features breaks between devel releases.
|
|
|
|
HAVE_DECL_PROGRAM_INVOCATION_NAME is renamed to
HAVE_PROGRAM_INVOCATION_NAME. Previously,
HAVE_DECL_PROGRAM_INVOCATION_NAME was always set when
building with autotools. CMake would only set this when it was 1, and the
dos/config.h did not define it. The new macro definition is consistent
across build systems.
|
|
|
|
Tests all API functions exported from index_hash.h. Does not have a
dedicated test for lzma_index_hash_end.
|
|
This is for consistency with lzma_index_append.
|
|
|
|
|
|
|
|
Previously, mytime.c depended on mythread.h for <time.h> to be included.
|
|
Previously, <sys/time.h> was always included, even if mythread only used
clock_gettime. <time.h> is still needed even if clock_gettime is not used
though because struct timespec is needed for mythread_condtime.
|
|
Previously, if threading was enabled HAVE_DECL_CLOCK_MONOTONIC would always
be set to 0 or 1. However, this macro was needed in xz so if xz was not
built with threading and HAVE_DECL_CLOCK_MONOTONIC was not defined but
HAVE_CLOCK_GETTIME was, it caused a warning during build. Now,
HAVE_DECL_CLOCK_MONOTONIC has been renamed to HAVE_CLOCK_MONOTONIC and
will only be set if it is 1.
|
|
Thanks to Yuri Chornoivan
|
|
The CI/CD workflow will only execute on Ubuntu and MacOS latest version.
The workflow will attempt to build with autotools and CMake and execute
the tests. The workflow will run for all pull requests and pushes done
to the master branch.
|
|
|
|
|
|
In source builds are not recommended, but we should still ignore
the generated artifacts.
|
|
Using return_if_error on lzma_lzma_lclppb_encode was improper because
return_if_error is expecting an lzma_ret value, but
lzma_lzma_lclppb_encode returns a boolean. This could result in
lzma_microlzma_encoder, which would be misleading for applications.
|
|
In source builds are not recommended, but we can make it easier
by ignoring the generated artifacts from CMake.
|
|
|
|
|
|
Using CMake to build liblzma should work on a few other OSes
but building the command line tools is still subtly broken.
It is known that shared library versioning may differ between
CMake and Libtool builds on some OSes, most notably Darwin.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The TODO file outdated still.
|
|
|
|
|
|
|
|
|
|
The code that parses --memlimit options and --block-list modified
the argv[] when parsing the option string from optarg. This was
visible in "ps auxf" and such and could be confusing. I didn't
understand it back in the day when I wrote that code. Now a copy
is allocated when modifiable strings are needed.
|
|
|
|
|
|
|
|
|
|
Thanks to Remus-Gabriel Chelu.
|
|
|
|
The API docs gave an impression that such checks are done
but they actually weren't done. In practice it made little
difference since the calling code has a bug if these are NULL.
Thanks to Jia Tan for the original patch that checked for
block->filters == NULL.
|
|
This also sorts the symbol names alphabetically in liblzma_*.map.
|
|
|
|
|
|
If someone sets up Clang to define __GNUC__ to 10 or greater
then symvers broke. __has_attribute is supported by such GCC
and Clang versions that don't support __symver__ so this should
be much better and simpler way to detect if __symver__ is
actually supported.
Thanks to Tomasz Gajc for the bug report.
|
|
It has some complicated downsides and its usefulness is more limited
than I originally thought. So this change is bad for certain very
specific situations but a generic solution that works for other
filters (and is otherwise better too) is planned anyway. And this
way 7-Zip can use the same compatible filter for the .7z format.
This is still marked as experimental with a new temporary Filter ID.
|
|
|
|
|
|
Thanks to Jia Tan for the original patch.
|
|
|
|
|
|
This was forgotten from 7484744af6cbabe81e92af7d9e061dfd597fff7b.
|
|
It forwards to me and Jia Tan.
Also update the IRC reference in README as #tukaani was moved
to Libera Chat long ago.
|
|
|
|
|
|
|
|
Thanks to Jia Tan.
|
|
Two uses: Displaying encoder filter chain when compressing with -vv,
and displaying the decoder filter chain in --list -vv.
|
|
lzma_str_to_filters() uses static error messages which makes
them not very precise. It tells the position in the string
where an error occurred though which helps quite a bit if
applications take advantage of it. Dynamic error messages can
be added later with a new flag if it seems important enough.
|
|
|
|
|
|
Here too this avoids the slightly ugly method to set
the uncompressed size.
Also moved the setting of dict_size to the struct initializer.
|
|
This avoids the need to use the slightly ugly method to
set the uncompressed size.
|
|
Some file formats need support for LZMA1 streams that don't use
the end of payload marker (EOPM) alias end of stream (EOS) marker.
So far liblzma API has supported decompressing such streams via
lzma_alone_decoder() when .lzma header specifies a known
uncompressed size. Encoding support hasn't been available in the API.
Instead of adding a new LZMA1-only API for this purpose, this commit
adds a new filter ID for use with raw encoder and decoder. The main
benefit of this approach is that then also filter chains are possible,
for example, if someone wants to implement support for .7z files that
use the x86 BCJ filter with LZMA1 (not BCJ2 as that isn't supported
in liblzma).
|
|
|
|
This allows using two Filter IDs with the same
initialization function and data structures.
|
|
|
|
|
|
|
|
|
|
Now that liblzma accepts these, we avoid the extra check and
there's one message less for translators too.
|
|
That is, if the specified nice_len is smaller than the minimum
of the match finder, silently use the match finder's minimum value
instead of reporting an error. The old behavior is annoying to users
and it complicates xz options handling too.
|
|
A tiny downside of this is that now a 1-4 tiny allocations are made
for every Block because each worker thread needs its own copy of
the filter chain.
|
|
|
|
It not only makes no sense to put symbol versions into a static library
but it can also cause breakage.
By default Libtool #defines PIC if building a shared library and
doesn't define it for static libraries. This is documented in the
Libtool manual. It can be overriden using --with-pic or --without-pic.
configure.ac detects if --with-pic or --without-pic is used and then
gives an error if neither --disable-shared nor --disable-static was
used at the same time. Thus, in normal situations it works to build
both shared and static library at the same time on GNU/Linux,
only --with-pic or --without-pic requires that only one type of
library is built.
Thanks to John Paul Adrian Glaubitz from Debian for reporting
the problem that occurred on ia64:
https://www.mail-archive.com/xz-devel@tukaani.org/msg00610.html
|
|
lzma_filters_free() sets the options to NULL and ids to
LZMA_VLI_UNKNOWN so there is no need to do it by caller;
the filter arrays will always be left in a safe state.
Also use memcpy() instead of a loop to copy a filter chain
when it is known to be safe to copy LZMA_FILTERS_MAX + 1
(even if the elements past the terminator might be uninitialized).
|
|
This time it can happen when lzma_stream_encoder_mt() is used
to reinitialize an existing multi-threaded Stream encoder
and one of 1-4 tiny allocations in lzma_filters_copy() fail.
It's very similar to the previous bug
10430fbf3820dafd4eafd38ec8be161a6978ed2b, happening with
an array of lzma_filter structures whose old options are freed
but the replacement never arrives due to a memory allocation
failure in lzma_filters_copy().
|
|
The documentation mentions that lzma_block_encoder() supports
LZMA_SYNC_FLUSH but it was never added to supported_actions[]
in the internal structure. Because of this, LZMA_SYNC_FLUSH could
not be used with the Block encoder unless it was the next coder
after something like stream_encoder() or stream_encoder_mt().
|
|
This is small but convenient and should have been added
a long time ago.
|
|
|
|
|
|
The bug was in the single-threaded .xz Stream encoder
in the code that is used for both re-initialization and for
lzma_filters_update(). To trigger it, an application had
to either re-initialize an existing encoder instance with
lzma_stream_encoder() or use lzma_filters_update(), and
then one of the 1-4 tiny allocations in lzma_filters_copy()
(called from stream_encoder_update()) must fail. An error
was correctly reported but the encoder state was corrupted.
This is related to the recent fix in
f8ee61e74eb40600445fdb601c374d582e1e9c8a which is good but
it wasn't enough to fix the main problem in stream_encoder.c.
|
|
|
|
The encoder doesn't support dictionary sizes larger than 1536 MiB.
This is validated, for example, when calculating the memory usage
via lzma_raw_encoder_memusage(). It is also enforced by the LZ
part of the encoder initialization. However, LZMA encoder with
LZMA_MODE_NORMAL did an unsafe calculation with dict_size before
such validation and that results in an infinite loop if dict_size
was 2 << 30 or greater.
|
|
These were caught by clang -Wdocumentation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This reverts commit 177bdc922cb17bd0fd831ab8139dfae912a5c2b8
and also does equivalent change to arm64.c.
Now that ARM64 filter will use lzma_options_bcj, this change
is not needed anymore.
|
|
This is incompatible with the previous version.
This has space/tab fixes in filter_*.c and bcj.h too.
|
|
It also works on E2K as it supports these intrinsics.
On x86-64 runtime detection is used so the code keeps working on
older processors too. A CLMUL-only build can be done by using
-msse4.1 -mpclmul in CFLAGS and this will reduce the library
size since the generic implementation and its 8 KiB lookup table
will be omitted.
On 32-bit x86 this isn't used by default for now because by default
on 32-bit x86 the separate assembly file crc64_x86.S is used.
If --disable-assembler is used then this new CLMUL code is used
the same way as on 64-bit x86. However, a CLMUL-only build
(-msse4.1 -mpclmul) won't omit the 8 KiB lookup table on
32-bit x86 due to a currently-missing check for disabled
assembler usage.
The configure.ac check should be such that the code won't be
built if something in the toolchain doesn't support it but
--disable-clmul-crc option can be used to unconditionally
disable this feature.
CLMUL speeds up decompression of files that have compressed very
well (assuming CRC64 is used as a check type). It is know that
the CLMUL code is significantly slower than the generic code for
tiny inputs (especially 1-8 bytes but up to 16 bytes). If that
is a real-world problem then there is already a commented-out
variant that uses the generic version for small inputs.
Thanks to Ilya Kurdyukov for the original patch which was
derived from a white paper from Intel [1] (published in 2009)
and public domain code from [2] (released in 2016).
[1] https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
[2] https://github.com/rawrunprotected/crc
|
|
|
|
|