Age | Commit message (Collapse) | Author | Files | Lines |
|
This improves the generated Doxygen HTML files to better highlight
how to properly use the liblzma API header files.
|
|
tuklib_physmem depends on GetProcAddress() for both MSVC and MinGW-w64
to retrieve a function address. The proper way to do this is to cast the
return value to the type of function pointer retrieved. Unfortunately,
this causes a cast-function-type warning, so the best solution is to
simply ignore the warning.
|
|
|
|
Calling coder_set_compression_settings() in list mode with verbose mode
on caused the filter chain and memory requirements to print. This was
unnecessary since the command results in an error and not consistent
with other formats like lzma and alone.
|
|
|
|
It's not that important. It can be annoying in builds that
disable many features since in those cases the tests programs
will correctly trigger this warning with Clang.
|
|
It makes no difference here as the return value fits into an int
too and it then gets ignored but this looks better.
|
|
|
|
It doesn't warn on a 64-bit system because truncating
a ptrdiff_t (signed long) to uint32_t is diagnosed under
-Wconversion by GCC and -Wshorten-64-to-32 by Clang.
|
|
|
|
-Wstrict-aliasing was removed from the list since it is enabled
by -Wall already.
A normal build is clean with these on GNU/Linux x86-64 with
GCC 12.2.0 and Clang 14.0.6.
|
|
Explicitly casting the integer to lzma_check silences the warning.
Since such an invalid value is needed in multiple tests, a constant
INVALID_LZMA_CHECK_ID was added to tests.h.
The use of 0x1000 for lzma_block.check wasn't optimal as if
the underlying type is a char then 0x1000 will be truncated to 0.
However, in these test cases the value is ignored, thus even with
such truncation the test would have passed.
|
|
Note that assigning an unsigned int to lzma_check doesn't warn
on GNU/Linux x86-64 since the enum type is unsigned on that
platform. The enum can be signed on some other platform though
so it's best to use enumeration type lzma_check in these situations.
|
|
This is similar to 2ce4f36f179a81d0c6e182a409f363df759d1ad0.
The actual initialization of the variables is done inside
mythread_sync() macro. Clang doesn't seem to see that
the initialization code inside the macro is always executed.
|
|
|
|
|
|
|
|
|
|
|
|
On some platforms src/xz/suffix.c may need <strings.h> for
strcasecmp() but suffix.c includes the header when it needs it.
Unless there is an old system that otherwise supports enough C99
to build XZ Utils but doesn't have C89/C90-compatible <string.h>,
there should be no need to include <strings.h> in sysdefs.h.
|
|
SUSv2 and POSIX.1‐2017 declare only a few functions in <strings.h>.
Of these, strcasecmp() is used on some platforms in suffix.c.
Nothing else in the project needs <strings.h> (at least if
building on a modern system).
sysdefs.h currently includes <strings.h> if HAVE_STRINGS_H is
defined and suffix.c relied on this.
Note that dos/config.h doesn't #define HAVE_STRINGS_H even though
DJGPP does have strings.h. It isn't needed with DJGPP as strcasecmp()
is also in <string.h> in DJGPP.
|
|
clang and gcc differ in how they handle -Wformat-nonliteral. gcc will
allow a non-literal format string as long as the function takes its
format arguments as a va_list.
|
|
|
|
This only occurs in test_filter_flags when the BCJ filters are not
configured and built. In this case, ARRAY_SIZE() returns 0 and causes a
type-limits warning with the loop variable since an unsigned number will
always be >= 0.
|
|
This affects only 32-bit x86 builds. x86-64 is OK as is.
I still cannot easily test this myself. The reporter has tested
this and it passes the tests included in the CMake build and
performance is good: raw CRC64 is 2-3 times faster than the
C version of the slice-by-four method. (Note that liblzma doesn't
include a MSVC-compatible version of the 32-bit x86 assembly code
for the slice-by-four method.)
Thanks to Iouri Kharon for figuring out a fix, testing, and
benchmarking.
|
|
One of the global arrays of filters was only used in a test that
required both encoders and decoders to be configured in the build.
|
|
test_index_hash does not use fill_index_hash() unless both encoders
and decoders are configured in the build.
|
|
|
|
This reverts commit 36edc65ab4cf10a131f239acbd423b4510ba52d5.
It was reported that it wasn't a good enough fix and MSVC
still produced (different kind of) bad code when building
for 32-bit x86 if optimizations are enabled.
Thanks to Iouri Kharon.
|
|
|
|
It quite probably was never needed, that is, any system where memory.h
was required likely couldn't compile XZ Utils for other reasons anyway.
XZ Utils 5.2.6 and later source packages were generated using
Autoconf 2.71 which no longer defines HAVE_MEMORY_H. So the code
being removed is no longer used anyway.
|
|
This is combined from the following commits in the master branch:
443dfebced041adc88f10d824188eeef5b5821a9
6b117d3b1fe91eb26d533ab16a2e552f84148d47
5e34774c31d1b7509b5cb77a3be9973adec59ea0
Thanks to Iouri Kharon for the bug report, the original patch,
and testing.
|
|
Here are the list of the most significant issues addressed:
- Avoid using internal common.h header. It's not good to copy the
constants like this but common.h cannot be included for use outside
of liblzma. This is the quickest thing to do that could be fixed later.
- Omit the INIT_FILTER macro. Initialization should be done with just
regular designated initializers.
- Use start_offset = 257 for BCJ tests. It demonstrates that Filter
Flags encoder and decoder don't validate the options thoroughly.
257 is valid only for the x86 filter. This is a bit silly but
not a significant problem in practice because the encoder and
decoder initialization functions will catch bad alignment still.
Perhaps this should be fixed but it's not urgent and doesn't need
to be in 5.4.x.
- Various tweaks to comments such as filter id -> Filter ID
|
|
Converts the existing filter flags tests into tuktests.
|
|
It's not needed in XZ Utils at least for now. It's good to support
it still because if such use is needed later, it wouldn't be
caught on GNU/Linux since malloc(0) from glibc returns non-NULL.
|
|
The changes listed on cmake-policies(7) for versions 3.17 to 3.25
shouldn't affect this project.
|
|
|
|
|
|
It was my mistake. Thanks to Iouri Kharon for the bug report.
|
|
The command line tools cannot be built with MSVC for now but
they can be built with MinGW-w64.
Thanks to Iouri Kharon for the bug report and the original patch.
|
|
I haven't tested with MSVC myself and there doesn't seem to be
information about the problem online, so I'm relying on the bug report.
Thanks to Iouri Kharon for the bug report and the patch.
|
|
VS2013 doesn't have _mm_set_epi64x() so this way CLMUL gets
disabled with VS2013.
Thanks to Iouri Kharon for the bug report.
|
|
Tests all API functions exported from index_hash.h. Does not have a
dedicated test for lzma_index_hash_end.
[Minor edits were made by Lasse Collin.]
|
|
common/index.h is needed by liblzma internally and tests. common.h will
include and define many things that are not needed by the tests.
Also, this prevents include order problems because both common.h and
lzma.h define LZMA_API. On most platforms it results only in a warning
but on Windows it would break the build as the definition in common.h
must be used only for building liblzma itself.
|
|
This is for consistency with lzma_index_append.
|
|
|
|
|
|
The line in the .vcxproj files for building with was missing in 5.4.0.
Thank to Hajin Jang for reporting the issue.
|
|
|
|
|
|
The shell parameter expansion using # and ## is not supported in
Solaris 10 Bourne shell (/bin/sh). Even though this is POSIX, it is not fully
portable, so we should avoid it.
|
|
Thanks to Seong-ho Cho
|
|
|
|
|
|
HAVE_DECL_PROGRAM_INVOCATION_NAME is renamed to
HAVE_PROGRAM_INVOCATION_NAME. Previously,
HAVE_DECL_PROGRAM_INVOCATION_NAME was always set when
building with autotools. CMake would only set this when it was 1, and the
dos/config.h did not define it. The new macro definition is consistent
across build systems.
|
|
|
|
|
|
Previously, mytime.c depended on mythread.h for <time.h> to be included.
|
|
Previously, <sys/time.h> was always included, even if mythread only used
clock_gettime. <time.h> is still needed even if clock_gettime is not used
though because struct timespec is needed for mythread_condtime.
|
|
Previously, if threading was enabled HAVE_DECL_CLOCK_MONOTONIC would always
be set to 0 or 1. However, this macro was needed in xz so if xz was not
built with threading and HAVE_DECL_CLOCK_MONOTONIC was not defined but
HAVE_CLOCK_GETTIME was, it caused a warning during build. Now,
HAVE_DECL_CLOCK_MONOTONIC has been renamed to HAVE_CLOCK_MONOTONIC and
will only be set if it is 1.
|
|
Thanks to Yuri Chornoivan
|
|
|
|
|
|
In source builds are not recommended, but we should still ignore
the generated artifacts.
|
|
Using return_if_error on lzma_lzma_lclppb_encode was improper because
return_if_error is expecting an lzma_ret value, but
lzma_lzma_lclppb_encode returns a boolean. This could result in
lzma_microlzma_encoder, which would be misleading for applications.
|
|
|
|
In source builds are not recommended, but we can make it easier
by ignoring the generated artifacts from CMake.
|
|
|
|
Using CMake to build liblzma should work on a few other OSes
but building the command line tools is still subtly broken.
It is known that shared library versioning may differ between
CMake and Libtool builds on some OSes, most notably Darwin.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The TODO file outdated still.
|
|
|
|
|
|
|
|
|
|
The code that parses --memlimit options and --block-list modified
the argv[] when parsing the option string from optarg. This was
visible in "ps auxf" and such and could be confusing. I didn't
understand it back in the day when I wrote that code. Now a copy
is allocated when modifiable strings are needed.
|
|
|
|
|
|
|
|
|
|
Thanks to Remus-Gabriel Chelu.
|
|
|
|
The API docs gave an impression that such checks are done
but they actually weren't done. In practice it made little
difference since the calling code has a bug if these are NULL.
Thanks to Jia Tan for the original patch that checked for
block->filters == NULL.
|
|
This also sorts the symbol names alphabetically in liblzma_*.map.
|
|
|
|
|
|
If someone sets up Clang to define __GNUC__ to 10 or greater
then symvers broke. __has_attribute is supported by such GCC
and Clang versions that don't support __symver__ so this should
be much better and simpler way to detect if __symver__ is
actually supported.
Thanks to Tomasz Gajc for the bug report.
|
|
It has some complicated downsides and its usefulness is more limited
than I originally thought. So this change is bad for certain very
specific situations but a generic solution that works for other
filters (and is otherwise better too) is planned anyway. And this
way 7-Zip can use the same compatible filter for the .7z format.
This is still marked as experimental with a new temporary Filter ID.
|
|
|
|
|
|
Thanks to Jia Tan for the original patch.
|
|
|
|
|
|
This was forgotten from 7484744af6cbabe81e92af7d9e061dfd597fff7b.
|
|
It forwards to me and Jia Tan.
Also update the IRC reference in README as #tukaani was moved
to Libera Chat long ago.
|
|
|
|
|
|
|
|
Thanks to Jia Tan.
|
|
Two uses: Displaying encoder filter chain when compressing with -vv,
and displaying the decoder filter chain in --list -vv.
|
|
lzma_str_to_filters() uses static error messages which makes
them not very precise. It tells the position in the string
where an error occurred though which helps quite a bit if
applications take advantage of it. Dynamic error messages can
be added later with a new flag if it seems important enough.
|
|
|
|
|
|
Here too this avoids the slightly ugly method to set
the uncompressed size.
Also moved the setting of dict_size to the struct initializer.
|
|
This avoids the need to use the slightly ugly method to
set the uncompressed size.
|
|
Some file formats need support for LZMA1 streams that don't use
the end of payload marker (EOPM) alias end of stream (EOS) marker.
So far liblzma API has supported decompressing such streams via
lzma_alone_decoder() when .lzma header specifies a known
uncompressed size. Encoding support hasn't been available in the API.
Instead of adding a new LZMA1-only API for this purpose, this commit
adds a new filter ID for use with raw encoder and decoder. The main
benefit of this approach is that then also filter chains are possible,
for example, if someone wants to implement support for .7z files that
use the x86 BCJ filter with LZMA1 (not BCJ2 as that isn't supported
in liblzma).
|
|
|
|
This allows using two Filter IDs with the same
initialization function and data structures.
|
|
|
|
|
|
|
|
|
|
Now that liblzma accepts these, we avoid the extra check and
there's one message less for translators too.
|
|
That is, if the specified nice_len is smaller than the minimum
of the match finder, silently use the match finder's minimum value
instead of reporting an error. The old behavior is annoying to users
and it complicates xz options handling too.
|
|
A tiny downside of this is that now a 1-4 tiny allocations are made
for every Block because each worker thread needs its own copy of
the filter chain.
|
|
|
|
It not only makes no sense to put symbol versions into a static library
but it can also cause breakage.
By default Libtool #defines PIC if building a shared library and
doesn't define it for static libraries. This is documented in the
Libtool manual. It can be overriden using --with-pic or --without-pic.
configure.ac detects if --with-pic or --without-pic is used and then
gives an error if neither --disable-shared nor --disable-static was
used at the same time. Thus, in normal situations it works to build
both shared and static library at the same time on GNU/Linux,
only --with-pic or --without-pic requires that only one type of
library is built.
Thanks to John Paul Adrian Glaubitz from Debian for reporting
the problem that occurred on ia64:
https://www.mail-archive.com/xz-devel@tukaani.org/msg00610.html
|
|
lzma_filters_free() sets the options to NULL and ids to
LZMA_VLI_UNKNOWN so there is no need to do it by caller;
the filter arrays will always be left in a safe state.
Also use memcpy() instead of a loop to copy a filter chain
when it is known to be safe to copy LZMA_FILTERS_MAX + 1
(even if the elements past the terminator might be uninitialized).
|
|
This time it can happen when lzma_stream_encoder_mt() is used
to reinitialize an existing multi-threaded Stream encoder
and one of 1-4 tiny allocations in lzma_filters_copy() fail.
It's very similar to the previous bug
10430fbf3820dafd4eafd38ec8be161a6978ed2b, happening with
an array of lzma_filter structures whose old options are freed
but the replacement never arrives due to a memory allocation
failure in lzma_filters_copy().
|
|
The documentation mentions that lzma_block_encoder() supports
LZMA_SYNC_FLUSH but it was never added to supported_actions[]
in the internal structure. Because of this, LZMA_SYNC_FLUSH could
not be used with the Block encoder unless it was the next coder
after something like stream_encoder() or stream_encoder_mt().
|
|
This is small but convenient and should have been added
a long time ago.
|
|
|
|
|
|
The bug was in the single-threaded .xz Stream encoder
in the code that is used for both re-initialization and for
lzma_filters_update(). To trigger it, an application had
to either re-initialize an existing encoder instance with
lzma_stream_encoder() or use lzma_filters_update(), and
then one of the 1-4 tiny allocations in lzma_filters_copy()
(called from stream_encoder_update()) must fail. An error
was correctly reported but the encoder state was corrupted.
This is related to the recent fix in
f8ee61e74eb40600445fdb601c374d582e1e9c8a which is good but
it wasn't enough to fix the main problem in stream_encoder.c.
|
|
|
|
The encoder doesn't support dictionary sizes larger than 1536 MiB.
This is validated, for example, when calculating the memory usage
via lzma_raw_encoder_memusage(). It is also enforced by the LZ
part of the encoder initialization. However, LZMA encoder with
LZMA_MODE_NORMAL did an unsafe calculation with dict_size before
such validation and that results in an infinite loop if dict_size
was 2 << 30 or greater.
|
|
These were caught by clang -Wdocumentation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This reverts commit 177bdc922cb17bd0fd831ab8139dfae912a5c2b8
and also does equivalent change to arm64.c.
Now that ARM64 filter will use lzma_options_bcj, this change
is not needed anymore.
|
|
This is incompatible with the previous version.
This has space/tab fixes in filter_*.c and bcj.h too.
|
|
It also works on E2K as it supports these intrinsics.
On x86-64 runtime detection is used so the code keeps working on
older processors too. A CLMUL-only build can be done by using
-msse4.1 -mpclmul in CFLAGS and this will reduce the library
size since the generic implementation and its 8 KiB lookup table
will be omitted.
On 32-bit x86 this isn't used by default for now because by default
on 32-bit x86 the separate assembly file crc64_x86.S is used.
If --disable-assembler is used then this new CLMUL code is used
the same way as on 64-bit x86. However, a CLMUL-only build
(-msse4.1 -mpclmul) won't omit the 8 KiB lookup table on
32-bit x86 due to a currently-missing check for disabled
assembler usage.
The configure.ac check should be such that the code won't be
built if something in the toolchain doesn't support it but
--disable-clmul-crc option can be used to unconditionally
disable this feature.
CLMUL speeds up decompression of files that have compressed very
well (assuming CRC64 is used as a check type). It is know that
the CLMUL code is significantly slower than the generic code for
tiny inputs (especially 1-8 bytes but up to 16 bytes). If that
is a real-world problem then there is already a commented-out
variant that uses the generic version for small inputs.
Thanks to Ilya Kurdyukov for the original patch which was
derived from a white paper from Intel [1] (published in 2009)
and public domain code from [2] (released in 2016).
[1] https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/fast-crc-computation-generic-polynomials-pclmulqdq-paper.pdf
[2] https://github.com/rawrunprotected/crc
|
|
|
|
|
|
|
|
|
|
It didn't do anything. There are only 32-bit x86 assembly files
and it feels likely that new files won't be added as intrinsics
in C are more portable across toolchains and OSes.
|
|
This uses it for CRC table initializations when using --disable-small.
It avoids mythread_once() overhead. It also means that then
--disable-small --disable-threads is thread-safe if this attribute
is supported.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It claims __GNUC__ >= 10 but doesn't support __symver__ attribute.
Thanks to Stephen Sachs.
|
|
__SSE2__ is the correct macro for SSE2 support with GCC, Clang,
and ICC. __SSE2_MATH__ means doing floating point math with SSE2
instead of 387. Often the latter macro is defined if the first
one is but it was still a bug.
|
|
|
|
The other scripts don't need changes for .lz support because
in those scripts it is enough that xz supports .lz.
|
|
In practice this means making the scripts work when
the input files have an unsupported check type which
isn't a problem in practice unless support for
some check types has been disabled at build time.
|
|
That's how it is preferred at the Translation Project.
On my system /usr/share/man/fr_FR doesn't contain any
other man pages than XZ Utils while /usr/share/man/fr
has quite a few, so this will fix that too.
Thanks to Benno Schulenberg from the Translation Project.
|
|
The --arm64 isn't actually implemented yet in the form
described in this commit.
Thanks to Jia Tan.
|
|
Modern 32-bit ARM in big endian mode use little endian for
instruction encoding still, so the filters work on such
executables too. It's likely less confusing for users this way.
The --arm64 option hasn't been implemented yet (there is
--experimental-arm64 but it's different). The --arm64 option
is added now anyway because this is the likely result and the
strings need to be ready for translators.
Thanks to Jia Tan.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If configured with --disable-lzip-decoder then --long-help will
still list `lzip' in --format but I left it like that since
due to translations it would be messy to have two help strings.
Features are disabled only in special situations so wrong help
in such a situation shouldn't matter much.
Thanks to Michał Górny for the original patch.
|
|
Thanks to Michał Górny for the original patch.
|
|
Support for format version 0 was removed from lzip 1.18 for some
reason. .lz format version 0 files are rare (and old) but some
source packages were released in this format, and some people might
have personal files in this format too. It's very little extra code
to support it along side format version 1 so this commits adds
support for both.
The Sync Flush marker extentension to the original .lz format
version 1 isn't supported. It would require changes to the
LZMA decoder itself. Such files are very rare anyway.
See the API doc for lzma_lzip_decoder() for more details about
the .lz format support.
Thanks to Michał Górny for the original patch.
|
|
This was forgotten from commit 59c4d6e1390f6f4176f43ac1dad1f7ac03c449b8.
|
|
"xz -v < regular_file > out.xz" doesn't display the percentage
and estimated remaining time because it doesn't even try to
check the input file size when input is read from stdin.
This could be improved but for now there's just a comment
to remind about it.
|
|
It worked for one input file since the counters are zero when
xz starts but they weren't reset when starting a new file in
passthru mode. For example, if files A, B, and C are one byte each,
then "xz -dcvf A B C" would show file sizes as 1, 2, and 3 bytes
instead of 1, 1, and 1 byte.
|
|
It is on the man page still.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
It feels better that the initializations are sandboxed too.
They don't do anything that the pledge() call wouldn't allow.
|
|
Now it includes everything that the human-readable --info-memory shows.
|
|
This affects lzma_memusage() and lzma_memlimit_set() when used
with the threaded decompressor. Now all allocations are reported
by lzma_memusage() (so it's not misleading) and lzma_memlimit_set()
cannot lower the limit below that value.
The alternative would have been to allow lowering the limit if
doing so is possible by freeing the cached memory but since
the primary use case of lzma_memlimit_set() is to increase
memlimit after LZMA_MEMLIMIT_ERROR this simple approach
was selected.
The cached memory was always included when enforcing
the memory usage limit while decoding.
Thanks to Jia Tan.
|
|
This should be smaller too since it avoids the string constants.
|
|
|
|
|
|
Don't call InitOnceComplete() if initialization was already done.
So far mythread_once() has been needed only when building
with --enable-small. windows/build.bash does this together
with --disable-threads so the Vista-specific mythread_once()
is never needed by those builds. VS project files or
CMake-builds don't support HAVE_SMALL builds at all.
|
|
|
|
This was forgotten from commit 2611c4d90535652d3eb7ef4a026a6691276fab43.
|
|
It now tries to test as many files as easily possible.
The exit status indicates skipping if any of the files were
skipped. This way it is easy to notice if something is being
skipped when it isn't expected.
|
|
xz (but not xzdec) will normally warn about unsupported check
but since we are testing specifically such a file, it's better
to silence that warning so that it doesn't look suspicious in
test_files.sh.log.
The use of -q and -Q in xzdec is just for consistency and
doesn't affect the result at least for now.
|
|
|
|
|