aboutsummaryrefslogtreecommitdiff
path: root/src/liblzma/lzma (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-02-14liblzma: LZMA decoder improvements.Lasse Collin1-186/+78
This adds macros for bittree decoding which prepares the code for alternative C versions and inline assembly.
2024-02-14liblzma: Creates Non-resumable and Resumable modes for lzma_decoder.Jia Tan1-211/+509
The new decoder resumes the first decoder loop in the Resumable mode. Then, the code executes in Non-resumable mode until it detects that it cannot guarantee to have enough input/output to decode another symbol. The Resumable mode is how the decoder has always worked. Before decoding every input bit, it checks if there is enough space and will save its location to be resumed later. When the decoder has more input/output, it jumps back to the correct sequence in the Resumable mode code. When the input/output buffers are large, the Resumable mode is much slower than the Non-resumable because it has more branches and is harder for the compiler to optimize since it is in a large switch block. Early benchmarking shows significant time improvement (8-10% on gcc and clang x86) by using the Non-resumable code as much as possible.
2024-02-14liblzma: Creates separate "safe" range decoder mode.Jia Tan1-83/+25
The new "safe" range decoder mode is the same as old range decoder, but now the default behavior of the range decoder will not check if there is enough input or output to complete the operation. When the buffers are close to fully consumed, the "safe" operations must be used instead. This will improve speed because it will reduce the number of branches needed for most of the range decoder operations.
2024-02-14liblzma: Include the SPDX license identifier 0BSD to generated files.Lasse Collin2-6/+10
Perhaps the generated files aren't even copyrightable but using the same license for them as for the rest of the liblzma keeps things more consistent for tools that look for license info.
2024-02-14Add SPDX license identifier into 0BSD source code files.Lasse Collin16-2/+31
2024-02-14Change most public domain parts to 0BSD.Lasse Collin16-48/+0
Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt were not touched. COPYING.0BSD was added.
2023-12-16liblzma: Improve lzma encoder init function consistency.Jia Tan1-0/+3
lzma_encoder_init() did not check for NULL options, but lzma2_encoder_init() did. This is more of a code style improvement than anything else to help make lzma_encoder_init() and lzma2_encoder_init() more similar.
2023-10-31liblzma: Fix compilation of fastpos_tablegen.c.Lasse Collin1-0/+2
The macro lzma_attr_visibility_hidden has to be defined to make fastpos.h usable. The visibility attribute is irrelevant to fastpos_tablegen.c so simply #define the macro to an empty value. fastpos_tablegen.c is never built by the included build systems and so the problem wasn't noticed earlier. It's just a standalone program for generating fastpos_table.c. Fixes: https://github.com/tukaani-project/xz/pull/69 Thanks to GitHub user Jamaika1.
2023-10-30liblzma: Use lzma_attr_visibility_hidden on private extern declarations.Lasse Collin1-0/+1
These variables are internal to liblzma and not exposed in the API.
2023-07-31Docs: Fix typos found by codespellDimitri Papadopoulos Orfanos1-2/+2
2023-05-11liblzma: Exports lzma_mt_block_size() as an API function.Jia Tan1-0/+3
The lzma_mt_block_size() was previously just an internal function for the multithreaded .xz encoder. It is used to provide a recommended Block size for a given filter chain. This function is helpful to determine the maximum Block size for the multithreaded .xz encoder when one wants to change the filters between blocks. Then, this determined Block size can be provided to lzma_stream_encoder_mt() in the lzma_mt options parameter when intializing the coder. This requires one to know all the filter chains they are using before starting to encode (or at least the filter chain that will need the largest Block size), but that isn't a bad limitation.
2023-03-23Build: Removes redundant check for LZMA1 filter support.Jia Tan1-4/+1
2022-11-27liblzma: Add LZMA_FILTER_LZMA1EXT to support LZMA1 without end marker.Lasse Collin5-8/+66
Some file formats need support for LZMA1 streams that don't use the end of payload marker (EOPM) alias end of stream (EOS) marker. So far liblzma API has supported decompressing such streams via lzma_alone_decoder() when .lzma header specifies a known uncompressed size. Encoding support hasn't been available in the API. Instead of adding a new LZMA1-only API for this purpose, this commit adds a new filter ID for use with raw encoder and decoder. The main benefit of this approach is that then also filter chains are possible, for example, if someone wants to implement support for .7z files that use the x86 BCJ filter with LZMA1 (not BCJ2 as that isn't supported in liblzma).
2022-11-27liblzma: Avoid unneeded use of void pointer in LZMA decoder.Lasse Collin2-3/+2
2022-11-27liblzma: Pass the Filter ID to LZ encoder and decoder.Lasse Collin4-4/+6
This allows using two Filter IDs with the same initialization function and data structures.
2022-11-24liblzma: Allow nice_len 2 and 3 even if match finder requires 3 or 4.Lasse Collin1-3/+8
That is, if the specified nice_len is smaller than the minimum of the match finder, silently use the match finder's minimum value instead of reporting an error. The old behavior is annoying to users and it complicates xz options handling too.
2022-11-22liblzma: Fix infinite loop in LZMA encoder init with dict_size >= 2 GiB.Lasse Collin1-4/+15
The encoder doesn't support dictionary sizes larger than 1536 MiB. This is validated, for example, when calculating the memory usage via lzma_raw_encoder_memusage(). It is also enforced by the LZ part of the encoder initialization. However, LZMA encoder with LZMA_MODE_NORMAL did an unsafe calculation with dict_size before such validation and that results in an infinite loop if dict_size was 2 << 30 or greater.
2022-07-14liblzma: Rename a variable and improve a comment.Lasse Collin1-4/+9
2022-07-13liblzma: Add optional autodetection of LZMA end marker.Lasse Collin2-30/+71
Turns out that this is needed for .lzma files as the spec in LZMA SDK says that end marker may be present even if the size is stored in the header. Such files are rare but exist in the real world. The code in liblzma is so old that the spec didn't exist in LZMA SDK back then and I had understood that such files weren't possible (the lzma tool in LZMA SDK didn't create such files). This modifies the internal API so that LZMA decoder can be told if EOPM is allowed even when the uncompressed size is known. It's allowed with .lzma and not with other uses. Thanks to Karl Beldan for reporting the problem.
2022-02-07liblzma: Add NULL checks to LZMA and LZMA2 properties encoders.jiat752-0/+6
Previously lzma_lzma_props_encode() and lzma_lzma2_props_encode() assumed that the options pointers must be non-NULL because the with these filters the API says it must never be NULL. It is good to do these checks anyway.
2021-01-29liblzma: Fix unitialized variable.Lasse Collin1-0/+1
This was introduced two weeks ago in the commit 625f4c7c99b2fcc4db9e7ab2deb4884790e2e17c. Thanks to Nathan Moinvaziri.
2021-01-14liblzma: Add rough support for output-size-limited encoding in LZMA1.Lasse Collin2-35/+104
With this it is possible to encode LZMA1 data without EOPM so that the encoder will encode as much input as it can without exceeding the specified output size limit. The resulting LZMA1 stream will be a normal LZMA1 stream without EOPM. The actual uncompressed size will be available to the caller via the uncomp_size pointer. One missing thing is that the LZMA layer doesn't inform the LZ layer when the encoding is finished and thus the LZ may read more input when it won't be used. However, this doesn't matter if encoding is done with a single call (which is the planned use case for now). For proper multi-call encoding this should be improved. This commit only adds the functionality for internal use. Nothing uses it yet.
2020-02-24liblzma: Remove unneeded <sys/types.h> from fastpos_tablegen.c.Lasse Collin1-1/+0
This file only generates fastpos_table.c. It isn't built as a part of liblzma.
2020-02-21liblzma: Add more uses of lzma_memcmplen() to the normal mode of LZMA.Lasse Collin1-6/+10
This gives a tiny encoder speed improvement. This could have been done in 2014 after the commit 544aaa3d13554e8640f9caf7db717a96360ec0f6 but it was forgotten.
2019-12-31Rename unaligned_read32ne to read32ne, and similarly for the others.Lasse Collin3-4/+3
2019-06-23liblzma: Fix warnings from -Wsign-conversion.Lasse Collin4-14/+15
Also, more parentheses were added to the literal_subcoder macro in lzma_comon.h (better style but no functional change in the current usage).
2019-06-01liblzma: Use unaligned_readXXne functions instead of type punning.Lasse Collin1-1/+1
Now gcc -fsanitize=undefined should be clean. Thanks to Jeffrey Walton.
2017-08-14Fix or hide warnings from GCC 7's -Wimplicit-fallthrough.Lasse Collin1-0/+6
2016-11-21liblzma: Avoid multiple definitions of lzma_coder structures.Lasse Collin8-82/+98
Only one definition was visible in a translation unit. It avoided a few casts and temp variables but seems that this hack doesn't work with link-time optimizations in compilers as it's not C99/C11 compliant. Fixes: http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html
2015-11-03liblzma: Rename lzma_presets.c back to lzma_encoder_presets.c.Lasse Collin2-2/+2
It would be too annoying to update other build systems just because of this.
2015-11-03Build: Build LZMA1/2 presets also when only decoder is wanted.Lasse Collin2-2/+7
People shouldn't rely on the presets when decoding raw streams, but xz uses the presets as the starting point for raw decoder options anyway. lzma_encocder_presets.c was renamed to lzma_presets.c to make it clear it's not used solely by the encoder code.
2015-03-07liblzma: Silence more uint32_t vs. size_t warnings.Lasse Collin1-1/+1
2015-02-21liblzma: Fix a compression-ratio regression in LZMA1/2 in fast mode.Lasse Collin1-1/+1
The bug was added in the commit f48fce093b07aeda95c18850f5e086d9f2383380 and thus affected 5.1.4beta and 5.2.0. Luckily the bug cannot cause data corruption or other nasty things.
2014-07-25liblzma: Use lzma_memcmplen() in normal mode of LZMA.Lasse Collin1-15/+5
Two locations were not changed yet because the simplest change assumes that the initial "len" may be greater than "limit".
2014-07-25liblzma: Simplify LZMA fast mode code by using memcmp().Lasse Collin1-10/+1
2014-07-25liblzma: Use lzma_memcmplen() in fast mode of LZMA.Lasse Collin1-3/+3
2014-01-12liblzma: Avoid C99 compound literal arrays.Lasse Collin1-3/+5
MSVC 2013 doesn't like them. Maybe they aren't so good for readability either since many aren't used to them.
2012-07-17liblzma: Make the use of lzma_allocator const-correct.Lasse Collin8-22/+26
There is a tiny risk of causing breakage: If an application assigns lzma_stream.allocator to a non-const pointer, such code won't compile anymore. I don't know why anyone would do such a thing though, so in practice this shouldn't cause trouble. Thanks to Jan Kratochvil for the patch.
2012-06-28liblzma: Check that the first byte of range encoded data is 0x00.Lasse Collin1-2/+6
It is just to be more pedantic and thus perhaps catch broken files slightly earlier.
2011-04-12Remove doubled words from documentation and comments.Lasse Collin1-1/+1
Spot candidates by running these commands: git ls-files |xargs perl -0777 -n \ -e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt]o)\s+\1\b/gims)' \ -e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g; print "$ARGV:$n:$v\n"}' Thanks to Jim Meyering for the original patch.
2011-04-11liblzma: Add the forgotten lzma_lzma2_block_size().Lasse Collin2-0/+12
This should have been in 5eefc0086d24a65e136352f8c1d19cefb0cbac7a.
2011-03-31liblzma: Fix decoding of LZMA2 streams having no uncompressed data.Lasse Collin1-4/+4
The decoder considered empty LZMA2 streams to be corrupt. This shouldn't matter much with .xz files, because no encoder creates empty LZMA2 streams in .xz. This bug is more likely to cause problems in applications that use raw LZMA2 streams.
2010-10-26liblzma: Rename a few variables and constants.Lasse Collin8-186/+183
This has no semantic changes. I find the new names slightly more logical and they match the names that are already used in XZ Embedded. The name fastpos wasn't changed (not worth the hassle).
2010-10-19Clean up a few FIXMEs and TODOs.Lasse Collin3-4/+3
lzma_chunk_size() was commented out because it is currently useless.
2010-09-26Fix the preset -3e.Lasse Collin1-0/+1
depth=0 was missing.
2010-09-04Don't set lc=4 with --extreme.Lasse Collin1-1/+0
This should reduce the cases where --extreme makes compression worse. On the other hand, some other files may now benefit slightly less from --extreme.
2010-09-03Tweak the compression presets -0 .. -5.Lasse Collin1-10/+19
"Extreme" mode might need some further tweaking still. Docs were not updated yet.
2010-05-26Rename MIN() and MAX() to my_min() and my_max().Lasse Collin3-9/+9
This should avoid some minor portability issues.
2010-02-12Collection of language fixes to comments and docs.Lasse Collin6-7/+7
Thanks to Jonathan Nieder.
2009-11-22Make fastpos.h use tuklib_integer.h instead of bsr.hLasse Collin1-4/+1
when --enable-small has been specified.
2009-11-15Fix wrong indentation caused by incorrect settingsLasse Collin2-4/+4
in the text editor.
2009-11-14Fix a design error in liblzma API.Lasse Collin2-26/+34
Originally the idea was that using LZMA_FULL_FLUSH with Stream encoder would read the filter chain from the same array that was used to intialize the Stream encoder. Since most apps wouldn't use LZMA_FULL_FLUSH, most apps wouldn't need to keep the filter chain available after initializing the Stream encoder. However, due to my mistake, it actually required keeping the array always available. Since setting the new filter chain via the array used at initialization time is not a nice way to do it for a couple of reasons, this commit ditches it and introduces lzma_filters_update(). This new function replaces also the "persistent" flag used by LZMA2 (and to-be-designed Subblock filter), which was also an ugly thing to do. Thanks to Alexey Tourbin for reminding me about the problem that Stream encoder used to require keeping the filter chain allocated.
2009-10-04Use a tuklib module for integer handling.Lasse Collin3-3/+3
This replaces bswap.h and integer.h. The tuklib module uses <byteswap.h> on GNU, <sys/endian.h> on *BSDs and <sys/byteorder.h> on Solaris, which may contain optimized code like inline assembly.
2009-09-11Fix a couple of warnings.Lasse Collin2-5/+5
2009-06-30Build system fixesLasse Collin2-51/+43
Don't use libtool convenience libraries to avoid recently discovered long-standing subtle but somewhat severe bugs in libtool (at least 1.5.22 and 2.2.6 are affected). It was found when porting XZ Utils to Windows <http://lists.gnu.org/archive/html/libtool/2009-06/msg00070.html> but the problem is significant also e.g. on GNU/Linux. Unless --disable-shared is passed to configure, static library built from a set of convenience libraries will contain PIC objects. That is, while libtool builds non-PIC objects too, only PIC objects will be used from the convenience libraries. On 32-bit x86 (tested on mobile XP2400+), using PIC instead of non-PIC makes the decompressor 10 % slower with the default CFLAGS. So while xz was linked against static liblzma by default, it got the slower PIC objects unless --disable-shared was used. I tend develop and benchmark with --disable-shared due to faster build time, so I hadn't noticed the problem in benchmarks earlier. This commit also adds support for building Windows resources into liblzma and executables.
2009-06-26Fix @variables@ to $(variables) in Makefile.am files.Lasse Collin1-4/+4
Fix the ordering of libgnu.a and LTLIBINTL on the linker command line and added missing LTLIBINTL to tests/Makefile.am.
2009-04-13Put the interesting parts of XZ Utils into the public domain.Lasse Collin16-184/+72
Some minor documentation cleanups were made at the same time.
2009-02-02Modify LZMA_API macro so that it works on Windows withLasse Collin2-2/+2
other compilers than MinGW. This may hurt readability of the API headers slightly, but I don't know any better way to do this.
2009-01-27Added initial support for preset dictionary for raw LZMA1Lasse Collin5-14/+28
and LZMA2. It is not supported by the .xz format or the xz command line tool yet.
2009-01-19Move some LZMA2 constants to lzma2_encoder.h so that theyLasse Collin3-14/+16
can be used outside lzma2_encoder.c.
2009-01-19Remove dead code.Lasse Collin1-8/+0
2008-12-27Revert a change made in 3b34851de1eaf358cf9268922fa0eeed8278d680Lasse Collin1-15/+8
that was related to LZMA_MODE_FAST. The original code is slightly faster although it compresses slightly worse. But since it is fast mode, it is better to select the faster version.
2008-12-27Bunch of liblzma tweaks, including some API changes.Lasse Collin2-38/+17
The API and ABI should now be very close to stable, although the code behind it isn't yet.
2008-12-15Bunch of liblzma API cleanups and fixes.Lasse Collin1-3/+3
2008-12-15Fix data corruption in LZMA2 decoder.Lasse Collin1-4/+11
2008-12-09Make the memusage functions of LZMA1 and LZMA2 encodersLasse Collin3-16/+35
to validate the filter options. Add missing validation to LZMA2 encoder when options are changed in the middle of encoding.
2008-12-01Make the memusage functions of LZMA1 and LZMA2 decodersLasse Collin3-9/+17
to validate the filter options.
2008-12-01LZMA2 decoder cleanups. Make it require new LZMA propertiesLasse Collin1-54/+41
also in the first LZMA chunk after a dictionary reset in uncompressed chunk.
2008-10-07Made the preset numbering more logical in liblzma API.Lasse Collin1-1/+2
2008-09-27Some API changes, bug fixes, cleanups etc.Lasse Collin9-123/+112
2008-09-17Miscellaneous LZ and LZMA encoder cleanupsLasse Collin3-101/+23
2008-09-13Renamed constants:Lasse Collin3-14/+14
- LZMA_VLI_VALUE_MAX -> LZMA_VLI_MAX - LZMA_VLI_VALUE_UNKNOWN -> LZMA_VLI_UNKNOWN - LZMA_HEADER_ERRRO -> LZMA_OPTIONS_ERROR
2008-08-31Fix wrong pointer calculation in LZMA encoder.Lasse Collin1-1/+3
2008-08-28Sort of garbage collection commit. :-| Many things are stillLasse Collin21-2334/+3425
broken. API has changed a lot and it will still change a little more here and there. The command line tool doesn't have all the required changes to reflect the API changes, so it's easy to get "internal error" or trigger assertions.
2008-06-20Remove some redundant code from LZMA encoder.Lasse Collin1-14/+1
2008-06-19Add limit of lc + lp <= 4. Now we can allocate theLasse Collin5-90/+34
literal coder as part of the main LZMA encoder or decoder structure. Make the LZMA decoder to rely on the current internal API to free the allocated memory in case an error occurs.
2008-06-18CommentsLasse Collin1-5/+2
2008-06-18Update the code to mostly match the new simpler file formatLasse Collin2-5/+18
specification. Simplify things by removing most of the support for known uncompressed size in most places. There are some miscellaneous changes here and there too. The API of liblzma has got many changes and still some more will be done soon. While most of the code has been updated, some things are not fixed (the command line tool will choke with invalid filter chain, if nothing else). Subblock filter is somewhat broken for now. It will be updated once the encoded format of the Subblock filter has been decided.
2008-06-11Fix uninitialized variable in LZMA encoder. This wasLasse Collin1-0/+2
introduced in 369f72fd656f537a9a8e06f13e6d0d4c242be22f.
2008-06-01Fix a buffer overflow in the LZMA encoder. It was due to myLasse Collin5-318/+320
misunderstanding of the code. There's no tiny fix for this problem, so I also cleaned up the code in general. This reduces the speed of the encoder 2-5 % in the fastest compression mode ("lzma -1"). High compression modes should have no noticeable performance difference. This commit breaks things (especially LZMA_SYNC_FLUSH) but I will fix them once the new format and LZMA2 has been roughly implemented. Plain LZMA won't support LZMA_SYNC_FLUSH at all and won't be supported in the new .lzma format. This may change still but this is what it looks like now. Support for known uncompressed size (that is, LZMA or LZMA2 without EOPM) is likely to go away. This means there will be API changes.
2008-04-24Added two assert()s.Lasse Collin1-1/+3
2008-04-24Fix fastpos problem in Makefile.am when built with --enable-small.Lasse Collin1-1/+4
2008-03-22Update a comment to use the variable name rep_len_decoder.Lasse Collin1-1/+1
(And BTW, the previous commit actually did change the program logic slightly.)
2008-03-22Demystified the "state" variable in LZMA code. Use theLasse Collin6-70/+107
word literal instead of char for better consistency. There are still some names with _char instead of _literal in lzma_optimum, these may be changed later. Renamed length coder variables. This commit doesn't change the program logic.
2008-03-14Fix data corruption in LZMA encoder. Note that this bug wasLasse Collin1-0/+4
specific to liblzma and was *not* present in LZMA SDK.
2008-03-11Apply a minor speed optimization to LZMA decoder.Lasse Collin1-42/+43
2008-03-10Really fix the price count initialization.Lasse Collin1-2/+2
2008-03-10Initialize align_price_count and match_price_count inLasse Collin1-0/+2
lzma_encoder_init.c. While we don't call fill_distances_prices() and fill_align_prices() in lzma_lzma_encoder_init(), we still need to initialize these two variables so that the fill functions get called in lzma_encoder_getoptimum.c in the beginning of a stream.
2008-02-28Remove two redundant validity checks from the LZMA decoder.Lasse Collin1-19/+4
These are already checked elsewhere, so omitting these gives (very) tiny speed up.
2008-01-18Fix LZMA_SYNC_FLUSH handling in LZ and LZMA encoders.Lasse Collin1-25/+2
That code is now almost completely in LZ coder, where it can be shared with other LZ77-based algorithms in future.
2008-01-15Revised the fastpos code. It now uses the slightly fasterLasse Collin9-44/+746
table-based version from LZMA SDK 4.57. This should be fast on most systems. A simpler and smaller alternative version is also provided. On some CPUs this can be even a little faster than the default table-based version (see comments in fastpos.h), but on most systems the table-based code is faster.
2008-01-15Removed a few unused macros from lzma_common.h.Lasse Collin1-6/+2
2008-01-15Fix a typo in lzma_encoder.c.Lasse Collin1-1/+1
2008-01-15Convert bittree_get_price() and bittree_reverse_get_price()Lasse Collin2-22/+13
from macros to inline functions.
2008-01-14Remove RC_BUFFER_SIZE from lzma_encoder_private.hLasse Collin1-2/+4
and replace it with a sanity check.
2008-01-14Major changes to LZ encoder, LZMA encoder, and range encoder.Lasse Collin1-31/+43
These changes implement support for LZMA_SYNC_FLUSH in LZMA encoder, and move the temporary buffer needed by range encoder from lzma_range_encoder structure to lzma_lz_encoder.
2008-01-14In lzma_read_match_distances(), don't useLasse Collin1-3/+3
coder->lz.stream_end_was_reached. That variable will be removed, and the check isn't required anyway. Rearrange the check so that it doesn't make one to think that there could be an integer overflow.
2008-01-14More fixes to LZMA decoder's flush marker handling.Lasse Collin1-22/+30
2008-01-05Another bug fix for flush marker detection.Lasse Collin1-1/+9
2008-01-04Fix stupid bugs in flush marker detection.Lasse Collin1-3/+4
2008-01-04Added support for flush marker, which will be in filesLasse Collin2-117/+104
that use LZMA_SYNC_FLUSH with encoder (not implemented yet). This is a new feature in the raw LZMA format, which isn't supported by old decoders. This shouldn't be a problem in practice, since lzma_alone_encoder() will not allow LZMA_SYNC_FLUSH, and thus not allow creating files on decodable with old decoders. Made lzma_decoder.c to require tab width of 4 characters if one wants to fit the code in 80 columns. This makes the code easier to read.
2008-01-04Moved range decoder initialization (reading the firstLasse Collin1-36/+6
five input bytes) from LZMA decoder to range decoder header. Did the same for decoding of direct bits.
2007-12-09Imported to git.Lasse Collin14-0/+3309