aboutsummaryrefslogtreecommitdiff
path: root/src/xz/coder.c (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2022-11-09xz: Add .lz (lzip) decompression support.Lasse Collin1-3/+65
If configured with --disable-lzip-decoder then --long-help will still list `lzip' in --format but I left it like that since due to translations it would be messy to have two help strings. Features are disabled only in special situations so wrong help in such a situation shouldn't matter much. Thanks to Michał Górny for the original patch.
2022-11-09xz: Add comments about stdin and src_st.st_size.Lasse Collin1-0/+9
"xz -v < regular_file > out.xz" doesn't display the percentage and estimated remaining time because it doesn't even try to check the input file size when input is read from stdin. This could be improved but for now there's just a comment to remind about it.
2022-11-09xz: Fix displaying of file sizes in progress indicator in passthru mode.Lasse Collin1-1/+5
It worked for one input file since the counters are zero when xz starts but they weren't reset when starting a new file in passthru mode. For example, if files A, B, and C are one byte each, then "xz -dcvf A B C" would show file sizes as 1, 2, and 3 bytes instead of 1, 1, and 1 byte.
2022-10-25xz: Fix --single-stream with an empty .xz Stream.Lasse Collin1-0/+9
Example: $ xz -dc --single-stream good-0-empty.xz xz: good-0-empty.xz: Internal error (bug) The code, that is tries to catch some input file issues early, didn't anticipate LZMA_STREAM_END which is possible in that code only when --single-stream is used.
2022-10-25xz: Fix decompressor behavior if input uses an unsupported check type.Lasse Collin1-4/+15
Now files with unsupported check will make xz display a warning, set the exit status to 2 (unless --no-warn is used), and then decompress the file normally. This is how it was supposed to work since the beginning but this was broken by the commit 231c3c7098f1099a56abb8afece76fc9b8699f05, that is, a little before 5.0.0 was released. The buggy behavior displayed a message, set exit status 1 (error), and xz didn't attempt to to decompress the file. This doesn't matter today except for special builds that disable CRC64 or SHA-256 at build time (but such builds should be used in special situations only). The bug matters if new check type is added in the future and an old xz version is used to decompress such a file; however, it's likely that such files would use a new filter too and an old xz wouldn't be able to decompress the file anyway. The first hunk in the commit is the actual fix. The second hunk is a cleanup since LZMA_TELL_ANY_CHECK isn't used in xz. There is a test file for unsupported check type but it wasn't used by test_files.sh, perhaps due to different behavior between xz and the simpler xzdec.
2022-04-14xz: Add a default soft memory usage limit for --threads=0.Lasse Collin1-2/+26
This is a soft limit in sense that it only affects the number of threads. It never makes xz fail and it never makes xz change settings that would affect the compressed output. The idea is to make -T0 have more reasonable behavior when the system has very many cores or when a memory-hungry compression options are used. This also helps with 32-bit xz, preventing it from running out of address space. The downside of this commit is that now the number of threads might become too low compared to what the user expected. I hope this to be an acceptable compromise as the old behavior has been a source of well-argued complaints for a long time.
2022-04-14xz: Make -T0 use multithreaded mode on single-core systems.Lasse Collin1-9/+9
The main problem withi the old behavior is that the compressed output is different on single-core systems vs. multicore systems. This commit fixes it by making -T0 one thread in multithreaded mode on single-core systems. The downside of this is that it uses more memory. However, if --memlimit-compress is used, xz can (thanks to the previous commit) drop to the single-threaded mode still.
2022-04-14xz: Changes to --memlimit-compress and --no-adjust.Lasse Collin1-20/+43
In single-threaded mode, --memlimit-compress can make xz scale down the LZMA2 dictionary size to meet the memory usage limit. This obviously affects the compressed output. However, if xz was in threaded mode, --memlimit-compress could make xz reduce the number of threads but it wouldn't make xz switch from multithreaded mode to single-threaded mode or scale down the LZMA2 dictionary size. This seemed illogical and there was even a "FIXME?" about it. Now --memlimit-compress can make xz switch to single-threaded mode if one thread in multithreaded mode uses too much memory. If memory usage is still too high, then the LZMA2 dictionary size can be scaled down too. The option --no-adjust was also changed so that it no longer prevents xz from scaling down the number of threads as that doesn't affect compressed output (only performance). After this commit --no-adjust only prevents adjustments that affect compressed output, that is, with --no-adjust xz won't switch from multithreaded mode to single-threaded mode and won't scale down the LZMA2 dictionary size. The man page wasn't updated yet.
2022-04-12xz: Add --memlimit-mt-decompress along with a default limit value.Lasse Collin1-23/+11
--memlimit-mt-decompress allows specifying the limit for multithreaded decompression. This matches memlimit_threading in liblzma. This limit can only affect the number of threads being used; it will never prevent xz from decompressing a file. The old --memlimit-decompress option is still used at the same time. If the value of --memlimit-decompress (the default value or one specified by the user) is less than the value of --memlimit-mt-decompress , then --memlimit-mt-decompress is reduced to match --memlimit-decompress. Man page wasn't updated yet.
2022-03-07xz: Add initial support for threaded decompression.Lasse Collin1-1/+35
If threading support is enabled at build time, this will use lzma_stream_decoder_mt() even for single-threaded mode. With memlimit_threading=0 the behavior should be identical. This needs some work like adding --memlimit-threading=LIMIT. The original patch from Sebastian Andrzej Siewior included a method to get currently available RAM on Linux. It might be one way to go but as it is Linux-only, the available-RAM approach needs work for portability or using a fallback method on other OSes. The man page wasn't updated yet.
2020-01-26xz: Set the --flush-timeout deadline when the first input byte arrives.Lasse Collin1-5/+1
xz --flush-timeout=2000, old version: 1. xz is started. The next flush will happen after two seconds. 2. No input for one second. 3. A burst of a few kilobytes of input. 4. No input for one second. 5. Two seconds have passed and flushing starts. The first second counted towards the flush-timeout even though there was no pending data. This can cause flushing to occur more often than needed. xz --flush-timeout=2000, after this commit: 1. xz is started. 2. No input for one second. 3. A burst of a few kilobytes of input. The next flush will happen after two seconds counted from the time when the first bytes of the burst were read. 4. No input for one second. 5. No input for another second. 6. Two seconds have passed and flushing starts.
2020-01-26xz: Move flush_needed from mytime.h to file_pair struct in file_io.h.Lasse Collin1-1/+2
2020-01-26xz: coder.c: Make writing output a separate function.Lasse Collin1-13/+17
The same code sequence repeats so it's nicer as a separate function. Note that in one case there was no test for opt_mode != MODE_TEST, but that was only because that condition would always be true, so this commit doesn't change the behavior there.
2020-01-26xz: Fix semi-busy-waiting in xz --flush-timeout.Lasse Collin1-0/+4
When input blocked, xz --flush-timeout=1 would wake up every millisecond and initiate flushing which would have nothing to flush and thus would just waste CPU time. The fix disables the timeout when no input has been seen since the previous flush.
2019-06-23xz: Fix some of the warnings from -Wsign-conversion.Lasse Collin1-2/+2
2019-05-11spellingAntoine Cœur1-2/+2
2018-12-20xz: Fix a crash in progress indicator when in passthru mode.Lasse Collin1-4/+7
"xz -dcfv not_an_xz_file" crashed (all four options are required to trigger it). It caused xz to call lzma_get_progress(&strm, ...) when no coder was initialized in strm. In this situation strm.internal is NULL which leads to a crash in lzma_get_progress(). The bug was introduced when xz started using lzma_get_progress() to get progress info for multi-threaded compression, so the bug is present in versions 5.1.3alpha and higher. Thanks to Filip Palian <Filip.Palian@pjwstk.edu.pl> for the bug report.
2015-11-03xz: Make xz buildable even when encoders or decoders are disabled.Lasse Collin1-8/+25
The patch is quite long but it's mostly about adding new #ifdefs to omit code when encoders or decoders have been disabled. This adds two new #defines to config.h: HAVE_ENCODERS and HAVE_DECODERS.
2014-08-05xz: Add --ignore-check.Lasse Collin1-1/+9
2014-06-18xz: Check for filter chain compatibility for --flush-timeout.Lasse Collin1-9/+21
This avoids LZMA_PROG_ERROR from lzma_code() with filter chains that don't support LZMA_SYNC_FLUSH.
2014-06-09xz: Force single-threaded mode when --flush-timeout is used.Lasse Collin1-0/+11
2014-05-08xz: Fix uint64_t vs. size_t which broke 32-bit build.Lasse Collin1-1/+1
Thanks to Christian Hesse.
2014-01-12xz: Fix a comment.Lasse Collin1-2/+2
2013-11-12xz: Make --block-list and --block-size work together in single-threaded.Lasse Collin1-15/+75
Previously, --block-list and --block-size only worked together in threaded mode. Boundaries are specified by --block-list, but --block-size specifies the maximum size for a Block. Now this works in single-threaded mode too. Thanks to James M Leddy for the original patch.
2013-10-22xz: Take advantage of LZMA_FULL_BARRIER with --block-list.Lasse Collin1-17/+15
Now if --block-list is used in threaded mode, the encoder won't need to flush at each Block boundary specified via --block-list. This improves performance a lot, making threading helpful with --block-list. The flush timer was reset after LZMA_FULL_FLUSH but since LZMA_FULL_BARRIER doesn't flush, resetting the timer is no longer done.
2013-09-17Add native threading support on Windows.Lasse Collin1-4/+4
Now liblzma only uses "mythread" functions and types which are defined in mythread.h matching the desired threading method. Before Windows Vista, there is no direct equivalent to pthread condition variables. Since this package doesn't use pthread_cond_broadcast(), pre-Vista threading can still be kept quite simple. The pre-Vista code doesn't use anything that wasn't already available in Windows 95, so the binaries should run even on Windows 95 if someone happens to care.
2013-07-04xz: Add preliminary support for --flush-timeout=TIMEOUT.Lasse Collin1-11/+35
When --flush-timeout=TIMEOUT is used, xz will use LZMA_SYNC_FLUSH if read() would block and at least TIMEOUT milliseconds has elapsed since the previous flush. This can be useful in realtime-like use cases where the data is simultanously decompressed by another process (possibly on a different computer). If new uncompressed input data is produced slowly, without this option xz could buffer the data for a long time until it would become decompressible from the output. If TIMEOUT is 0, the feature is disabled. This is the default. This commit affects the compression side. Using xz for the decompression side for the above purpose doesn't work yet so well because there is quite a bit of input and output buffering when decompressing. The --long-help or man page were not updated yet. The details of this feature may change.
2013-07-04xz: Fix the test when to read more input.Lasse Collin1-3/+3
Testing for end of file was no longer correct after full flushing became possible with --block-size=SIZE and --block-list=SIZES. There was no bug in practice though because xz just made a few unneeded zero-byte reads.
2013-07-04xz: Move some of the timing code into mytime.[hc].Lasse Collin1-0/+5
This switches units from microseconds to milliseconds. New clock_gettime(CLOCK_MONOTONIC) will be used if available. There is still a fallback to gettimeofday().
2013-06-21xz: Fix interaction between preset and custom filter chains.Lasse Collin1-14/+21
There was somewhat illogical behavior when --extreme was specified and mixed with custom filter chains. Before this commit, "xz -9 --lzma2 -e" was equivalent to "xz --lzma2". After it is equivalent to "xz -6e" (all earlier preset options get forgotten when a custom filter chain is specified and the default preset is 6 to which -e is applied). I find this less illogical. This also affects the meaning of "xz -9e --lzma2 -7". Earlier it was equivalent to "xz -7e" (the -e specified before a custom filter chain wasn't forgotten). Now it is "xz -7". Note that "xz -7e" still is the same as "xz -e7". Hopefully very few cared about this in the first place, so pretty much no one should even notice this change. Thanks to Conley Moorhous.
2012-07-03xz: Add incomplete support for --block-list.Lasse Collin1-8/+40
It's broken with threads and when also --block-size is used.
2011-11-03xz: Fix xz on EBCDIC systems.Lasse Collin1-1/+4
Thanks to Chris Donawa.
2011-05-17Add underscores to attributes (__attribute((__foo__))).Lasse Collin1-1/+1
2011-05-01xz: Fix input file position when --single-stream is used.Lasse Collin1-0/+1
Now the following works as you would expect: echo foo | xz > foo.xz echo bar | xz >> foo.xz ( xz -dc --single-stream ; xz -dc --single-stream ) < foo.xz Note that it doesn't work if the input is not seekable or if there is Stream Padding between the concatenated .xz Streams.
2011-05-01xz: Print the maximum number of worker threads in xz -vv.Lasse Collin1-0/+4
2011-04-11xz: Add support for threaded compression.Lasse Collin1-79/+123
2011-04-08xz: Change size_t to uint32_t in a few places.Lasse Collin1-3/+3
2011-04-08xz: Fix a typo in a comment.Lasse Collin1-1/+1
2011-04-05xz: Call lzma_end(&strm) before exiting if debugging is enabled.Lasse Collin1-0/+10
2011-03-18xz: Add --block-size=SIZE.Lasse Collin1-10/+40
This uses LZMA_FULL_FLUSH every SIZE bytes of input. Man page wasn't updated yet.
2011-03-18xz: Add --single-stream.Lasse Collin1-2/+9
This can be useful when there is garbage after the compressed stream (.xz, .lzma, or raw stream). Man page wasn't updated yet.
2010-09-03xz: Make -vv show also decompressor memory usage.Lasse Collin1-0/+7
2010-09-02xz: Make setting a preset override a custom filter chain.Lasse Collin1-0/+9
This is more logical behavior than ignoring preset level options once a custom filter chain has been specified.
2010-09-02xz: Always warn if adjusting dictionary size due to memlimit.Lasse Collin1-19/+9
2010-08-07Disable the memory usage limiter by default.Lasse Collin1-3/+5
For several people, the limiter causes bigger problems that it solves, so it is better to have it disabled by default. Those who want to have a limiter by default need to enable it via the environment variable XZ_DEFAULTS. Support for environment variable XZ_DEFAULTS was added. It is parsed before XZ_OPT and technically identical with it. The intended uses differ quite a bit though; see the man page. The memory usage limit can now be set separately for compression and decompression using --memlimit-compress and --memlimit-decompress. To set both at once, -M or --memlimit can be used. --memory was retained as a legacy alias for --memlimit for backwards compatibility. The semantics of --info-memory were changed in backwards incompatible way. Compatibility wasn't meaningful due to changes in the memory usage limiter functionality. The memory usage limiter info is no longer shown at the bottom of xz --long -help. The memory usage limiter support for removed completely from xzdec. xz's man page was updated to match the above changes. Various unrelated fixes were also made to the man page.
2010-06-15Add --no-adjust.Lasse Collin1-6/+2
2010-05-16Split message_filters().Lasse Collin1-1/+1
message_filters_to_str() converts the filter chain to a string. message_filters_show() replaces the original message_filters(). uint32_to_optstr() was also added to show the dictionary size in nicer format when possible.
2010-02-12Collection of language fixes to comments and docs.Lasse Collin1-1/+1
Thanks to Jonathan Nieder.
2010-01-31Select the default integrity check type at runtime.Lasse Collin1-5/+14
Previously it was set statically to CRC64 or CRC32 depending on options passed to the configure script.
2010-01-31Improve displaying of the memory usage limit.Lasse Collin1-5/+3
2010-01-31Delay opening the destionation file and other fixes.Lasse Collin1-20/+44
The opening of the destination file is now delayed a little. The coder is initialized, and if decompressing, the memory usage of the first Block compared against the memory usage limit before the destination file is opened. This means that if --force was used, the old "target" file won't be deleted so easily when something goes wrong very early. Thanks to Mark K for the bug report. The above fix required some changes to progress message handling. Now there is a separate function for setting and printing the filename. It is used also in list.c. list_file() now handles stdin correctly (gives an error). A useless check for user_abort was removed from file_io.c.
2010-01-24Some improvements to printing sizes in xz.Lasse Collin1-35/+21
2009-11-25Add missing error check to coder.c.Lasse Collin1-9/+11
With bad luck this could cause a segfault due to reading (but not writing) past the end of the buffer.
2009-11-25Create sparse files by default when decompressing intoLasse Collin1-16/+17
a regular file. Sparse file creation can be disabled with --no-sparse. I don't promise yet that the name of this option won't change before 5.0.0. It's possible that the code, that checks when it is safe to use sparse output on stdout, is not good enough, and a more flexible command line option is needed to configure sparse file handling.
2009-07-20Avoid internal error with --format=xz --lzma1.Lasse Collin1-4/+12
2009-07-04Make "xz --decompress --stdout --force" copy unrecognizedLasse Collin1-35/+178
files as is to standard output. This feature is needed to be more compatible with gzip's behavior. This was more complicated to implement than it sounds, because the way liblzma is able to return errors with files of only a few bytes in size. xz now has its own file type detection code and no longer uses lzma_auto_decoder().
2009-06-26Updated comments to match renamed files.Lasse Collin1-1/+1
2009-06-26Rename process.[hc] to coder.[hc] and io.[hc] to file_io.[hc]Lasse Collin1-0/+0
to avoid problems on systems with system headers with those names.
2009-06-26Rename process_file() to coder_run().Lasse Collin1-3/+3
2009-06-26Ugly hack to make it possible to use the thousand separatorLasse Collin1-15/+15
format character with snprintf() on POSIX systems but not on non-POSIX systems and still keep xgettext working.
2009-05-22Make the default memory usage limit 40 % of RAM for bothLasse Collin1-11/+7
compressing and decompressing. This should be OK now that xz automatically scales down the compression settings if they would exceed the memory usage limit (earlier, the limit for compression was increased to 90 % because low limit broke scripts that used "xz -9" on systems with low RAM). Support spcifying the memory usage limit as a percentage of RAM (e.g. --memory=50%). Support --threads=0 to reset the thread limit to the default value (number of available CPU cores). Use UINT32_MAX instead of SIZE_MAX as the maximum in args.c. hardware.c was already expecting uint32_t value. Cleaned up the output of --help and --long-help.
2009-04-13Put the interesting parts of XZ Utils into the public domain.Lasse Collin1-10/+3
Some minor documentation cleanups were made at the same time.
2009-02-22Fixes to progress message handling in xz:Lasse Collin1-26/+27
- Don't use Windows-specific code on Windows. The old code required at least Windows 2000. Now it should work on Windows 98 and later, and maybe on Windows 95 too. - Use less precision when showing estimated remaining time. - Fix some small design issues.
2009-02-14Cleanups to the code that detects the amount of RAM andLasse Collin1-0/+2
the number of CPU cores. Added support for using sysinfo() on Linux systems whose libc lacks appropriate sysconf() support (at least dietlibc). The Autoconf macros were split into separate files, and CPU core count detection was moved from hardware.c to cpucores.h. The core count isn't used for anything real for now, so a problematic part in process.c was commented out.
2009-02-13Fix handling of integrity check type in the xz command line tool.Lasse Collin1-0/+4
2009-01-17Beta was supposed to be API stable but I had forgot to renameLasse Collin1-3/+3
lzma_memlimit_encoder and lzma_memlimit_decoder to lzma_raw_encoder_memlimit and lzma_raw_decoder_memlimit. :-( Now it is fixed. Hopefully it doesn't cause too much trouble to those who already thought API is stable.
2008-12-27Some xz command line tool improvements.Lasse Collin1-28/+105
2008-12-17xz message handling improvementsLasse Collin1-2/+26
2008-12-01In command line tool, take advantage of memusage calculation'sLasse Collin1-4/+4
ability to also validate the filter chain and options (not implemented yet for all filters).
2008-11-22Typo fixLasse Collin1-1/+1
2008-11-19Renamed lzma to xz and lzmadec to xzdec. We create symlinksLasse Collin1-0/+0
lzma, unlzma, and lzcat in "make install" for backwards compatibility with LZMA Utils 4.32.x; I'm not sure if this should be the default though.
2008-11-19Oh well, big messy commit again. Some highlights:Lasse Collin1-284/+241
- Updated to the latest, probably final file format version. - Command line tool reworked to not use threads anymore. Threading will probably go into liblzma anyway. - Memory usage limit is now about 30 % for uncompression and about 90 % for compression. - Progress indicator with --verbose - Simplified --help and full --long-help - Upgraded to the last LGPLv2.1+ getopt_long from gnulib. - Some bug fixes
2008-10-02Initial changes to change the suffix of the new format to .xz.Lasse Collin1-10/+14
This also fixes a bug related to --suffix option. Some issues with suffixes with --format=raw were not fixed.
2008-09-11Silence a compiler warning.Lasse Collin1-1/+1
2008-09-06Some API cleanupsLasse Collin1-1/+1
2008-09-04Added support for raw encoding and decoding to the commandLasse Collin1-8/+34
line tool, and made various cleanups. --lzma was renamed to --lzma1 to prevent people from accidentally using LZMA when they want LZMA2.
2008-09-02Command line tool fixesLasse Collin1-8/+13
2008-08-28Sort of garbage collection commit. :-| Many things are stillLasse Collin1-66/+22
broken. API has changed a lot and it will still change a little more here and there. The command line tool doesn't have all the required changes to reflect the API changes, so it's easy to get "internal error" or trigger assertions.
2008-06-18Update the code to mostly match the new simpler file formatLasse Collin1-21/+5
specification. Simplify things by removing most of the support for known uncompressed size in most places. There are some miscellaneous changes here and there too. The API of liblzma has got many changes and still some more will be done soon. While most of the code has been updated, some things are not fixed (the command line tool will choke with invalid filter chain, if nothing else). Subblock filter is somewhat broken for now. It will be updated once the encoded format of the Subblock filter has been decided.
2008-01-14Added one assert() to process.c of the command line tool.Lasse Collin1-0/+1
2008-01-08Disable CRC32 from Block Headers when --check=noneLasse Collin1-1/+1
has been specified.