aboutsummaryrefslogtreecommitdiff
path: root/src/xz (follow)
AgeCommit message (Collapse)AuthorFilesLines
2024-02-14Add SPDX license identifier into 0BSD source code files.Lasse Collin27-2/+53
2024-02-14Change most public domain parts to 0BSD.Lasse Collin28-84/+1
Translations and doc/xz-file-format.txt and doc/lzma-file-format.txt were not touched. COPYING.0BSD was added.
2024-01-23xz: Use threaded mode by defaut (as if --threads=0 was used).Lasse Collin3-3/+16
This hopefully does more good than bad: + It's faster by default. + Only the threaded compressor creates files that can be decompressed in threaded mode. - Compression ratio is worse, usually not too much though. When it matters, -T1 must be used. - Memory usage increases. - Scripts that assume single-threaded mode but don't use -T1 will possibly use too much resources, for example, if they run multiple xz processes in parallel to compress multiple files. - Output from single-threaded and multi-threaded compressors differ but such changes could happen for other reasons too (they just haven't happened since 5.0.0).
2024-01-23xz: Man page: Add more examples of LZMA2 options with BCJ filters.Lasse Collin1-7/+31
2024-01-23xz: Update xz -lvv for RISC-V filter.Jia Tan1-0/+10
Version 5.6.0 will be shown, even though upcoming alphas and betas will be able to support this filter. 5.6.0 looks nicer in the output and people shouldn't be encouraged to use an unstable version in production in any way.
2024-01-23xz: Update message in --long-help for RISC-V Filter.Jia Tan1-0/+1
2024-01-23xz: Update the man page for the RISC-V Filter.Jia Tan1-1/+2
A special note was added to suggest using four-byte alignment when the compressed instruction extension is not present in a RISC-V binary.
2024-01-23liblzma: Add RISC-V BCJ filter.Jia Tan1-0/+7
The new Filter ID is 0x0B. Thanks to Chien Wong <m@xv97.com> for the initial version of the Filter, the xz CLI updates, and the Autotools build system modifications. Thanks to Igor Pavlov for his many contributions to the design of the filter.
2024-01-19xz: Update website URLs in the man pages.Jia Tan1-3/+3
2023-12-21xz: Add a comment to Capsicum sandbox setup.Jia Tan1-0/+1
This comment is repeated in xzdec.c to help remind us why all the capabilities are removed from stdin in certain situations.
2023-11-30xz: Fix typoKian-Meng Ang1-1/+1
2023-11-23xz: Tweak a comment.Lasse Collin1-2/+2
2023-11-23xz: Use is_tty() in message.c.Jia Tan1-6/+1
2023-11-23xz: Create separate is_tty() function.Jia Tan2-7/+37
The new is_tty() will report if a file descriptor is a terminal or not. On POSIX systems, it is a wrapper around isatty(). However, the native Windows implementation of isatty() will return true for all character devices, not just terminals. So is_tty() has a special case for Windows so it can use alternative Windows API functions to determine if a file descriptor is a terminal. This fixes a bug with MSVC and MinGW-w64 builds that refused to read from or write to non-terminal character devices because xz thought it was a terminal. For instance: xz foo -c > /dev/null would fail because /dev/null was assumed to be a terminal.
2023-11-18xz: Move the check for --suffix with --format=raw a few lines earlier.Lasse Collin1-22/+22
Now it reads from argv[] instead of args->arg_names.
2023-11-17xz: Fix a bug with --files and --files0 in raw mode without a suffix.Jia Tan1-0/+5
The following command caused a segmentation fault: xz -Fraw --lzma1 --files=foo when foo was a valid file. The usage of --files or --files0 was not being checked when compressing or decompressing in raw mode without a suffix. The suffix checking code was meant to validate that all files to be processed are "-" (if not writing to standard out), meaning the data is only coming from standard in. In this case, there were no file names to check since --files and --files0 store their file name in a different place. Later code assumed the suffix was set and caused a segmentation fault. Now, the above command results in an error.
2023-11-15xz: Refactor suffix test with raw format.Jia Tan1-25/+13
The previous version set opt_stdout, but this caused an issue with copying an input file to standard out when decompressing an unknown file type. The following needs to result in an error: echo foo | xz -df since -c, --stdout is not used. This fixes the previous error by not setting opt_stdout.
2023-11-14xz: Move suffix check after stdout mode is detected.Jia Tan1-8/+8
This fixes a bug introduced in cc5aa9ab138beeecaee5a1e81197591893ee9ca0 when the suffix check was initially moved. This caused a situation that previously worked: echo foo | xz -Fraw --lzma1 | wc -c to fail because the old code knew that this would write to standard out so a suffix was not needed.
2023-11-14xz: Detect when all data will be written to standard out earlier.Jia Tan1-0/+21
If the -c, --stdout argument is not used, then we can still detect when the data will be written to standard out if all of the provided filenames are "-" (denoting standard in) or if no filenames are provided.
2023-10-22xz: Support basic sandboxing with Linux Landlock (ABI versions 1-3).Lasse Collin3-1/+79
It is enabled only when decompressing one file to stdout, similar to how Capsicum is used. Landlock was added in Linux 5.13.
2023-10-22Simplify detection of Capsicum support.Lasse Collin3-11/+7
This removes support for FreeBSD 10.0 and 10.1 which used <sys/capability.h> instead of <sys/capsicum.h>. Support for FreeBSD 10.1 ended on 2016-12-31. So now FreeBSD >= 10.2 is required to enable Capsicum support. This also removes support for Capsicum on Linux (libcaprights) which seems to have been unmaintained since 2017 and Linux 4.11: https://github.com/google/capsicum-linux
2023-10-22xz/Windows: Allow clock_gettime with POSIX threads.Lasse Collin1-3/+6
If winpthreads are used for threading, it's OK to use clock_gettime() from winpthreads too.
2023-10-22xz/Windows: Ensure that clock_gettime() isn't used with MinGW-w64.Lasse Collin1-2/+7
This commit alone doesn't change anything in the real-world: - configure.ac currently checks for clock_gettime() only when using pthreads. - CMakeLists.txt doesn't check for clock_gettime() on Windows. So clock_gettime() wasn't used with MinGW-w64 before either. clock_gettime() provides monotonic time and it's better than gettimeofday() in this sense. But clock_gettime() is defined in winpthreads, and liblzma or xz needs nothing else from winpthreads. By avoiding clock_gettime(), we avoid the dependency on libwinpthread-1.dll or the need to link against the static version. As a bonus, GetTickCount64() and MinGW-w64's gettimeofday() can be faster than clock_gettime(CLOCK_MONOTONIC, &tv). The resolution is more than good enough for the progress indicator in xz.
2023-10-22xz/Windows: Use GetTickCount64() with MinGW-w64 if using Vista threads.Lasse Collin1-3/+11
2023-09-24xz: Change quoting style from `...' to '...'.Jia Tan7-18/+18
2023-09-22xz: Windows: Don't (de)compress to special files like "con" or "nul".Lasse Collin1-7/+28
Before this commit, the following writes "foo" to the console and deletes the input file: echo foo | xz > con_xz xz --suffix=_xz --decompress con_xz It cannot happen without --suffix because names like con.xz are also special and so attempting to decompress con.xz (or compress con to con.xz) will already fail when opening the input file. Similar thing is possible when compressing. The following writes to "nul" and the input file "n" is deleted. echo foo | xz > n xz --suffix=ul n Now xz checks if the destination is a special file before continuing. DOS/DJGPP version had a check for this but Windows (and OS/2) didn't.
2023-09-22xz, xzdec, lzmainfo: Use tuklib_attr_noreturn.Lasse Collin5-20/+27
For compatibility with C23's [[noreturn]], tuklib_attr_noreturn must be at the beginning of declaration (before "extern" or "static", and even before any GNU C's __attribute__). This commit also moves all other function attributes to the beginning of function declarations. "extern" is kept at the beginning of a line so the attributes are listed on separate lines before "extern" or "static".
2023-09-22Remove incorrect uses of __attribute__((__malloc__)).Lasse Collin1-2/+2
xrealloc() is obviously incorrect, modern GCC docs even mention realloc() as an example where this attribute cannot be used. liblzma's lzma_alloc() and lzma_alloc_zero() would be correct uses most of the time but custom allocators may use a memory pool or otherwise hold the pointer so aliasing issues could happen in theory. The xstrdup() case likely was correct but I removed it anyway. Now there are no __malloc__ attributes left in the code. The allocations aren't in hot paths so this should make no practical difference.
2023-09-22MSVC: xz: Make file_io.c and file_io.h compatible with MSVC.Lasse Collin2-0/+36
Thanks to Kelvin Lee for the original patches and testing the modifications I made.
2023-09-22MSVC: xz: Use GetTickCount64() to implement mytime_now().Lasse Collin1-2/+9
It's available since Windows Vista.
2023-09-22MSVC: xz: Use _stricmp() instead of strcasecmp() in suffix.c.Kelvin Lee1-2/+8
2023-09-22MSVC: xz: Use _isatty() from <io.h> to implement isatty().Kelvin Lee2-0/+10
2023-09-22MSVC: xz: Use _fileno() instead of fileno().Kelvin Lee1-0/+4
2023-09-22MSVC: Don't #include <unistd.h>.Kelvin Lee1-1/+4
2023-08-31xz: Refactor thousand separator detection and disable it on MSVC.Lasse Collin1-44/+45
Now the two variations of the format strings are created with a macro, and the whole detection code can be easily disabled on platforms where thousand separator formatting is known to not work (MSVC has no support, and on DJGPP 2.05 it can have problems in some cases).
2023-08-31xz: Fix a too relaxed assertion and remove uses of SSIZE_MAX.Lasse Collin2-5/+4
SSIZE_MAX isn't readily available on MSVC. Removing it means that there is one thing less to worry when porting to MSVC.
2023-08-02xz: Omit an empty paragraph on the man page.Lasse Collin1-1/+0
2023-07-31Docs: Fix typos found by codespellDimitri Papadopoulos Orfanos1-2/+2
2023-07-18xz: Translate the second "%s: " in message.c since French needs "%s : ".Lasse Collin1-1/+1
This string is used to print a filename when using "xz -v" and stderr isn't a terminal.
2023-07-18xz: Make "%s: %s" translatable because French needs "%s : %s".Lasse Collin4-14/+18
2023-07-18xz: Update Authors list in a few files.Jia Tan5-5/+10
2023-07-17xz: Fix typo in man page.Jia Tan1-1/+1
The Memory limit information section described three output columns when it actually has six. This was reworded to "multiple" to make it more future proof.
2023-07-17xz: Minor clean up for coder.cJia Tan1-32/+21
* Moved max_block_list_size from a global to local variable. * Reworded error message in validate_block_list_filter(). * Removed helper function filter_chain_error(). * Changed 1 << X to 1U << X in many places
2023-07-17xz: Update man page Authors and date.Jia Tan1-2/+3
2023-07-17xz: Add a section to man page for robot mode --filters-help.Jia Tan1-2/+30
2023-07-17xz: Slight reword in xz man page for consistency.Jia Tan1-1/+1
Changed will print => prints in xz --robot --version description to match --robot --info-memory description.
2023-07-17xz: Reorder robot mode subsections in the man page.Jia Tan1-96/+96
The order is now consistent with the order the command line arguments are documented earlier in the man page. The new order is: 1. --list 2. --info-memory 3. --version Instead of the previous order: 1. --version 2. --info-memory 3. --list
2023-07-17xz: Update man page for new --filters-help option.Jia Tan1-0/+10
2023-07-17xz: Add a new --filters-help option.Jia Tan3-0/+43
The --filters-help can be used to help create filter chains with the --filters and --filtersX options. The message in --long-help is too short to fully explain the syntax to construct complex filter chains. In --robot mode, xz will only print the output from liblzma function lzma_str_list_filters.
2023-07-17xz: Update the man page for --block-list and --filtersXJia Tan1-26/+80
The --block-list option description needed updating since the new --filtersX option changes how it can be used. The new entry for --filters1=FILTERS ... --filter9=FILTERS was created right after the --filters option.
2023-07-17xz: Update --long-help for the new --filtersX option.Jia Tan1-2/+10
2023-07-17xz: Ignore filter chains that are set but never used in --block-list.Jia Tan1-18/+48
If a filter chain is set but not used in --block-list, it introduced unexpected behavior such as requiring an unneeded amount of memory to compress, reducing the number of threads in multi-threaded encoding, and printing an incorrect amount of memory needed to decompress. This also renames filters_init_mask => filters_used_mask. A filter is assumed to be used if it is specified in --filtersX until coder_set_compression_settings() determines which filters are referenced in --block-list.
2023-07-17xz: Set the Block size for mt encoding correctly.Jia Tan1-1/+67
When opt_block_size is not used, the Block size for mt encoder is derived from the minimum of the largest Block specified by --block-list and the recommended Block size on all filter chains calculated by lzma_mt_block_size(). This avoids using unnecessary memory and ensures that all Blocks are large enough for the most memory needy filter chain.
2023-07-17xz: Validate --flush-timeout for all specified filter chains.Jia Tan1-8/+16
2023-07-17xz: Allows --block-list filters to scale down memory usage.Jia Tan1-55/+214
Previously, only the default filter chain could have its memory usage adjusted. The filter chains specified with --filtersX were not checked for memory usage. Now, all used filter chains will be adjusted if necessary.
2023-07-17xz: Do not include block splitting if encoders are disabled.Jia Tan1-9/+20
The block splitting logic and split_block() function are not needed if encoders are disabled. This will help slightly reduce the binary size when built without encoders and allow split_block() to use functions that require encoders being enabled.
2023-07-17xz: Free filters[] in debug mode.Jia Tan1-0/+10
This will only free filter chains created with --filters1-9 since the default filter chain may be set from a static function variable. The complexity to free the default filter chain is not worth the burden on code maintenance.
2023-07-17xz: Add a message if --block-list is used outside of xz compresssion.Jia Tan1-0/+11
--block-list is only supported with compression in xz format. This avoids silently ignoring when --block-list is unused.
2023-07-17xz: Create command line options for filters[1-9].Jia Tan3-60/+230
The new command line options are meant to be combined with --block-list. They work as an optional extension to --block-list to specify a custom filter chain for each block listed. The new options allow the creation of up to 9 reusable filter chains. For instance: xz --block-list=1:10MiB,3:5MiB,,2:5MiB,1:0 --filters1=delta--lzma2 \ --filters2=x86--lzma2 --filters3=arm64--lzma2 Will create the following blocks: 1. A block of size 10 MiB with filter chain delta, lzma2. 2. A block of size 5 MiB with filter chain arm64, lzma2. 3. A block of size 5 MiB with filter chain arm64, lzma2. 4. A block of size 5 MiB with filter chain x86, lzma2. 5. A block containing the rest of the file contents with filter chain delta, lzma2.
2023-07-17xz: Use lzma_filters_free() in forget_filter_chain().Jia Tan1-8/+10
This is a little cleaner than the previous implementation of forget_filter_chain(). It is also more consistent since lzma_str_to_filters() will always terminate the filter chain so there is no need to terminate it later in coder_set_compression_settings().
2023-07-17xz: Separate string to filter conversion into a helper function.Jia Tan1-13/+20
Converting from string to filter will also need to be done for block specific filter chains.
2023-07-17xz: Update --long-help and man page for new --filters option.Jia Tan2-5/+42
2023-07-17xz: Add --filters option to CLI.Jia Tan3-4/+58
The --filters option uses the new lzma_str_to_filters() function to convert a string into a full filter chain. Using this option will reset all previous filters set by --preset, --[filter], or --filters.
2023-03-18Change a few HTTP URLs to HTTPS.Lasse Collin1-1/+1
The xz man page timestamp was intentionally left unchanged.
2023-03-11xz: Simplify the error-label in Capsicum sandbox code.Lasse Collin1-15/+12
Also remove unneeded "sandbox_allowed = false;" as this code will never be run more than once (making it work with multiple input files isn't trivial).
2023-03-08xz: Make Capsicum sandbox more strict with stdin and stdout.Lasse Collin1-0/+8
2023-03-08Revert: "Add warning if Capsicum sandbox system calls are unsupported."Jia Tan1-6/+4
The warning causes the exit status to be 2, so this will cause problems for many scripted use cases for xz. The sandbox usage is already very limited already, so silently disabling this allows it to be more usable.
2023-03-07xz: Fix -Wunused-label in io_sandbox_enter().Jia Tan1-2/+2
Thanks to Xin Li for recommending the fix.
2023-03-06xz: Add warning if Capsicum sandbox system calls are unsupported.Jia Tan1-0/+2
The warning is only used when errno == ENOSYS. Otherwise, xz still issues a fatal error.
2023-03-06xz: Skip Capsicum sandbox system calls when they are unsupported.Jia Tan1-5/+17
If a system has the Capsicum header files but does not actually implement the system calls, then this would render xz unusable. Instead, we can check if errno == ENOSYS and not issue a fatal error.
2023-03-06xz: Reorder cap_enter() to beginning of capsicum sandbox code.Jia Tan1-3/+3
cap_enter() puts the process into the sandbox. If later calls to cap_rights_limit() fail, then the process can still have some extra protections.
2023-02-07xz: Improve the comment about start_time in mytime.c.Lasse Collin1-5/+10
start_time is relative to an arbitary point in time, it's not time of day, so using it for anything else than time differences wouldn't make sense.
2023-02-04xz: Add a comment clarifying the use of start_time in mytime.c.Jia Tan1-0/+5
2023-01-27xz: Use clock_gettime() even if CLOCK_MONOTONIC isn't available.Lasse Collin2-5/+9
mythread.h and thus liblzma already does it.
2023-01-27xz: Add SIGTSTP handler for progress indicator time keeping.Lasse Collin4-2/+89
This way, if xz is stopped the elapsed time and estimated time remaining won't get confused by the amount of time spent in the stopped state. This raises SIGSTOP. It's not clear to me if this is the correct way. POSIX and glibc docs say that SIGTSTP shouldn't stop the process if it is orphaned but this commit doesn't attempt to handle that. Search for SIGTSTP in section 2.4.3: https://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
2023-01-24xz: Flip the return value of suffix_is_set to match the documentation.Lasse Collin3-4/+5
Also edit style to match the existing coding style in the project.
2023-01-21xz: Refactor duplicated check for custom suffix when using --format=rawJia Tan3-18/+23
2023-01-16xz: Add missing comment for coder_set_compression_settings()Jia Tan1-1/+2
2023-01-16xz: Do not set compression settings with raw format in list mode.Jia Tan1-1/+2
Calling coder_set_compression_settings() in list mode with verbose mode on caused the filter chain and memory requirements to print. This was unnecessary since the command results in an error and not consistent with other formats like lzma and alone.
2023-01-12xz: Use ssize_t for the to-be-ignored return value from write(fd, ptr, 1).Lasse Collin1-1/+1
It makes no difference here as the return value fits into an int too and it then gets ignored but this looks better.
2023-01-12xz: Silence warnings from -Wsign-conversion in a 32-bit build.Lasse Collin1-2/+2
2023-01-12Fix warnings from clang -Wdocumentation.Lasse Collin1-2/+2
2023-01-11xz: Fix warning -Wformat-nonliteral on clang in message.c.Jia Tan1-0/+9
clang and gcc differ in how they handle -Wformat-nonliteral. gcc will allow a non-literal format string as long as the function takes its format arguments as a va_list.
2023-01-10xz: Include <strings.h> in suffix.c if needed for strcasecmp().Lasse Collin1-0/+3
SUSv2 and POSIX.1‐2017 declare only a few functions in <strings.h>. Of these, strcasecmp() is used on some platforms in suffix.c. Nothing else in the project needs <strings.h> (at least if building on a modern system). sysdefs.h currently includes <strings.h> if HAVE_STRINGS_H is defined and suffix.c relied on this. Note that dos/config.h doesn't #define HAVE_STRINGS_H even though DJGPP does have strings.h. It isn't needed with DJGPP as strcasecmp() is also in <string.h> in DJGPP.
2022-12-30xz: Includes <time.h> and <sys/time.h> conditionally in mytime.c.Jia Tan1-1/+3
Previously, mytime.c depended on mythread.h for <time.h> to be included.
2022-12-30Build: No longer require HAVE_DECL_CLOCK_MONOTONIC to always be set.Jia Tan1-3/+2
Previously, if threading was enabled HAVE_DECL_CLOCK_MONOTONIC would always be set to 0 or 1. However, this macro was needed in xz so if xz was not built with threading and HAVE_DECL_CLOCK_MONOTONIC was not defined but HAVE_CLOCK_GETTIME was, it caused a warning during build. Now, HAVE_DECL_CLOCK_MONOTONIC has been renamed to HAVE_CLOCK_MONOTONIC and will only be set if it is 1.
2022-12-11xz: Rename --experimental-arm64 to --arm64.Lasse Collin1-1/+1
2022-12-08xz: Make args_info.files_name a const pointer.Lasse Collin2-2/+2
2022-12-08xz: Don't modify argv[].Lasse Collin1-4/+19
The code that parses --memlimit options and --block-list modified the argv[] when parsing the option string from optarg. This was visible in "ps auxf" and such and could be confusing. I didn't understand it back in the day when I wrote that code. Now a copy is allocated when modifiable strings are needed.
2022-12-01xz: Omit the special notes about ARM64 filter on the man page.Lasse Collin1-3/+2
2022-11-30xz: Remove message_filters_to_str function prototype from message.h.Jia Tan1-16/+0
This was forgotten from 7484744af6cbabe81e92af7d9e061dfd597fff7b.
2022-11-28xz: Use lzma_str_from_filters().Lasse Collin2-175/+28
Two uses: Displaying encoder filter chain when compressing with -vv, and displaying the decoder filter chain in --list -vv.
2022-11-26xz: Use lzma_filters_free().Lasse Collin1-6/+2
2022-11-24xz: Allow nice_len 2 and 3 even if match finder requires 3 or 4.Lasse Collin1-5/+0
Now that liblzma accepts these, we avoid the extra check and there's one message less for translators too.
2022-11-19xz: Refactor duplicate code from hardware_memlimit_mtenc_get().Lasse Collin1-1/+1
2022-11-19xz: Add support --threads=+N so that -T+1 gives threaded mode.Lasse Collin4-6/+51
2022-11-14Replace the experimental ARM64 filter with a new experimental version.Lasse Collin4-50/+11
This is incompatible with the previous version. This has space/tab fixes in filter_*.c and bcj.h too.
2022-11-09xz: Update the man page about BCJ filters, including upcoming --arm64.Lasse Collin1-37/+29
The --arm64 isn't actually implemented yet in the form described in this commit. Thanks to Jia Tan.
2022-11-09xz: Add --arm64 to --long-help and omit endianness from ARM(-Thumb).Lasse Collin1-2/+3
Modern 32-bit ARM in big endian mode use little endian for instruction encoding still, so the filters work on such executables too. It's likely less confusing for users this way. The --arm64 option hasn't been implemented yet (there is --experimental-arm64 but it's different). The --arm64 option is added now anyway because this is the likely result and the strings need to be ready for translators. Thanks to Jia Tan.
2022-11-09xz: Remove the commented-out FORMAT_GZIP, gzip, .gz, and .tgz.Lasse Collin3-12/+0
2022-11-09xz: Add .lz (lzip) decompression support.Lasse Collin6-13/+141
If configured with --disable-lzip-decoder then --long-help will still list `lzip' in --format but I left it like that since due to translations it would be messy to have two help strings. Features are disabled only in special situations so wrong help in such a situation shouldn't matter much. Thanks to Michał Górny for the original patch.
2022-11-09xz: Add comments about stdin and src_st.st_size.Lasse Collin2-0/+13
"xz -v < regular_file > out.xz" doesn't display the percentage and estimated remaining time because it doesn't even try to check the input file size when input is read from stdin. This could be improved but for now there's just a comment to remind about it.
2022-11-09xz: Fix displaying of file sizes in progress indicator in passthru mode.Lasse Collin1-1/+5
It worked for one input file since the counters are zero when xz starts but they weren't reset when starting a new file in passthru mode. For example, if files A, B, and C are one byte each, then "xz -dcvf A B C" would show file sizes as 1, 2, and 3 bytes instead of 1, 1, and 1 byte.
2022-11-09xz: Add a comment why --to-stdout is not in --help.Lasse Collin1-0/+3
It is on the man page still.
2022-11-08xz: Make xz -lvv show that the upcoming --arm64 needs 5.4.0 to decompress.Lasse Collin1-5/+15
2022-11-08xz: Initialize the pledge(2) sandbox at the very beginning of main().Lasse Collin1-13/+14
It feels better that the initializations are sandboxed too. They don't do anything that the pledge() call wouldn't allow.
2022-11-07xz: Extend --robot --info-memory output.Lasse Collin2-15/+56
Now it includes everything that the human-readable --info-memory shows.
2022-11-07xz: Avoid a compiler warning in progress_speed() in message.c.Jia Tan1-6/+3
This should be smaller too since it avoids the string constants.
2022-10-25xz: Fix --single-stream with an empty .xz Stream.Lasse Collin1-0/+9
Example: $ xz -dc --single-stream good-0-empty.xz xz: good-0-empty.xz: Internal error (bug) The code, that is tries to catch some input file issues early, didn't anticipate LZMA_STREAM_END which is possible in that code only when --single-stream is used.
2022-10-25xz: Add support for OpenBSD's pledge() sandbox.Lasse Collin3-1/+25
2022-10-25xz: Fix decompressor behavior if input uses an unsupported check type.Lasse Collin1-4/+15
Now files with unsupported check will make xz display a warning, set the exit status to 2 (unless --no-warn is used), and then decompress the file normally. This is how it was supposed to work since the beginning but this was broken by the commit 231c3c7098f1099a56abb8afece76fc9b8699f05, that is, a little before 5.0.0 was released. The buggy behavior displayed a message, set exit status 1 (error), and xz didn't attempt to to decompress the file. This doesn't matter today except for special builds that disable CRC64 or SHA-256 at build time (but such builds should be used in special situations only). The bug matters if new check type is added in the future and an old xz version is used to decompress such a file; however, it's likely that such files would use a new filter too and an old xz wouldn't be able to decompress the file anyway. The first hunk in the commit is the actual fix. The second hunk is a cleanup since LZMA_TELL_ANY_CHECK isn't used in xz. There is a test file for unsupported check type but it wasn't used by test_files.sh, perhaps due to different behavior between xz and the simpler xzdec.
2022-10-25xz: Clarify the man page: input file isn't removed if an error occurs.Lasse Collin1-2/+3
2022-10-25xz: Refactor to remove is_empty_filename().Lasse Collin3-17/+3
Long ago it was used in list.c too but nowadays it's needed only in io_open_src() so it's nicer to avoid a separate function.
2022-10-25xz: If input file cannot be removed, treat it as a warning, not error.Lasse Collin1-2/+2
Treating it as a warning (message + exit status 2) matches gzip and it seems more logical as at that point the output file has already been successfully closed. When it's a warning it is possible to suppress it with --no-warn.
2022-09-19xz: Add --experimental-arm64[=width=WIDTH].Lasse Collin4-0/+60
It will be renamed to --arm64 once it is stable. Man page or --long-help weren't updated yet.
2022-08-22xz: Try to clarify --memlimit-mt-decompress vs. --memlimit-compress.Lasse Collin1-12/+19
2022-08-19xz: Revise --info-memory output.Lasse Collin2-6/+27
The strings could be more descriptive but it's good to have some version of this committed now. --robot mode wasn't changed yet.
2022-08-19xz: Update the man page for threaded decompression and memlimits.Lasse Collin1-27/+121
This documents the changes made in commits 6c6da57ae2aa962aabde6892442227063d87e88c, cad299008cf73ec566f0662a9cf2b94f86a99659, and 898faa97287a756231c663a3ed5165672b417207. The --info-memory bit hasn't been finished yet even though it's already mentioned in this commit under --memlimit-mt-decompress and --threads.
2022-07-24xz: Update the man page that change to --keep will be in 5.2.6.Lasse Collin1-2/+2
2022-07-12xz: Document the special memlimit case of 2000 MiB on MIPS32.Lasse Collin1-2/+6
See commit fc3d3a7296ef58bb799a73943636b8bfd95339f7.
2022-04-14xz: Fix build with --disable-threads.Lasse Collin1-0/+4
2022-04-14xz: Change the cap of the default -T0 memlimit for 32-bit xz.Lasse Collin1-1/+3
The SIZE_MAX / 3 was 1365 MiB. 1400 MiB gives little more room and it looks like a round (artificial) number in --info-memory once --info-memory is made to display it. Also, using #if avoids useless code on 64-bit builds.
2022-04-14xz: Add a default soft memory usage limit for --threads=0.Lasse Collin3-11/+82
This is a soft limit in sense that it only affects the number of threads. It never makes xz fail and it never makes xz change settings that would affect the compressed output. The idea is to make -T0 have more reasonable behavior when the system has very many cores or when a memory-hungry compression options are used. This also helps with 32-bit xz, preventing it from running out of address space. The downside of this commit is that now the number of threads might become too low compared to what the user expected. I hope this to be an acceptable compromise as the old behavior has been a source of well-argued complaints for a long time.
2022-04-14xz: Make -T0 use multithreaded mode on single-core systems.Lasse Collin3-9/+27
The main problem withi the old behavior is that the compressed output is different on single-core systems vs. multicore systems. This commit fixes it by making -T0 one thread in multithreaded mode on single-core systems. The downside of this is that it uses more memory. However, if --memlimit-compress is used, xz can (thanks to the previous commit) drop to the single-threaded mode still.
2022-04-14xz: Changes to --memlimit-compress and --no-adjust.Lasse Collin1-20/+43
In single-threaded mode, --memlimit-compress can make xz scale down the LZMA2 dictionary size to meet the memory usage limit. This obviously affects the compressed output. However, if xz was in threaded mode, --memlimit-compress could make xz reduce the number of threads but it wouldn't make xz switch from multithreaded mode to single-threaded mode or scale down the LZMA2 dictionary size. This seemed illogical and there was even a "FIXME?" about it. Now --memlimit-compress can make xz switch to single-threaded mode if one thread in multithreaded mode uses too much memory. If memory usage is still too high, then the LZMA2 dictionary size can be scaled down too. The option --no-adjust was also changed so that it no longer prevents xz from scaling down the number of threads as that doesn't affect compressed output (only performance). After this commit --no-adjust only prevents adjustments that affect compressed output, that is, with --no-adjust xz won't switch from multithreaded mode to single-threaded mode and won't scale down the LZMA2 dictionary size. The man page wasn't updated yet.
2022-04-12xz: Add --memlimit-mt-decompress along with a default limit value.Lasse Collin5-42/+97
--memlimit-mt-decompress allows specifying the limit for multithreaded decompression. This matches memlimit_threading in liblzma. This limit can only affect the number of threads being used; it will never prevent xz from decompressing a file. The old --memlimit-decompress option is still used at the same time. If the value of --memlimit-decompress (the default value or one specified by the user) is less than the value of --memlimit-mt-decompress , then --memlimit-mt-decompress is reduced to match --memlimit-decompress. Man page wasn't updated yet.
2022-03-07xz: Add initial support for threaded decompression.Lasse Collin1-1/+35
If threading support is enabled at build time, this will use lzma_stream_decoder_mt() even for single-threaded mode. With memlimit_threading=0 the behavior should be identical. This needs some work like adding --memlimit-threading=LIMIT. The original patch from Sebastian Andrzej Siewior included a method to get currently available RAM on Linux. It might be one way to go but as it is Linux-only, the available-RAM approach needs work for portability or using a fallback method on other OSes. The man page wasn't updated yet.
2021-10-27xz: Change the coding style of the previous commit.Lasse Collin1-5/+6
It isn't any better now but it's consistent with the rest of the code base.
2021-10-27xz: Avoid fchown(2) failure.Alexander Bluhm1-1/+7
OpenBSD does not allow to change the group of a file if the user does not belong to this group. In contrast to Linux, OpenBSD also fails if the new group is the same as the old one. Do not call fchown(2) in this case, it would change nothing anyway. This fixes an issue with Perl Alien::Build module. https://github.com/PerlAlien/Alien-Build/issues/62
2021-04-11Reduce maximum possible memory limit on MIPS32Ivan A. Melnikov1-0/+6
Due to architectural limitations, address space available to a single userspace process on MIPS32 is limited to 2 GiB, not 4, even on systems that have more physical RAM -- e.g. 64-bit systems with 32-bit userspace, or systems that use XPA (an extension similar to x86's PAE). So, for MIPS32, we have to impose stronger memory limits. I've chosen 2000MiB to give the process some headroom.
2021-01-11xz: Make --keep accept symlinks, hardlinks, and setuid/setgid/sticky.Lasse Collin2-5/+20
Previously this required using --force but that has other effects too which might be undesirable. Changing the behavior of --keep has a small risk of breaking existing scripts but since this is a fairly special corner case I expect the likehood of breakage to be low enough. I think the new behavior is more logical. The only reason for the old behavior was to be consistent with gzip and bzip2. Thanks to Vincent Lefevre and Sebastian Andrzej Siewior.
2020-11-01xz: Avoid unneeded \f escapes on the man page.Lasse Collin1-9/+22
I don't want to use \c in macro arguments but groff_man(7) suggests that \f has better portability. \f would be needed for the .TP strings for portability reasons anyway. Thanks to Bjarni Ingi Gislason.
2020-11-01xz: Use non-breaking spaces when intentionally using more than one space.Lasse Collin1-1/+1
This silences some style checker warnings. Seems that spaces in the beginning of a line don't need this treatment. Thanks to Bjarni Ingi Gislason.
2020-11-01xz: Protect the ellipsis (...) on the man page with \&.Lasse Collin1-2/+2
This does it only when ... appears outside macro calls. Thanks to Bjarni Ingi Gislason.
2020-11-01xz: Avoid the abbreviation "e.g." on the man page.Lasse Collin1-33/+33
A few are simply omitted, most are converted to "for example" and surrounded with commas. Sounds like that this is better style, for example, man-pages(7) recommends avoiding such abbreviations except in parenthesis. Thanks to Bjarni Ingi Gislason.
2020-07-12xz man page: Change \- (minus) to \(en (en-dash) for a numeric range.Lasse Collin1-8/+8
Docs of ancient troff/nroff mention \(em (em-dash) but not \(en and \- was used for both minus and en-dash. I don't know how portable \(en is nowadays but it can be changed back if someone complains. At least GNU groff and OpenBSD's mandoc support it. Thanks to Bjarni Ingi Gislason for the patch.
2020-04-06src/xz/xz.1: Correct misused two-fonts macrosBjarni Ingi Gislason1-5/+5
Output is from: test-groff -b -e -mandoc -T utf8 -rF0 -t -w w -z [ "test-groff" is a developmental version of "groff" ] Input file is ./src/xz/xz.1 <src/xz/xz.1>:408 (macro BR): only 1 argument, but more are expected <src/xz/xz.1>:1009 (macro BR): only 1 argument, but more are expected <src/xz/xz.1>:1743 (macro BR): only 1 argument, but more are expected <src/xz/xz.1>:1920 (macro BR): only 1 argument, but more are expected <src/xz/xz.1>:2213 (macro BR): only 1 argument, but more are expected Output from nroff and troff is unchanged, except for a font change of a full stop (.). Signed-off-by: Bjarni Ingi Gislason <bjarniig@rhi.hi.is>
2020-03-23Typo fixes from fossies.org.Lasse Collin1-2/+2
https://fossies.org/linux/misc/xz-5.2.5.tar.xz/codespell.html
2020-03-11xz: Never use thousand separators in DJGPP builds.Lasse Collin1-2/+12
DJGPP 2.05 added support for thousands separators but it's broken at least under WinXP with Finnish locale that uses a non-breaking space as the thousands separator. Workaround by disabling thousands separators for DJGPP builds.
2020-02-21xz: Silence a warning when sig_atomic_t is long int.Lasse Collin1-1/+1
It can be true at least on z/OS.
2020-02-21xz: Avoid unneeded access of a volatile variable.Lasse Collin1-1/+1
2020-02-07Build: Add support for translated man pages using po4a.Lasse Collin1-14/+36
The dependency on po4a is optional. It's never required to install the translated man pages when xz is built from a release tarball. If po4a is missing when building from xz.git, the translated man pages won't be generated but otherwise the build will work normally. The translations are only updated automatically by autogen.sh and by "make mydist". This makes it easy to keep po4a as an optional dependency and ensures that I won't forget to put updated translations to a release tarball. The translated man pages aren't installed if --disable-nls is used. The installation of translated man pages abuses Automake internals by calling "install-man" with redefined dist_man_MANS and man_MANS. This makes the hairy script code slightly less hairy. If it breaks some day, this code needs to be fixed; don't blame Automake developers. Also, this adds more quotes to the existing shell script code in the Makefile.am "-hook"s.
2020-02-05xz: Make it a fatal error if enabling the sandbox fails.Lasse Collin1-1/+1
Perhaps it's too drastic but on the other hand it will let me learn about possible problems if people report the errors. This won't be backported to the v5.2 branch.
2020-02-05xz: Comment out annoying sandboxing messages.Lasse Collin1-3/+7
2020-02-01xz: Limit --memlimit-compress to at most 4020 MiB for 32-bit xz.Lasse Collin2-2/+51
See the code comment for reasoning. It's far from perfect but hopefully good enough for certain cases while hopefully doing nothing bad in other situations. At presets -5 ... -9, 4020 MiB vs. 4096 MiB makes no difference on how xz scales down the number of threads. The limit has to be a few MiB below 4096 MiB because otherwise things like "xz --lzma2=dict=500MiB" won't scale down the dict size enough and xz cannot allocate enough memory. With "ulimit -v $((4096 * 1024))" on x86-64, the limit in xz had to be no more than 4085 MiB. Some safety margin is good though. This is hack but it should be useful when running 32-bit xz on a 64-bit kernel that gives full 4 GiB address space to xz. Hopefully this is enough to solve this: https://bugzilla.redhat.com/show_bug.cgi?id=1196786 FreeBSD has a patch that limits the result in tuklib_physmem() to SIZE_MAX on 32-bit systems. While I think it's not the way to do it, the results on --memlimit-compress have been good. This commit should achieve practically identical results for compression while leaving decompression and tuklib_physmem() and thus lzma_physmem() unaffected.
2020-01-26xz: Set the --flush-timeout deadline when the first input byte arrives.Lasse Collin3-7/+6
xz --flush-timeout=2000, old version: 1. xz is started. The next flush will happen after two seconds. 2. No input for one second. 3. A burst of a few kilobytes of input. 4. No input for one second. 5. Two seconds have passed and flushing starts. The first second counted towards the flush-timeout even though there was no pending data. This can cause flushing to occur more often than needed. xz --flush-timeout=2000, after this commit: 1. xz is started. 2. No input for one second. 3. A burst of a few kilobytes of input. The next flush will happen after two seconds counted from the time when the first bytes of the burst were read. 4. No input for one second. 5. No input for another second. 6. Two seconds have passed and flushing starts.
2020-01-26xz: Move flush_needed from mytime.h to file_pair struct in file_io.h.Lasse Collin5-9/+7
2020-01-26xz: coder.c: Make writing output a separate function.Lasse Collin1-13/+17
The same code sequence repeats so it's nicer as a separate function. Note that in one case there was no test for opt_mode != MODE_TEST, but that was only because that condition would always be true, so this commit doesn't change the behavior there.
2020-01-26xz: Fix semi-busy-waiting in xz --flush-timeout.Lasse Collin3-4/+19
When input blocked, xz --flush-timeout=1 would wake up every millisecond and initiate flushing which would have nothing to flush and thus would just waste CPU time. The fix disables the timeout when no input has been seen since the previous flush.
2020-01-26xz: Refactor io_read() a bit.Lasse Collin1-9/+8
2020-01-26xz: Update a comment in file_io.h.Lasse Collin1-1/+4
2020-01-26xz: Move the setting of flush_needed in file_io.c to a nicer location.Lasse Collin1-4/+2
2019-06-28xz: Automatically align the strings in --info-memory.Lasse Collin1-11/+34
This makes it easier to translate the strings. Also, the string for amount of RAM was shortened.
2019-06-24Add LZMA_RET_INTERNAL1..8 to lzma_ret and use one for LZMA_TIMED_OUT.Lasse Collin1-0/+8
LZMA_TIMED_OUT is *internally* used as a value for lzma_ret enumeration. Previously it was #defined to 32 and cast to lzma_ret. That way it wasn't visible in the public API, but this was hackish. Now the public API has eight LZMA_RET_INTERNALx members and LZMA_TIMED_OUT is #defined to LZMA_RET_INTERNAL1. This way the code is cleaner overall although the public API has a few extra mysterious enum members.
2019-06-24xz: Silence a warning from clang -Wsign-conversion in main.c.Lasse Collin1-1/+1
2019-06-24xz: Make "headings" static in list.c.Lasse Collin1-1/+1
Caught by clang -Wmissing-variable-declarations.
2019-06-24xz: Fix an integer overflow with 32-bit off_t.Lasse Collin1-2/+9
Or any off_t which isn't very big (like signed 64 bit integer that most system have). A small off_t could overflow if the file being decompressed had long enough run of zero bytes, which would result in corrupt output.
2019-06-24xz: Cleanup io_seek_src() a bit.Lasse Collin1-3/+1
lseek() returns -1 on error and checking for -1 is nicer.
2019-06-24xz: Change io_seek_src and io_pread arguments from off_t to uint64_t.Lasse Collin3-11/+18
This helps fixing warnings from -Wsign-conversion and makes the code look better too.
2019-06-24xz: list.c: Fix some warnings from -Wsign-conversion.Lasse Collin1-3/+4
2019-06-23xz: Fix some of the warnings from -Wsign-conversion.Lasse Collin7-13/+14
2019-05-11xz: Update xz man page date.Lasse Collin1-1/+1
2019-05-11spellingAntoine Cœur7-9/+9
2019-05-01xz: In xz -lvv look at the widths of the check names too.Lasse Collin1-6/+26
Now the widths of the check names is used to adjust the width of the Check column. This way there no longer is a need to restrict the widths of the check names to be at most ten terminal-columns.
2019-05-01xz: Fix xz -lvv column alignment to look at the translated strings.Lasse Collin1-2/+2
2019-03-04xz: Automatically align column headings in xz -lvv.Lasse Collin1-51/+212
2019-03-04xz: Automatically align strings ending in a colon in --list output.Lasse Collin1-12/+102
This should avoid alignment errors in translations with these strings.
2018-12-20xz: Fix a crash in progress indicator when in passthru mode.Lasse Collin3-7/+25
"xz -dcfv not_an_xz_file" crashed (all four options are required to trigger it). It caused xz to call lzma_get_progress(&strm, ...) when no coder was initialized in strm. In this situation strm.internal is NULL which leads to a crash in lzma_get_progress(). The bug was introduced when xz started using lzma_get_progress() to get progress info for multi-threaded compression, so the bug is present in versions 5.1.3alpha and higher. Thanks to Filip Palian <Filip.Palian@pjwstk.edu.pl> for the bug report.
2018-11-22xz: Update man page timestamp.Lasse Collin1-1/+1
2018-11-22'have have' typosPavel Raiskup2-2/+2
2017-08-14Fix or hide warnings from GCC 7's -Wimplicit-fallthrough.Lasse Collin1-0/+2
2017-05-23xz: Fix "xz --list --robot missing_or_bad_file.xz".Lasse Collin1-2/+6
It ended up printing an uninitialized char-array when trying to print the check names (column 7) on the "totals" line. This also changes the column 12 (minimum xz version) to 50000002 (xz 5.0.0) instead of 0 when there are no valid input files. Thanks to kidmin for the bug report.
2017-04-24xz: Use lzma_file_info_decoder() for --list.Lasse Collin1-210/+44
2017-04-21liblzma: Rename LZMA_SEEK to LZMA_SEEK_NEEDED and seek_in to seek_pos.Lasse Collin1-1/+1
2017-04-19Update the home page URLs to HTTPS.Lasse Collin1-3/+3
2017-04-05xz: Add io_seek_src().Lasse Collin2-3/+30
2017-03-30xz: Use POSIX_FADV_RANDOM for in "xz --list" mode.Lasse Collin1-2/+8
xz --list is random access so POSIX_FADV_SEQUENTIAL was clearly wrong.
2017-03-30liblzma: Add generic support for input seeking (LZMA_SEEK).Lasse Collin1-0/+1
Also mention LZMA_SEEK in xz/message.c to silence a warning.
2016-06-30xz: Fix copying of timestamps on Windows.Lasse Collin1-0/+18
xz used to call utime() on Windows, but its result gets lost on close(). Using _futime() seems to work. Thanks to Martok for reporting the bug: http://www.mail-archive.com/xz-devel@tukaani.org/msg00261.html
2016-06-16xz: Silence warnings from -Wlogical-op.Lasse Collin1-2/+10
Thanks to Evan Nemerson.
2016-04-10Build: Fix = to += for xz_SOURCES in src/xz/Makefile.am.Lasse Collin1-1/+1
Thanks to Christian Kujau.
2015-11-03xz: Make xz buildable even when encoders or decoders are disabled.Lasse Collin5-13/+58
The patch is quite long but it's mostly about adding new #ifdefs to omit code when encoders or decoders have been disabled. This adds two new #defines to config.h: HAVE_ENCODERS and HAVE_DECODERS.
2015-11-02xz: Always close the file before trying to delete it.Lasse Collin1-13/+12
unlink() can return EBUSY in errno for open files on some operating systems and file systems.
2015-05-11xz: Document that threaded decompression hasn't been implemented yet.Lasse Collin1-1/+9
2015-04-20Revert "xz: Use pipe2() if available."Lasse Collin1-8/+1
This reverts commit 7a11c4a8e5e15f13d5fa59233b3172e65428efdd. It is a problem when libc has pipe2() but the kernel is too old to have pipe2() and thus pipe2() fails. In xz it's pointless to have a fallback for non-functioning pipe2(); it's better to avoid pipe2() completely. Thanks to Michael Fox for the bug report.
2015-04-01xz: Fix the Capsicum rights on user_abort_pipe.Lasse Collin1-1/+5
2015-03-31xz: Add support for sandboxing with Capsicum.Lasse Collin5-1/+110
The sandboxing is used conditionally as described in main.c. This isn't optimal but it was much easier to implement than a full sandboxing solution and it still covers the most common use cases where xz is writing to standard output. This should have practically no effect on performance even with small files as fork() isn't needed. C and locale libraries can open files as needed. This has been fine in the past, but it's a problem with things like Capsicum. io_sandbox_enter() tries to ensure that various locale-related files have been loaded before cap_enter() is called, but it's possible that there are other similar problems which haven't been seen yet. Currently Capsicum is available on FreeBSD 10 and later and there is a port to Linux too. Thanks to Loganaden Velvindron for help.
2015-03-07xz: size_t/uint32_t cleanup in options.c.Lasse Collin1-6/+6
2015-03-07xz: Fix a comment and silence a warning in message.c.Lasse Collin1-2/+3
2015-03-07xz: Make arg_count an unsigned int to silence a warning.Lasse Collin2-2/+2
Actually the value of arg_count cannot exceed INT_MAX but it's nicer as an unsigned int.
2015-02-22xz: Use pipe2() if available.Lasse Collin1-1/+8
2015-02-21xz: Fix the fcntl() usage when creating a pipe for the self-pipe trick.Lasse Collin1-5/+11
Now it reads the old flags instead of blindly setting O_NONBLOCK. The old code may have worked correctly, but this is better.
2015-01-09xz: Fix comments.Lasse Collin1-4/+8
2015-01-09xz: Don't fail if stdout doesn't support O_NONBLOCK.Lasse Collin1-21/+15
This is similar to the case with stdin. Thanks to Brad Smith for the bug report and testing on OpenBSD.
2015-01-07xz: Fix a memory leak in DOS-specific code.Lasse Collin1-0/+2
2015-01-07xz: Don't fail if stdin doesn't support O_NONBLOCK.Lasse Collin1-11/+7
It's a problem at least on OpenBSD which doesn't support O_NONBLOCK on e.g. /dev/null. I'm not surprised if it's a problem on other OSes too since this behavior is allowed in POSIX-1.2008. The code relying on this behavior was committed in June 2013 and included in 5.1.3alpha released on 2013-10-26. Clearly the development releases only get limited testing.
2014-12-21xz: Fix a comment.Lasse Collin1-2/+2
2014-12-16xz: Update the man page about --threads.Lasse Collin1-5/+0
2014-12-16xz: Update the man page about --block-size.Lasse Collin1-8/+33
2014-11-26Remove LZMA_UNSTABLE macro.Lasse Collin1-1/+0