aboutsummaryrefslogtreecommitdiff
path: root/src/xz/xz.1
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--src/xz/xz.1341
1 files changed, 215 insertions, 126 deletions
diff --git a/src/xz/xz.1 b/src/xz/xz.1
index 644822ac..a2eabd72 100644
--- a/src/xz/xz.1
+++ b/src/xz/xz.1
@@ -5,7 +5,7 @@
.\" This file has been put into the public domain.
.\" You can do whatever you want with this file.
.\"
-.TH XZ 1 "2010-07-28" "Tukaani" "XZ Utils"
+.TH XZ 1 "2010-08-07" "Tukaani" "XZ Utils"
.SH NAME
xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files
.SH SYNOPSIS
@@ -188,52 +188,56 @@ The memory usage of
.B xz
varies from a few hundred kilobytes to several gigabytes depending on
the compression settings. The settings used when compressing a file
-affect also the memory usage of the decompressor. Typically the decompressor
-needs only 5\ % to 20\ % of the amount of RAM that the compressor needed when
-creating the file. Still, the worst-case memory usage of the decompressor
-is several gigabytes.
+determine the memory requirements of the decompressor. Typically the
+decompressor needs only 5\ % to 20\ % of the amount of memory that the
+compressor needed when creating the file. For example, decompressing a
+file created with
+.B xz \-9
+currently requires 65 MiB of memory. Still, it is possible to have
+.B .xz
+files that need several gigabytes of memory to decompress.
.PP
-To prevent uncomfortable surprises caused by huge memory usage,
+Especially users of older systems may find the possibility of very large
+memory usage annoying. To prevent uncomfortable surprises,
.B xz
-has a built-in memory usage limiter. While some operating systems provide
-ways to limit the memory usage of processes, relying on it wasn't deemed
-to be flexible enough. The default limit depends on the total amount of
-physical RAM:
-.IP \(bu 3
-If 40\ % of RAM is at least 80 MiB, 40\ % of RAM is used as the limit.
-.IP \(bu 3
-If 80\ % of RAM is less than 80 MiB, 80\ % of RAM is used as the limit.
-.IP \(bu 3
-Otherwise 80 MiB is used as the limit.
+has a built-in memory usage limiter, which is disabled by default.
+While some operating systems provide ways to limit the memory usage of
+processes, relying on it wasn't deemed to be flexible enough (e.g. using
+.BR ulimit (1)
+to limit virtual memory tends to cripple
+.BR mmap (2)).
.PP
-When compressing, if the selected compression settings exceed the memory
-usage limit, the settings are automatically adjusted downwards and a notice
-about this is displayed. As an exception, if the memory usage limit is
-exceeded when compressing with
-.B \-\-format=raw
-or
-.BR \-\-no\-adjust ,
-an error is displayed and
+The memory usage limiter can be enabled with the command line option
+\fB\-\-memlimit=\fIlimit\fR, but often it is more convenient to enable
+the limiter by default by setting the environment variable
+.BR XZ_DEFAULTS ,
+e.g.
+.BR XZ_DEFAULTS=\-\-memlimit=150MiB .
+It is possible to set the limits separately for compression and decompression
+by using \fB\-\-memlimit\-compress=\fIlimit\fR and
+\fB\-\-memlimit\-decompress=\fIlimit\fR, respectively.
+Using these two options outside
+.B XZ_DEFAULTS
+is rarely useful, because a single run of
.B xz
-will exit with exit status
-.BR 1 .
+cannot do both compression and decompression and
+.BI \-\-memlimit= limit
+(or \fB\-M\fR \fIlimit\fR)
+is shorter to type on the command line.
.PP
-If source
-.I file
-cannot be decompressed without exceeding the memory usage limit, an error
-message is displayed and the file is skipped. Note that compressed files
-may contain many blocks, which may have been compressed with different
-settings. Typically all blocks will have roughly the same memory requirements,
-but it is possible that a block later in the file will exceed the memory usage
-limit, and an error about too low memory usage limit gets displayed after some
-data has already been decompressed.
-.PP
-The absolute value of the active memory usage limit can be seen with
-.B \-\-info-memory
-or near the bottom of the output of
-.BR \-\-long\-help .
-The default limit can be overridden with
-\fB\-\-memory=\fIlimit\fR.
+If the specified memory usage limit is exceeded when decompressing,
+.B xz
+will display an error and decompressing the file will fail.
+If the limit is exceeded when compressing,
+.B xz
+will try to scale the settings down so that the limit is no longer exceeded
+(except when using \fB\-\-format=raw\fR or \fB\-\-no\-adjust\fR).
+This way the operation won't fail unless the limit is very small. The scaling
+of the settings is done in steps that don't match the compression level
+presets, e.g. if the limit is only slightly less than the amount required for
+.BR "xz \-9" ,
+the settings will be scaled down only a little, not all the way down to
+.BR "xz \-8" .
.SS Concatenation and padding with .xz files
It is possible to concatenate
.B .xz
@@ -363,7 +367,7 @@ doesn't recognize the type of the source file,
.B xz
will copy the source file as is to standard output. This allows using
.B xzcat
-.B \--force
+.B \-\-force
like
.BR cat (1)
for files that have not been compressed with
@@ -380,7 +384,7 @@ can be used to restrict
to decompress only a single file format.
.RE
.TP
-.BR \-c ", " \-\-stdout ", " \-\-to-stdout
+.BR \-c ", " \-\-stdout ", " \-\-to\-stdout
Write the compressed or decompressed data to standard output instead of
a file. This implies
.BR \-\-keep .
@@ -559,12 +563,8 @@ due to speed and memory usage.
The exact compression settings (filter chain) used by each preset may
vary between
.B xz
-versions. The settings may also vary between files being compressed, if
-.B xz
-determines that modified settings will probably give better compression
-ratio without significantly affecting compression time or memory usage.
-.IP
-Because the settings may vary, the memory usage may vary too. The following
+versions. Because the settings may vary, the memory usage may vary
+slightly too. FIXME The following
table lists the maximum memory usage of each preset level, which won't be
exceeded even in future versions of
.BR xz .
@@ -590,12 +590,6 @@ Preset;Compression;Decompression
.TE
.RE
.RE
-.IP
-When compressing,
-.B xz
-automatically adjusts the compression settings downwards if
-the memory usage limit would be exceeded, so it is safe to specify
-a high preset level even on systems that don't have lots of RAM.
.TP
.BR \-\-fast " and " \-\-best
These are somewhat misleading aliases for
@@ -619,16 +613,25 @@ of the compressor or decompressor (exception: compressor memory usage may
increase a little with presets \fB\-0\fR ... \fB\-2\fR). The downside is that
the compression time will increase dramatically (it can easily double).
.TP
+.BI \-\-memlimit\-compress= limit
+Set a memory usage limit for compression. If this option is specified
+multiple times, the last one takes effect.
+.IP
+If the compression settings exceed the
+.IR limit ,
+.B xz
+will adjust the settings downwards so that the limit is no longer exceeded
+and display a notice that automatic adjustment was done. Adjustment is never
+done when compressing with
+.B \-\-format=raw
+or if
.B \-\-no\-adjust
-Display an error and exit if the compression settings exceed the
-the memory usage limit. The default is to adjust the settings downwards so
-that the memory usage limit is not exceeded. Automatic adjusting is
-always disabled when creating raw streams
-.RB ( \-\-format=raw ).
-.TP
-\fB\-M\fR \fIlimit\fR, \fB\-\-memory=\fIlimit
-Set the memory usage limit. If this option is specified multiple times,
-the last one takes effect. The
+has been specified. In those cases, an error is displayed and
+.B xz
+will exit with exit status
+.BR 1 .
+.IP
+The
.I limit
can be specified in multiple ways:
.RS
@@ -638,52 +641,80 @@ The
can be an absolute value in bytes. Using an integer suffix like
.B MiB
can be useful. Example:
-.B "\-\-memory=80MiB"
+.B "\-\-memlimit\-compress=80MiB"
.IP \(bu 3
The
.I limit
-can be specified as a percentage of physical RAM. Example:
-.B "\-\-memory=70%"
+can be specified as a percentage of total physical memory (RAM).
+This can be useful especially when setting the
+.B XZ_DEFAULTS
+environment variable in a shell initialization script that is shared
+between different computers. That way the limit is automatically bigger
+on systems with more memory. Example:
+.B "\-\-memlimit\-compress=70%"
.IP \(bu 3
The
.I limit
can be reset back to its default value by setting it to
.BR 0 .
-See the section
-.B "Memory usage"
-for how the default limit is defined.
-.IP \(bu 3
-The memory usage limiting can be effectively disabled by setting
+This is currently equivalent to setting the
.I limit
to
-.BR max .
-This isn't recommended. It's usually better to use, for example,
-.BR \-\-memory=90% .
+.B max
+i.e. no memory usage limit. Once multithreading support has been implemented,
+there may be a difference between
+.B 0
+and
+.B max
+for the multithreaded case, so it is recommended to use
+.B 0
+instead of
+.B max
+at least until the details have been decided.
.RE
.IP
-The current
-.I limit
-can be seen near the bottom of the output of the
-.B \-\-long-help
-option.
+See also the section
+.BR "Memory usage" .
+.TP
+.BI \-\-memlimit\-decompress= limit
+Set a memory usage limit for decompression. This affects also the
+.B \-\-list
+mode. If the operation is not possible without exceeding the
+.IR limit ,
+.B xz
+will display an error and decompressing the file will fail. See
+.BI \-\-memlimit\-compress= limit
+for possible ways to specify the
+.IR limit .
+.TP
+\fB\-M\fR \fIlimit\fR, \fB\-\-memlimit=\fIlimit\fR, \fB\-\-memory=\fIlimit
+This is equivalent to specifying \fB\-\-memlimit\-compress=\fIlimit
+\fB\-\-memlimit\-decompress=\fIlimit\fR.
+.TP
+.B \-\-no\-adjust
+Display an error and exit if the compression settings exceed the
+the memory usage limit. The default is to adjust the settings downwards so
+that the memory usage limit is not exceeded. Automatic adjusting is
+always disabled when creating raw streams
+.RB ( \-\-format=raw ).
.TP
\fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads
-Specify the maximum number of worker threads to use. The default is
-the number of available CPU cores. You can see the current value of
-.I threads
-near the end of the output of the
-.B \-\-long\-help
-option.
-.IP
-The actual number of worker threads can be less than
+Specify the number of worker threads to use. The actual number of threads
+can be less than
.I threads
if using more threads would exceed the memory usage limit.
-In addition to CPU-intensive worker threads,
-.B xz
-may use a few auxiliary threads, which don't use a lot of CPU time.
.IP
.B "Multithreaded compression and decompression are not implemented yet,"
.B "so this option has no effect for now."
+.IP
+.B "As of writing (2010-08-07), it hasn't been decided if threads will be"
+.B "used by default on multicore systems once support for threading has"
+.B "been implemented. Comments are welcome."
+The complicating factor is that using many threads will increase the memory
+usage dramatically. Note that if multithreading will be the default,
+it will be done so that single-threaded and multithreaded modes produce
+the same output, so compression ratio won't be significantly affected if
+threading will be enabled by default.
.SS Custom compressor filter chains
A custom filter chain allows specifying the compression settings in detail
instead of relying on the settings associated to the preset levels.
@@ -1037,7 +1068,8 @@ Currently only simple byte-wise delta calculation is supported. It can
be useful when compressing e.g. uncompressed bitmap images or uncompressed
PCM audio. However, special purpose algorithms may give significantly better
results than Delta + LZMA2. This is true especially with audio, which
-compresses faster and better e.g. with FLAC.
+compresses faster and better e.g. with
+.BR flac (1).
.IP
Supported
.IR options :
@@ -1087,18 +1119,17 @@ processed so far.
.IP \(bu 3
Compression or decompression speed. This is measured as the amount of
uncompressed data consumed (compression) or produced (decompression)
-per second. It is shown once a few seconds have passed since
+per second. It is shown after a few seconds have passed since
.B xz
started processing the file.
.IP \(bu 3
-Elapsed time or estimated time remaining.
-Elapsed time is displayed in the format M:SS or H:MM:SS.
-The estimated remaining time is displayed in a less precise format
-which never has colons, for example, 2 min 30 s. The estimate can
-be shown only when the size of the input file is known and a couple of
-seconds have already passed since
+Elapsed time in the format M:SS or H:MM:SS.
+.IP \(bu 3
+Estimated remaining time is shown only when the size of the input file is
+known and a couple of seconds have already passed since
.B xz
-started processing the file.
+started processing the file. The time is shown in a less precise format which
+never has any colons, e.g. 2 min 30 s.
.RE
.IP
When standard error is not a terminal,
@@ -1106,11 +1137,11 @@ When standard error is not a terminal,
will make
.B xz
print the filename, compressed size, uncompressed size, compression ratio,
-speed, and elapsed time on a single line to standard error after
-compressing or decompressing the file. If operating took at least a few
-seconds, also the speed and elapsed time are printed. If the operation
-didn't finish, for example due to user interruption, also the completion
-percentage is printed if the size of the input file is known.
+and possibly also the speed and elapsed time on a single line to standard
+error after compressing or decompressing the file. The speed and elapsed
+time are included only when the operation took at least a few seconds.
+If the operation didn't finish, for example due to user interruption, also
+the completion percentage is printed if the size of the input file is known.
.TP
.BR \-Q ", " \-\-no\-warn
Don't set the exit status to
@@ -1133,12 +1164,11 @@ releases. See the section
.B "ROBOT MODE"
for details.
.TP
-.BR \-\-info-memory
-Display the current memory usage limit in human-readable format on
-a single line, and exit successfully. To see how much RAM
+.BR \-\-info\-memory
+Display, in human-readable format, how much physical memory (RAM)
.B xz
-thinks your system has, use
-.BR "\-\-memory=100% \-\-info\-memory" .
+thinks the system has and the memory usage limits for compression
+and decompression, and exit successfully.
.TP
.BR \-h ", " \-\-help
Display a help message describing the most commonly used options,
@@ -1165,7 +1195,7 @@ easier to parse by other programs. Currently
.B \-\-robot
is supported only together with
.BR \-\-version ,
-.BR \-\-info-memory ,
+.BR \-\-info\-memory ,
and
.BR \-\-list .
It will be supported for normal compression and decompression in the future.
@@ -1216,10 +1246,24 @@ and
5.0.0 is
.BR 50000002 .
.SS Memory limit information
-.B "xz \-\-robot \-\-info-memory"
-prints the current memory usage limit as bytes on a single line.
-To get the total amount of installed RAM, use
-.BR "xz \-\-robot \-\-memory=100% \-\-info-memory" .
+.B "xz \-\-robot \-\-info\-memory"
+prints a single line with three tab-separated columns:
+.RS
+.IP 1. 4
+Total amount of physical memory (RAM) as bytes
+.IP 2. 4
+Memory usage limit for compression as bytes.
+A special value of zero indicates the default setting,
+which for single-threaded mode is the same as no limit.
+.IP 3. 4
+Memory usage limit for decompression as bytes.
+A special value of zero indicates the default setting,
+which for single-threaded mode is the same as no limit.
+.RE
+.PP
+In the future, the output of
+.B "xz \-\-robot \-\-info\-memory"
+may have more columns, but never more than a single line.
.SS List mode
.B "xz \-\-robot \-\-list"
uses tab-separated output. The first column of every line has a string
@@ -1455,16 +1499,52 @@ Something worth a warning occurred, but no actual errors occurred.
Notices (not warnings or errors) printed on standard error don't affect
the exit status.
.SH ENVIRONMENT
+.B xz
+parses space-separated lists of options from the environment variables
+.B XZ_DEFAULTS
+and
+.BR XZ_OPT ,
+in this order, before parsing the options from the command line. Note that
+only options are parsed from the environment variables; all non-options
+are silently ignored. Parsing is done with
+.BR getopt_long (3)
+which is used also for the command line arguments.
+.TP
+.B XZ_DEFAULTS
+User-specific or system-wide default options.
+Typically this is set in a shell initialization script to enable
+.BR xz 's
+memory usage limiter by default. Excluding shell initialization scripts
+and similar special cases, scripts must never set or unset
+.BR XZ_DEFAULTS .
.TP
.B XZ_OPT
-A space-separated list of options is parsed from
+This is for passing options to
+.B xz
+when it is not possible to set the options directly on the
+.B xz
+command line. This is the case e.g. when
+.B xz
+is run by a script or tool, e.g. GNU
+.BR tar (1):
+.RS
+.IP
+\fBXZ_OPT=\-2v tar caf foo.tar.xz foo
+.RE
+.IP
+Scripts may use
.B XZ_OPT
-before parsing the options given on the command line. Note that only
-options are parsed from
-.BR XZ_OPT ;
-all non-options are silently ignored. Parsing is done with
-.BR getopt_long (3)
-which is used also for the command line arguments.
+e.g. to set script-specific default compression options.
+It is still recommended to allow users to override
+.B XZ_OPT
+if that is reasonable, e.g. in
+.BR sh (1)
+scripts one may use something like this:
+.RS
+.IP
+\fBXZ_OPT=${XZ_OPT\-"\-7e"}; export XZ_OPT
+.RE
+.IP
.SH "LZMA UTILS COMPATIBILITY"
The command line syntax of
.B xz
@@ -1663,7 +1743,7 @@ XZ Embedded supports BCJ filters, but only with the default start offset.
A mix of compressed and uncompressed files can be decompressed
to standard output with a single command:
.IP
-.B "xz -dcf a.txt b.txt.xz c.txt d.txt.xz > abcd.txt"
+.B "xz \-dcf a.txt b.txt.xz c.txt d.txt.xz > abcd.txt"
.SS Parallel compression of many files
On GNU and *BSD,
.BR find (1)
@@ -1672,7 +1752,8 @@ and
can be used to parallelize compression of many files:
.PP
.IP
-.B "find . \-type f \e! \-name '*.xz' \-print0 | xargs \-0r \-P4 \-n16 xz"
+.B "find . \-type f \e! \-name '*.xz' \-print0 |"
+.B "xargs \-0r \-P4 \-n16 xz \-T1"
.PP
The
.B \-P
@@ -1690,11 +1771,19 @@ or even more may be appropriate to reduce the number of
processes that
.BR xargs (1)
will eventually create.
+.PP
+The option
+.B \-T1
+for
+.B xz
+is there to force it to single-threaded mode, because
+.BR xargs (1)
+is used to control the amount of parallelization.
.SS Robot mode examples
Calculating how many bytes have been saved in total after compressing
multiple files:
.IP
-.B "xz --robot --list *.xz | awk '/^totals/{print $5\-$4}'"
+.B "xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}'"
.SH "SEE ALSO"
.BR xzdec (1),
.BR gzip (1),