Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
In the v5.2 branch this feature is considered experimental
and thus disabled by default.
The sandboxing is used conditionally as described in main.c.
This isn't optimal but it was much easier to implement than
a full sandboxing solution and it still covers the most common
use cases where xz is writing to standard output. This should
have practically no effect on performance even with small files
as fork() isn't needed.
C and locale libraries can open files as needed. This has been
fine in the past, but it's a problem with things like Capsicum.
io_sandbox_enter() tries to ensure that various locale-related
files have been loaded before cap_enter() is called, but it's
possible that there are other similar problems which haven't
been seen yet.
Currently Capsicum is available on FreeBSD 10 and later
and there is a port to Linux too.
Thanks to Loganaden Velvindron for help.
|
|
xz used to call utime() on Windows, but its result gets lost
on close(). Using _futime() seems to work.
Thanks to Martok for reporting the bug:
http://www.mail-archive.com/xz-devel@tukaani.org/msg00261.html
|
|
Thanks to Evan Nemerson.
|
|
Thanks to Christian Kujau.
|
|
The patch is quite long but it's mostly about adding new #ifdefs
to omit code when encoders or decoders have been disabled.
This adds two new #defines to config.h: HAVE_ENCODERS and
HAVE_DECODERS.
|
|
unlink() can return EBUSY in errno for open files on some
operating systems and file systems.
|
|
|
|
This reverts commit 7a11c4a8e5e15f13d5fa59233b3172e65428efdd.
It is a problem when libc has pipe2() but the kernel is too
old to have pipe2() and thus pipe2() fails. In xz it's pointless
to have a fallback for non-functioning pipe2(); it's better to
avoid pipe2() completely.
Thanks to Michael Fox for the bug report.
|
|
|
|
|
|
Actually the value of arg_count cannot exceed INT_MAX
but it's nicer as an unsigned int.
|
|
|
|
Now it reads the old flags instead of blindly setting O_NONBLOCK.
The old code may have worked correctly, but this is better.
|
|
|
|
This is similar to the case with stdin.
Thanks to Brad Smith for the bug report and testing
on OpenBSD.
|
|
|
|
It's a problem at least on OpenBSD which doesn't support
O_NONBLOCK on e.g. /dev/null. I'm not surprised if it's
a problem on other OSes too since this behavior is allowed
in POSIX-1.2008.
The code relying on this behavior was committed in June 2013
and included in 5.1.3alpha released on 2013-10-26. Clearly
the development releases only get limited testing.
|
|
|
|
|
|
|
|
|
|
Due to a bug in Automake, subdir-objects won't be enabled
for now.
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=17354
Thanks to Daniel Richard G. for the original patches.
|
|
|
|
Updated: --threads, --block-size, and --block-list
Added: --flush-timeout
|
|
|
|
This avoids LZMA_PROG_ERROR from lzma_code() with filter chains
that don't support LZMA_SYNC_FLUSH.
|
|
|
|
Thanks to Christian Hesse.
|
|
I don't know the details but I have an impression that there's
no problem in practice if using GCC since people have built xz
with GCC (without patching xz), but renaming the variable cannot
hurt either.
Thanks to Mark Ashley.
|
|
|
|
Since the only call to suffix_set() uses optarg
as the argument, fixing this bug doesn't change
the behavior of the program.
|
|
|
|
Previously, --block-list and --block-size only worked together
in threaded mode. Boundaries are specified by --block-list, but
--block-size specifies the maximum size for a Block. Now this
works in single-threaded mode too.
Thanks to James M Leddy for the original patch.
|
|
This needs to be updated before 5.2.0.
|
|
|
|
Now if --block-list is used in threaded mode, the encoder
won't need to flush at each Block boundary specified via
--block-list. This improves performance a lot, making
threading helpful with --block-list.
The flush timer was reset after LZMA_FULL_FLUSH but since
LZMA_FULL_BARRIER doesn't flush, resetting the timer is
no longer done.
|
|
|
|
Now liblzma only uses "mythread" functions and types
which are defined in mythread.h matching the desired
threading method.
Before Windows Vista, there is no direct equivalent to
pthread condition variables. Since this package doesn't
use pthread_cond_broadcast(), pre-Vista threading can
still be kept quite simple. The pre-Vista code doesn't
use anything that wasn't already available in Windows 95,
so the binaries should run even on Windows 95 if someone
happens to care.
|
|
When --flush-timeout=TIMEOUT is used, xz will use
LZMA_SYNC_FLUSH if read() would block and at least
TIMEOUT milliseconds has elapsed since the previous flush.
This can be useful in realtime-like use cases where the
data is simultanously decompressed by another process
(possibly on a different computer). If new uncompressed
input data is produced slowly, without this option xz could
buffer the data for a long time until it would become
decompressible from the output.
If TIMEOUT is 0, the feature is disabled. This is the default.
This commit affects the compression side. Using xz for
the decompression side for the above purpose doesn't work
yet so well because there is quite a bit of input and
output buffering when decompressing.
The --long-help or man page were not updated yet.
The details of this feature may change.
|
|
|
|
Testing for end of file was no longer correct after full flushing
became possible with --block-size=SIZE and --block-list=SIZES.
There was no bug in practice though because xz just made a few
unneeded zero-byte reads.
|
|
This switches units from microseconds to milliseconds.
New clock_gettime(CLOCK_MONOTONIC) will be used if available.
There is still a fallback to gettimeofday().
|
|
Thanks to Christian Hesse.
|
|
Now both reading and writing should be without
race conditions with signals.
They might still be signal handling issues left.
Signals are blocked during many operations to avoid
EINTR but it may cause problems e.g. if writing to
stderr blocks when trying to display an error message.
|
|
It didn't affect the behavior of the code since -1
becomes true anyway.
|
|
It is possible that a signal to set user_abort arrives right
before a blocking system call is made. In this case the call
may block until another signal arrives, while the wanted
behavior is to make xz clean up and exit as soon as possible.
After this commit, the race condition is avoided with the
input side which already uses non-blocking I/O. The output
side still uses blocking I/O and thus has the race condition.
|
|
|
|
Nowadays errno == EFTYPE is documented in open(2).
|
|
POSIX says that fcntl(fd, F_SETFL, flags) returns -1 on
error and "other than -1" on success. This is how it is
documented e.g. on OpenBSD too. On Linux, success with
F_SETFL is always 0 (at least accorinding to fcntl(2)
from man-pages 3.51).
|
|
Due to a wrong variable name, when writing a sparse file
to standard output, *all* file status flags were cleared
(to the extent the operating system allowed it) instead of
only clearing the O_APPEND flag. In practice this worked
fine in the common situations on GNU/Linux, but I didn't
check how it behaved elsewhere.
The original flags were still restored correctly. I still
changed the code to use a separate boolean variable to
indicate when the flags should be restored instead of
relying on a special value in stdout_flags.
|
|
Input file can be a FIFO or something else that doesn't
support posix_fadvise() so don't check the return value
even with an assertion. Nothing bad happens if the call
to posix_fadvise() fails.
|
|
It is a no-op for now, but if an old xz version is used
together with a newer liblzma that supports something new,
then this check becomes important and will stop the old xz
from trying to parse files that it won't understand.
|
|
This affects only "xz -lvv". Normal decompression with xz
already detected if Block Header and Index had mismatched
Uncompressed Size fields. So this just makes "xz -lvv"
show such files as corrupt instead of showing the
Uncompressed Size from Index.
|
|
Thanks to Eric S. Raymond.
|
|
Now the interaction of presets and custom filter chains
is described correctly. Earlier it contradicted itself.
Thanks to DevHC who reported these issues on IRC to me
on 2012-12-14.
|
|
There was somewhat illogical behavior when --extreme was
specified and mixed with custom filter chains.
Before this commit, "xz -9 --lzma2 -e" was equivalent
to "xz --lzma2". After it is equivalent to "xz -6e"
(all earlier preset options get forgotten when a custom
filter chain is specified and the default preset is 6
to which -e is applied). I find this less illogical.
This also affects the meaning of "xz -9e --lzma2 -7".
Earlier it was equivalent to "xz -7e" (the -e specified
before a custom filter chain wasn't forgotten). Now it
is "xz -7". Note that "xz -7e" still is the same as "xz -e7".
Hopefully very few cared about this in the first place,
so pretty much no one should even notice this change.
Thanks to Conley Moorhous.
|
|
This adds lzma_get_progress() to liblzma and takes advantage
of it in xz.
lzma_get_progress() collects progress information from
the thread-specific structures so that fairly accurate
progress information is available to applications. Adding
a new function seemed to be a better way than making the
information directly available in lzma_stream (like total_in
and total_out are) because collecting the information requires
locking mutexes. It's waste of time to do it more often than
the up to date information is actually needed by an application.
|
|
Thanks to Olivier Delhomme for pointing out that this
was still missing.
|
|
|
|
Thanks to Jim Meyering.
|
|
Thanks to Jim Meyering.
|
|
|
|
Thanks to Jonathan Nieder.
|
|
The decoder bug was fixed in 5.0.2 instead of 5.0.3.
|
|
It's broken with threads and when also --block-size is used.
|
|
|
|
|
|
|
|
This documents only the columns that are in v5.0.
The new columns added in the master branch aren't
necessarily stable yet.
|
|
It printed the filename in "filename (x/y)" format
which it obviously shouldn't do in robot mode.
|
|
Man page wasn't updated yet.
|
|
Thanks to Bela Lubkin.
|
|
Thanks to Chris Donawa.
|
|
It could do an invalid free() and read past the end
of the uninitialized filters array.
|
|
French needs a space before a colon, e.g. "xz : foo error".
|
|
|
|
Now the following works as you would expect:
echo foo | xz > foo.xz
echo bar | xz >> foo.xz
( xz -dc --single-stream ; xz -dc --single-stream ) < foo.xz
Note that it doesn't work if the input is not seekable
or if there is Stream Padding between the concatenated
.xz Streams.
|
|
|
|
This way people hopefully won't complain if these APIs
change and break code that used an older API.
|
|
Spot candidates by running these commands:
git ls-files |xargs perl -0777 -n \
-e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt]o)\s+\1\b/gims)' \
-e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g; print "$ARGV:$n:$v\n"}'
Thanks to Jim Meyering for the original patch.
|
|
|
|
|
|
|
|
|
|
This is incompatible with the 8.3 support patch made by
Juan Manuel Guerrero. I think this one is nicer, but
I need to get feedback from DOS users before saying
that this is the final version of 8.3 filename support.
|
|
Try to avoid overwriting the source file if --force is
used and the generated destination filename refers to
the source file. This can happen with 8.3 filenames where
extra characters are ignored.
If the generated output file refers to a special file
like "con" or "prn", refuse to write to it even if --force
is used.
|
|
|
|
Now it always defaults to one thread. Maybe this
will change again if a threading method is added
that doesn't affect memory usage.
|
|
|
|
|
|
|
|
|
|
This uses LZMA_FULL_FLUSH every SIZE bytes of input.
Man page wasn't updated yet.
|
|
This can be useful when there is garbage after the
compressed stream (.xz, .lzma, or raw stream).
Man page wasn't updated yet.
|
|
struct suffix_pair isn't needed in compresed_name()
so get rid of it there.
|
|
Now "xz -S .test foo.test" refuses to compress the
file because it already has the suffix .test. The man
page had it documented this way already.
|
|
xz didn't compress setuid/setgid/sticky files and files
with multiple hard links even with --force. This bug was
introduced in 23ac2c44c3ac76994825adb7f9a8f719f78b5ee4.
Thanks to Charles Wilson.
|
|
Thanks to Cristian Rodríguez for the original patch.
|
|
Juan Manuel Guerrero had fixed this in his XZ Utils port
to DOS/DJGPP. The bug affects also Windows and OS/2.
|
|
|
|
lzma_chunk_size() was commented out because it is
currently useless.
|
|
This is similar to DOS/DJGPP that killing the program
with a signal will print a backtrace or a similar message.
|
|
SA_RESTART is not as portable as I had hoped. It's missing
at least from OpenVMS, QNX, and DJGPP). Luckily we can do
fine without SA_RESTART.
|
|
|
|
Calling raise() to kill xz when user has pressed C-c
is a bit verbose on OS/2 and DOS/DJGPP. Instead of
calling raise(), set only the exit status to 1.
|
|
|
|
Most distros want xz linked against shared liblzma, so
it doesn't help much to require --enable-dynamic for that.
Those who want to avoid PIC on x86-32 to get better
performance, can still do it e.g. by using --disable-shared
to compile xz and then another pass to compile shared liblzma.
Part of these static/dynamic tricks were needed for Windows
in the past. Nowadays we rely on GCC and binutils to do the
right thing with auto-import. If the Autotooled build system
needs to support some other toolchain on Windows in the future,
this may need some rethinking.
|
|
Thanks to Jonathan Nieder.
|
|
|
|
Lots of content was updated on the xz man page.
Technical improvements:
- Start a new sentence on a new line.
- Use fairly short lines.
- Use constant-width font for examples (where supported).
- Some minor cleanups.
Thanks to Jonathan Nieder for some language fixes.
|
|
|
|
|
|
The code assumed that printing numbers with thousand separators
and decimal points would always produce only US-ASCII characters.
This was used for buffer sizes (with snprintf(), no overflows)
and aligning columns of the progress indicator and --list. That
assumption was wrong (e.g. LC_ALL=fi_FI.UTF-8 with glibc), so
multibyte character support was added in this commit. The old
way is used if the operating system doesn't have enough multibyte
support (e.g. lacks wcwidth()).
The sizes of buffers were increased to accomodate multibyte
characters. I don't know how big they should be exactly, but
they aren't used for anything critical, so it's not too bad.
If they still aren't big enough, I hopefully get a bug report.
snprintf() takes care of avoiding buffer overflows.
Some static buffers were replaced with buffers allocated on
stack. double_to_str() was removed. uint64_to_str() and
uint64_to_nicestr() now share the static buffer and test
for thousand separator support.
Integrity check names "None" and "Unknown-N" (2 <= N <= 15)
were marked to be translated. I had forgot these, plus they
wouldn't have worked correctly anyway before this commit,
because printing tables with multibyte strings didn't work.
Thanks to Marek Černocký for reporting the bug about
misaligned table columns in --list output.
|
|
|
|
I had somehow thought that N_() is usually used
as shorthand for ngettext().
This also fixes a missing \n from a call to ngettext().
|
|
|
|
Thanks to Joerg Sonnenberger.
|
|
Thanks Joerg Sonnenberger.
|
|
|
|
|
|
|
|
|
|
|
|
At least for now, the --help option doesn't list any
options that take arguments, so "Mandatory arguments to..."
can be omitted.
|
|
This is more logical behavior than ignoring preset level
options once a custom filter chain has been specified.
|
|
|
|
For several people, the limiter causes bigger problems that
it solves, so it is better to have it disabled by default.
Those who want to have a limiter by default need to enable
it via the environment variable XZ_DEFAULTS.
Support for environment variable XZ_DEFAULTS was added. It is
parsed before XZ_OPT and technically identical with it. The
intended uses differ quite a bit though; see the man page.
The memory usage limit can now be set separately for
compression and decompression using --memlimit-compress and
--memlimit-decompress. To set both at once, -M or --memlimit
can be used. --memory was retained as a legacy alias for
--memlimit for backwards compatibility.
The semantics of --info-memory were changed in backwards
incompatible way. Compatibility wasn't meaningful due to
changes in the memory usage limiter functionality.
The memory usage limiter info is no longer shown at the
bottom of xz --long -help.
The memory usage limiter support for removed completely from xzdec.
xz's man page was updated to match the above changes. Various
unrelated fixes were also made to the man page.
|
|
|
|
Thanks to A. Costa and Jonathan Nieder.
|
|
|
|
Thanks to Denis Excoffier.
|
|
Thanks to Denis Excoffier for the bug report.
|
|
|
|
|
|
I want to get a bug report if something else than
DJGPP lacks SA_RESTART.
|
|
- Concatenating .xz files and padding
- List mode
- Robot mode
- A few examples (but many more are needed)
|
|
|
|
|
|
This change is no-op, but good to have just in case
for the future.
|
|
Thanks to Jonathan Nieder.
|
|
This should have been done in
920a69a8d8e4203c5edddd829d932130eac188ea.
|
|
This should avoid some minor portability issues.
|
|
The spec isn't finished and the code didn't compile anymore.
It won't be included in XZ Utils 5.0.0. It's easy to get it
back once the spec is done.
|
|
message_filters_to_str() converts the filter chain to
a string. message_filters_show() replaces the original
message_filters().
uint32_to_optstr() was also added to show the dictionary
size in nicer format when possible.
|
|
The extra space for showing both has been taken from the
sizes field. If the sizes grow big, bigger units than MiB
will be used. It makes it slightly difficult to see that
progress is still happening with huge files, but it should
be OK in practice.
Thanks to Trent W. Buck for <http://bugs.debian.org/574583>
and Jonathan Nieder for suggestions how to fix it.
|
|
Originally both base-2 and base-10 were supported, but since
there seems to be little need for base-10 in XZ Utils, treat
everything as base-2 and also be more relaxed about the case
of the first letter of the suffix. Now xz will accept e.g.
KiB, Ki, k, K, kB, and KB, and interpret them all as 1024. The
recommended spelling of the suffixes are still KiB, MiB, and GiB.
|
|
It still feels a bit wrong to round 1 byte to 1 MiB but
at least it is now done consistently so that the same
byte value is always rounded the same way to MiB.
|
|
Previously the default limit was always 40 % of RAM. The
new limit is a little bit more complex:
- If 40 % of RAM is at least 80 MiB, 40 % of RAM is used
as the limit.
- If 80 % of RAM is over 80 MiB, 80 MiB is used as the limit.
- Otherwise 80 % of RAM is used as the limit.
This should make it possible to decompress files created with
"xz -9" on more systems. Swapping is generally more expected
on systems with less RAM, so higher default limit on them
shouldn't cause too bad surprises in terms of heavy swapping.
Instead, the higher default limit should reduce the number of
bad surprises when it used to prevent decompression of files
created with "xz -9". The DoS prevention system shouldn't be
a DoS itself.
Note that even with the new default limit, a system with 64 MiB
RAM cannot decompress files created with "xz -9" without user
overriding the limit. This should be OK, because if xz is going
to need more memory than the system has RAM, it will run very
very slowly and thus it's good that user has to override the limit
in that case.
|
|
Thanks to Jonathan Nieder.
|
|
This was added in 455e68c030fde8a8c2f5e254c3b3ab9489bf3735.
|
|
|
|
|
|
xz --force accepted symlinks, but didn't remove
them after successful compression. Instead, an error
message was displayed.
|
|
Previously it was set statically to CRC64 or CRC32
depending on options passed to the configure script.
|
|
|
|
If signal handlers haven't been established, then it's
useless to try to block them, especially since the sigset_t
used for blocking hasn't been initialized yet.
|
|
The opening of the destination file is now delayed a little.
The coder is initialized, and if decompressing, the memory
usage of the first Block compared against the memory
usage limit before the destination file is opened. This
means that if --force was used, the old "target" file won't
be deleted so easily when something goes wrong very early.
Thanks to Mark K for the bug report.
The above fix required some changes to progress message
handling. Now there is a separate function for setting and
printing the filename. It is used also in list.c.
list_file() now handles stdin correctly (gives an error).
A useless check for user_abort was removed from file_io.c.
|
|
This should have been already in
0bc9eab243dee3be764b3530433a7fcdc3f7c6a1.
|
|
|
|
|
|
Thanks to Robert Readman.
|
|
Added a note to translators too.
Thanks to Robert Readman.
|
|
This was introduced in
0dd6d007669b946543ca939a44243833c79e08f4 two days ago.
|
|
|
|
This is a bit rough but should be useful for basic things.
Ideas (with detailed examples) about the output format are
welcome.
The output of --robot --list is not necessarily stable yet,
although I don't currently have any plans about changing it.
The man page hasn't been updated yet.
|
|
It will be used by --list.
|
|
It is to ensure that floating point numbers
will always have a dot as the decimal separator.
|
|
|
|
|
|
to a terminal even if --force is specified.
It just seems more reasonable this way.
The new behavior matches bzip2. The old one matched gzip.
|
|
to stdout even if --force is used.
--force will still enable compression of symlinks, but only
in case they point to a regular file.
The new way simply seems more reasonable. It matches gzip's
behavior while the old one matched bzip2's behavior.
|
|
The problem was introduced when adding sparse file
support in 465d1b0d6518c5d980f2db4c2d769f9905bdd902.
Thanks to Charles Wilson.
|
|
for translation bugs.
|
|
Thanks to Marek Černocký.
|
|
Fix a missing _() in the error message too.
|
|
With bad luck this could cause a segfault due to
reading (but not writing) past the end of the buffer.
|
|
a regular file.
Sparse file creation can be disabled with --no-sparse.
I don't promise yet that the name of this option won't
change before 5.0.0. It's possible that the code, that
checks when it is safe to use sparse output on stdout,
is not good enough, and a more flexible command line
option is needed to configure sparse file handling.
|
|
Currently --robot works only with --info-memory and
--version. --help and --long-help work too, but --robot
has no effect on them.
Thanks to Jonathan Nieder for the original patches.
|
|
I had hoped to keep liblzma as purely a compression
library as possible (e.g. file I/O will go into
a different library), but it seems that applications
linking agaisnt liblzma need some way to determine
the memory usage limit, and knowing the amount of RAM
is one reasonable way to help making such decisions.
Thanks to Jonathan Nieder for the original patch.
|
|
|
|
|
|
Originally the idea was that using LZMA_FULL_FLUSH
with Stream encoder would read the filter chain
from the same array that was used to intialize the
Stream encoder. Since most apps wouldn't use
LZMA_FULL_FLUSH, most apps wouldn't need to keep
the filter chain available after initializing the
Stream encoder. However, due to my mistake, it
actually required keeping the array always available.
Since setting the new filter chain via the array
used at initialization time is not a nice way to do
it for a couple of reasons, this commit ditches it
and introduces lzma_filters_update(). This new function
replaces also the "persistent" flag used by LZMA2
(and to-be-designed Subblock filter), which was also
an ugly thing to do.
Thanks to Alexey Tourbin for reminding me about the problem
that Stream encoder used to require keeping the filter
chain allocated.
|
|
the man page though.
Thanks to Jim Meyering for noticing this.
|
|
|
|
Thanks to Jouk Jansen.
|
|
Thanks to Jouk Jansen.
|
|
Separate a few reusable components from XZ Utils specific
code. The reusable code is now in "tuklib" modules. A few
more could be separated still, e.g. bswap.h.
Fix some bugs in lzmainfo.
Fix physmem and cpucores code on OS/2. Thanks to Elbert Pol
for help.
Add OpenVMS support into physmem. Add a few #ifdefs to ease
building XZ Utils on OpenVMS. Thanks to Jouk Jansen for the
original patch.
|
|
Thanks to Christian Weisgerber for pointing out some of these.
|
|
This fixes "make install" on operating systems using
a suffix for executables.
Cygwin is treated specially. The symlink names won't have
.exe suffix even though the executables themselves have.
Thanks to Charles Wilson.
|
|
|
|
xz used to reject "xz --lzma2=pb=2," while
"xz --lzma2=pb=2,," worked. Now both work.
|
|
Seems that in addition on Windows and DOS, also OpenBSD
lacks support for %'d style printf() format strings.
So far that is the only modern POSIX-like system I know
with this problem, but after this hack, the thousand
separator shouldn't be a problem on any system.
Maybe testing if a format string like %'d produces
reasonable output is invoking undefined behavior on some
systems, but so far all the problematic systems I've tried
just print the raw format string (e.g. %'d prints 'd).
Maybe Autoconf test would have been better, but this
hack works also for cross-compilation, and avoids
recompilation in case the system libc starts to support
the thousand separator.
|
|
|
|
|
|
|
|
install-exec-hook -> install-data-hook
|
|
Make xz error message translation usable outside
xz (at least in upcoming lzmainfo).
|
|
|
|
|