diff options
author | Lasse Collin <lasse.collin@tukaani.org> | 2020-02-01 19:56:18 +0200 |
---|---|---|
committer | Lasse Collin <lasse.collin@tukaani.org> | 2020-02-01 19:56:18 +0200 |
commit | 353970510895f6a80adfe60cf71b70a95adfa8bc (patch) | |
tree | 112c8b4601bb0a90243cc7c7d4a6ad3433751e30 | |
parent | xz: Set the --flush-timeout deadline when the first input byte arrives. (diff) | |
download | xz-353970510895f6a80adfe60cf71b70a95adfa8bc.tar.xz |
xz: Limit --memlimit-compress to at most 4020 MiB for 32-bit xz.
See the code comment for reasoning. It's far from perfect but
hopefully good enough for certain cases while hopefully doing
nothing bad in other situations.
At presets -5 ... -9, 4020 MiB vs. 4096 MiB makes no difference
on how xz scales down the number of threads.
The limit has to be a few MiB below 4096 MiB because otherwise
things like "xz --lzma2=dict=500MiB" won't scale down the dict
size enough and xz cannot allocate enough memory. With
"ulimit -v $((4096 * 1024))" on x86-64, the limit in xz had
to be no more than 4085 MiB. Some safety margin is good though.
This is hack but it should be useful when running 32-bit xz on
a 64-bit kernel that gives full 4 GiB address space to xz.
Hopefully this is enough to solve this:
https://bugzilla.redhat.com/show_bug.cgi?id=1196786
FreeBSD has a patch that limits the result in tuklib_physmem()
to SIZE_MAX on 32-bit systems. While I think it's not the way
to do it, the results on --memlimit-compress have been good. This
commit should achieve practically identical results for compression
while leaving decompression and tuklib_physmem() and thus
lzma_physmem() unaffected.
-rw-r--r-- | src/xz/hardware.c | 32 | ||||
-rw-r--r-- | src/xz/xz.1 | 21 |
2 files changed, 51 insertions, 2 deletions
diff --git a/src/xz/hardware.c b/src/xz/hardware.c index dca9f5dd..7cb33582 100644 --- a/src/xz/hardware.c +++ b/src/xz/hardware.c @@ -68,9 +68,39 @@ hardware_memlimit_set(uint64_t new_memlimit, new_memlimit = (uint32_t)new_memlimit * total_ram / 100; } - if (set_compress) + if (set_compress) { memlimit_compress = new_memlimit; +#if SIZE_MAX == UINT32_MAX + // FIXME? + // + // When running a 32-bit xz on a system with a lot of RAM and + // using a percentage-based memory limit, the result can be + // bigger than the 32-bit address space. Limiting the limit + // below SIZE_MAX for compression (not decompression) makes + // xz lower the compression settings (or number of threads) + // to a level that *might* work. In practice it has worked + // when using a 64-bit kernel that gives full 4 GiB address + // space to 32-bit programs. In other situations this might + // still be too high, like 32-bit kernels that may give much + // less than 4 GiB to a single application. + // + // So this is an ugly hack but I will keep it here while + // it does more good than bad. + // + // Use a value less than SIZE_MAX so that there's some room + // for the xz program and so on. Don't use 4000 MiB because + // it could look like someone mixed up base-2 and base-10. + const uint64_t limit_max = UINT64_C(4020) << 20; + + // UINT64_MAX is a special case for the string "max" so + // that has to be handled specially. + if (memlimit_compress != UINT64_MAX + && memlimit_compress > limit_max) + memlimit_compress = limit_max; +#endif + } + if (set_decompress) memlimit_decompress = new_memlimit; diff --git a/src/xz/xz.1 b/src/xz/xz.1 index 6b949640..540d1364 100644 --- a/src/xz/xz.1 +++ b/src/xz/xz.1 @@ -5,7 +5,7 @@ .\" This file has been put into the public domain. .\" You can do whatever you want with this file. .\" -.TH XZ 1 "2019-05-11" "Tukaani" "XZ Utils" +.TH XZ 1 "2020-02-01" "Tukaani" "XZ Utils" . .SH NAME xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files @@ -1005,6 +1005,25 @@ instead of until the details have been decided. .RE .IP "" +For 32-bit +.BR xz +there is a special case: if the +.I limit +would be over +.BR "4020\ MiB" , +the +.I limit +is set to +.BR "4020\ MiB" . +(The values +.B 0 +and +.B max +aren't affected by this. +A similar feature doesn't exist for decompression.) +This can be helpful when a 32-bit executable has access +to 4\ GiB address space while hopefully doing no harm in other situations. +.IP "" See also the section .BR "Memory usage" . .TP |