xz.git - XZ Utils

Age	Commit message (Collapse)	Author	Files	Lines
2023-03-11	liblzma: Avoid null pointer + 0 (undefined behavior in C).	Lasse Collin	1	-2/+4
	In the C99 and C17 standards, section 6.5.6 paragraph 8 means that adding 0 to a null pointer is undefined behavior. As of writing, "clang -fsanitize=undefined" (Clang 15) diagnoses this. However, I'm not aware of any compiler that would take advantage of this when optimizing (Clang 15 included). It's good to avoid this anyway since compilers might some day infer that pointer arithmetic implies that the pointer is not NULL. That is, the following foo() would then unconditionally return 0, even for foo(NULL, 0): void bar(char a, char b); int foo(char *a, size_t n) { bar(a, a + n); return a == NULL; } In contrast to C, C++ explicitly allows null pointer + 0. So if the above is compiled as C++ then there is no undefined behavior in the foo(NULL, 0) call. To me it seems that changing the C standard would be the sane thing to do (just add one sentence) as it would ensure that a huge amount of old code won't break in the future. Based on web searches it seems that a large number of codebases (where null pointer + 0 occurs) are being fixed instead to be future-proof in case compilers will some day optimize based on it (like making the above foo(NULL, 0) return 0) which in the worst case will cause security bugs. Some projects don't plan to change it. For example, gnulib and thus many GNU tools currently require that null pointer + 0 is defined: https://lists.gnu.org/archive/html/bug-gnulib/2021-11/msg00000.html https://www.gnu.org/software/gnulib/manual/html_node/Other-portability-assumptions.html In XZ Utils null pointer + 0 issue should be fixed after this commit. This adds a few if-statements and thus branches to avoid null pointer + 0. These check for size > 0 instead of ptr != NULL because this way bugs where size > 0 && ptr == NULL will likely get caught quickly. None of them are in hot spots so it shouldn't matter for performance. A little less readable version would be replacing ptr + offset with offset != 0 ? ptr + offset : ptr or creating a macro for it: #define my_ptr_add(ptr, offset) \ ((offset) != 0 ? ((ptr) + (offset)) : (ptr)) Checking for offset != 0 instead of ptr != NULL allows GCC >= 8.1, Clang >= 7, and Clang-based ICX to optimize it to the very same code as ptr + offset. That is, it won't create a branch. So for hot code this could be a good solution to avoid null pointer + 0. Unfortunately other compilers like ICC 2021 or MSVC 19.33 (VS2022) will create a branch from my_ptr_add(). Thanks to Marcin Kowalczyk for reporting the problem: https://github.com/tukaani-project/xz/issues/36
2019-07-13	liblzma: Avoid memcpy(NULL, foo, 0) because it is undefined behavior.	Lasse Collin	1	-1/+9
	I should have always known this but I didn't. Here is an example as a reminder to myself: int mycopy(void dest, void src, size_t n) { memcpy(dest, src, n); return dest == NULL; } In the example, a compiler may assume that dest != NULL because passing NULL to memcpy() would be undefined behavior. Testing with GCC 8.2.1, mycopy(NULL, NULL, 0) returns 1 with -O0 and -O1. With -O2 the return value is 0 because the compiler infers that dest cannot be NULL because it was already used with memcpy() and thus the test for NULL gets optimized out. In liblzma, if a null-pointer was passed to memcpy(), there were no checks for NULL after the memcpy() call, so I cautiously suspect that it shouldn't have caused bad behavior in practice, but it's hard to be sure, and the problematic cases had to be fixed anyway. Thanks to Jeffrey Walton.
2016-12-28	liblzma: Avoid multiple definitions of lzma_coder structures.	Lasse Collin	1	-28/+33
	Only one definition was visible in a translation unit. It avoided a few casts and temp variables but seems that this hack doesn't work with link-time optimizations in compilers as it's not C99/C11 compliant. Fixes: http://www.mail-archive.com/xz-devel@tukaani.org/msg00279.html
2012-07-17	liblzma: Make the use of lzma_allocator const-correct.	Lasse Collin	1	-5/+5
	There is a tiny risk of causing breakage: If an application assigns lzma_stream.allocator to a non-const pointer, such code won't compile anymore. I don't know why anyone would do such a thing though, so in practice this shouldn't cause trouble. Thanks to Jan Kratochvil for the patch.
2012-05-28	liblzma: Fix possibility of incorrect LZMA_BUF_ERROR.	Lasse Collin	1	-1/+1
	lzma_code() could incorrectly return LZMA_BUF_ERROR if all of the following was true: - The caller knows how many bytes of output to expect and only provides that much output space. - When the last output bytes are decoded, the caller-provided input buffer ends right before the LZMA2 end of payload marker. So LZMA2 won't provide more output anymore, but it won't know it yet and thus won't return LZMA_STREAM_END yet. - A BCJ filter is in use and it hasn't left any unfiltered bytes in the temp buffer. This can happen with any BCJ filter, but in practice it's more likely with filters other than the x86 BCJ. Another situation where the bug can be triggered happens if the uncompressed size is zero bytes and no output space is provided. In this case the decompression can fail even if the whole input file is given to lzma_code(). A similar bug was fixed in XZ Embedded on 2011-09-19.
2012-04-19	liblzma: Remove outdated comments.	Lasse Collin	1	-3/+0

2011-05-17	Add underscores to attributes (__attribute((__foo__))).	Lasse Collin	1	-1/+1

2010-02-12	Collection of language fixes to comments and docs.	Lasse Collin	1	-1/+1
	Thanks to Jonathan Nieder.
2009-11-14	Fix a design error in liblzma API.	Lasse Collin	1	-0/+12
	Originally the idea was that using LZMA_FULL_FLUSH with Stream encoder would read the filter chain from the same array that was used to intialize the Stream encoder. Since most apps wouldn't use LZMA_FULL_FLUSH, most apps wouldn't need to keep the filter chain available after initializing the Stream encoder. However, due to my mistake, it actually required keeping the array always available. Since setting the new filter chain via the array used at initialization time is not a nice way to do it for a couple of reasons, this commit ditches it and introduces lzma_filters_update(). This new function replaces also the "persistent" flag used by LZMA2 (and to-be-designed Subblock filter), which was also an ugly thing to do. Thanks to Alexey Tourbin for reminding me about the problem that Stream encoder used to require keeping the filter chain allocated.
2009-07-10	BCJ filters: Reject invalid start offsets with LZMA_OPTIONS_ERROR.	Lasse Collin	1	-1/+4
	This is a quick and slightly dirty fix to make the code conform to the latest file format specification. Without this patch, it's possible to make corrupt files by specifying start offset that is not a multiple of the filter's alignment. Custom start offset is almost never used, so this was only a minor bug. The xz command line tool doesn't validate the start offset, so one will get a bit unclear error message if trying to use an invalid start offset.
2009-04-13	Put the interesting parts of XZ Utils into the public domain.	Lasse Collin	1	-10/+3
	Some minor documentation cleanups were made at the same time.
2008-12-31	Renamed lzma_options_simple to lzma_options_bcj in the API.	Lasse Collin	1	-1/+1
	The internal implementation is still using the name "simple". It may need some cleanups, so I look at it later.
2008-09-13	Renamed constants:	Lasse Collin	1	-1/+1
	- LZMA_VLI_VALUE_MAX -> LZMA_VLI_MAX - LZMA_VLI_VALUE_UNKNOWN -> LZMA_VLI_UNKNOWN - LZMA_HEADER_ERRRO -> LZMA_OPTIONS_ERROR
2008-08-28	Sort of garbage collection commit. :-\| Many things are still	Lasse Collin	1	-4/+4
	broken. API has changed a lot and it will still change a little more here and there. The command line tool doesn't have all the required changes to reflect the API changes, so it's easy to get "internal error" or trigger assertions.
2008-06-18	Update the code to mostly match the new simpler file format	Lasse Collin	1	-25/+4
	specification. Simplify things by removing most of the support for known uncompressed size in most places. There are some miscellaneous changes here and there too. The API of liblzma has got many changes and still some more will be done soon. While most of the code has been updated, some things are not fixed (the command line tool will choke with invalid filter chain, if nothing else). Subblock filter is somewhat broken for now. It will be updated once the encoded format of the Subblock filter has been decided.
2008-01-26	Return LZMA_HEADER_ERROR if LZMA_SYNC_FLUSH is used with any	Lasse Collin	1	-0/+8
	of the so called simple filters. If there is demand, limited support for LZMA_SYNC_FLUSH may be added in future. After this commit, using LZMA_SYNC_FLUSH shouldn't cause undefined behavior in any situation.
2007-12-11	Remove uncompressed size tracking from the filter encoders.	Lasse Collin	1	-25/+4
	It's not strictly needed there, and just complicates the code. LZ encoder never even had this feature. The primary reason to have uncompressed size tracking in filter encoders was validating that the application doesn't give different amount of input that it had promised. A side effect was to validate internal workings of liblzma. Uncompressed size tracking is still present in the Block encoder. Maybe it should be added to LZMA_Alone and raw encoders too. It's simpler to have one coder just to validate the uncompressed size instead of having it in every filter.
2007-12-09	Imported to git.	Lasse Collin	1	-0/+306