diff options
author | Lasse Collin <lasse.collin@tukaani.org> | 2009-05-01 11:28:52 +0300 |
---|---|---|
committer | Lasse Collin <lasse.collin@tukaani.org> | 2009-05-01 11:28:52 +0300 |
commit | be06858d5cf8ba46557395035d821dc332f3f830 (patch) | |
tree | 603491cf2b789dd19afd7f3cc6185873f1a36cb8 /doc/liblzma-intro.txt | |
parent | Added documentation about the legacy .lzma file format. (diff) | |
download | xz-be06858d5cf8ba46557395035d821dc332f3f830.tar.xz |
Remove docs that are too outdated to be updated
(rewrite will be better).
Diffstat (limited to 'doc/liblzma-intro.txt')
-rw-r--r-- | doc/liblzma-intro.txt | 194 |
1 files changed, 0 insertions, 194 deletions
diff --git a/doc/liblzma-intro.txt b/doc/liblzma-intro.txt deleted file mode 100644 index 52c4d920..00000000 --- a/doc/liblzma-intro.txt +++ /dev/null @@ -1,194 +0,0 @@ - -Introduction to liblzma ------------------------ - -Writing applications to work with liblzma - - liblzma API is split in several subheaders to improve readability and - maintainance. The subheaders must not be #included directly. lzma.h - requires that certain integer types and macros are available when - the header is #included. On systems that have inttypes.h that conforms - to C99, the following will work: - - #include <sys/types.h> - #include <inttypes.h> - #include <lzma.h> - - Those who have used zlib should find liblzma's API easy to use. - To developers who haven't used zlib before, I recommend learning - zlib first, because zlib has excellent documentation. - - While the API is similar to that of zlib, there are some major - differences, which are summarized below. - - For basic stream encoding, zlib has three functions (deflateInit(), - deflate(), and deflateEnd()). Similarly, there are three functions - for stream decoding (inflateInit(), inflate(), and inflateEnd()). - liblzma has only single coding and ending function. Thus, to - encode one may use, for example, lzma_stream_encoder_single(), - lzma_code(), and lzma_end(). Simlarly for decoding, one may - use lzma_auto_decoder(), lzma_code(), and lzma_end(). - - zlib has deflateReset() and inflateReset() to reset the stream - structure without reallocating all the memory. In liblzma, all - coder initialization functions are like zlib's reset functions: - the first-time initializations are done with the same functions - as the reinitializations (resetting). - - To make all this work, liblzma needs to know when lzma_stream - doesn't already point to an allocated and initialized coder. - This is achieved by initializing lzma_stream structure with - LZMA_STREAM_INIT (static initialization) or LZMA_STREAM_INIT_VAR - (for exampple when new lzma_stream has been allocated with malloc()). - This initialization should be done exactly once per lzma_stream - structure to avoid leaking memory. Calling lzma_end() will leave - lzma_stream into a state comparable to the state achieved with - LZMA_STREAM_INIT and LZMA_STREAM_INIT_VAR. - - Example probably clarifies a lot. With zlib, compression goes - roughly like this: - - z_stream strm; - deflateInit(&strm, level); - deflate(&strm, Z_RUN); - deflate(&strm, Z_RUN); - ... - deflate(&strm, Z_FINISH); - deflateEnd(&strm) or deflateReset(&strm) - - With liblzma, it's slightly different: - - lzma_stream strm = LZMA_STREAM_INIT; - lzma_stream_encoder_single(&strm, &options); - lzma_code(&strm, LZMA_RUN); - lzma_code(&strm, LZMA_RUN); - ... - lzma_code(&strm, LZMA_FINISH); - lzma_end(&strm) or reinitialize for new coding work - - Reinitialization in the last step can be any function that can - initialize lzma_stream; it doesn't need to be the same function - that was used for the previous initialization. If it is the same - function, liblzma will usually be able to re-use most of the - existing memory allocations (depends on how much the initialization - options change). If you reinitialize with different function, - liblzma will automatically free the memory of the previous coder. - - -File formats - - liblzma supports multiple container formats for the compressed data. - Different initialization functions initialize the lzma_stream to - process different container formats. See the details from the public - header files. - - The following functions are the most commonly used: - - - lzma_stream_encoder_single(): Encodes Single-Block Stream; this - the recommended format for most purporses. - - - lzma_alone_encoder(): Useful if you need to encode into the - legacy LZMA_Alone format. - - - lzma_auto_decoder(): Decoder that automatically detects the - file format; recommended when you decode compressed files on - disk, because this way compatibility with the legacy LZMA_Alone - format is transparent. - - - lzma_stream_decoder(): Decoder for Single- and Multi-Block - Streams; this is good if you want to accept only .lzma Streams. - - -Filters - - liblzma supports multiple filters (algorithm implementations). The new - .lzma format supports filter-chain having up to seven filters. In the - filter chain, the output of one filter is input of the next filter in - the chain. The legacy LZMA_Alone format supports only one filter, and - that must always be LZMA. - - General-purporse compression: - - LZMA The main algorithm of liblzma (surprise!) - - Branch/Call/Jump filters for executables: - - x86 This filter is known as BCJ in 7-Zip - IA64 IA-64 (Itanium) - PowerPC Big endian PowerPC - ARM - ARM-Thumb - SPARC - - Other filters: - - Copy Dummy filter that simply copies all the data - from input to output. - - Subblock Multi-purporse filter, that can - - embed End of Payload Marker if the previous - filter in the chain doesn't support it; and - - apply Subfilters, which filter only part - of the same compressed Block in the Stream. - - Branch/Call/Jump filters never change the size of the data. They - should usually be used as a pre-filter for some compression filter - like LZMA. - - -Integrity checks - - The .lzma Stream format uses CRC32 as the integrity check for - different file format headers. It is possible to omit CRC32 from - the Block Headers, but not from Stream Header. This is the reason - why CRC32 code cannot be disabled when building liblzma (in addition, - the LZMA encoder uses CRC32 for hashing, so that's another reason). - - The integrity check of the actual data is calculated from the - uncompressed data. This check can be CRC32, CRC64, or SHA256. - It can also be omitted completely, although that usually is not - a good thing to do. There are free IDs left, so support for new - checks algorithms can be added later. - - -API and ABI stability - - The API and ABI of liblzma isn't stable yet, although no huge - changes should happen. One potential place for change is the - lzma_options_subblock structure. - - In the 4.42.0alpha phase, the shared library version number won't - be updated even if ABI breaks. I don't want to track the ABI changes - yet. Just rebuild everything when you upgrade liblzma until we get - to the beta stage. - - -Size of the library - - While liblzma isn't huge, it is quite far from the smallest possible - LZMA implementation: full liblzma binary (with support for all - filters and other features) is way over 100 KiB, but the plain raw - LZMA decoder is only 5-10 KiB. - - To decrease the size of the library, you can omit parts of the library - by passing certain options to the `configure' script. Disabling - everything but the decoders of the require filters will usually give - you a small enough library, but if you need a decoder for example - embedded in the operating system kernel, the code from liblzma probably - isn't suitable as is. - - If you need a minimal implementation supporting .lzma Streams, you - may need to do partial rewrite. liblzma uses stateful API like zlib. - That increases the size of the library. Using callback API or even - simpler buffer-to-buffer API would allow smaller implementation. - - LZMA SDK contains smaller LZMA decoder written in ANSI-C than - liblzma, so you may want to take a look at that code. However, - it doesn't (at least not yet) support the new .lzma Stream format. - - -Documentation - - There's no other documentation than the public headers and this - text yet. Real docs will be written some day, I hope. - |