aboutsummaryrefslogtreecommitdiff
path: root/doc/faq.txt
diff options
context:
space:
mode:
authorLasse Collin <lasse.collin@tukaani.org>2007-12-09 00:42:33 +0200
committerLasse Collin <lasse.collin@tukaani.org>2007-12-09 00:42:33 +0200
commit5d018dc03549c1ee4958364712fb0c94e1bf2741 (patch)
tree1b211911fb33fddb3f04b77f99e81df23623ffc4 /doc/faq.txt
downloadxz-5d018dc03549c1ee4958364712fb0c94e1bf2741.tar.xz
Imported to git.
Diffstat (limited to 'doc/faq.txt')
-rw-r--r--doc/faq.txt247
1 files changed, 247 insertions, 0 deletions
diff --git a/doc/faq.txt b/doc/faq.txt
new file mode 100644
index 00000000..d01cf91b
--- /dev/null
+++ b/doc/faq.txt
@@ -0,0 +1,247 @@
+
+LZMA Utils FAQ
+--------------
+
+ Copyright (C) 2007 Lasse Collin
+
+ Copying and distribution of this file, with or without modification,
+ are permitted in any medium without royalty provided the copyright
+ notice and this notice are preserved.
+
+
+Q: What are LZMA, LZMA Utils, lzma, .lzma, liblzma, LZMA SDK, LZMA_Alone,
+ 7-Zip and p7zip?
+
+A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. LZMA is the name
+ of the compression algorithm designed by Igor Pavlov. He is the author
+ of 7-Zip, which is a great LGPL'd compression tool for Microsoft
+ Windows operating systems. In addition to 7-Zip itself, also LZMA SDK
+ is available on the website of 7-Zip. LZMA SDK contains LZMA
+ implementations in C++, Java and C#. The C++ version is the original
+ implementation which is used also in 7-Zip itself.
+
+ Excluding the unrar plugin, 7-Zip is free software (free as in
+ freedom). Thanks to this, it was possible to port it to POSIX
+ platforms. The port was done and is maintained by myspace (TODO:
+ myspace's real name?). p7zip is a port of 7-Zip's command line version;
+ p7zip doesn't include the 7-Zip's GUI.
+
+ In POSIX world, users are used to gzip and bzip2 command line tools.
+ Developers know APIs of zlib and libbzip2. LZMA Utils try to ease
+ adoption of LZMA on free operating systems by providing a compression
+ library and a set of command line tools. The library is called liblzma.
+ It provides a zlib-like API making it easy to adapt LZMA compression in
+ existing applications. The main command line tool is known as lzma,
+ whose command line syntax is very similar to that of gzip and bzip2.
+
+ The original command line tool from LZMA SDK (lzma.exe) was found from
+ a directory called LZMA_Alone in the LZMA SDK. It used a simple header
+ format in .lzma files. This format was also used by LZMA Utils up to
+ and including 4.32.x. In LZMA Utils documentation, LZMA_Alone refers
+ to both the file format and the command line tool from LZMA SDK.
+
+ Because of various limitations of the LZMA_Alone file format, a new
+ file format was developed. Extending some existing format such as .gz
+ used by gzip was considered, but these formats were found to be too
+ limited. The filename suffix for the new .lzma format is `.lzma'. The
+ same suffix is also used for files in the LZMA_Alone format. To make
+ the transition to the new format as transparent as possible, LZMA Utils
+ support both the new and old formats transparently.
+
+ 7-Zip and LZMA SDK: <http://7-zip.org/>
+ p7zip: <http://p7zip.sourceforge.net/>
+ LZMA Utils: <http://tukaani.org/lzma/>
+
+
+Q: What LZMA implementations there are available?
+
+A: LZMA SDK contains implementations in C++, Java and C#. The C++ version
+ is the original implementation which is part of 7-Zip. LZMA SDK
+ contains also a small LZMA decoder in C.
+
+ A port of LZMA SDK to Pascal was made by Alan Birtles
+ <http://www.birtles.org.uk/programming/>. It should work with
+ multiple Pascal programming language implementations.
+
+ LZMA Utils includes liblzma, which is directly based on LZMA SDK.
+ liblzma is written in C (C99, not C89). In contrast to C++ callback
+ API used by LZMA SDK, liblzma uses zlib-like stateful C API. I do not
+ want to comment whether both/former/latter/neither API(s) are good or
+ bad. The only reason to implement a zlib-like API was, that many
+ developers are already familiar with zlib, and very many applications
+ already use zlib. Having a similar API makes it easier to include LZMA
+ support in existing applications.
+
+ See also <http://en.wikipedia.org/wiki/LZMA#External_links>.
+
+
+Q: Which file formats are supported by LZMA Utils?
+
+A: Even when the raw LZMA stream is always the same, it can be wrapped
+ in different container formats. The preferred format is the new .lzma
+ format. It has magic bytes (the first six bytes: 0xFF 'L' 'Z' 'M'
+ 'A' 0x00). The format supports chaining up to seven filters filters,
+ splitting data to multiple blocks for easier multi-threading and rough
+ random-access reading. The file integrity is verified using CRC32,
+ CRC64, or SHA256, and by verifying the uncompressed size of the file.
+
+ LZMA SDK includes a tool called LZMA_Alone. It supports uses a
+ primitive header which includes only the mandatory stream information
+ required by the LZMA decoder. This format can be both read and
+ written by liblzma and the command line tool (use --format=alone to
+ create such files).
+
+ .7z is the native archive format used by 7-Zip. This format is not
+ supported by liblzma, and probably will never be supported. You
+ should use e.g. p7zip to extract .7z files.
+
+ It is possible to implement custom file formats by using raw filter
+ mode in liblzma. In this mode the application needs to store the filter
+ properties and provide them to liblzma before starting to uncompress
+ the data.
+
+
+Q: How can I identify files containing LZMA compressed data?
+
+A: The preferred filename suffix for .lzma files is `.lzma'. `.tar.lzma'
+ may be abbreviated to `.tlz'. The same suffixes are used for files in
+ LZMA_Alone format. In practice this should be no problem since tools
+ included in LZMA Utils support both formats transparently.
+
+ Checking the magic bytes is easy way to detect files in the new .lzma
+ format (the first six bytes: 0xFF 'L' 'Z' 'M' 'A' 0x00). The "file"
+ command version FIXME contains magic strings for this format.
+
+ The old LZMA_Alone format has no magic bytes. Its header cannot contain
+ arbitrary bytes, thus it is possible to make a guess. Unfortunately the
+ guessing is usually too hard to be reliable, so don't try it unless you
+ are desperate.
+
+
+Q: Does the lzma command line tool support sparse files?
+
+A: Sparse files can (of course) be compressed like normal files, but
+ uncompression will not restore sparseness of the file. Use an archiver
+ tool to take care of sparseness before compressing the data with lzma.
+
+ The reason for this is that archiver tools handle files, while
+ compression tools handle streams or buffers. Being a sparse file is
+ a property of the file on the disk, not a property of the stream or
+ buffer.
+
+
+Q: Can I recover parts of a broken LZMA file (e.g. corrupted CD-R)?
+
+A: With LZMA_Alone and single-block .lzma files, you can uncompress the
+ file until you hit the first broken byte. The data after the broken
+ position is lost. LZMA relies on the uncompression history, and if
+ bytes are missing in the middle of the file, it is impossible to
+ reliably continue after the broken section.
+
+ With multi-block .lzma files it may be possible to locale the next
+ block in the file and continue decoding there. A limited recovery
+ tool for this kind of situations is planned.
+
+
+Q: Is LZMA patented?
+
+A: No, the authors are not aware of any patents that could affect LZMA.
+ However, due to nature of software patents, the authors cannot
+ guarantee, that LZMA isn't affected by any third party patent.
+
+
+Q: Where can I find documentation about how LZMA works as an algorithm?
+
+A: Read the source code, Luke. There is no documentation about LZMA
+ internals. It is possible that Igor Pavlov is the only person on
+ the Earth that completely knows and understands the algorithm.
+
+ You could begin by downloading LZMA SDK, and start reading from
+ the LZMA decoder to get some idea about the bitstream format.
+ Before you begin, you should know the basics of LZ77 and
+ range coding algorithms. LZMA is based on LZ77, but LZMA is
+ *a lot* more complex. Range coding is used to compress the
+ final bitstream like Huffman coding is used in Deflate.
+
+
+Q: What are filters?
+
+A: In context of .lzma files, a filter means an implementation of a
+ compression algorithm. The primary filter is LZMA, which is why
+ the names of the tools contain the letters LZMA.
+
+ liblzma and the new .lzma format support also other filters than LZMA.
+ There are different types of filters, which are suitable for different
+ types of data. Thus, to select the optimal filter and settings, the
+ type of the input data being compressed needs to be known.
+
+ Some filters are most useful when combined with another filter like
+ LZMA. These filters increase redundancy in the data, without changing
+ the size of the data, by taking advantage of properties specific to
+ the data being compressed.
+
+ So far, all the filters are always reversible. That is, no matter what
+ data you pass to a filter encoder, it can be always defiltered back to
+ the original form. Because of this, it is safe to compress for example
+ a software package that contains other file types than executables
+ using a filter specific to the architechture of the package being
+ compressed.
+
+ The old LZMA_Alone format supports only the LZMA filter.
+
+
+Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma?
+
+A: BCJ filter is called "x86" in liblzma. BCJ2 is not included,
+ because it requires using more than one encoded output stream.
+
+
+Q: Can I use LZMA in proprietary, non-free applications?
+
+A: liblzma is under the GNU LGPL version 2.1 or (at your opinion) any
+ later version. To summarise (*NOTE* This summary is not legally
+ binding, that is, it doesn't give you any extra permissions compared
+ to the LGPL. Read the GNU LGPL carefully for the exact license
+ conditions.):
+ * All the changes made into the library itself must be published
+ under the same license.
+ * End users must be able to replace the used liblzma. Easiest way
+ to assure this is to link dynamically against liblzma so users
+ can replace the shared library file if they want.
+ * You must make it clear to your users, that your application uses
+ liblzma, and that liblzma is free software under the GNU LGPL.
+ A copy of GNU LGPL must be included.
+
+ LZMA SDK contains a special exception which allows linking *unmodified*
+ code statically with a non-free application. This exception does *not*
+ apply to liblzma.
+
+ As an alternative, you can support the development of LZMA and 7-Zip
+ by buying a proprietary license from Igor Pavlov. See homepage of
+ LZMA SDK <http://7-zip.org/sdk.html> for more information. Note that
+ having a proprietary license from Igor Pavlov doesn't allow you to use
+ liblzma in a way that contradicts with the GNU LGPL, because liblzma
+ contains code that is not copyrighted by Igor Pavlov. Please contact
+ both Lasse Collin and Igor Pavlov if the license conditions of liblzma
+ are not suitable for you.
+
+
+Q: I would like to help. What can I do?
+
+A: See the TODO file. Please contact Lasse Collin before starting to do
+ anything, because it is possible that someone else is already working
+ on the same thing.
+
+
+Q: How can I contact the authors?
+
+A: Lasse Collin is the maintainer of LZMA Utils. You can contact him
+ either via IRC (Larhzu on #tukaani at Freenode or IRCnet). Email
+ should work too, <lasse.collin@tukaani.org>.
+
+ Igor Pavlov is the father of LZMA. He is the author of 7-Zip
+ and LZMA SDK. <http://7-zip.org/>
+
+ NOTE: Please don't bother Igor Pavlov with questions specific
+ to LZMA Utils.
+