aboutsummaryrefslogtreecommitdiff
path: root/doc/liblzma-hacking.txt
diff options
context:
space:
mode:
authorLasse Collin <lasse.collin@tukaani.org>2009-05-01 11:28:52 +0300
committerLasse Collin <lasse.collin@tukaani.org>2009-05-01 11:28:52 +0300
commitbe06858d5cf8ba46557395035d821dc332f3f830 (patch)
tree603491cf2b789dd19afd7f3cc6185873f1a36cb8 /doc/liblzma-hacking.txt
parentAdded documentation about the legacy .lzma file format. (diff)
downloadxz-be06858d5cf8ba46557395035d821dc332f3f830.tar.xz
Remove docs that are too outdated to be updated
(rewrite will be better).
Diffstat (limited to 'doc/liblzma-hacking.txt')
-rw-r--r--doc/liblzma-hacking.txt112
1 files changed, 0 insertions, 112 deletions
diff --git a/doc/liblzma-hacking.txt b/doc/liblzma-hacking.txt
deleted file mode 100644
index 64390bcb..00000000
--- a/doc/liblzma-hacking.txt
+++ /dev/null
@@ -1,112 +0,0 @@
-
-Hacking liblzma
----------------
-
-0. Preface
-
- This document gives some overall information about the internals of
- liblzma, which should make it easier to start reading and modifying
- the code.
-
-
-1. Programming language
-
- liblzma was written in C99. If you use GCC, this means that you need
- at least GCC 3.x.x. GCC 2 isn't and won't be supported.
-
- Some GCC-specific extensions are used *conditionally*. They aren't
- required to build a full-featured library. Don't make the code rely
- on any non-standard compiler extensions or even C99 features that
- aren't portable between almost-C99 compatible compilers (for example
- non-static inlines).
-
- The public API headers are in C89. This is to avoid frustrating those
- who maintain programs, which are strictly in C89 or C++.
-
- An assumption about sizeof(size_t) is made. If this assumption is
- wrong, some porting is probably needed:
-
- sizeof(uint32_t) <= sizeof(size_t) <= sizeof(uint64_t)
-
-
-2. Internal vs. external API
-
-
-
- Input Output
- v Application ^
- | liblzma public API |
- | Stream coder |
- | Block coder |
- | Filter coder |
- | ... |
- v Filter coder ^
-
-
- Application
- `-- liblzma public API
- `-- Stream coder
- |-- Stream info handler
- |-- Stream Header coder
- |-- Block Header coder
- | `-- Filter Flags coder
- |-- Metadata coder
- | `-- Block coder
- | `-- Filter 0
- | `-- Filter 1
- | ...
- |-- Data Block coder
- | `-- Filter 0
- | `-- Filter 1
- | ...
- `-- Stream tail coder
-
-
-
-x. Designing new filters
-
- All filters must be designed so that the decoder cannot consume
- arbitrary amount input without producing any decoded output. Failing
- to follow this rule makes liblzma vulnerable to DoS attacks if
- untrusted files are decoded (usually they are untrusted).
-
- An example should clarify the reason behind this requirement: There
- are two filters in the chain. The decoder of the first filter produces
- huge amount of output (many gigabytes or more) with a few bytes of
- input, which gets passed to the decoder of the second filter. If the
- data passed to the second filter is interpreted as something that
- produces no output (e.g. padding), the filter chain as a whole
- produces no output and consumes no input for a long period of time.
-
- The above problem was present in the first versions of the Subblock
- filter. A tiny .lzma file could have taken several years to decode
- while it wouldn't produce any output at all. The problem was fixed
- by adding limits for number of consecutive Padding bytes, and requiring
- that some decoded output must be produced between Set Subfilter and
- Unset Subfilter.
-
-
-x. Implementing new filters
-
- If the filter supports embedding End of Payload Marker, make sure that
- when your filter detects End of Payload Marker,
- - the usage of End of Payload Marker is actually allowed (i.e. End
- of Input isn't used); and
- - it also checks that there is no more input coming from the next
- filter in the chain.
-
- The second requirement is slightly tricky. It's possible that the next
- filter hasn't returned LZMA_STREAM_END yet. It may even need a few
- bytes more input before it will do so. You need to give it as much
- input as it needs, and verify that it doesn't produce any output.
-
- Don't call the next filter in the chain after it has returned
- LZMA_STREAM_END (except in encoder if action == LZMA_SYNC_FLUSH).
- It will result undefined behavior.
-
- Be pedantic. If the input data isn't exactly valid, reject it.
-
- At the moment, liblzma isn't modular. You will need to edit several
- files in src/liblzma/common to include support for a new filter. grep
- for LZMA_FILTER_LZMA to locate the files needing changes.
-