diff options
Diffstat (limited to 'doc/liblzma-hacking.txt')
-rw-r--r-- | doc/liblzma-hacking.txt | 112 |
1 files changed, 0 insertions, 112 deletions
diff --git a/doc/liblzma-hacking.txt b/doc/liblzma-hacking.txt deleted file mode 100644 index 64390bcb..00000000 --- a/doc/liblzma-hacking.txt +++ /dev/null @@ -1,112 +0,0 @@ - -Hacking liblzma ---------------- - -0. Preface - - This document gives some overall information about the internals of - liblzma, which should make it easier to start reading and modifying - the code. - - -1. Programming language - - liblzma was written in C99. If you use GCC, this means that you need - at least GCC 3.x.x. GCC 2 isn't and won't be supported. - - Some GCC-specific extensions are used *conditionally*. They aren't - required to build a full-featured library. Don't make the code rely - on any non-standard compiler extensions or even C99 features that - aren't portable between almost-C99 compatible compilers (for example - non-static inlines). - - The public API headers are in C89. This is to avoid frustrating those - who maintain programs, which are strictly in C89 or C++. - - An assumption about sizeof(size_t) is made. If this assumption is - wrong, some porting is probably needed: - - sizeof(uint32_t) <= sizeof(size_t) <= sizeof(uint64_t) - - -2. Internal vs. external API - - - - Input Output - v Application ^ - | liblzma public API | - | Stream coder | - | Block coder | - | Filter coder | - | ... | - v Filter coder ^ - - - Application - `-- liblzma public API - `-- Stream coder - |-- Stream info handler - |-- Stream Header coder - |-- Block Header coder - | `-- Filter Flags coder - |-- Metadata coder - | `-- Block coder - | `-- Filter 0 - | `-- Filter 1 - | ... - |-- Data Block coder - | `-- Filter 0 - | `-- Filter 1 - | ... - `-- Stream tail coder - - - -x. Designing new filters - - All filters must be designed so that the decoder cannot consume - arbitrary amount input without producing any decoded output. Failing - to follow this rule makes liblzma vulnerable to DoS attacks if - untrusted files are decoded (usually they are untrusted). - - An example should clarify the reason behind this requirement: There - are two filters in the chain. The decoder of the first filter produces - huge amount of output (many gigabytes or more) with a few bytes of - input, which gets passed to the decoder of the second filter. If the - data passed to the second filter is interpreted as something that - produces no output (e.g. padding), the filter chain as a whole - produces no output and consumes no input for a long period of time. - - The above problem was present in the first versions of the Subblock - filter. A tiny .lzma file could have taken several years to decode - while it wouldn't produce any output at all. The problem was fixed - by adding limits for number of consecutive Padding bytes, and requiring - that some decoded output must be produced between Set Subfilter and - Unset Subfilter. - - -x. Implementing new filters - - If the filter supports embedding End of Payload Marker, make sure that - when your filter detects End of Payload Marker, - - the usage of End of Payload Marker is actually allowed (i.e. End - of Input isn't used); and - - it also checks that there is no more input coming from the next - filter in the chain. - - The second requirement is slightly tricky. It's possible that the next - filter hasn't returned LZMA_STREAM_END yet. It may even need a few - bytes more input before it will do so. You need to give it as much - input as it needs, and verify that it doesn't produce any output. - - Don't call the next filter in the chain after it has returned - LZMA_STREAM_END (except in encoder if action == LZMA_SYNC_FLUSH). - It will result undefined behavior. - - Be pedantic. If the input data isn't exactly valid, reject it. - - At the moment, liblzma isn't modular. You will need to edit several - files in src/liblzma/common to include support for a new filter. grep - for LZMA_FILTER_LZMA to locate the files needing changes. - |