aboutsummaryrefslogtreecommitdiff
path: root/src/scripts/xzgrep.in (follow)
AgeCommit message (Collapse)AuthorFilesLines
2022-11-11Scripts: Ignore warnings from xz.Lasse Collin1-1/+1
In practice this means making the scripts work when the input files have an unsupported check type which isn't a problem in practice unless support for some check types has been disabled at build time.
2022-09-17xzgrep: Fix compatibility with old shells.Lasse Collin1-3/+3
Running the current xzgrep on Slackware 10.1 with GNU bash 3.00.15: xzgrep: line 231: syntax error near unexpected token `;;' On SCO OpenServer 5.0.7 with Korn Shell 93r: syntax error at line 231 : `;;' unexpected Turns out that some old shells don't like apostrophes (') inside command substitutions. For example, the following fails: x=$(echo foo # asdf'zxcv echo bar) printf '%s\n' "$x" The problem was introduced by commits 69d1b3fc29677af8ade8dc15dba83f0589cb63d6 (2022-03-29), bd7b290f3fe4faeceb7d3497ed9bf2e6ed5e7dc5 (2022-07-18), and a648978b20495b7aa4a8b029c5a810b5ad9d08ff (2022-07-19). 5.2.6 is the only stable release that included this problem. Thanks to Kevin R. Bulgrien for reporting the problem on SCO OpenServer 5.0.7 and for providing the fix.
2022-07-24xzgrep: Improve error handling, especially signals.Lasse Collin1-19/+53
xzgrep wouldn't exit on SIGPIPE or SIGQUIT when it clearly should have. It's quite possible that it's not perfect still but at least it's much better. If multiple exit statuses compete, now it tries to pick the largest of value. Some comments were added. The exit status handling of signals is still broken if the shell uses values larger than 255 in $? to indicate that a process died due to a signal ***and*** their "exit" command doesn't take this into account. This seems to work well with the ksh and yash versions I tried. However, there is a report in gzip/zgrep that OpenSolaris 5.11 (not 5.10) has a problem with "exit" truncating the argument to 8 bits: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=22900#25 Such a bug would break xzgrep but I didn't add a workaround at least for now. 5.11 is old and I don't know if the problem exists in modern descendants, or if the problem exists in other ksh implementations in use.
2022-07-24xzgrep: Make the fix for ZDI-CAN-16587 more robust.Lasse Collin1-1/+4
I don't know if this can make a difference in the real world but it looked kind of suspicious (what happens with sed implementations that cannot process very long lines?). At least this commit shouldn't make it worse.
2022-07-24xzgrep: Use grep -H --label when available (GNU, *BSDs).Lasse Collin1-0/+21
It avoids the use of sed for prefixing filenames to output lines. Using sed for that is slower and prone to security bugs so now the sed method is only used as a fallback. This also fixes an actual bug: When grepping a binary file, GNU grep nowadays prints its diagnostics to stderr instead of stdout and thus the sed-method for prefixing the filename doesn't work. So with this commit grepping binary files gives reasonable output with GNU grep now. This was inspired by zgrep but the implementation is different.
2022-07-24xzgrep: Use -e to specify the pattern to grep.Lasse Collin1-8/+4
Now we don't need the separate test for adding the -q option as it can be added directly in the two places where it's needed.
2022-07-24Scripts: Use printf instead of echo in a few places.Lasse Collin1-2/+2
It's a good habbit as echo has some portability corner cases when the string contents can be anything.
2022-07-24xzgrep: Add more LC_ALL=C to avoid bugs with multibyte characters.Lasse Collin1-6/+8
Also replace one use of expr with printf. The rationale for LC_ALL=C was already mentioned in 69d1b3fc29677af8ade8dc15dba83f0589cb63d6 that fixed a security issue. However, unrelated uses weren't changed in that commit yet. POSIX says that with sed and such tools one should use LC_ALL=C to ensure predictable behavior when strings contain byte sequences that aren't valid multibyte characters in the current locale. See under "Application usage" in here: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/sed.html With GNU sed invalid multibyte strings would work without this; it's documented in its Texinfo manual. Some other implementations aren't so forgiving.
2022-07-24xzgrep: Fix parsing of certain options.Lasse Collin1-2/+17
Fix handling of "xzgrep -25 foo" (in GNU grep "grep -25 foo" is an alias for "grep -C25 foo"). xzgrep would treat "foo" as filename instead of as a pattern. This bug was fixed in zgrep in gzip in 2012. Add -E, -F, -G, and -P to the "no argument required" list. Add -X to "argument required" list. It is an intentionally-undocumented GNU grep option so this isn't an important option for xzgrep but it seems that other grep implementations (well, those that I checked) don't support -X so I hope this change is an improvement still. grep -d (grep --directories=ACTION) requires an argument. In contrast to zgrep, I kept -d in the "no argument required" list because it's not supported in xzgrep (or zgrep). This way "xzgrep -d" gives an error about option being unsupported instead of telling that it requires an argument. Both zgrep and xzgrep tell that it's unsupported if an argument is specified. Add comments.
2022-07-12xzgrep: Fix escaping of malicious filenames (ZDI-CAN-16587).Lasse Collin1-8/+12
Malicious filenames can make xzgrep to write to arbitrary files or (with a GNU sed extension) lead to arbitrary code execution. xzgrep from XZ Utils versions up to and including 5.2.5 are affected. 5.3.1alpha and 5.3.2alpha are affected as well. This patch works for all of them. This bug was inherited from gzip's zgrep. gzip 1.12 includes a fix for zgrep. The issue with the old sed script is that with multiple newlines, the N-command will read the second line of input, then the s-commands will be skipped because it's not the end of the file yet, then a new sed cycle starts and the pattern space is printed and emptied. So only the last line or two get escaped. One way to fix this would be to read all lines into the pattern space first. However, the included fix is even simpler: All lines except the last line get a backslash appended at the end. To ensure that shell command substitution doesn't eat a possible trailing newline, a colon is appended to the filename before escaping. The colon is later used to separate the filename from the grep output so it is fine to add it here instead of a few lines later. The old code also wasn't POSIX compliant as it used \n in the replacement section of the s-command. Using \<newline> is the POSIX compatible method. LC_ALL=C was added to the two critical sed commands. POSIX sed manual recommends it when using sed to manipulate pathnames because in other locales invalid multibyte sequences might cause issues with some sed implementations. In case of GNU sed, these particular sed scripts wouldn't have such problems but some other scripts could have, see: info '(sed)Locale Considerations' This vulnerability was discovered by: cleemy desu wayo working with Trend Micro Zero Day Initiative Thanks to Jim Meyering and Paul Eggert discussing the different ways to fix this and for coordinating the patch release schedule with gzip.
2022-07-12xzgrep: use `grep -E/-F` instead of `egrep` and `fgrep`Ville Skyttä1-2/+2
`egrep` and `fgrep` have been deprecated in GNU grep since 2007, and in current post 3.7 Git they have been made to emit obsolescence warnings: https://git.savannah.gnu.org/cgit/grep.git/commit/?id=a9515624709865d480e3142fd959bccd1c9372d1
2022-07-12Scripts: Fix exit status of xzgrep.Lasse Collin1-7/+13
Omit the -q option from xz, gzip, and bzip2. With xz this shouldn't matter. With gzip it's important because -q makes gzip replace SIGPIPE with exit status 2. With bzip2 it's important because with -q bzip2 is completely silent if input is corrupt while other decompressors still give an error message. Avoiding exit status 2 from gzip is important because bzip2 uses exit status 2 to indicate corrupt input. Before this commit xzgrep didn't recognize corrupt .bz2 files because xzgrep was treating exit status 2 as SIGPIPE for gzip compatibility. zstd still needs -q because otherwise it is noisy in normal operation. The code to detect real SIGPIPE didn't check if the exit status was due to a signal (>= 128) and so could ignore some other exit status too.
2022-07-12Scripts: Add zstd support to xzgrep.Adam Borowski1-0/+1
Thanks to Adam Borowski.
2019-12-31Scripts: Put /usr/xpg4/bin to the beginning of PATH on Solaris.Lasse Collin1-0/+1
This adds a configure option --enable-path-for-scripts=PREFIX which defaults to empty except on Solaris it is /usr/xpg4/bin to make POSIX grep and others available. The Solaris case had been documented in INSTALL with a manual fix but it's better to do this automatically since it is needed on most Solaris systems anyway. Thanks to Daniel Richard G.
2019-07-13spellingAntoine Cœur1-1/+1
2014-10-09xzgrep: Avoid passing both -q and -l to grep.Lasse Collin1-2/+4
The behavior of grep -ql varies: - GNU grep behaves like grep -q. - OpenBSD grep behaves like grep -l. POSIX doesn't make it 100 % clear what behavior is expected. Anyway, using both -q and -l at the same time makes no sense so both options simply should never be used at the same time. Thanks to Christian Weisgerber.
2014-06-11xzgrep: exit 0 when at least one file matches.Lasse Collin1-2/+13
Mimic the original grep behavior and return exit_success when at least one xz compressed file matches given pattern. Original bugreport: https://bugzilla.redhat.com/show_bug.cgi?id=1108085 Thanks to Pavel Raiskup for the patch.
2013-04-05xzgrep: make the '-h' option to be --no-filename equivalentJeff Bastian1-1/+1
* src/scripts/xzgrep.in: Accept the '-h' option in argument parsing.
2012-02-22Fix exit status of xzgrep when grepping binary files.Lasse Collin1-1/+2
When grepping binary files, grep may exit before it has read all the input. In this case, gzip -q returns 2 (eating SIGPIPE), but xz and bzip2 show SIGPIPE as the exit status (e.g. 141). This causes wrong exit status when grepping xz- or bzip2-compressed binary files. The fix checks for the special exit status that indicates SIGPIPE. It uses kill -l which should be supported everywhere since it is in both SUSv2 (1997) and POSIX.1-2008. Thanks to James Buren for the bug report.
2011-04-18xzgrep: fix typo in $0 parsingMartin Väth1-2/+2
Reported-by: Diego Elio Pettenò <flameeyes@gentoo.org> Signed-off-by: Martin Väth <vaeth@mathematik.uni-wuerzburg.de> Signed-off-by: Mike Frysinger <vapier@gentoo.org>
2011-03-24Scripts: Better fix for xzgrep.Lasse Collin1-2/+6
Now it uses "grep -q". Thanks to Gregory Margo.
2011-03-24Scripts: Fix xzgrep -l.Lasse Collin1-2/+2
It didn't work at all. It tried to use the -q option for grep, but it appended it after "--". This works around it by redirecting to /dev/null. The downside is that this can be slower with big files compared to proper use of "grep -q". Thanks to Gregory Margo.
2011-03-19Scripts: Add lzop (.lzo) support to xzdiff and xzgrep.Lasse Collin1-2/+3
2010-03-07Fix xzgrep to not break if filenames have spaces or quotes.Lasse Collin1-1/+1
Thanks to someone who reported the bug on IRC.
2009-07-05Major update to the xzgrep and other scripts based onLasse Collin1-0/+196
the latest versions found from gzip CVS repository. configure will try to find a POSIX shell to be used by the scripts. This should ease portability on systems which have pre-POSIX /bin/sh. xzgrep and xzdiff support .xz, .lzma, .gz, and .bz2 files. xzmore and xzless support only .xz and .lzma files. The name of the xz executable used in these scripts is now correct even if --program-transform-name has been used.