| '\" t |
| .\" |
| .\" Author: Lasse Collin |
| .\" |
| .\" This file has been put into the public domain. |
| .\" You can do whatever you want with this file. |
| .\" |
| .TH XZ 1 "2010-06-15" "Tukaani" "XZ Utils" |
| .SH NAME |
| xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files |
| .SH SYNOPSIS |
| .B xz |
| .RI [ option ]... |
| .RI [ file ]... |
| .PP |
| .B unxz |
| is equivalent to |
| .BR "xz \-\-decompress" . |
| .br |
| .B xzcat |
| is equivalent to |
| .BR "xz \-\-decompress \-\-stdout" . |
| .br |
| .B lzma |
| is equivalent to |
| .BR "xz \-\-format=lzma" . |
| .br |
| .B unlzma |
| is equivalent to |
| .BR "xz \-\-format=lzma \-\-decompress" . |
| .br |
| .B lzcat |
| is equivalent to |
| .BR "xz \-\-format=lzma \-\-decompress \-\-stdout" . |
| .PP |
| When writing scripts that need to decompress files, it is recommended to |
| always use the name |
| .B xz |
| with appropriate arguments |
| .RB ( "xz \-d" |
| or |
| .BR "xz \-dc" ) |
| instead of the names |
| .B unxz |
| and |
| .BR xzcat. |
| .SH DESCRIPTION |
| .B xz |
| is a general-purpose data compression tool with command line syntax similar to |
| .BR gzip (1) |
| and |
| .BR bzip2 (1). |
| The native file format is the |
| .B .xz |
| format, but also the legacy |
| .B .lzma |
| format and raw compressed streams with no container format headers |
| are supported. |
| .PP |
| .B xz |
| compresses or decompresses each |
| .I file |
| according to the selected operation mode. |
| If no |
| .I files |
| are given or |
| .I file |
| is |
| .BR \- , |
| .B xz |
| reads from standard input and writes the processed data to standard output. |
| .B xz |
| will refuse (display an error and skip the |
| .IR file ) |
| to write compressed data to standard output if it is a terminal. Similarly, |
| .B xz |
| will refuse to read compressed data from standard input if it is a terminal. |
| .PP |
| Unless |
| .B \-\-stdout |
| is specified, |
| .I files |
| other than |
| .B \- |
| are written to a new file whose name is derived from the source |
| .I file |
| name: |
| .IP \(bu 3 |
| When compressing, the suffix of the target file format |
| .RB ( .xz |
| or |
| .BR .lzma ) |
| is appended to the source filename to get the target filename. |
| .IP \(bu 3 |
| When decompressing, the |
| .B .xz |
| or |
| .B .lzma |
| suffix is removed from the filename to get the target filename. |
| .B xz |
| also recognizes the suffixes |
| .B .txz |
| and |
| .BR .tlz , |
| and replaces them with the |
| .B .tar |
| suffix. |
| .PP |
| If the target file already exists, an error is displayed and the |
| .I file |
| is skipped. |
| .PP |
| Unless writing to standard output, |
| .B xz |
| will display a warning and skip the |
| .I file |
| if any of the following applies: |
| .IP \(bu 3 |
| .I File |
| is not a regular file. Symbolic links are not followed, thus they |
| are not considered to be regular files. |
| .IP \(bu 3 |
| .I File |
| has more than one hard link. |
| .IP \(bu 3 |
| .I File |
| has setuid, setgid, or sticky bit set. |
| .IP \(bu 3 |
| The operation mode is set to compress, and the |
| .I file |
| already has a suffix of the target file format |
| .RB ( .xz |
| or |
| .B .txz |
| when compressing to the |
| .B .xz |
| format, and |
| .B .lzma |
| or |
| .B .tlz |
| when compressing to the |
| .B .lzma |
| format). |
| .IP \(bu 3 |
| The operation mode is set to decompress, and the |
| .I file |
| doesn't have a suffix of any of the supported file formats |
| .RB ( .xz , |
| .BR .txz , |
| .BR .lzma , |
| or |
| .BR .tlz ). |
| .PP |
| After successfully compressing or decompressing the |
| .IR file , |
| .B xz |
| copies the owner, group, permissions, access time, and modification time |
| from the source |
| .I file |
| to the target file. If copying the group fails, the permissions are modified |
| so that the target file doesn't become accessible to users who didn't have |
| permission to access the source |
| .IR file . |
| .B xz |
| doesn't support copying other metadata like access control lists |
| or extended attributes yet. |
| .PP |
| Once the target file has been successfully closed, the source |
| .I file |
| is removed unless |
| .B \-\-keep |
| was specified. The source |
| .I file |
| is never removed if the output is written to standard output. |
| .PP |
| Sending |
| .B SIGINFO |
| or |
| .B SIGUSR1 |
| to the |
| .B xz |
| process makes it print progress information to standard error. |
| This has only limited use since when standard error is a terminal, using |
| .B \-\-verbose |
| will display an automatically updating progress indicator. |
| .SS "Memory usage" |
| The memory usage of |
| .B xz |
| varies from a few hundred kilobytes to several gigabytes depending on |
| the compression settings. The settings used when compressing a file |
| affect also the memory usage of the decompressor. Typically the decompressor |
| needs only 5\ % to 20\ % of the amount of RAM that the compressor needed when |
| creating the file. Still, the worst-case memory usage of the decompressor |
| is several gigabytes. |
| .PP |
| To prevent uncomfortable surprises caused by huge memory usage, |
| .B xz |
| has a built-in memory usage limiter. While some operating systems provide |
| ways to limit the memory usage of processes, relying on it wasn't deemed |
| to be flexible enough. The default limit depends on the total amount of |
| physical RAM: |
| .IP \(bu 3 |
| If 40\ % of RAM is at least 80 MiB, 40\ % of RAM is used as the limit. |
| .IP \(bu 3 |
| If 80\ % of RAM is less than 80 MiB, 80\ % of RAM is used as the limit. |
| .IP \(bu 3 |
| Otherwise 80 MiB is used as the limit. |
| .PP |
| When compressing, if the selected compression settings exceed the memory |
| usage limit, the settings are automatically adjusted downwards and a notice |
| about this is displayed. As an exception, if the memory usage limit is |
| exceeded when compressing with |
| .B \-\-format=raw |
| or |
| .BR \-\-no\-adjust , |
| an error is displayed and |
| .B xz |
| will exit with exit status |
| .BR 1 . |
| .PP |
| If source |
| .I file |
| cannot be decompressed without exceeding the memory usage limit, an error |
| message is displayed and the file is skipped. Note that compressed files |
| may contain many blocks, which may have been compressed with different |
| settings. Typically all blocks will have roughly the same memory requirements, |
| but it is possible that a block later in the file will exceed the memory usage |
| limit, and an error about too low memory usage limit gets displayed after some |
| data has already been decompressed. |
| .PP |
| The absolute value of the active memory usage limit can be seen with |
| .B \-\-info-memory |
| or near the bottom of the output of |
| .BR \-\-long\-help . |
| The default limit can be overridden with |
| \fB\-\-memory=\fIlimit\fR. |
| .SS Concatenation and padding with .xz files |
| It is possible to concatenate |
| .B .xz |
| files as is. |
| .B xz |
| will decompress such files as if they were a single |
| .B .xz |
| file. |
| .PP |
| It is possible to insert padding between the concenated parts |
| or after the last part. The padding must be null bytes and the size |
| of the padding must be a multiple of four bytes. This can be useful |
| if the .xz file is stored on a medium that stores file sizes |
| e.g. as 512-byte blocks. |
| .PP |
| Concatenation and padding are not allowed with |
| .B .lzma |
| files or raw streams. |
| .SH OPTIONS |
| .SS "Integer suffixes and special values" |
| In most places where an integer argument is expected, an optional suffix |
| is supported to easily indicate large integers. There must be no space |
| between the integer and the suffix. |
| .TP |
| .B KiB |
| The integer is multiplied by 1,024 (2^10). Also |
| .BR Ki , |
| .BR k , |
| .BR kB , |
| .BR K , |
| and |
| .B KB |
| are accepted as synonyms for |
| .BR KiB . |
| .TP |
| .B MiB |
| The integer is multiplied by 1,048,576 (2^20). Also |
| .BR Mi , |
| .BR m , |
| .BR M , |
| and |
| .B MB |
| are accepted as synonyms for |
| .BR MiB . |
| .TP |
| .B GiB |
| The integer is multiplied by 1,073,741,824 (2^30). Also |
| .BR Gi , |
| .BR g , |
| .BR G , |
| and |
| .B GB |
| are accepted as synonyms for |
| .BR GiB . |
| .PP |
| A special value |
| .B max |
| can be used to indicate the maximum integer value supported by the option. |
| .SS "Operation mode" |
| If multiple operation mode options are given, the last one takes effect. |
| .TP |
| .BR \-z ", " \-\-compress |
| Compress. This is the default operation mode when no operation mode option |
| is specified, and no other operation mode is implied from the command name |
| (for example, |
| .B unxz |
| implies |
| .BR \-\-decompress ). |
| .TP |
| .BR \-d ", " \-\-decompress ", " \-\-uncompress |
| Decompress. |
| .TP |
| .BR \-t ", " \-\-test |
| Test the integrity of compressed |
| .IR files . |
| No files are created or removed. This option is equivalent to |
| .B "\-\-decompress \-\-stdout" |
| except that the decompressed data is discarded instead of being |
| written to standard output. |
| .TP |
| .BR \-l ", " \-\-list |
| List information about compressed |
| .IR files . |
| No uncompressed output is produced, and no files are created or removed. |
| In list mode, the program cannot read the compressed data from standard |
| input or from other unseekable sources. |
| .IP |
| The default listing shows basic information about |
| .IR files , |
| one file per line. To get more detailed information, use also the |
| .B \-\-verbose |
| option. For even more information, use |
| .B \-\-verbose |
| twice, but note that it may be slow, because getting all the extra |
| information requires many seeks. The width of verbose output exceeds |
| 80 characters, so piping the output to e.g. |
| .B "less\ \-S" |
| may be convenient if the terminal isn't wide enough. |
| .IP |
| The exact output may vary between |
| .B xz |
| versions and different locales. To get machine-readable output, |
| .B \-\-robot \-\-list |
| should be used. |
| .SS "Operation modifiers" |
| .TP |
| .BR \-k ", " \-\-keep |
| Keep (don't delete) the input files. |
| .TP |
| .BR \-f ", " \-\-force |
| This option has several effects: |
| .RS |
| .IP \(bu 3 |
| If the target file already exists, delete it before compressing or |
| decompressing. |
| .IP \(bu 3 |
| Compress or decompress even if the input is a symbolic link to a regular file, |
| has more than one hard link, or has setuid, setgid, or sticky bit set. |
| The setuid, setgid, and sticky bits are not copied to the target file. |
| .IP \(bu 3 |
| If combined with |
| .B \-\-decompress |
| .BR \-\-stdout |
| and |
| .B xz |
| doesn't recognize the type of the source file, |
| .B xz |
| will copy the source file as is to standard output. This allows using |
| .B xzcat |
| .B \--force |
| like |
| .BR cat (1) |
| for files that have not been compressed with |
| .BR xz . |
| Note that in future, |
| .B xz |
| might support new compressed file formats, which may make |
| .B xz |
| decompress more types of files instead of copying them as is to |
| standard output. |
| .BI \-\-format= format |
| can be used to restrict |
| .B xz |
| to decompress only a single file format. |
| .RE |
| .TP |
| .BR \-c ", " \-\-stdout ", " \-\-to-stdout |
| Write the compressed or decompressed data to standard output instead of |
| a file. This implies |
| .BR \-\-keep . |
| .TP |
| .B \-\-no\-sparse |
| Disable creation of sparse files. By default, if decompressing into |
| a regular file, |
| .B xz |
| tries to make the file sparse if the decompressed data contains long |
| sequences of binary zeros. It works also when writing to standard output |
| as long as standard output is connected to a regular file, and certain |
| additional conditions are met to make it safe. Creating sparse files may |
| save disk space and speed up the decompression by reducing the amount of |
| disk I/O. |
| .TP |
| \fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf |
| When compressing, use |
| .I .suf |
| as the suffix for the target file instead of |
| .B .xz |
| or |
| .BR .lzma . |
| If not writing to standard output and the source file already has the suffix |
| .IR .suf , |
| a warning is displayed and the file is skipped. |
| .IP |
| When decompressing, recognize also files with the suffix |
| .I .suf |
| in addition to files with the |
| .BR .xz , |
| .BR .txz , |
| .BR .lzma , |
| or |
| .B .tlz |
| suffix. If the source file has the suffix |
| .IR .suf , |
| the suffix is removed to get the target filename. |
| .IP |
| When compressing or decompressing raw streams |
| .RB ( \-\-format=raw ), |
| the suffix must always be specified unless writing to standard output, |
| because there is no default suffix for raw streams. |
| .TP |
| \fB\-\-files\fR[\fB=\fIfile\fR] |
| Read the filenames to process from |
| .IR file ; |
| if |
| .I file |
| is omitted, filenames are read from standard input. Filenames must be |
| terminated with the newline character. A dash |
| .RB ( \- ) |
| is taken as a regular filename; it doesn't mean standard input. |
| If filenames are given also as command line arguments, they are |
| processed before the filenames read from |
| .IR file . |
| .TP |
| \fB\-\-files0\fR[\fB=\fIfile\fR] |
| This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except that the |
| filenames must be terminated with the null character. |
| .SS "Basic file format and compression options" |
| .TP |
| \fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat |
| Specify the file format to compress or decompress: |
| .RS |
| .IP \(bu 3 |
| .BR auto : |
| This is the default. When compressing, |
| .B auto |
| is equivalent to |
| .BR xz . |
| When decompressing, the format of the input file is automatically detected. |
| Note that raw streams (created with |
| .BR \-\-format=raw ) |
| cannot be auto-detected. |
| .IP \(bu 3 |
| .BR xz : |
| Compress to the |
| .B .xz |
| file format, or accept only |
| .B .xz |
| files when decompressing. |
| .IP \(bu 3 |
| .B lzma |
| or |
| .BR alone : |
| Compress to the legacy |
| .B .lzma |
| file format, or accept only |
| .B .lzma |
| files when decompressing. The alternative name |
| .B alone |
| is provided for backwards compatibility with LZMA Utils. |
| .IP \(bu 3 |
| .BR raw : |
| Compress or uncompress a raw stream (no headers). This is meant for advanced |
| users only. To decode raw streams, you need to set not only |
| .B \-\-format=raw |
| but also specify the filter chain, which would normally be stored in the |
| container format headers. |
| .RE |
| .TP |
| \fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck |
| Specify the type of the integrity check, which is calculated from the |
| uncompressed data. This option has an effect only when compressing into the |
| .B .xz |
| format; the |
| .B .lzma |
| format doesn't support integrity checks. |
| The integrity check (if any) is verified when the |
| .B .xz |
| file is decompressed. |
| .IP |
| Supported |
| .I check |
| types: |
| .RS |
| .IP \(bu 3 |
| .BR none : |
| Don't calculate an integrity check at all. This is usually a bad idea. This |
| can be useful when integrity of the data is verified by other means anyway. |
| .IP \(bu 3 |
| .BR crc32 : |
| Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet). |
| .IP \(bu 3 |
| .BR crc64 : |
| Calculate CRC64 using the polynomial from ECMA-182. This is the default, since |
| it is slightly better than CRC32 at detecting damaged files and the speed |
| difference is negligible. |
| .IP \(bu 3 |
| .BR sha256 : |
| Calculate SHA-256. This is somewhat slower than CRC32 and CRC64. |
| .RE |
| .IP |
| Integrity of the |
| .B .xz |
| headers is always verified with CRC32. It is not possible to change or |
| disable it. |
| .TP |
| .BR \-0 " ... " \-9 |
| Select compression preset. If a preset level is specified multiple times, |
| the last one takes effect. |
| .IP |
| The compression preset levels can be categorised roughly into three |
| categories: |
| .RS |
| .IP "\fB\-0\fR ... \fB\-2" |
| Fast presets with relatively low memory usage. |
| .B \-1 |
| and |
| .B \-2 |
| should give compression speed and ratios comparable to |
| .B "bzip2 \-1" |
| and |
| .BR "bzip2 \-9" , |
| respectively. |
| Currently |
| .B \-0 |
| is not very good (not much faster than |
| .B \-1 |
| but much worse compression). In future, |
| .B \-0 |
| may be indicate some fast algorithm instead of LZMA2. |
| .IP "\fB\-3\fR ... \fB\-5" |
| Good compression ratio with low to medium memory usage. |
| These are significantly slower than levels 0\-2. |
| .IP "\fB\-6\fR ... \fB\-9" |
| Excellent compression with medium to high memory usage. These are also |
| slower than the lower preset levels. The default is |
| .BR \-6 . |
| Unless you want to maximize the compression ratio, you probably don't want |
| a higher preset level than |
| .B \-7 |
| due to speed and memory usage. |
| .RE |
| .IP |
| The exact compression settings (filter chain) used by each preset may |
| vary between |
| .B xz |
| versions. The settings may also vary between files being compressed, if |
| .B xz |
| determines that modified settings will probably give better compression |
| ratio without significantly affecting compression time or memory usage. |
| .IP |
| Because the settings may vary, the memory usage may vary too. The following |
| table lists the maximum memory usage of each preset level, which won't be |
| exceeded even in future versions of |
| .BR xz . |
| .IP |
| .B "FIXME: The table below is just a rough idea." |
| .RS |
| .RS |
| .TS |
| tab(;); |
| c c c |
| n n n. |
| Preset;Compression;Decompression |
| \-0;6 MiB;1 MiB |
| \-1;6 MiB;1 MiB |
| \-2;10 MiB;1 MiB |
| \-3;20 MiB;2 MiB |
| \-4;30 MiB;3 MiB |
| \-5;60 MiB;6 MiB |
| \-6;100 MiB;10 MiB |
| \-7;200 MiB;20 MiB |
| \-8;400 MiB;40 MiB |
| \-9;800 MiB;80 MiB |
| .TE |
| .RE |
| .RE |
| .IP |
| When compressing, |
| .B xz |
| automatically adjusts the compression settings downwards if |
| the memory usage limit would be exceeded, so it is safe to specify |
| a high preset level even on systems that don't have lots of RAM. |
| .TP |
| .BR \-\-fast " and " \-\-best |
| These are somewhat misleading aliases for |
| .B \-0 |
| and |
| .BR \-9 , |
| respectively. |
| These are provided only for backwards compatibility with LZMA Utils. |
| Avoid using these options. |
| .IP |
| Especially the name of |
| .B \-\-best |
| is misleading, because the definition of best depends on the input data, |
| and that usually people don't want the very best compression ratio anyway, |
| because it would be very slow. |
| .TP |
| .BR \-e ", " \-\-extreme |
| Modify the compression preset (\fB\-0\fR ... \fB\-9\fR) so that a little bit |
| better compression ratio can be achieved without increasing memory usage |
| of the compressor or decompressor (exception: compressor memory usage may |
| increase a little with presets \fB\-0\fR ... \fB\-2\fR). The downside is that |
| the compression time will increase dramatically (it can easily double). |
| .TP |
| .B \-\-no\-adjust |
| Display an error and exit if the compression settings exceed the |
| the memory usage limit. The default is to adjust the settings downwards so |
| that the memory usage limit is not exceeded. Automatic adjusting is |
| always disabled when creating raw streams |
| .RB ( \-\-format=raw ). |
| .TP |
| \fB\-M\fR \fIlimit\fR, \fB\-\-memory=\fIlimit |
| Set the memory usage limit. If this option is specified multiple times, |
| the last one takes effect. The |
| .I limit |
| can be specified in multiple ways: |
| .RS |
| .IP \(bu 3 |
| The |
| .I limit |
| can be an absolute value in bytes. Using an integer suffix like |
| .B MiB |
| can be useful. Example: |
| .B "\-\-memory=80MiB" |
| .IP \(bu 3 |
| The |
| .I limit |
| can be specified as a percentage of physical RAM. Example: |
| .B "\-\-memory=70%" |
| .IP \(bu 3 |
| The |
| .I limit |
| can be reset back to its default value by setting it to |
| .BR 0 . |
| See the section |
| .B "Memory usage" |
| for how the default limit is defined. |
| .IP \(bu 3 |
| The memory usage limiting can be effectively disabled by setting |
| .I limit |
| to |
| .BR max . |
| This isn't recommended. It's usually better to use, for example, |
| .BR \-\-memory=90% . |
| .RE |
| .IP |
| The current |
| .I limit |
| can be seen near the bottom of the output of the |
| .B \-\-long-help |
| option. |
| .TP |
| \fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads |
| Specify the maximum number of worker threads to use. The default is |
| the number of available CPU cores. You can see the current value of |
| .I threads |
| near the end of the output of the |
| .B \-\-long\-help |
| option. |
| .IP |
| The actual number of worker threads can be less than |
| .I threads |
| if using more threads would exceed the memory usage limit. |
| In addition to CPU-intensive worker threads, |
| .B xz |
| may use a few auxiliary threads, which don't use a lot of CPU time. |
| .IP |
| .B "Multithreaded compression and decompression are not implemented yet," |
| .B "so this option has no effect for now." |
| .SS Custom compressor filter chains |
| A custom filter chain allows specifying the compression settings in detail |
| instead of relying on the settings associated to the preset levels. |
| When a custom filter chain is specified, the compression preset level options |
| (\fB\-0\fR ... \fB\-9\fR and \fB\-\-extreme\fR) are silently ignored. |
| .PP |
| A filter chain is comparable to piping on the UN*X command line. |
| When compressing, the uncompressed input goes to the first filter, whose |
| output goes to the next filter (if any). The output of the last filter |
| gets written to the compressed file. The maximum number of filters in |
| the chain is four, but typically a filter chain has only one or two filters. |
| .PP |
| Many filters have limitations where they can be in the filter chain: |
| some filters can work only as the last filter in the chain, some only |
| as a non-last filter, and some work in any position in the chain. Depending |
| on the filter, this limitation is either inherent to the filter design or |
| exists to prevent security issues. |
| .PP |
| A custom filter chain is specified by using one or more filter options in |
| the order they are wanted in the filter chain. That is, the order of filter |
| options is significant! When decoding raw streams |
| .RB ( \-\-format=raw ), |
| the filter chain is specified in the same order as it was specified when |
| compressing. |
| .PP |
| Filters take filter-specific |
| .I options |
| as a comma-separated list. Extra commas in |
| .I options |
| are ignored. Every option has a default value, so you need to |
| specify only those you want to change. |
| .TP |
| \fB\-\-lzma1\fR[\fB=\fIoptions\fR], \fB\-\-lzma2\fR[\fB=\fIoptions\fR] |
| Add LZMA1 or LZMA2 filter to the filter chain. These filter can be used |
| only as the last filter in the chain. |
| .IP |
| LZMA1 is a legacy filter, which is supported almost solely due to the legacy |
| .B .lzma |
| file format, which supports only LZMA1. LZMA2 is an updated |
| version of LZMA1 to fix some practical issues of LZMA1. The |
| .B .xz |
| format uses LZMA2, and doesn't support LZMA1 at all. Compression speed and |
| ratios of LZMA1 and LZMA2 are practically the same. |
| .IP |
| LZMA1 and LZMA2 share the same set of |
| .IR options : |
| .RS |
| .TP |
| .BI preset= preset |
| Reset all LZMA1 or LZMA2 |
| .I options |
| to |
| .IR preset . |
| .I Preset |
| consist of an integer, which may be followed by single-letter preset |
| modifiers. The integer can be from |
| .B 0 |
| to |
| .BR 9 , |
| matching the command line options \fB\-0\fR ... \fB\-9\fR. |
| The only supported modifier is currently |
| .BR e , |
| which matches |
| .BR \-\-extreme . |
| .IP |
| The default |
| .I preset |
| is |
| .BR 6 , |
| from which the default values for the rest of the LZMA1 or LZMA2 |
| .I options |
| are taken. |
| .TP |
| .BI dict= size |
| Dictionary (history buffer) size indicates how many bytes of the recently |
| processed uncompressed data is kept in memory. One method to reduce size of |
| the uncompressed data is to store distance-length pairs, which |
| indicate what data to repeat from the dictionary buffer. The bigger |
| the dictionary, the better the compression ratio usually is, |
| but dictionaries bigger than the uncompressed data are waste of RAM. |
| .IP |
| Typical dictionary size is from 64 KiB to 64 MiB. The minimum is 4 KiB. |
| The maximum for compression is currently 1.5 GiB. The decompressor already |
| supports dictionaries up to one byte less than 4 GiB, which is the |
| maximum for LZMA1 and LZMA2 stream formats. |
| .IP |
| Dictionary size has the biggest effect on compression ratio. |
| Dictionary size and match finder together determine the memory usage of |
| the LZMA1 or LZMA2 encoder. The same dictionary size is required |
| for decompressing that was used when compressing, thus the memory usage of |
| the decoder is determined by the dictionary size used when compressing. |
| .TP |
| .BI lc= lc |
| Specify the number of literal context bits. The minimum is |
| .B 0 |
| and the maximum is |
| .BR 4 ; |
| the default is |
| .BR 3 . |
| In addition, the sum of |
| .I lc |
| and |
| .I lp |
| must not exceed |
| .BR 4 . |
| .TP |
| .BI lp= lp |
| Specify the number of literal position bits. The minimum is |
| .B 0 |
| and the maximum is |
| .BR 4 ; |
| the default is |
| .BR 0 . |
| .TP |
| .BI pb= pb |
| Specify the number of position bits. The minimum is |
| .B 0 |
| and the maximum is |
| .BR 4 ; |
| the default is |
| .BR 2 . |
| .TP |
| .BI mode= mode |
| Compression |
| .I mode |
| specifies the function used to analyze the data produced by the match finder. |
| Supported |
| .I modes |
| are |
| .B fast |
| and |
| .BR normal . |
| The default is |
| .B fast |
| for |
| .I presets |
| .BR 0 \- 2 |
| and |
| .B normal |
| for |
| .I presets |
| .BR 3 \- 9 . |
| .TP |
| .BI mf= mf |
| Match finder has a major effect on encoder speed, memory usage, and |
| compression ratio. Usually Hash Chain match finders are faster than |
| Binary Tree match finders. Hash Chains are usually used together with |
| .B mode=fast |
| and Binary Trees with |
| .BR mode=normal . |
| The memory usage formulas are only rough estimates, |
| which are closest to reality when |
| .I dict |
| is a power of two. |
| .RS |
| .TP |
| .B hc3 |
| Hash Chain with 2- and 3-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 3 |
| .br |
| Memory usage: |
| .I dict |
| * 7.5 (if |
| .I dict |
| <= 16 MiB); |
| .br |
| .I dict |
| * 5.5 + 64 MiB (if |
| .I dict |
| > 16 MiB) |
| .TP |
| .B hc4 |
| Hash Chain with 2-, 3-, and 4-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 4 |
| .br |
| Memory usage: |
| .I dict |
| * 7.5 |
| .TP |
| .B bt2 |
| Binary Tree with 2-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 2 |
| .br |
| Memory usage: |
| .I dict |
| * 9.5 |
| .TP |
| .B bt3 |
| Binary Tree with 2- and 3-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 3 |
| .br |
| Memory usage: |
| .I dict |
| * 11.5 (if |
| .I dict |
| <= 16 MiB); |
| .br |
| .I dict |
| * 9.5 + 64 MiB (if |
| .I dict |
| > 16 MiB) |
| .TP |
| .B bt4 |
| Binary Tree with 2-, 3-, and 4-byte hashing |
| .br |
| Minimum value for |
| .IR nice : |
| 4 |
| .br |
| Memory usage: |
| .I dict |
| * 11.5 |
| .RE |
| .TP |
| .BI nice= nice |
| Specify what is considered to be a nice length for a match. Once a match |
| of at least |
| .I nice |
| bytes is found, the algorithm stops looking for possibly better matches. |
| .IP |
| .I nice |
| can be 2\-273 bytes. Higher values tend to give better compression ratio |
| at expense of speed. The default depends on the |
| .I preset |
| level. |
| .TP |
| .BI depth= depth |
| Specify the maximum search depth in the match finder. The default is the |
| special value |
| .BR 0 , |
| which makes the compressor determine a reasonable |
| .I depth |
| from |
| .I mf |
| and |
| .IR nice . |
| .IP |
| Using very high values for |
| .I depth |
| can make the encoder extremely slow with carefully crafted files. |
| Avoid setting the |
| .I depth |
| over 1000 unless you are prepared to interrupt the compression in case it |
| is taking too long. |
| .RE |
| .IP |
| When decoding raw streams |
| .RB ( \-\-format=raw ), |
| LZMA2 needs only the value of |
| .BR dict . |
| LZMA1 needs also |
| .BR lc , |
| .BR lp , |
| and |
| .BR pb. |
| .TP |
| \fB\-\-x86\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-powerpc\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-ia64\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-arm\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-armthumb\fR[\fB=\fIoptions\fR] |
| .TP |
| \fB\-\-sparc\fR[\fB=\fIoptions\fR] |
| Add a branch/call/jump (BCJ) filter to the filter chain. These filters |
| can be used only as non-last filter in the filter chain. |
| .IP |
| A BCJ filter converts relative addresses in the machine code to their |
| absolute counterparts. This doesn't change the size of the data, but |
| it increases redundancy, which allows e.g. LZMA2 to get better |
| compression ratio. |
| .IP |
| The BCJ filters are always reversible, so using a BCJ filter for wrong |
| type of data doesn't cause any data loss. However, applying a BCJ filter |
| for wrong type of data is a bad idea, because it tends to make the |
| compression ratio worse. |
| .IP |
| Different instruction sets have have different alignment: |
| .RS |
| .RS |
| .TS |
| tab(;); |
| l n l |
| l n l. |
| Filter;Alignment;Notes |
| x86;1;32-bit and 64-bit x86 |
| PowerPC;4;Big endian only |
| ARM;4;Little endian only |
| ARM-Thumb;2;Little endian only |
| IA-64;16;Big or little endian |
| SPARC;4;Big or little endian |
| .TE |
| .RE |
| .RE |
| .IP |
| Since the BCJ-filtered data is usually compressed with LZMA2, the compression |
| ratio may be improved slightly if the LZMA2 options are set to match the |
| alignment of the selected BCJ filter. For example, with the IA-64 filter, |
| it's good to set |
| .B pb=4 |
| with LZMA2 (2^4=16). The x86 filter is an exception; it's usually good to |
| stick to LZMA2's default four-byte alignment when compressing x86 executables. |
| .IP |
| All BCJ filters support the same |
| .IR options : |
| .RS |
| .TP |
| .BI start= offset |
| Specify the start |
| .I offset |
| that is used when converting between relative and absolute addresses. |
| The |
| .I offset |
| must be a multiple of the alignment of the filter (see the table above). |
| The default is zero. In practice, the default is good; specifying |
| a custom |
| .I offset |
| is almost never useful. |
| .IP |
| Specifying a non-zero start |
| .I offset |
| is probably useful only if the executable has multiple sections, and there |
| are many cross-section jumps or calls. Applying a BCJ filter separately for |
| each section with proper start offset and then compressing the result as |
| a single chunk may give some improvement in compression ratio compared |
| to applying the BCJ filter with the default |
| .I offset |
| for the whole executable. |
| .RE |
| .TP |
| \fB\-\-delta\fR[\fB=\fIoptions\fR] |
| Add Delta filter to the filter chain. The Delta filter |
| can be used only as non-last filter in the filter chain. |
| .IP |
| Currently only simple byte-wise delta calculation is supported. It can |
| be useful when compressing e.g. uncompressed bitmap images or uncompressed |
| PCM audio. However, special purpose algorithms may give significantly better |
| results than Delta + LZMA2. This is true especially with audio, which |
| compresses faster and better e.g. with FLAC. |
| .IP |
| Supported |
| .IR options : |
| .RS |
| .TP |
| .BI dist= distance |
| Specify the |
| .I distance |
| of the delta calculation as bytes. |
| .I distance |
| must be 1\-256. The default is 1. |
| .IP |
| For example, with |
| .B dist=2 |
| and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be |
| A1 B1 01 02 01 02 01 02. |
| .RE |
| .SS "Other options" |
| .TP |
| .BR \-q ", " \-\-quiet |
| Suppress warnings and notices. Specify this twice to suppress errors too. |
| This option has no effect on the exit status. That is, even if a warning |
| was suppressed, the exit status to indicate a warning is still used. |
| .TP |
| .BR \-v ", " \-\-verbose |
| Be verbose. If standard error is connected to a terminal, |
| .B xz |
| will display a progress indicator. |
| Specifying |
| .B \-\-verbose |
| twice will give even more verbose output (useful mostly for debugging). |
| .IP |
| The progress indicator shows the following information: |
| .RS |
| .IP \(bu 3 |
| Completion percentage is shown if the size of the input file is known. |
| That is, percentage cannot be shown in pipes. |
| .IP \(bu 3 |
| Amount of compressed data produced (compressing) or consumed (decompressing). |
| .IP \(bu 3 |
| Amount of uncompressed data consumed (compressing) or produced |
| (decompressing). |
| .IP \(bu 3 |
| Compression ratio, which is calculated by dividing the amount of |
| compressed data processed so far by the amount of uncompressed data |
| processed so far. |
| .IP \(bu 3 |
| Compression or decompression speed. This is measured as the amount of |
| uncompressed data consumed (compression) or produced (decompression) |
| per second. It is shown once a few seconds have passed since |
| .B xz |
| started processing the file. |
| .IP \(bu 3 |
| Elapsed time or estimated time remaining. |
| Elapsed time is displayed in the format M:SS or H:MM:SS. |
| The estimated remaining time is displayed in a less precise format |
| which never has colons, for example, 2 min 30 s. The estimate can |
| be shown only when the size of the input file is known and a couple of |
| seconds have already passed since |
| .B xz |
| started processing the file. |
| .RE |
| .IP |
| When standard error is not a terminal, |
| .B \-\-verbose |
| will make |
| .B xz |
| print the filename, compressed size, uncompressed size, compression ratio, |
| speed, and elapsed time on a single line to standard error after |
| compressing or decompressing the file. If operating took at least a few |
| seconds, also the speed and elapsed time are printed. If the operation |
| didn't finish, for example due to user interruption, also the completion |
| percentage is printed if the size of the input file is known. |
| .TP |
| .BR \-Q ", " \-\-no\-warn |
| Don't set the exit status to |
| .B 2 |
| even if a condition worth a warning was detected. This option doesn't affect |
| the verbosity level, thus both |
| .B \-\-quiet |
| and |
| .B \-\-no\-warn |
| have to be used to not display warnings and to not alter the exit status. |
| .TP |
| .B \-\-robot |
| Print messages in a machine-parsable format. This is intended to ease |
| writing frontends that want to use |
| .B xz |
| instead of liblzma, which may be the case with various scripts. The output |
| with this option enabled is meant to be stable across |
| .B xz |
| releases. See the section |
| .B "ROBOT MODE" |
| for details. |
| .TP |
| .BR \-\-info-memory |
| Display the current memory usage limit in human-readable format on |
| a single line, and exit successfully. To see how much RAM |
| .B xz |
| thinks your system has, use |
| .BR "\-\-memory=100% \-\-info\-memory" . |
| .TP |
| .BR \-h ", " \-\-help |
| Display a help message describing the most commonly used options, |
| and exit successfully. |
| .TP |
| .BR \-H ", " \-\-long\-help |
| Display a help message describing all features of |
| .BR xz , |
| and exit successfully |
| .TP |
| .BR \-V ", " \-\-version |
| Display the version number of |
| .B xz |
| and liblzma in human readable format. To get machine-parsable output, specify |
| .B \-\-robot |
| before |
| .BR \-\-version . |
| .SH ROBOT MODE |
| The robot mode is activated with the |
| .B \-\-robot |
| option. It makes the output of |
| .B xz |
| easier to parse by other programs. Currently |
| .B \-\-robot |
| is supported only together with |
| .BR \-\-version , |
| .BR \-\-info-memory , |
| and |
| .BR \-\-list . |
| It will be supported for normal compression and decompression in the future. |
| .PP |
| .SS Version |
| .B "xz \-\-robot \-\-version" |
| will print the version number of |
| .B xz |
| and liblzma in the following format: |
| .PP |
| .BI XZ_VERSION= XYYYZZZS |
| .br |
| .BI LIBLZMA_VERSION= XYYYZZZS |
| .TP |
| .I X |
| Major version. |
| .TP |
| .I YYY |
| Minor version. Even numbers are stable. |
| Odd numbers are alpha or beta versions. |
| .TP |
| .I ZZZ |
| Patch level for stable releases or just a counter for development releases. |
| .TP |
| .I S |
| Stability. |
| .B 0 |
| is alpha, |
| .B 1 |
| is beta, and |
| .B 2 |
| is stable. |
| .I S |
| should be always |
| .B 2 |
| when |
| .I YYY |
| is even. |
| .PP |
| .I XYYYZZZS |
| are the same on both lines if |
| .B xz |
| and liblzma are from the same XZ Utils release. |
| .PP |
| Examples: 4.999.9beta is |
| .B 49990091 |
| and |
| 5.0.0 is |
| .BR 50000002 . |
| .SS Memory limit information |
| .B "xz \-\-robot \-\-info-memory" |
| prints the current memory usage limit as bytes on a single line. |
| To get the total amount of installed RAM, use |
| .BR "xz \-\-robot \-\-memory=100% \-\-info-memory" . |
| .SS List mode |
| .B "xz \-\-robot \-\-list" |
| uses tab-separated output. The first column of every line has a string |
| that indicates the type of the information found on that line: |
| .TP |
| .B name |
| This is always the first line when starting to list a file. The second |
| column on the line is the filename. |
| .TP |
| .B file |
| This line contains overall information about the |
| .B .xz |
| file. This line is always printed after the |
| .B name |
| line. |
| .TP |
| .B stream |
| This line type is used only when |
| .B \-\-verbose |
| was specified. There are as many |
| .B stream |
| lines as there are streams in the |
| .B .xz |
| file. |
| .TP |
| .B block |
| This line type is used only when |
| .B \-\-verbose |
| was specified. There are as many |
| .B block |
| lines as there are blocks in the |
| .B .xz |
| file. The |
| .B block |
| lines are shown after all the |
| .B stream |
| lines; different line types are not interleaved. |
| .TP |
| .B summary |
| This line type is used only when |
| .B \-\-verbose |
| was specified twice. This line is printed after all |
| .B block |
| lines. Like the |
| .B file |
| line, the |
| .B summary |
| line contains overall information about the |
| .B .xz |
| file. |
| .TP |
| .B totals |
| This line is always the very last line of the list output. It shows |
| the total counts and sizes. |
| .PP |
| The columns of the |
| .B file |
| lines: |
| .RS |
| .IP 2. 4 |
| Number of streams in the file |
| .IP 3. 4 |
| Total number of blocks in the stream(s) |
| .IP 4. 4 |
| Compressed size of the file |
| .IP 5. 4 |
| Uncompressed size of the file |
| .IP 6. 4 |
| Compression ratio, for example |
| .BR 0.123. |
| If ratio is over 9.999, three dashes |
| .RB ( \-\-\- ) |
| are displayed instead of the ratio. |
| .IP 7. 4 |
| Comma-separated list of integrity check names. The following strings are |
| used for the known check types: |
| .BR None , |
| .BR CRC32 , |
| .BR CRC64 , |
| and |
| .BR SHA\-256 . |
| For unknown check types, |
| .BI Unknown\- N |
| is used, where |
| .I N |
| is the Check ID as a decimal number (one or two digits). |
| .IP 8. 4 |
| Total size of stream padding in the file |
| .RE |
| .PP |
| The columns of the |
| .B stream |
| lines: |
| .RS |
| .IP 2. 4 |
| Stream number (the first stream is 1) |
| .IP 3. 4 |
| Number of blocks in the stream |
| .IP 4. 4 |
| Compressed start offset |
| .IP 5. 4 |
| Uncompressed start offset |
| .IP 6. 4 |
| Compressed size (does not include stream padding) |
| .IP 7. 4 |
| Uncompressed size |
| .IP 8. 4 |
| Compression ratio |
| .IP 9. 4 |
| Name of the integrity check |
| .IP 10. 4 |
| Size of stream padding |
| .RE |
| .PP |
| The columns of the |
| .B block |
| lines: |
| .RS |
| .IP 2. 4 |
| Number of the stream containing this block |
| .IP 3. 4 |
| Block number relative to the beginning of the stream (the first block is 1) |
| .IP 4. 4 |
| Block number relative to the beginning of the file |
| .IP 5. 4 |
| Compressed start offset relative to the beginning of the file |
| .IP 6. 4 |
| Uncompressed start offset relative to the beginning of the file |
| .IP 7. 4 |
| Total compressed size of the block (includes headers) |
| .IP 8. 4 |
| Uncompressed size |
| .IP 9. 4 |
| Compression ratio |
| .IP 10. 4 |
| Name of the integrity check |
| .RE |
| .PP |
| If |
| .B \-\-verbose |
| was specified twice, additional columns are included on the |
| .B block |
| lines. These are not displayed with a single |
| .BR \-\-verbose , |
| because getting this information requires many seeks and can thus be slow: |
| .RS |
| .IP 11. 4 |
| Value of the integrity check in hexadecimal |
| .IP 12. 4 |
| Block header size |
| .IP 13. 4 |
| Block flags: |
| .B c |
| indicates that compressed size is present, and |
| .B u |
| indicates that uncompressed size is present. |
| If the flag is not set, a dash |
| .RB ( \- ) |
| is shown instead to keep the string length fixed. New flags may be added |
| to the end of the string in the future. |
| .IP 14. 4 |
| Size of the actual compressed data in the block (this excludes |
| the block header, block padding, and check fields) |
| .IP 15. 4 |
| Amount of memory (as bytes) required to decompress this block with this |
| .B xz |
| version |
| .IP 16. 4 |
| Filter chain. Note that most of the options used at compression time cannot |
| be known, because only the options that are needed for decompression are |
| stored in the |
| .B .xz |
| headers. |
| .RE |
| .PP |
| The columns of the |
| .B totals |
| line: |
| .RS |
| .IP 2. 4 |
| Number of streams |
| .IP 3. 4 |
| Number of blocks |
| .IP 4. 4 |
| Compressed size |
| .IP 5. 4 |
| Uncompressed size |
| .IP 6. 4 |
| Average compression ratio |
| .IP 7. 4 |
| Comma-separated list of integrity check names that were present in the files |
| .IP 8. 4 |
| Stream padding size |
| .IP 9. 4 |
| Number of files. This is here to keep the order of the earlier columns |
| the same as on |
| .B file |
| lines. |
| .RE |
| .PP |
| If |
| .B \-\-verbose |
| was specified twice, additional columns are included on the |
| .B totals |
| line: |
| .RS |
| .IP 10. 4 |
| Maximum amount of memory (as bytes) required to decompress the files |
| with this |
| .B xz |
| version |
| .IP 11. 4 |
| .B yes |
| or |
| .B no |
| indicating if all block headers have both compressed size and |
| uncompressed size stored in them |
| .RE |
| .PP |
| Future versions may add new line types and new columns can be added to |
| the existing line types, but the existing columns won't be changed. |
| .SH "EXIT STATUS" |
| .TP |
| .B 0 |
| All is good. |
| .TP |
| .B 1 |
| An error occurred. |
| .TP |
| .B 2 |
| Something worth a warning occurred, but no actual errors occurred. |
| .PP |
| Notices (not warnings or errors) printed on standard error don't affect |
| the exit status. |
| .SH ENVIRONMENT |
| .TP |
| .B XZ_OPT |
| A space-separated list of options is parsed from |
| .B XZ_OPT |
| before parsing the options given on the command line. Note that only |
| options are parsed from |
| .BR XZ_OPT ; |
| all non-options are silently ignored. Parsing is done with |
| .BR getopt_long (3) |
| which is used also for the command line arguments. |
| .SH "LZMA UTILS COMPATIBILITY" |
| The command line syntax of |
| .B xz |
| is practically a superset of |
| .BR lzma , |
| .BR unlzma , |
| and |
| .BR lzcat |
| as found from LZMA Utils 4.32.x. In most cases, it is possible to replace |
| LZMA Utils with XZ Utils without breaking existing scripts. There are some |
| incompatibilities though, which may sometimes cause problems. |
| .SS "Compression preset levels" |
| The numbering of the compression level presets is not identical in |
| .B xz |
| and LZMA Utils. |
| The most important difference is how dictionary sizes are mapped to different |
| presets. Dictionary size is roughly equal to the decompressor memory usage. |
| .RS |
| .TS |
| tab(;); |
| c c c |
| c n n. |
| Level;xz;LZMA Utils |
| \-1;64 KiB;64 KiB |
| \-2;512 KiB;1 MiB |
| \-3;1 MiB;512 KiB |
| \-4;2 MiB;1 MiB |
| \-5;4 MiB;2 MiB |
| \-6;8 MiB;4 MiB |
| \-7;16 MiB;8 MiB |
| \-8;32 MiB;16 MiB |
| \-9;64 MiB;32 MiB |
| .TE |
| .RE |
| .PP |
| The dictionary size differences affect the compressor memory usage too, |
| but there are some other differences between LZMA Utils and XZ Utils, which |
| make the difference even bigger: |
| .RS |
| .TS |
| tab(;); |
| c c c |
| c n n. |
| Level;xz;LZMA Utils 4.32.x |
| \-1;2 MiB;2 MiB |
| \-2;5 MiB;12 MiB |
| \-3;13 MiB;12 MiB |
| \-4;25 MiB;16 MiB |
| \-5;48 MiB;26 MiB |
| \-6;94 MiB;45 MiB |
| \-7;186 MiB;83 MiB |
| \-8;370 MiB;159 MiB |
| \-9;674 MiB;311 MiB |
| .TE |
| .RE |
| .PP |
| The default preset level in LZMA Utils is |
| .B \-7 |
| while in XZ Utils it is |
| .BR \-6 , |
| so both use 8 MiB dictionary by default. |
| .SS "Streamed vs. non-streamed .lzma files" |
| Uncompressed size of the file can be stored in the |
| .B .lzma |
| header. LZMA Utils does that when compressing regular files. |
| The alternative is to mark that uncompressed size is unknown and |
| use end of payload marker to indicate where the decompressor should stop. |
| LZMA Utils uses this method when uncompressed size isn't known, which is |
| the case for example in pipes. |
| .PP |
| .B xz |
| supports decompressing |
| .B .lzma |
| files with or without end of payload marker, but all |
| .B .lzma |
| files created by |
| .B xz |
| will use end of payload marker and have uncompressed size marked as unknown |
| in the |
| .B .lzma |
| header. This may be a problem in some (uncommon) situations. For example, a |
| .B .lzma |
| decompressor in an embedded device might work only with files that have known |
| uncompressed size. If you hit this problem, you need to use LZMA Utils or |
| LZMA SDK to create |
| .B .lzma |
| files with known uncompressed size. |
| .SS "Unsupported .lzma files" |
| The |
| .B .lzma |
| format allows |
| .I lc |
| values up to 8, and |
| .I lp |
| values up to 4. LZMA Utils can decompress files with any |
| .I lc |
| and |
| .IR lp , |
| but always creates files with |
| .B lc=3 |
| and |
| .BR lp=0 . |
| Creating files with other |
| .I lc |
| and |
| .I lp |
| is possible with |
| .B xz |
| and with LZMA SDK. |
| .PP |
| The implementation of the LZMA1 filter in liblzma requires |
| that the sum of |
| .I lc |
| and |
| .I lp |
| must not exceed 4. Thus, |
| .B .lzma |
| files which exceed this limitation, cannot be decompressed with |
| .BR xz . |
| .PP |
| LZMA Utils creates only |
| .B .lzma |
| files which have dictionary size of |
| .RI "2^" n |
| (a power of 2), but accepts files with any dictionary size. |
| liblzma accepts only |
| .B .lzma |
| files which have dictionary size of |
| .RI "2^" n |
| or |
| .RI "2^" n " + 2^(" n "\-1)." |
| This is to decrease false positives when detecting |
| .B .lzma |
| files. |
| .PP |
| These limitations shouldn't be a problem in practice, since practically all |
| .B .lzma |
| files have been compressed with settings that liblzma will accept. |
| .SS "Trailing garbage" |
| When decompressing, LZMA Utils silently ignore everything after the first |
| .B .lzma |
| stream. In most situations, this is a bug. This also means that LZMA Utils |
| don't support decompressing concatenated |
| .B .lzma |
| files. |
| .PP |
| If there is data left after the first |
| .B .lzma |
| stream, |
| .B xz |
| considers the file to be corrupt. This may break obscure scripts which have |
| assumed that trailing garbage is ignored. |
| .SH NOTES |
| .SS Compressed output may vary |
| The exact compressed output produced from the same uncompressed input file |
| may vary between XZ Utils versions even if compression options are identical. |
| This is because the encoder can be improved (faster or better compression) |
| without affecting the file format. The output can vary even between different |
| builds of the same XZ Utils version, if different build options are used. |
| .PP |
| The above means that implementing |
| .B \-\-rsyncable |
| to create rsyncable |
| .B .xz |
| files is not going to happen without freezing a part of the encoder |
| implementation, which can then be used with |
| .BR \-\-rsyncable . |
| .SS Embedded .xz decompressors |
| Embedded |
| .B .xz |
| decompressor implementations like XZ Embedded don't necessarily support files |
| created with |
| .I check |
| types other than |
| .B none |
| and |
| .BR crc32 . |
| Since the default is \fB\-\-check=\fIcrc64\fR, you must use |
| .B \-\-check=none |
| or |
| .B \-\-check=crc32 |
| when creating files for embedded systems. |
| .PP |
| Outside embedded systems, all |
| .B .xz |
| format decompressors support all the |
| .I check |
| types, or at least are able to decompress the file without verifying the |
| integrity check if the particular |
| .I check |
| is not supported. |
| .PP |
| XZ Embedded supports BCJ filters, but only with the default start offset. |
| .SH EXAMPLES |
| .SS Basics |
| A mix of compressed and uncompressed files can be decompressed |
| to standard output with a single command: |
| .IP |
| .B "xz -dcf a.txt b.txt.xz c.txt d.txt.xz > abcd.txt" |
| .SS Parallel compression of many files |
| On GNU and *BSD, |
| .BR find (1) |
| and |
| .BR xargs (1) |
| can be used to parallellize compression of many files: |
| .PP |
| .IP |
| .B "find . \-type f \e! \-name '*.xz' \-print0 | xargs \-0r \-P4 \-n16 xz" |
| .PP |
| The |
| .B \-P |
| option sets the number of parallel |
| .B xz |
| processes. The best value for the |
| .B \-n |
| option depends on how many files there are to be compressed. |
| If there are only a couple of files, the value should probably be |
| .BR 1 ; |
| with tens of thousands of files, |
| .B 100 |
| or even more may be appropriate to reduce the number of |
| .B xz |
| processes that |
| .BR xargs (1) |
| will eventually create. |
| .SS Robot mode examples |
| Calculating how many bytes have been saved in total after compressing |
| multiple files: |
| .IP |
| .B "xz --robot --list *.xz | awk '/^totals/{print $5\-$4}'" |
| .SH "SEE ALSO" |
| .BR xzdec (1), |
| .BR gzip (1), |
| .BR bzip2 (1) |
| .PP |
| XZ Utils: <http://tukaani.org/xz/> |
| .br |
| XZ Embedded: <http://tukaani.org/xz/embedded.html> |
| .br |
| LZMA SDK: <http://7-zip.org/sdk.html> |