xz.1 68 KB


  1. '\" t
  2. .\"
  3. .\" Author: Lasse Collin
  4. .\"
  5. .\" This file has been put into the public domain.
  6. .\" You can do whatever you want with this file.
  7. .\"
  8. .TH XZ 1 "2022-12-01" "Tukaani" "XZ Utils"
  9. .
  10. .SH NAME
  11. xz, unxz, xzcat, lzma, unlzma, lzcat \- Compress or decompress .xz and .lzma files
  12. .
  13. .SH SYNOPSIS
  14. .B xz
  15. .RI [ option... ]
  16. .RI [ file... ]
  17. .
  18. .SH COMMAND ALIASES
  19. .B unxz
  20. is equivalent to
  21. .BR "xz \-\-decompress" .
  22. .br
  23. .B xzcat
  24. is equivalent to
  25. .BR "xz \-\-decompress \-\-stdout" .
  26. .br
  27. .B lzma
  28. is equivalent to
  29. .BR "xz \-\-format=lzma" .
  30. .br
  31. .B unlzma
  32. is equivalent to
  33. .BR "xz \-\-format=lzma \-\-decompress" .
  34. .br
  35. .B lzcat
  36. is equivalent to
  37. .BR "xz \-\-format=lzma \-\-decompress \-\-stdout" .
  38. .PP
  39. When writing scripts that need to decompress files,
  40. it is recommended to always use the name
  41. .B xz
  42. with appropriate arguments
  43. .RB ( "xz \-d"
  44. or
  45. .BR "xz \-dc" )
  46. instead of the names
  47. .B unxz
  48. and
  49. .BR xzcat .
  50. .
  51. .SH DESCRIPTION
  52. .B xz
  53. is a general-purpose data compression tool with
  54. command line syntax similar to
  55. .BR gzip (1)
  56. and
  57. .BR bzip2 (1).
  58. The native file format is the
  59. .B .xz
  60. format, but the legacy
  61. .B .lzma
  62. format used by LZMA Utils and
  63. raw compressed streams with no container format headers
  64. are also supported.
  65. In addition, decompression of the
  66. .B .lz
  67. format used by
  68. .B lzip
  69. is supported.
  70. .PP
  71. .B xz
  72. compresses or decompresses each
  73. .I file
  74. according to the selected operation mode.
  75. If no
  76. .I files
  77. are given or
  78. .I file
  79. is
  80. .BR \- ,
  81. .B xz
  82. reads from standard input and writes the processed data
  83. to standard output.
  84. .B xz
  85. will refuse (display an error and skip the
  86. .IR file )
  87. to write compressed data to standard output if it is a terminal.
  88. Similarly,
  89. .B xz
  90. will refuse to read compressed data
  91. from standard input if it is a terminal.
  92. .PP
  93. Unless
  94. .B \-\-stdout
  95. is specified,
  96. .I files
  97. other than
  98. .B \-
  99. are written to a new file whose name is derived from the source
  100. .I file
  101. name:
  102. .IP \(bu 3
  103. When compressing, the suffix of the target file format
  104. .RB ( .xz
  105. or
  106. .BR .lzma )
  107. is appended to the source filename to get the target filename.
  108. .IP \(bu 3
  109. When decompressing, the
  110. .BR .xz ,
  111. .BR .lzma ,
  112. or
  113. .B .lz
  114. suffix is removed from the filename to get the target filename.
  115. .B xz
  116. also recognizes the suffixes
  117. .B .txz
  118. and
  119. .BR .tlz ,
  120. and replaces them with the
  121. .B .tar
  122. suffix.
  123. .PP
  124. If the target file already exists, an error is displayed and the
  125. .I file
  126. is skipped.
  127. .PP
  128. Unless writing to standard output,
  129. .B xz
  130. will display a warning and skip the
  131. .I file
  132. if any of the following applies:
  133. .IP \(bu 3
  134. .I File
  135. is not a regular file.
  136. Symbolic links are not followed,
  137. and thus they are not considered to be regular files.
  138. .IP \(bu 3
  139. .I File
  140. has more than one hard link.
  141. .IP \(bu 3
  142. .I File
  143. has setuid, setgid, or sticky bit set.
  144. .IP \(bu 3
  145. The operation mode is set to compress and the
  146. .I file
  147. already has a suffix of the target file format
  148. .RB ( .xz
  149. or
  150. .B .txz
  151. when compressing to the
  152. .B .xz
  153. format, and
  154. .B .lzma
  155. or
  156. .B .tlz
  157. when compressing to the
  158. .B .lzma
  159. format).
  160. .IP \(bu 3
  161. The operation mode is set to decompress and the
  162. .I file
  163. doesn't have a suffix of any of the supported file formats
  164. .RB ( .xz ,
  165. .BR .txz ,
  166. .BR .lzma ,
  167. .BR .tlz ,
  168. or
  169. .BR .lz ).
  170. .PP
  171. After successfully compressing or decompressing the
  172. .IR file ,
  173. .B xz
  174. copies the owner, group, permissions, access time,
  175. and modification time from the source
  176. .I file
  177. to the target file.
  178. If copying the group fails, the permissions are modified
  179. so that the target file doesn't become accessible to users
  180. who didn't have permission to access the source
  181. .IR file .
  182. .B xz
  183. doesn't support copying other metadata like access control lists
  184. or extended attributes yet.
  185. .PP
  186. Once the target file has been successfully closed, the source
  187. .I file
  188. is removed unless
  189. .B \-\-keep
  190. was specified.
  191. The source
  192. .I file
  193. is never removed if the output is written to standard output
  194. or if an error occurs.
  195. .PP
  196. Sending
  197. .B SIGINFO
  198. or
  199. .B SIGUSR1
  200. to the
  201. .B xz
  202. process makes it print progress information to standard error.
  203. This has only limited use since when standard error
  204. is a terminal, using
  205. .B \-\-verbose
  206. will display an automatically updating progress indicator.
  207. .
  208. .SS "Memory usage"
  209. The memory usage of
  210. .B xz
  211. varies from a few hundred kilobytes to several gigabytes
  212. depending on the compression settings.
  213. The settings used when compressing a file determine
  214. the memory requirements of the decompressor.
  215. Typically the decompressor needs 5\ % to 20\ % of
  216. the amount of memory that the compressor needed when
  217. creating the file.
  218. For example, decompressing a file created with
  219. .B xz \-9
  220. currently requires 65\ MiB of memory.
  221. Still, it is possible to have
  222. .B .xz
  223. files that require several gigabytes of memory to decompress.
  224. .PP
  225. Especially users of older systems may find
  226. the possibility of very large memory usage annoying.
  227. To prevent uncomfortable surprises,
  228. .B xz
  229. has a built-in memory usage limiter, which is disabled by default.
  230. While some operating systems provide ways to limit
  231. the memory usage of processes, relying on it
  232. wasn't deemed to be flexible enough (for example, using
  233. .BR ulimit (1)
  234. to limit virtual memory tends to cripple
  235. .BR mmap (2)).
  236. .PP
  237. The memory usage limiter can be enabled with
  238. the command line option \fB\-\-memlimit=\fIlimit\fR.
  239. Often it is more convenient to enable the limiter
  240. by default by setting the environment variable
  241. .BR XZ_DEFAULTS ,
  242. for example,
  243. .BR XZ_DEFAULTS=\-\-memlimit=150MiB .
  244. It is possible to set the limits separately
  245. for compression and decompression by using
  246. .BI \-\-memlimit\-compress= limit
  247. and \fB\-\-memlimit\-decompress=\fIlimit\fR.
  248. Using these two options outside
  249. .B XZ_DEFAULTS
  250. is rarely useful because a single run of
  251. .B xz
  252. cannot do both compression and decompression and
  253. .BI \-\-memlimit= limit
  254. (or
  255. .B \-M
  256. .IR limit )
  257. is shorter to type on the command line.
  258. .PP
  259. If the specified memory usage limit is exceeded when decompressing,
  260. .B xz
  261. will display an error and decompressing the file will fail.
  262. If the limit is exceeded when compressing,
  263. .B xz
  264. will try to scale the settings down so that the limit
  265. is no longer exceeded (except when using
  266. .B \-\-format=raw
  267. or
  268. .BR \-\-no\-adjust ).
  269. This way the operation won't fail unless the limit is very small.
  270. The scaling of the settings is done in steps that don't
  271. match the compression level presets, for example, if the limit is
  272. only slightly less than the amount required for
  273. .BR "xz \-9" ,
  274. the settings will be scaled down only a little,
  275. not all the way down to
  276. .BR "xz \-8" .
  277. .
  278. .SS "Concatenation and padding with .xz files"
  279. It is possible to concatenate
  280. .B .xz
  281. files as is.
  282. .B xz
  283. will decompress such files as if they were a single
  284. .B .xz
  285. file.
  286. .PP
  287. It is possible to insert padding between the concatenated parts
  288. or after the last part.
  289. The padding must consist of null bytes and the size
  290. of the padding must be a multiple of four bytes.
  291. This can be useful, for example, if the
  292. .B .xz
  293. file is stored on a medium that measures file sizes
  294. in 512-byte blocks.
  295. .PP
  296. Concatenation and padding are not allowed with
  297. .B .lzma
  298. files or raw streams.
  299. .
  300. .SH OPTIONS
  301. .
  302. .SS "Integer suffixes and special values"
  303. In most places where an integer argument is expected,
  304. an optional suffix is supported to easily indicate large integers.
  305. There must be no space between the integer and the suffix.
  306. .TP
  307. .B KiB
  308. Multiply the integer by 1,024 (2^10).
  309. .BR Ki ,
  310. .BR k ,
  311. .BR kB ,
  312. .BR K ,
  313. and
  314. .B KB
  315. are accepted as synonyms for
  316. .BR KiB .
  317. .TP
  318. .B MiB
  319. Multiply the integer by 1,048,576 (2^20).
  320. .BR Mi ,
  321. .BR m ,
  322. .BR M ,
  323. and
  324. .B MB
  325. are accepted as synonyms for
  326. .BR MiB .
  327. .TP
  328. .B GiB
  329. Multiply the integer by 1,073,741,824 (2^30).
  330. .BR Gi ,
  331. .BR g ,
  332. .BR G ,
  333. and
  334. .B GB
  335. are accepted as synonyms for
  336. .BR GiB .
  337. .PP
  338. The special value
  339. .B max
  340. can be used to indicate the maximum integer value
  341. supported by the option.
  342. .
  343. .SS "Operation mode"
  344. If multiple operation mode options are given,
  345. the last one takes effect.
  346. .TP
  347. .BR \-z ", " \-\-compress
  348. Compress.
  349. This is the default operation mode when no operation mode option
  350. is specified and no other operation mode is implied from
  351. the command name (for example,
  352. .B unxz
  353. implies
  354. .BR \-\-decompress ).
  355. .TP
  356. .BR \-d ", " \-\-decompress ", " \-\-uncompress
  357. Decompress.
  358. .TP
  359. .BR \-t ", " \-\-test
  360. Test the integrity of compressed
  361. .IR files .
  362. This option is equivalent to
  363. .B "\-\-decompress \-\-stdout"
  364. except that the decompressed data is discarded instead of being
  365. written to standard output.
  366. No files are created or removed.
  367. .TP
  368. .BR \-l ", " \-\-list
  369. Print information about compressed
  370. .IR files .
  371. No uncompressed output is produced,
  372. and no files are created or removed.
  373. In list mode, the program cannot read
  374. the compressed data from standard
  375. input or from other unseekable sources.
  376. .IP ""
  377. The default listing shows basic information about
  378. .IR files ,
  379. one file per line.
  380. To get more detailed information, use also the
  381. .B \-\-verbose
  382. option.
  383. For even more information, use
  384. .B \-\-verbose
  385. twice, but note that this may be slow, because getting all the extra
  386. information requires many seeks.
  387. The width of verbose output exceeds
  388. 80 characters, so piping the output to, for example,
  389. .B "less\ \-S"
  390. may be convenient if the terminal isn't wide enough.
  391. .IP ""
  392. The exact output may vary between
  393. .B xz
  394. versions and different locales.
  395. For machine-readable output,
  396. .B \-\-robot \-\-list
  397. should be used.
  398. .
  399. .SS "Operation modifiers"
  400. .TP
  401. .BR \-k ", " \-\-keep
  402. Don't delete the input files.
  403. .IP ""
  404. Since
  405. .B xz
  406. 5.2.6,
  407. this option also makes
  408. .B xz
  409. compress or decompress even if the input is
  410. a symbolic link to a regular file,
  411. has more than one hard link,
  412. or has the setuid, setgid, or sticky bit set.
  413. The setuid, setgid, and sticky bits are not copied
  414. to the target file.
  415. In earlier versions this was only done with
  416. .BR \-\-force .
  417. .TP
  418. .BR \-f ", " \-\-force
  419. This option has several effects:
  420. .RS
  421. .IP \(bu 3
  422. If the target file already exists,
  423. delete it before compressing or decompressing.
  424. .IP \(bu 3
  425. Compress or decompress even if the input is
  426. a symbolic link to a regular file,
  427. has more than one hard link,
  428. or has the setuid, setgid, or sticky bit set.
  429. The setuid, setgid, and sticky bits are not copied
  430. to the target file.
  431. .IP \(bu 3
  432. When used with
  433. .B \-\-decompress
  434. .B \-\-stdout
  435. and
  436. .B xz
  437. cannot recognize the type of the source file,
  438. copy the source file as is to standard output.
  439. This allows
  440. .B xzcat
  441. .B \-\-force
  442. to be used like
  443. .BR cat (1)
  444. for files that have not been compressed with
  445. .BR xz .
  446. Note that in future,
  447. .B xz
  448. might support new compressed file formats, which may make
  449. .B xz
  450. decompress more types of files instead of copying them as is to
  451. standard output.
  452. .BI \-\-format= format
  453. can be used to restrict
  454. .B xz
  455. to decompress only a single file format.
  456. .RE
  457. .TP
  458. .BR \-c ", " \-\-stdout ", " \-\-to\-stdout
  459. Write the compressed or decompressed data to
  460. standard output instead of a file.
  461. This implies
  462. .BR \-\-keep .
  463. .TP
  464. .B \-\-single\-stream
  465. Decompress only the first
  466. .B .xz
  467. stream, and
  468. silently ignore possible remaining input data following the stream.
  469. Normally such trailing garbage makes
  470. .B xz
  471. display an error.
  472. .IP ""
  473. .B xz
  474. never decompresses more than one stream from
  475. .B .lzma
  476. files or raw streams, but this option still makes
  477. .B xz
  478. ignore the possible trailing data after the
  479. .B .lzma
  480. file or raw stream.
  481. .IP ""
  482. This option has no effect if the operation mode is not
  483. .B \-\-decompress
  484. or
  485. .BR \-\-test .
  486. .TP
  487. .B \-\-no\-sparse
  488. Disable creation of sparse files.
  489. By default, if decompressing into a regular file,
  490. .B xz
  491. tries to make the file sparse if the decompressed data contains
  492. long sequences of binary zeros.
  493. It also works when writing to standard output
  494. as long as standard output is connected to a regular file
  495. and certain additional conditions are met to make it safe.
  496. Creating sparse files may save disk space and speed up
  497. the decompression by reducing the amount of disk I/O.
  498. .TP
  499. \fB\-S\fR \fI.suf\fR, \fB\-\-suffix=\fI.suf
  500. When compressing, use
  501. .I .suf
  502. as the suffix for the target file instead of
  503. .B .xz
  504. or
  505. .BR .lzma .
  506. If not writing to standard output and
  507. the source file already has the suffix
  508. .IR .suf ,
  509. a warning is displayed and the file is skipped.
  510. .IP ""
  511. When decompressing, recognize files with the suffix
  512. .I .suf
  513. in addition to files with the
  514. .BR .xz ,
  515. .BR .txz ,
  516. .BR .lzma ,
  517. .BR .tlz ,
  518. or
  519. .B .lz
  520. suffix.
  521. If the source file has the suffix
  522. .IR .suf ,
  523. the suffix is removed to get the target filename.
  524. .IP ""
  525. When compressing or decompressing raw streams
  526. .RB ( \-\-format=raw ),
  527. the suffix must always be specified unless
  528. writing to standard output,
  529. because there is no default suffix for raw streams.
  530. .TP
  531. \fB\-\-files\fR[\fB=\fIfile\fR]
  532. Read the filenames to process from
  533. .IR file ;
  534. if
  535. .I file
  536. is omitted, filenames are read from standard input.
  537. Filenames must be terminated with the newline character.
  538. A dash
  539. .RB ( \- )
  540. is taken as a regular filename; it doesn't mean standard input.
  541. If filenames are given also as command line arguments, they are
  542. processed before the filenames read from
  543. .IR file .
  544. .TP
  545. \fB\-\-files0\fR[\fB=\fIfile\fR]
  546. This is identical to \fB\-\-files\fR[\fB=\fIfile\fR] except
  547. that each filename must be terminated with the null character.
  548. .
  549. .SS "Basic file format and compression options"
  550. .TP
  551. \fB\-F\fR \fIformat\fR, \fB\-\-format=\fIformat
  552. Specify the file
  553. .I format
  554. to compress or decompress:
  555. .RS
  556. .TP
  557. .B auto
  558. This is the default.
  559. When compressing,
  560. .B auto
  561. is equivalent to
  562. .BR xz .
  563. When decompressing,
  564. the format of the input file is automatically detected.
  565. Note that raw streams (created with
  566. .BR \-\-format=raw )
  567. cannot be auto-detected.
  568. .TP
  569. .B xz
  570. Compress to the
  571. .B .xz
  572. file format, or accept only
  573. .B .xz
  574. files when decompressing.
  575. .TP
  576. .BR lzma ", " alone
  577. Compress to the legacy
  578. .B .lzma
  579. file format, or accept only
  580. .B .lzma
  581. files when decompressing.
  582. The alternative name
  583. .B alone
  584. is provided for backwards compatibility with LZMA Utils.
  585. .TP
  586. .B lzip
  587. Accept only
  588. .B .lz
  589. files when decompressing.
  590. Compression is not supported.
  591. .IP ""
  592. The
  593. .B .lz
  594. format version 0 and the unextended version 1 are supported.
  595. Version 0 files were produced by
  596. .B lzip
  597. 1.3 and older.
  598. Such files aren't common but may be found from file archives
  599. as a few source packages were released in this format.
  600. People might have old personal files in this format too.
  601. Decompression support for the format version 0 was removed in
  602. .B lzip
  603. 1.18.
  604. .IP ""
  605. .B lzip
  606. 1.4 and later create files in the format version 1.
  607. The sync flush marker extension to the format version 1 was added in
  608. .B lzip
  609. 1.6.
  610. This extension is rarely used and isn't supported by
  611. .B xz
  612. (diagnosed as corrupt input).
  613. .TP
  614. .B raw
  615. Compress or uncompress a raw stream (no headers).
  616. This is meant for advanced users only.
  617. To decode raw streams, you need use
  618. .B \-\-format=raw
  619. and explicitly specify the filter chain,
  620. which normally would have been stored in the container headers.
  621. .RE
  622. .TP
  623. \fB\-C\fR \fIcheck\fR, \fB\-\-check=\fIcheck
  624. Specify the type of the integrity check.
  625. The check is calculated from the uncompressed data and
  626. stored in the
  627. .B .xz
  628. file.
  629. This option has an effect only when compressing into the
  630. .B .xz
  631. format; the
  632. .B .lzma
  633. format doesn't support integrity checks.
  634. The integrity check (if any) is verified when the
  635. .B .xz
  636. file is decompressed.
  637. .IP ""
  638. Supported
  639. .I check
  640. types:
  641. .RS
  642. .TP
  643. .B none
  644. Don't calculate an integrity check at all.
  645. This is usually a bad idea.
  646. This can be useful when integrity of the data is verified
  647. by other means anyway.
  648. .TP
  649. .B crc32
  650. Calculate CRC32 using the polynomial from IEEE-802.3 (Ethernet).
  651. .TP
  652. .B crc64
  653. Calculate CRC64 using the polynomial from ECMA-182.
  654. This is the default, since it is slightly better than CRC32
  655. at detecting damaged files and the speed difference is negligible.
  656. .TP
  657. .B sha256
  658. Calculate SHA-256.
  659. This is somewhat slower than CRC32 and CRC64.
  660. .RE
  661. .IP ""
  662. Integrity of the
  663. .B .xz
  664. headers is always verified with CRC32.
  665. It is not possible to change or disable it.
  666. .TP
  667. .B \-\-ignore\-check
  668. Don't verify the integrity check of the compressed data when decompressing.
  669. The CRC32 values in the
  670. .B .xz
  671. headers will still be verified normally.
  672. .IP ""
  673. .B "Do not use this option unless you know what you are doing."
  674. Possible reasons to use this option:
  675. .RS
  676. .IP \(bu 3
  677. Trying to recover data from a corrupt .xz file.
  678. .IP \(bu 3
  679. Speeding up decompression.
  680. This matters mostly with SHA-256 or
  681. with files that have compressed extremely well.
  682. It's recommended to not use this option for this purpose
  683. unless the file integrity is verified externally in some other way.
  684. .RE
  685. .TP
  686. .BR \-0 " ... " \-9
  687. Select a compression preset level.
  688. The default is
  689. .BR \-6 .
  690. If multiple preset levels are specified,
  691. the last one takes effect.
  692. If a custom filter chain was already specified, setting
  693. a compression preset level clears the custom filter chain.
  694. .IP ""
  695. The differences between the presets are more significant than with
  696. .BR gzip (1)
  697. and
  698. .BR bzip2 (1).
  699. The selected compression settings determine
  700. the memory requirements of the decompressor,
  701. thus using a too high preset level might make it painful
  702. to decompress the file on an old system with little RAM.
  703. Specifically,
  704. .B "it's not a good idea to blindly use \-9 for everything"
  705. like it often is with
  706. .BR gzip (1)
  707. and
  708. .BR bzip2 (1).
  709. .RS
  710. .TP
  711. .BR "\-0" " ... " "\-3"
  712. These are somewhat fast presets.
  713. .B \-0
  714. is sometimes faster than
  715. .B "gzip \-9"
  716. while compressing much better.
  717. The higher ones often have speed comparable to
  718. .BR bzip2 (1)
  719. with comparable or better compression ratio,
  720. although the results
  721. depend a lot on the type of data being compressed.
  722. .TP
  723. .BR "\-4" " ... " "\-6"
  724. Good to very good compression while keeping
  725. decompressor memory usage reasonable even for old systems.
  726. .B \-6
  727. is the default, which is usually a good choice
  728. for distributing files that need to be decompressible
  729. even on systems with only 16\ MiB RAM.
  730. .RB ( \-5e
  731. or
  732. .B \-6e
  733. may be worth considering too.
  734. See
  735. .BR \-\-extreme .)
  736. .TP
  737. .B "\-7 ... \-9"
  738. These are like
  739. .B \-6
  740. but with higher compressor and decompressor memory requirements.
  741. These are useful only when compressing files bigger than
  742. 8\ MiB, 16\ MiB, and 32\ MiB, respectively.
  743. .RE
  744. .IP ""
  745. On the same hardware, the decompression speed is approximately
  746. a constant number of bytes of compressed data per second.
  747. In other words, the better the compression,
  748. the faster the decompression will usually be.
  749. This also means that the amount of uncompressed output
  750. produced per second can vary a lot.
  751. .IP ""
  752. The following table summarises the features of the presets:
  753. .RS
  754. .RS
  755. .PP
  756. .TS
  757. tab(;);
  758. c c c c c
  759. n n n n n.
  760. Preset;DictSize;CompCPU;CompMem;DecMem
  761. \-0;256 KiB;0;3 MiB;1 MiB
  762. \-1;1 MiB;1;9 MiB;2 MiB
  763. \-2;2 MiB;2;17 MiB;3 MiB
  764. \-3;4 MiB;3;32 MiB;5 MiB
  765. \-4;4 MiB;4;48 MiB;5 MiB
  766. \-5;8 MiB;5;94 MiB;9 MiB
  767. \-6;8 MiB;6;94 MiB;9 MiB
  768. \-7;16 MiB;6;186 MiB;17 MiB
  769. \-8;32 MiB;6;370 MiB;33 MiB
  770. \-9;64 MiB;6;674 MiB;65 MiB
  771. .TE
  772. .RE
  773. .RE
  774. .IP ""
  775. Column descriptions:
  776. .RS
  777. .IP \(bu 3
  778. DictSize is the LZMA2 dictionary size.
  779. It is waste of memory to use a dictionary bigger than
  780. the size of the uncompressed file.
  781. This is why it is good to avoid using the presets
  782. .BR \-7 " ... " \-9
  783. when there's no real need for them.
  784. At
  785. .B \-6
  786. and lower, the amount of memory wasted is
  787. usually low enough to not matter.
  788. .IP \(bu 3
  789. CompCPU is a simplified representation of the LZMA2 settings
  790. that affect compression speed.
  791. The dictionary size affects speed too,
  792. so while CompCPU is the same for levels
  793. .BR \-6 " ... " \-9 ,
  794. higher levels still tend to be a little slower.
  795. To get even slower and thus possibly better compression, see
  796. .BR \-\-extreme .
  797. .IP \(bu 3
  798. CompMem contains the compressor memory requirements
  799. in the single-threaded mode.
  800. It may vary slightly between
  801. .B xz
  802. versions.
  803. Memory requirements of some of the future multithreaded modes may
  804. be dramatically higher than that of the single-threaded mode.
  805. .IP \(bu 3
  806. DecMem contains the decompressor memory requirements.
  807. That is, the compression settings determine
  808. the memory requirements of the decompressor.
  809. The exact decompressor memory usage is slightly more than
  810. the LZMA2 dictionary size, but the values in the table
  811. have been rounded up to the next full MiB.
  812. .RE
  813. .TP
  814. .BR \-e ", " \-\-extreme
  815. Use a slower variant of the selected compression preset level
  816. .RB ( \-0 " ... " \-9 )
  817. to hopefully get a little bit better compression ratio,
  818. but with bad luck this can also make it worse.
  819. Decompressor memory usage is not affected,
  820. but compressor memory usage increases a little at preset levels
  821. .BR \-0 " ... " \-3 .
  822. .IP ""
  823. Since there are two presets with dictionary sizes
  824. 4\ MiB and 8\ MiB, the presets
  825. .B \-3e
  826. and
  827. .B \-5e
  828. use slightly faster settings (lower CompCPU) than
  829. .B \-4e
  830. and
  831. .BR \-6e ,
  832. respectively.
  833. That way no two presets are identical.
  834. .RS
  835. .RS
  836. .PP
  837. .TS
  838. tab(;);
  839. c c c c c
  840. n n n n n.
  841. Preset;DictSize;CompCPU;CompMem;DecMem
  842. \-0e;256 KiB;8;4 MiB;1 MiB
  843. \-1e;1 MiB;8;13 MiB;2 MiB
  844. \-2e;2 MiB;8;25 MiB;3 MiB
  845. \-3e;4 MiB;7;48 MiB;5 MiB
  846. \-4e;4 MiB;8;48 MiB;5 MiB
  847. \-5e;8 MiB;7;94 MiB;9 MiB
  848. \-6e;8 MiB;8;94 MiB;9 MiB
  849. \-7e;16 MiB;8;186 MiB;17 MiB
  850. \-8e;32 MiB;8;370 MiB;33 MiB
  851. \-9e;64 MiB;8;674 MiB;65 MiB
  852. .TE
  853. .RE
  854. .RE
  855. .IP ""
  856. For example, there are a total of four presets that use
  857. 8\ MiB dictionary, whose order from the fastest to the slowest is
  858. .BR \-5 ,
  859. .BR \-6 ,
  860. .BR \-5e ,
  861. and
  862. .BR \-6e .
  863. .TP
  864. .B \-\-fast
  865. .PD 0
  866. .TP
  867. .B \-\-best
  868. .PD
  869. These are somewhat misleading aliases for
  870. .B \-0
  871. and
  872. .BR \-9 ,
  873. respectively.
  874. These are provided only for backwards compatibility
  875. with LZMA Utils.
  876. Avoid using these options.
  877. .TP
  878. .BI \-\-block\-size= size
  879. When compressing to the
  880. .B .xz
  881. format, split the input data into blocks of
  882. .I size
  883. bytes.
  884. The blocks are compressed independently from each other,
  885. which helps with multi-threading and
  886. makes limited random-access decompression possible.
  887. This option is typically used to override the default
  888. block size in multi-threaded mode,
  889. but this option can be used in single-threaded mode too.
  890. .IP ""
  891. In multi-threaded mode about three times
  892. .I size
  893. bytes will be allocated in each thread for buffering input and output.
  894. The default
  895. .I size
  896. is three times the LZMA2 dictionary size or 1 MiB,
  897. whichever is more.
  898. Typically a good value is 2\(en4 times
  899. the size of the LZMA2 dictionary or at least 1 MiB.
  900. Using
  901. .I size
  902. less than the LZMA2 dictionary size is waste of RAM
  903. because then the LZMA2 dictionary buffer will never get fully used.
  904. The sizes of the blocks are stored in the block headers,
  905. which a future version of
  906. .B xz
  907. will use for multi-threaded decompression.
  908. .IP ""
  909. In single-threaded mode no block splitting is done by default.
  910. Setting this option doesn't affect memory usage.
  911. No size information is stored in block headers,
  912. thus files created in single-threaded mode
  913. won't be identical to files created in multi-threaded mode.
  914. The lack of size information also means that a future version of
  915. .B xz
  916. won't be able decompress the files in multi-threaded mode.
  917. .TP
  918. .BI \-\-block\-list= sizes
  919. When compressing to the
  920. .B .xz
  921. format, start a new block after
  922. the given intervals of uncompressed data.
  923. .IP ""
  924. The uncompressed
  925. .I sizes
  926. of the blocks are specified as a comma-separated list.
  927. Omitting a size (two or more consecutive commas) is a shorthand
  928. to use the size of the previous block.
  929. .IP ""
  930. If the input file is bigger than the sum of
  931. .IR sizes ,
  932. the last value in
  933. .I sizes
  934. is repeated until the end of the file.
  935. A special value of
  936. .B 0
  937. may be used as the last value to indicate that
  938. the rest of the file should be encoded as a single block.
  939. .IP ""
  940. If one specifies
  941. .I sizes
  942. that exceed the encoder's block size
  943. (either the default value in threaded mode or
  944. the value specified with \fB\-\-block\-size=\fIsize\fR),
  945. the encoder will create additional blocks while
  946. keeping the boundaries specified in
  947. .IR sizes .
  948. For example, if one specifies
  949. .B \-\-block\-size=10MiB
  950. .B \-\-block\-list=5MiB,10MiB,8MiB,12MiB,24MiB
  951. and the input file is 80 MiB,
  952. one will get 11 blocks:
  953. 5, 10, 8, 10, 2, 10, 10, 4, 10, 10, and 1 MiB.
  954. .IP ""
  955. In multi-threaded mode the sizes of the blocks
  956. are stored in the block headers.
  957. This isn't done in single-threaded mode,
  958. so the encoded output won't be
  959. identical to that of the multi-threaded mode.
  960. .TP
  961. .BI \-\-flush\-timeout= timeout
  962. When compressing, if more than
  963. .I timeout
  964. milliseconds (a positive integer) has passed since the previous flush and
  965. reading more input would block,
  966. all the pending input data is flushed from the encoder and
  967. made available in the output stream.
  968. This can be useful if
  969. .B xz
  970. is used to compress data that is streamed over a network.
  971. Small
  972. .I timeout
  973. values make the data available at the receiving end
  974. with a small delay, but large
  975. .I timeout
  976. values give better compression ratio.
  977. .IP ""
  978. This feature is disabled by default.
  979. If this option is specified more than once, the last one takes effect.
  980. The special
  981. .I timeout
  982. value of
  983. .B 0
  984. can be used to explicitly disable this feature.
  985. .IP ""
  986. This feature is not available on non-POSIX systems.
  987. .IP ""
  988. .\" FIXME
  989. .B "This feature is still experimental."
  990. Currently
  991. .B xz
  992. is unsuitable for decompressing the stream in real time due to how
  993. .B xz
  994. does buffering.
  995. .TP
  996. .BI \-\-memlimit\-compress= limit
  997. Set a memory usage limit for compression.
  998. If this option is specified multiple times,
  999. the last one takes effect.
  1000. .IP ""
  1001. If the compression settings exceed the
  1002. .IR limit ,
  1003. .B xz
  1004. will attempt to adjust the settings downwards so that
  1005. the limit is no longer exceeded and display a notice that
  1006. automatic adjustment was done.
  1007. The adjustments are done in this order:
  1008. reducing the number of threads,
  1009. switching to single-threaded mode
  1010. if even one thread in multi-threaded mode exceeds the
  1011. .IR limit ,
  1012. and finally reducing the LZMA2 dictionary size.
  1013. .IP ""
  1014. When compressing with
  1015. .B \-\-format=raw
  1016. or if
  1017. .B \-\-no\-adjust
  1018. has been specified,
  1019. only the number of threads may be reduced
  1020. since it can be done without affecting the compressed output.
  1021. .IP ""
  1022. If the
  1023. .I limit
  1024. cannot be met even with the adjustments described above,
  1025. an error is displayed and
  1026. .B xz
  1027. will exit with exit status 1.
  1028. .IP ""
  1029. The
  1030. .I limit
  1031. can be specified in multiple ways:
  1032. .RS
  1033. .IP \(bu 3
  1034. The
  1035. .I limit
  1036. can be an absolute value in bytes.
  1037. Using an integer suffix like
  1038. .B MiB
  1039. can be useful.
  1040. Example:
  1041. .B "\-\-memlimit\-compress=80MiB"
  1042. .IP \(bu 3
  1043. The
  1044. .I limit
  1045. can be specified as a percentage of total physical memory (RAM).
  1046. This can be useful especially when setting the
  1047. .B XZ_DEFAULTS
  1048. environment variable in a shell initialization script
  1049. that is shared between different computers.
  1050. That way the limit is automatically bigger
  1051. on systems with more memory.
  1052. Example:
  1053. .B "\-\-memlimit\-compress=70%"
  1054. .IP \(bu 3
  1055. The
  1056. .I limit
  1057. can be reset back to its default value by setting it to
  1058. .BR 0 .
  1059. This is currently equivalent to setting the
  1060. .I limit
  1061. to
  1062. .B max
  1063. (no memory usage limit).
  1064. .RE
  1065. .IP ""
  1066. For 32-bit
  1067. .B xz
  1068. there is a special case: if the
  1069. .I limit
  1070. would be over
  1071. .BR "4020\ MiB" ,
  1072. the
  1073. .I limit
  1074. is set to
  1075. .BR "4020\ MiB" .
  1076. On MIPS32
  1077. .B "2000\ MiB"
  1078. is used instead.
  1079. (The values
  1080. .B 0
  1081. and
  1082. .B max
  1083. aren't affected by this.
  1084. A similar feature doesn't exist for decompression.)
  1085. This can be helpful when a 32-bit executable has access
  1086. to 4\ GiB address space (2 GiB on MIPS32)
  1087. while hopefully doing no harm in other situations.
  1088. .IP ""
  1089. See also the section
  1090. .BR "Memory usage" .
  1091. .TP
  1092. .BI \-\-memlimit\-decompress= limit
  1093. Set a memory usage limit for decompression.
  1094. This also affects the
  1095. .B \-\-list
  1096. mode.
  1097. If the operation is not possible without exceeding the
  1098. .IR limit ,
  1099. .B xz
  1100. will display an error and decompressing the file will fail.
  1101. See
  1102. .BI \-\-memlimit\-compress= limit
  1103. for possible ways to specify the
  1104. .IR limit .
  1105. .TP
  1106. .BI \-\-memlimit\-mt\-decompress= limit
  1107. Set a memory usage limit for multi-threaded decompression.
  1108. This can only affect the number of threads;
  1109. this will never make
  1110. .B xz
  1111. refuse to decompress a file.
  1112. If
  1113. .I limit
  1114. is too low to allow any multi-threading, the
  1115. .I limit
  1116. is ignored and
  1117. .B xz
  1118. will continue in single-threaded mode.
  1119. Note that if also
  1120. .B \-\-memlimit\-decompress
  1121. is used,
  1122. it will always apply to both single-threaded and multi-threaded modes,
  1123. and so the effective
  1124. .I limit
  1125. for multi-threading will never be higher than the limit set with
  1126. .BR \-\-memlimit\-decompress .
  1127. .IP ""
  1128. In contrast to the other memory usage limit options,
  1129. .BI \-\-memlimit\-mt\-decompress= limit
  1130. has a system-specific default
  1131. .IR limit .
  1132. .B "xz \-\-info\-memory"
  1133. can be used to see the current value.
  1134. .IP ""
  1135. This option and its default value exist
  1136. because without any limit the threaded decompressor
  1137. could end up allocating an insane amount of memory with some input files.
  1138. If the default
  1139. .I limit
  1140. is too low on your system,
  1141. feel free to increase the
  1142. .I limit
  1143. but never set it to a value larger than the amount of usable RAM
  1144. as with appropriate input files
  1145. .B xz
  1146. will attempt to use that amount of memory
  1147. even with a low number of threads.
  1148. Running out of memory or swapping
  1149. will not improve decompression performance.
  1150. .IP ""
  1151. See
  1152. .BI \-\-memlimit\-compress= limit
  1153. for possible ways to specify the
  1154. .IR limit .
  1155. Setting
  1156. .I limit
  1157. to
  1158. .B 0
  1159. resets the
  1160. .I limit
  1161. to the default system-specific value.
  1162. .IP ""
  1163. .TP
  1164. \fB\-M\fR \fIlimit\fR, \fB\-\-memlimit=\fIlimit\fR, \fB\-\-memory=\fIlimit
  1165. This is equivalent to specifying
  1166. .BI \-\-memlimit\-compress= limit
  1167. .BI \-\-memlimit-decompress= limit
  1168. \fB\-\-memlimit\-mt\-decompress=\fIlimit\fR.
  1169. .TP
  1170. .B \-\-no\-adjust
  1171. Display an error and exit if the memory usage limit cannot be
  1172. met without adjusting settings that affect the compressed output.
  1173. That is, this prevents
  1174. .B xz
  1175. from switching the encoder from multi-threaded mode to single-threaded mode
  1176. and from reducing the LZMA2 dictionary size.
  1177. Even when this option is used the number of threads may be reduced
  1178. to meet the memory usage limit as that won't affect the compressed output.
  1179. .IP ""
  1180. Automatic adjusting is always disabled when creating raw streams
  1181. .RB ( \-\-format=raw ).
  1182. .TP
  1183. \fB\-T\fR \fIthreads\fR, \fB\-\-threads=\fIthreads
  1184. Specify the number of worker threads to use.
  1185. Setting
  1186. .I threads
  1187. to a special value
  1188. .B 0
  1189. makes
  1190. .B xz
  1191. use up to as many threads as the processor(s) on the system support.
  1192. The actual number of threads can be fewer than
  1193. .I threads
  1194. if the input file is not big enough
  1195. for threading with the given settings or
  1196. if using more threads would exceed the memory usage limit.
  1197. .IP ""
  1198. The single-threaded and multi-threaded compressors produce different output.
  1199. Single-threaded compressor will give the smallest file size but
  1200. only the output from the multi-threaded compressor can be decompressed
  1201. using multiple threads.
  1202. Setting
  1203. .I threads
  1204. to
  1205. .B 1
  1206. will use the single-threaded mode.
  1207. Setting
  1208. .I threads
  1209. to any other value, including
  1210. .BR 0 ,
  1211. will use the multi-threaded compressor
  1212. even if the system supports only one hardware thread.
  1213. .RB ( xz
  1214. 5.2.x
  1215. used single-threaded mode in this situation.)
  1216. .IP ""
  1217. To use multi-threaded mode with only one thread, set
  1218. .I threads
  1219. to
  1220. .BR +1 .
  1221. The
  1222. .B +
  1223. prefix has no effect with values other than
  1224. .BR 1 .
  1225. A memory usage limit can still make
  1226. .B xz
  1227. switch to single-threaded mode unless
  1228. .B \-\-no\-adjust
  1229. is used.
  1230. Support for the
  1231. .B +
  1232. prefix was added in
  1233. .B xz
  1234. 5.4.0.
  1235. .IP ""
  1236. If an automatic number of threads has been requested and
  1237. no memory usage limit has been specified,
  1238. then a system-specific default soft limit will be used to possibly
  1239. limit the number of threads.
  1240. It is a soft limit in sense that it is ignored
  1241. if the number of threads becomes one,
  1242. thus a soft limit will never stop
  1243. .B xz
  1244. from compressing or decompressing.
  1245. This default soft limit will not make
  1246. .B xz
  1247. switch from multi-threaded mode to single-threaded mode.
  1248. The active limits can be seen with
  1249. .BR "xz \-\-info\-memory" .
  1250. .IP ""
  1251. Currently the only threading method is to split the input into
  1252. blocks and compress them independently from each other.
  1253. The default block size depends on the compression level and
  1254. can be overridden with the
  1255. .BI \-\-block\-size= size
  1256. option.
  1257. .IP ""
  1258. Threaded decompression only works on files that contain
  1259. multiple blocks with size information in block headers.
  1260. All large enough files compressed in multi-threaded mode
  1261. meet this condition,
  1262. but files compressed in single-threaded mode don't even if
  1263. .BI \-\-block\-size= size
  1264. has been used.
  1265. .
  1266. .SS "Custom compressor filter chains"
  1267. A custom filter chain allows specifying
  1268. the compression settings in detail instead of relying on
  1269. the settings associated to the presets.
  1270. When a custom filter chain is specified,
  1271. preset options
  1272. .RB ( \-0
  1273. \&...\&
  1274. .B \-9
  1275. and
  1276. .BR \-\-extreme )
  1277. earlier on the command line are forgotten.
  1278. If a preset option is specified
  1279. after one or more custom filter chain options,
  1280. the new preset takes effect and
  1281. the custom filter chain options specified earlier are forgotten.
  1282. .PP
  1283. A filter chain is comparable to piping on the command line.
  1284. When compressing, the uncompressed input goes to the first filter,
  1285. whose output goes to the next filter (if any).
  1286. The output of the last filter gets written to the compressed file.
  1287. The maximum number of filters in the chain is four,
  1288. but typically a filter chain has only one or two filters.
  1289. .PP
  1290. Many filters have limitations on where they can be
  1291. in the filter chain:
  1292. some filters can work only as the last filter in the chain,
  1293. some only as a non-last filter, and some work in any position
  1294. in the chain.
  1295. Depending on the filter, this limitation is either inherent to
  1296. the filter design or exists to prevent security issues.
  1297. .PP
  1298. A custom filter chain is specified by using one or more
  1299. filter options in the order they are wanted in the filter chain.
  1300. That is, the order of filter options is significant!
  1301. When decoding raw streams
  1302. .RB ( \-\-format=raw ),
  1303. the filter chain is specified in the same order as
  1304. it was specified when compressing.
  1305. .PP
  1306. Filters take filter-specific
  1307. .I options
  1308. as a comma-separated list.
  1309. Extra commas in
  1310. .I options
  1311. are ignored.
  1312. Every option has a default value, so you need to
  1313. specify only those you want to change.
  1314. .PP
  1315. To see the whole filter chain and
  1316. .IR options ,
  1317. use
  1318. .B "xz \-vv"
  1319. (that is, use
  1320. .B \-\-verbose
  1321. twice).
  1322. This works also for viewing the filter chain options used by presets.
  1323. .TP
  1324. \fB\-\-lzma1\fR[\fB=\fIoptions\fR]
  1325. .PD 0
  1326. .TP
  1327. \fB\-\-lzma2\fR[\fB=\fIoptions\fR]
  1328. .PD
  1329. Add LZMA1 or LZMA2 filter to the filter chain.
  1330. These filters can be used only as the last filter in the chain.
  1331. .IP ""
  1332. LZMA1 is a legacy filter,
  1333. which is supported almost solely due to the legacy
  1334. .B .lzma
  1335. file format, which supports only LZMA1.
  1336. LZMA2 is an updated
  1337. version of LZMA1 to fix some practical issues of LZMA1.
  1338. The
  1339. .B .xz
  1340. format uses LZMA2 and doesn't support LZMA1 at all.
  1341. Compression speed and ratios of LZMA1 and LZMA2
  1342. are practically the same.
  1343. .IP ""
  1344. LZMA1 and LZMA2 share the same set of
  1345. .IR options :
  1346. .RS
  1347. .TP
  1348. .BI preset= preset
  1349. Reset all LZMA1 or LZMA2
  1350. .I options
  1351. to
  1352. .IR preset .
  1353. .I Preset
  1354. consist of an integer, which may be followed by single-letter
  1355. preset modifiers.
  1356. The integer can be from
  1357. .B 0
  1358. to
  1359. .BR 9 ,
  1360. matching the command line options
  1361. .B \-0
  1362. \&...\&
  1363. .BR \-9 .
  1364. The only supported modifier is currently
  1365. .BR e ,
  1366. which matches
  1367. .BR \-\-extreme .
  1368. If no
  1369. .B preset
  1370. is specified, the default values of LZMA1 or LZMA2
  1371. .I options
  1372. are taken from the preset
  1373. .BR 6 .
  1374. .TP
  1375. .BI dict= size
  1376. Dictionary (history buffer)
  1377. .I size
  1378. indicates how many bytes of the recently processed
  1379. uncompressed data is kept in memory.
  1380. The algorithm tries to find repeating byte sequences (matches) in
  1381. the uncompressed data, and replace them with references
  1382. to the data currently in the dictionary.
  1383. The bigger the dictionary, the higher is the chance
  1384. to find a match.
  1385. Thus, increasing dictionary
  1386. .I size
  1387. usually improves compression ratio, but
  1388. a dictionary bigger than the uncompressed file is waste of memory.
  1389. .IP ""
  1390. Typical dictionary
  1391. .I size
  1392. is from 64\ KiB to 64\ MiB.
  1393. The minimum is 4\ KiB.
  1394. The maximum for compression is currently 1.5\ GiB (1536\ MiB).
  1395. The decompressor already supports dictionaries up to
  1396. one byte less than 4\ GiB, which is the maximum for
  1397. the LZMA1 and LZMA2 stream formats.
  1398. .IP ""
  1399. Dictionary
  1400. .I size
  1401. and match finder
  1402. .RI ( mf )
  1403. together determine the memory usage of the LZMA1 or LZMA2 encoder.
  1404. The same (or bigger) dictionary
  1405. .I size
  1406. is required for decompressing that was used when compressing,
  1407. thus the memory usage of the decoder is determined
  1408. by the dictionary size used when compressing.
  1409. The
  1410. .B .xz
  1411. headers store the dictionary
  1412. .I size
  1413. either as
  1414. .RI "2^" n
  1415. or
  1416. .RI "2^" n " + 2^(" n "\-1),"
  1417. so these
  1418. .I sizes
  1419. are somewhat preferred for compression.
  1420. Other
  1421. .I sizes
  1422. will get rounded up when stored in the
  1423. .B .xz
  1424. headers.
  1425. .TP
  1426. .BI lc= lc
  1427. Specify the number of literal context bits.
  1428. The minimum is 0 and the maximum is 4; the default is 3.
  1429. In addition, the sum of
  1430. .I lc
  1431. and
  1432. .I lp
  1433. must not exceed 4.
  1434. .IP ""
  1435. All bytes that cannot be encoded as matches
  1436. are encoded as literals.
  1437. That is, literals are simply 8-bit bytes
  1438. that are encoded one at a time.
  1439. .IP ""
  1440. The literal coding makes an assumption that the highest
  1441. .I lc
  1442. bits of the previous uncompressed byte correlate
  1443. with the next byte.
  1444. For example, in typical English text, an upper-case letter is
  1445. often followed by a lower-case letter, and a lower-case
  1446. letter is usually followed by another lower-case letter.
  1447. In the US-ASCII character set, the highest three bits are 010
  1448. for upper-case letters and 011 for lower-case letters.
  1449. When
  1450. .I lc
  1451. is at least 3, the literal coding can take advantage of
  1452. this property in the uncompressed data.
  1453. .IP ""
  1454. The default value (3) is usually good.
  1455. If you want maximum compression, test
  1456. .BR lc=4 .
  1457. Sometimes it helps a little, and
  1458. sometimes it makes compression worse.
  1459. If it makes it worse, test
  1460. .B lc=2
  1461. too.
  1462. .TP
  1463. .BI lp= lp
  1464. Specify the number of literal position bits.
  1465. The minimum is 0 and the maximum is 4; the default is 0.
  1466. .IP ""
  1467. .I Lp
  1468. affects what kind of alignment in the uncompressed data is
  1469. assumed when encoding literals.
  1470. See
  1471. .I pb
  1472. below for more information about alignment.
  1473. .TP
  1474. .BI pb= pb
  1475. Specify the number of position bits.
  1476. The minimum is 0 and the maximum is 4; the default is 2.
  1477. .IP ""
  1478. .I Pb
  1479. affects what kind of alignment in the uncompressed data is
  1480. assumed in general.
  1481. The default means four-byte alignment
  1482. .RI (2^ pb =2^2=4),
  1483. which is often a good choice when there's no better guess.
  1484. .IP ""
  1485. When the alignment is known, setting
  1486. .I pb
  1487. accordingly may reduce the file size a little.
  1488. For example, with text files having one-byte
  1489. alignment (US-ASCII, ISO-8859-*, UTF-8), setting
  1490. .B pb=0
  1491. can improve compression slightly.
  1492. For UTF-16 text,
  1493. .B pb=1
  1494. is a good choice.
  1495. If the alignment is an odd number like 3 bytes,
  1496. .B pb=0
  1497. might be the best choice.
  1498. .IP ""
  1499. Even though the assumed alignment can be adjusted with
  1500. .I pb
  1501. and
  1502. .IR lp ,
  1503. LZMA1 and LZMA2 still slightly favor 16-byte alignment.
  1504. It might be worth taking into account when designing file formats
  1505. that are likely to be often compressed with LZMA1 or LZMA2.
  1506. .TP
  1507. .BI mf= mf
  1508. Match finder has a major effect on encoder speed,
  1509. memory usage, and compression ratio.
  1510. Usually Hash Chain match finders are faster than Binary Tree
  1511. match finders.
  1512. The default depends on the
  1513. .IR preset :
  1514. 0 uses
  1515. .BR hc3 ,
  1516. 1\(en3
  1517. use
  1518. .BR hc4 ,
  1519. and the rest use
  1520. .BR bt4 .
  1521. .IP ""
  1522. The following match finders are supported.
  1523. The memory usage formulas below are rough approximations,
  1524. which are closest to the reality when
  1525. .I dict
  1526. is a power of two.
  1527. .RS
  1528. .TP
  1529. .B hc3
  1530. Hash Chain with 2- and 3-byte hashing
  1531. .br
  1532. Minimum value for
  1533. .IR nice :
  1534. 3
  1535. .br
  1536. Memory usage:
  1537. .br
  1538. .I dict
  1539. * 7.5 (if
  1540. .I dict
  1541. <= 16 MiB);
  1542. .br
  1543. .I dict
  1544. * 5.5 + 64 MiB (if
  1545. .I dict
  1546. > 16 MiB)
  1547. .TP
  1548. .B hc4
  1549. Hash Chain with 2-, 3-, and 4-byte hashing
  1550. .br
  1551. Minimum value for
  1552. .IR nice :
  1553. 4
  1554. .br
  1555. Memory usage:
  1556. .br
  1557. .I dict
  1558. * 7.5 (if
  1559. .I dict
  1560. <= 32 MiB);
  1561. .br
  1562. .I dict
  1563. * 6.5 (if
  1564. .I dict
  1565. > 32 MiB)
  1566. .TP
  1567. .B bt2
  1568. Binary Tree with 2-byte hashing
  1569. .br
  1570. Minimum value for
  1571. .IR nice :
  1572. 2
  1573. .br
  1574. Memory usage:
  1575. .I dict
  1576. * 9.5
  1577. .TP
  1578. .B bt3
  1579. Binary Tree with 2- and 3-byte hashing
  1580. .br
  1581. Minimum value for
  1582. .IR nice :
  1583. 3
  1584. .br
  1585. Memory usage:
  1586. .br
  1587. .I dict
  1588. * 11.5 (if
  1589. .I dict
  1590. <= 16 MiB);
  1591. .br
  1592. .I dict
  1593. * 9.5 + 64 MiB (if
  1594. .I dict
  1595. > 16 MiB)
  1596. .TP
  1597. .B bt4
  1598. Binary Tree with 2-, 3-, and 4-byte hashing
  1599. .br
  1600. Minimum value for
  1601. .IR nice :
  1602. 4
  1603. .br
  1604. Memory usage:
  1605. .br
  1606. .I dict
  1607. * 11.5 (if
  1608. .I dict
  1609. <= 32 MiB);
  1610. .br
  1611. .I dict
  1612. * 10.5 (if
  1613. .I dict
  1614. > 32 MiB)
  1615. .RE
  1616. .TP
  1617. .BI mode= mode
  1618. Compression
  1619. .I mode
  1620. specifies the method to analyze
  1621. the data produced by the match finder.
  1622. Supported
  1623. .I modes
  1624. are
  1625. .B fast
  1626. and
  1627. .BR normal .
  1628. The default is
  1629. .B fast
  1630. for
  1631. .I presets
  1632. 0\(en3 and
  1633. .B normal
  1634. for
  1635. .I presets
  1636. 4\(en9.
  1637. .IP ""
  1638. Usually
  1639. .B fast
  1640. is used with Hash Chain match finders and
  1641. .B normal
  1642. with Binary Tree match finders.
  1643. This is also what the
  1644. .I presets
  1645. do.
  1646. .TP
  1647. .BI nice= nice
  1648. Specify what is considered to be a nice length for a match.
  1649. Once a match of at least
  1650. .I nice
  1651. bytes is found, the algorithm stops
  1652. looking for possibly better matches.
  1653. .IP ""
  1654. .I Nice
  1655. can be 2\(en273 bytes.
  1656. Higher values tend to give better compression ratio
  1657. at the expense of speed.
  1658. The default depends on the
  1659. .IR preset .
  1660. .TP
  1661. .BI depth= depth
  1662. Specify the maximum search depth in the match finder.
  1663. The default is the special value of 0,
  1664. which makes the compressor determine a reasonable
  1665. .I depth
  1666. from
  1667. .I mf
  1668. and
  1669. .IR nice .
  1670. .IP ""
  1671. Reasonable
  1672. .I depth
  1673. for Hash Chains is 4\(en100 and 16\(en1000 for Binary Trees.
  1674. Using very high values for
  1675. .I depth
  1676. can make the encoder extremely slow with some files.
  1677. Avoid setting the
  1678. .I depth
  1679. over 1000 unless you are prepared to interrupt
  1680. the compression in case it is taking far too long.
  1681. .RE
  1682. .IP ""
  1683. When decoding raw streams
  1684. .RB ( \-\-format=raw ),
  1685. LZMA2 needs only the dictionary
  1686. .IR size .
  1687. LZMA1 needs also
  1688. .IR lc ,
  1689. .IR lp ,
  1690. and
  1691. .IR pb .
  1692. .TP
  1693. \fB\-\-x86\fR[\fB=\fIoptions\fR]
  1694. .PD 0
  1695. .TP
  1696. \fB\-\-arm\fR[\fB=\fIoptions\fR]
  1697. .TP
  1698. \fB\-\-armthumb\fR[\fB=\fIoptions\fR]
  1699. .TP
  1700. \fB\-\-arm64\fR[\fB=\fIoptions\fR]
  1701. .TP
  1702. \fB\-\-powerpc\fR[\fB=\fIoptions\fR]
  1703. .TP
  1704. \fB\-\-ia64\fR[\fB=\fIoptions\fR]
  1705. .TP
  1706. \fB\-\-sparc\fR[\fB=\fIoptions\fR]
  1707. .PD
  1708. Add a branch/call/jump (BCJ) filter to the filter chain.
  1709. These filters can be used only as a non-last filter
  1710. in the filter chain.
  1711. .IP ""
  1712. A BCJ filter converts relative addresses in
  1713. the machine code to their absolute counterparts.
  1714. This doesn't change the size of the data
  1715. but it increases redundancy,
  1716. which can help LZMA2 to produce 0\(en15\ % smaller
  1717. .B .xz
  1718. file.
  1719. The BCJ filters are always reversible,
  1720. so using a BCJ filter for wrong type of data
  1721. doesn't cause any data loss, although it may make
  1722. the compression ratio slightly worse.
  1723. The BCJ filters are very fast and
  1724. use an insignificant amount of memory.
  1725. .IP ""
  1726. These BCJ filters have known problems related to
  1727. the compression ratio:
  1728. .RS
  1729. .IP \(bu 3
  1730. Some types of files containing executable code
  1731. (for example, object files, static libraries, and Linux kernel modules)
  1732. have the addresses in the instructions filled with filler values.
  1733. These BCJ filters will still do the address conversion,
  1734. which will make the compression worse with these files.
  1735. .IP \(bu 3
  1736. If a BCJ filter is applied on an archive,
  1737. it is possible that it makes the compression ratio
  1738. worse than not using a BCJ filter.
  1739. For example, if there are similar or even identical executables
  1740. then filtering will likely make the files less similar
  1741. and thus compression is worse.
  1742. The contents of non-executable files in the same archive can matter too.
  1743. In practice one has to try with and without a BCJ filter to see
  1744. which is better in each situation.
  1745. .RE
  1746. .IP ""
  1747. Different instruction sets have different alignment:
  1748. the executable file must be aligned to a multiple of
  1749. this value in the input data to make the filter work.
  1750. .RS
  1751. .RS
  1752. .PP
  1753. .TS
  1754. tab(;);
  1755. l n l
  1756. l n l.
  1757. Filter;Alignment;Notes
  1758. x86;1;32-bit or 64-bit x86
  1759. ARM;4;
  1760. ARM-Thumb;2;
  1761. ARM64;4;4096-byte alignment is best
  1762. PowerPC;4;Big endian only
  1763. IA-64;16;Itanium
  1764. SPARC;4;
  1765. .TE
  1766. .RE
  1767. .RE
  1768. .IP ""
  1769. Since the BCJ-filtered data is usually compressed with LZMA2,
  1770. the compression ratio may be improved slightly if
  1771. the LZMA2 options are set to match the
  1772. alignment of the selected BCJ filter.
  1773. For example, with the IA-64 filter, it's good to set
  1774. .B pb=4
  1775. or even
  1776. .B pb=4,lp=4,lc=0
  1777. with LZMA2 (2^4=16).
  1778. The x86 filter is an exception;
  1779. it's usually good to stick to LZMA2's default
  1780. four-byte alignment when compressing x86 executables.
  1781. .IP ""
  1782. All BCJ filters support the same
  1783. .IR options :
  1784. .RS
  1785. .TP
  1786. .BI start= offset
  1787. Specify the start
  1788. .I offset
  1789. that is used when converting between relative
  1790. and absolute addresses.
  1791. The
  1792. .I offset
  1793. must be a multiple of the alignment of the filter
  1794. (see the table above).
  1795. The default is zero.
  1796. In practice, the default is good; specifying a custom
  1797. .I offset
  1798. is almost never useful.
  1799. .RE
  1800. .TP
  1801. \fB\-\-delta\fR[\fB=\fIoptions\fR]
  1802. Add the Delta filter to the filter chain.
  1803. The Delta filter can be only used as a non-last filter
  1804. in the filter chain.
  1805. .IP ""
  1806. Currently only simple byte-wise delta calculation is supported.
  1807. It can be useful when compressing, for example, uncompressed bitmap images
  1808. or uncompressed PCM audio.
  1809. However, special purpose algorithms may give significantly better
  1810. results than Delta + LZMA2.
  1811. This is true especially with audio,
  1812. which compresses faster and better, for example, with
  1813. .BR flac (1).
  1814. .IP ""
  1815. Supported
  1816. .IR options :
  1817. .RS
  1818. .TP
  1819. .BI dist= distance
  1820. Specify the
  1821. .I distance
  1822. of the delta calculation in bytes.
  1823. .I distance
  1824. must be 1\(en256.
  1825. The default is 1.
  1826. .IP ""
  1827. For example, with
  1828. .B dist=2
  1829. and eight-byte input A1 B1 A2 B3 A3 B5 A4 B7, the output will be
  1830. A1 B1 01 02 01 02 01 02.
  1831. .RE
  1832. .
  1833. .SS "Other options"
  1834. .TP
  1835. .BR \-q ", " \-\-quiet
  1836. Suppress warnings and notices.
  1837. Specify this twice to suppress errors too.
  1838. This option has no effect on the exit status.
  1839. That is, even if a warning was suppressed,
  1840. the exit status to indicate a warning is still used.
  1841. .TP
  1842. .BR \-v ", " \-\-verbose
  1843. Be verbose.
  1844. If standard error is connected to a terminal,
  1845. .B xz
  1846. will display a progress indicator.
  1847. Specifying
  1848. .B \-\-verbose
  1849. twice will give even more verbose output.
  1850. .IP ""
  1851. The progress indicator shows the following information:
  1852. .RS
  1853. .IP \(bu 3
  1854. Completion percentage is shown
  1855. if the size of the input file is known.
  1856. That is, the percentage cannot be shown in pipes.
  1857. .IP \(bu 3
  1858. Amount of compressed data produced (compressing)
  1859. or consumed (decompressing).
  1860. .IP \(bu 3
  1861. Amount of uncompressed data consumed (compressing)
  1862. or produced (decompressing).
  1863. .IP \(bu 3
  1864. Compression ratio, which is calculated by dividing
  1865. the amount of compressed data processed so far by
  1866. the amount of uncompressed data processed so far.
  1867. .IP \(bu 3
  1868. Compression or decompression speed.
  1869. This is measured as the amount of uncompressed data consumed
  1870. (compression) or produced (decompression) per second.
  1871. It is shown after a few seconds have passed since
  1872. .B xz
  1873. started processing the file.
  1874. .IP \(bu 3
  1875. Elapsed time in the format M:SS or H:MM:SS.
  1876. .IP \(bu 3
  1877. Estimated remaining time is shown
  1878. only when the size of the input file is
  1879. known and a couple of seconds have already passed since
  1880. .B xz
  1881. started processing the file.
  1882. The time is shown in a less precise format which
  1883. never has any colons, for example, 2 min 30 s.
  1884. .RE
  1885. .IP ""
  1886. When standard error is not a terminal,
  1887. .B \-\-verbose
  1888. will make
  1889. .B xz
  1890. print the filename, compressed size, uncompressed size,
  1891. compression ratio, and possibly also the speed and elapsed time
  1892. on a single line to standard error after compressing or
  1893. decompressing the file.
  1894. The speed and elapsed time are included only when
  1895. the operation took at least a few seconds.
  1896. If the operation didn't finish, for example, due to user interruption,
  1897. also the completion percentage is printed
  1898. if the size of the input file is known.
  1899. .TP
  1900. .BR \-Q ", " \-\-no\-warn
  1901. Don't set the exit status to 2
  1902. even if a condition worth a warning was detected.
  1903. This option doesn't affect the verbosity level, thus both
  1904. .B \-\-quiet
  1905. and
  1906. .B \-\-no\-warn
  1907. have to be used to not display warnings and
  1908. to not alter the exit status.
  1909. .TP
  1910. .B \-\-robot
  1911. Print messages in a machine-parsable format.
  1912. This is intended to ease writing frontends that want to use
  1913. .B xz
  1914. instead of liblzma, which may be the case with various scripts.
  1915. The output with this option enabled is meant to be stable across
  1916. .B xz
  1917. releases.
  1918. See the section
  1919. .B "ROBOT MODE"
  1920. for details.
  1921. .TP
  1922. .B \-\-info\-memory
  1923. Display, in human-readable format, how much physical memory (RAM)
  1924. and how many processor threads
  1925. .B xz
  1926. thinks the system has and the memory usage limits for compression
  1927. and decompression, and exit successfully.
  1928. .TP
  1929. .BR \-h ", " \-\-help
  1930. Display a help message describing the most commonly used options,
  1931. and exit successfully.
  1932. .TP
  1933. .BR \-H ", " \-\-long\-help
  1934. Display a help message describing all features of
  1935. .BR xz ,
  1936. and exit successfully
  1937. .TP
  1938. .BR \-V ", " \-\-version
  1939. Display the version number of
  1940. .B xz
  1941. and liblzma in human readable format.
  1942. To get machine-parsable output, specify
  1943. .B \-\-robot
  1944. before
  1945. .BR \-\-version .
  1946. .
  1947. .SH "ROBOT MODE"
  1948. The robot mode is activated with the
  1949. .B \-\-robot
  1950. option.
  1951. It makes the output of
  1952. .B xz
  1953. easier to parse by other programs.
  1954. Currently
  1955. .B \-\-robot
  1956. is supported only together with
  1957. .BR \-\-version ,
  1958. .BR \-\-info\-memory ,
  1959. and
  1960. .BR \-\-list .
  1961. It will be supported for compression and
  1962. decompression in the future.
  1963. .
  1964. .SS Version
  1965. .B "xz \-\-robot \-\-version"
  1966. will print the version number of
  1967. .B xz
  1968. and liblzma in the following format:
  1969. .PP
  1970. .BI XZ_VERSION= XYYYZZZS
  1971. .br
  1972. .BI LIBLZMA_VERSION= XYYYZZZS
  1973. .TP
  1974. .I X
  1975. Major version.
  1976. .TP
  1977. .I YYY
  1978. Minor version.
  1979. Even numbers are stable.
  1980. Odd numbers are alpha or beta versions.
  1981. .TP
  1982. .I ZZZ
  1983. Patch level for stable releases or
  1984. just a counter for development releases.
  1985. .TP
  1986. .I S
  1987. Stability.
  1988. 0 is alpha, 1 is beta, and 2 is stable.
  1989. .I S
  1990. should be always 2 when
  1991. .I YYY
  1992. is even.
  1993. .PP
  1994. .I XYYYZZZS
  1995. are the same on both lines if
  1996. .B xz
  1997. and liblzma are from the same XZ Utils release.
  1998. .PP
  1999. Examples: 4.999.9beta is
  2000. .B 49990091
  2001. and
  2002. 5.0.0 is
  2003. .BR 50000002 .
  2004. .
  2005. .SS "Memory limit information"
  2006. .B "xz \-\-robot \-\-info\-memory"
  2007. prints a single line with three tab-separated columns:
  2008. .IP 1. 4
  2009. Total amount of physical memory (RAM) in bytes.
  2010. .IP 2. 4
  2011. Memory usage limit for compression in bytes
  2012. .RB ( \-\-memlimit\-compress ).
  2013. A special value of
  2014. .B 0
  2015. indicates the default setting
  2016. which for single-threaded mode is the same as no limit.
  2017. .IP 3. 4
  2018. Memory usage limit for decompression in bytes
  2019. .RB ( \-\-memlimit\-decompress ).
  2020. A special value of
  2021. .B 0
  2022. indicates the default setting
  2023. which for single-threaded mode is the same as no limit.
  2024. .IP 4. 4
  2025. Since
  2026. .B xz
  2027. 5.3.4alpha:
  2028. Memory usage for multi-threaded decompression in bytes
  2029. .RB ( \-\-memlimit\-mt\-decompress ).
  2030. This is never zero because a system-specific default value
  2031. shown in the column 5
  2032. is used if no limit has been specified explicitly.
  2033. This is also never greater than the value in the column 3
  2034. even if a larger value has been specified with
  2035. .BR \-\-memlimit\-mt\-decompress .
  2036. .IP 5. 4
  2037. Since
  2038. .B xz
  2039. 5.3.4alpha:
  2040. A system-specific default memory usage limit
  2041. that is used to limit the number of threads
  2042. when compressing with an automatic number of threads
  2043. .RB ( \-\-threads=0 )
  2044. and no memory usage limit has been specified
  2045. .RB ( \-\-memlimit\-compress ).
  2046. This is also used as the default value for
  2047. .BR \-\-memlimit\-mt\-decompress .
  2048. .IP 6. 4
  2049. Since
  2050. .B xz
  2051. 5.3.4alpha:
  2052. Number of available processor threads.
  2053. .PP
  2054. In the future, the output of
  2055. .B "xz \-\-robot \-\-info\-memory"
  2056. may have more columns, but never more than a single line.
  2057. .
  2058. .SS "List mode"
  2059. .B "xz \-\-robot \-\-list"
  2060. uses tab-separated output.
  2061. The first column of every line has a string
  2062. that indicates the type of the information found on that line:
  2063. .TP
  2064. .B name
  2065. This is always the first line when starting to list a file.
  2066. The second column on the line is the filename.
  2067. .TP
  2068. .B file
  2069. This line contains overall information about the
  2070. .B .xz
  2071. file.
  2072. This line is always printed after the
  2073. .B name
  2074. line.
  2075. .TP
  2076. .B stream
  2077. This line type is used only when
  2078. .B \-\-verbose
  2079. was specified.
  2080. There are as many
  2081. .B stream
  2082. lines as there are streams in the
  2083. .B .xz
  2084. file.
  2085. .TP
  2086. .B block
  2087. This line type is used only when
  2088. .B \-\-verbose
  2089. was specified.
  2090. There are as many
  2091. .B block
  2092. lines as there are blocks in the
  2093. .B .xz
  2094. file.
  2095. The
  2096. .B block
  2097. lines are shown after all the
  2098. .B stream
  2099. lines; different line types are not interleaved.
  2100. .TP
  2101. .B summary
  2102. This line type is used only when
  2103. .B \-\-verbose
  2104. was specified twice.
  2105. This line is printed after all
  2106. .B block
  2107. lines.
  2108. Like the
  2109. .B file
  2110. line, the
  2111. .B summary
  2112. line contains overall information about the
  2113. .B .xz
  2114. file.
  2115. .TP
  2116. .B totals
  2117. This line is always the very last line of the list output.
  2118. It shows the total counts and sizes.
  2119. .PP
  2120. The columns of the
  2121. .B file
  2122. lines:
  2123. .PD 0
  2124. .RS
  2125. .IP 2. 4
  2126. Number of streams in the file
  2127. .IP 3. 4
  2128. Total number of blocks in the stream(s)
  2129. .IP 4. 4
  2130. Compressed size of the file
  2131. .IP 5. 4
  2132. Uncompressed size of the file
  2133. .IP 6. 4
  2134. Compression ratio, for example,
  2135. .BR 0.123 .
  2136. If ratio is over 9.999, three dashes
  2137. .RB ( \-\-\- )
  2138. are displayed instead of the ratio.
  2139. .IP 7. 4
  2140. Comma-separated list of integrity check names.
  2141. The following strings are used for the known check types:
  2142. .BR None ,
  2143. .BR CRC32 ,
  2144. .BR CRC64 ,
  2145. and
  2146. .BR SHA\-256 .
  2147. For unknown check types,
  2148. .BI Unknown\- N
  2149. is used, where
  2150. .I N
  2151. is the Check ID as a decimal number (one or two digits).
  2152. .IP 8. 4
  2153. Total size of stream padding in the file
  2154. .RE
  2155. .PD
  2156. .PP
  2157. The columns of the
  2158. .B stream
  2159. lines:
  2160. .PD 0
  2161. .RS
  2162. .IP 2. 4
  2163. Stream number (the first stream is 1)
  2164. .IP 3. 4
  2165. Number of blocks in the stream
  2166. .IP 4. 4
  2167. Compressed start offset
  2168. .IP 5. 4
  2169. Uncompressed start offset
  2170. .IP 6. 4
  2171. Compressed size (does not include stream padding)
  2172. .IP 7. 4
  2173. Uncompressed size
  2174. .IP 8. 4
  2175. Compression ratio
  2176. .IP 9. 4
  2177. Name of the integrity check
  2178. .IP 10. 4
  2179. Size of stream padding
  2180. .RE
  2181. .PD
  2182. .PP
  2183. The columns of the
  2184. .B block
  2185. lines:
  2186. .PD 0
  2187. .RS
  2188. .IP 2. 4
  2189. Number of the stream containing this block
  2190. .IP 3. 4
  2191. Block number relative to the beginning of the stream
  2192. (the first block is 1)
  2193. .IP 4. 4
  2194. Block number relative to the beginning of the file
  2195. .IP 5. 4
  2196. Compressed start offset relative to the beginning of the file
  2197. .IP 6. 4
  2198. Uncompressed start offset relative to the beginning of the file
  2199. .IP 7. 4
  2200. Total compressed size of the block (includes headers)
  2201. .IP 8. 4
  2202. Uncompressed size
  2203. .IP 9. 4
  2204. Compression ratio
  2205. .IP 10. 4
  2206. Name of the integrity check
  2207. .RE
  2208. .PD
  2209. .PP
  2210. If
  2211. .B \-\-verbose
  2212. was specified twice, additional columns are included on the
  2213. .B block
  2214. lines.
  2215. These are not displayed with a single
  2216. .BR \-\-verbose ,
  2217. because getting this information requires many seeks
  2218. and can thus be slow:
  2219. .PD 0
  2220. .RS
  2221. .IP 11. 4
  2222. Value of the integrity check in hexadecimal
  2223. .IP 12. 4
  2224. Block header size
  2225. .IP 13. 4
  2226. Block flags:
  2227. .B c
  2228. indicates that compressed size is present, and
  2229. .B u
  2230. indicates that uncompressed size is present.
  2231. If the flag is not set, a dash
  2232. .RB ( \- )
  2233. is shown instead to keep the string length fixed.
  2234. New flags may be added to the end of the string in the future.
  2235. .IP 14. 4
  2236. Size of the actual compressed data in the block (this excludes
  2237. the block header, block padding, and check fields)
  2238. .IP 15. 4
  2239. Amount of memory (in bytes) required to decompress
  2240. this block with this
  2241. .B xz
  2242. version
  2243. .IP 16. 4
  2244. Filter chain.
  2245. Note that most of the options used at compression time
  2246. cannot be known, because only the options
  2247. that are needed for decompression are stored in the
  2248. .B .xz
  2249. headers.
  2250. .RE
  2251. .PD
  2252. .PP
  2253. The columns of the
  2254. .B summary
  2255. lines:
  2256. .PD 0
  2257. .RS
  2258. .IP 2. 4
  2259. Amount of memory (in bytes) required to decompress
  2260. this file with this
  2261. .B xz
  2262. version
  2263. .IP 3. 4
  2264. .B yes
  2265. or
  2266. .B no
  2267. indicating if all block headers have both compressed size and
  2268. uncompressed size stored in them
  2269. .PP
  2270. .I Since
  2271. .B xz
  2272. .I 5.1.2alpha:
  2273. .IP 4. 4
  2274. Minimum
  2275. .B xz
  2276. version required to decompress the file
  2277. .RE
  2278. .PD
  2279. .PP
  2280. The columns of the
  2281. .B totals
  2282. line:
  2283. .PD 0
  2284. .RS
  2285. .IP 2. 4
  2286. Number of streams
  2287. .IP 3. 4
  2288. Number of blocks
  2289. .IP 4. 4
  2290. Compressed size
  2291. .IP 5. 4
  2292. Uncompressed size
  2293. .IP 6. 4
  2294. Average compression ratio
  2295. .IP 7. 4
  2296. Comma-separated list of integrity check names
  2297. that were present in the files
  2298. .IP 8. 4
  2299. Stream padding size
  2300. .IP 9. 4
  2301. Number of files.
  2302. This is here to
  2303. keep the order of the earlier columns the same as on
  2304. .B file
  2305. lines.
  2306. .PD
  2307. .RE
  2308. .PP
  2309. If
  2310. .B \-\-verbose
  2311. was specified twice, additional columns are included on the
  2312. .B totals
  2313. line:
  2314. .PD 0
  2315. .RS
  2316. .IP 10. 4
  2317. Maximum amount of memory (in bytes) required to decompress
  2318. the files with this
  2319. .B xz
  2320. version
  2321. .IP 11. 4
  2322. .B yes
  2323. or
  2324. .B no
  2325. indicating if all block headers have both compressed size and
  2326. uncompressed size stored in them
  2327. .PP
  2328. .I Since
  2329. .B xz
  2330. .I 5.1.2alpha:
  2331. .IP 12. 4
  2332. Minimum
  2333. .B xz
  2334. version required to decompress the file
  2335. .RE
  2336. .PD
  2337. .PP
  2338. Future versions may add new line types and
  2339. new columns can be added to the existing line types,
  2340. but the existing columns won't be changed.
  2341. .
  2342. .SH "EXIT STATUS"
  2343. .TP
  2344. .B 0
  2345. All is good.
  2346. .TP
  2347. .B 1
  2348. An error occurred.
  2349. .TP
  2350. .B 2
  2351. Something worth a warning occurred,
  2352. but no actual errors occurred.
  2353. .PP
  2354. Notices (not warnings or errors) printed on standard error
  2355. don't affect the exit status.
  2356. .
  2357. .SH ENVIRONMENT
  2358. .B xz
  2359. parses space-separated lists of options
  2360. from the environment variables
  2361. .B XZ_DEFAULTS
  2362. and
  2363. .BR XZ_OPT ,
  2364. in this order, before parsing the options from the command line.
  2365. Note that only options are parsed from the environment variables;
  2366. all non-options are silently ignored.
  2367. Parsing is done with
  2368. .BR getopt_long (3)
  2369. which is used also for the command line arguments.
  2370. .TP
  2371. .B XZ_DEFAULTS
  2372. User-specific or system-wide default options.
  2373. Typically this is set in a shell initialization script to enable
  2374. .BR xz 's
  2375. memory usage limiter by default.
  2376. Excluding shell initialization scripts
  2377. and similar special cases, scripts must never set or unset
  2378. .BR XZ_DEFAULTS .
  2379. .TP
  2380. .B XZ_OPT
  2381. This is for passing options to
  2382. .B xz
  2383. when it is not possible to set the options directly on the
  2384. .B xz
  2385. command line.
  2386. This is the case when
  2387. .B xz
  2388. is run by a script or tool, for example, GNU
  2389. .BR tar (1):
  2390. .RS
  2391. .RS
  2392. .PP
  2393. .nf
  2394. .ft CW
  2395. XZ_OPT=\-2v tar caf foo.tar.xz foo
  2396. .ft R
  2397. .fi
  2398. .RE
  2399. .RE
  2400. .IP ""
  2401. Scripts may use
  2402. .BR XZ_OPT ,
  2403. for example, to set script-specific default compression options.
  2404. It is still recommended to allow users to override
  2405. .B XZ_OPT
  2406. if that is reasonable.
  2407. For example, in
  2408. .BR sh (1)
  2409. scripts one may use something like this:
  2410. .RS
  2411. .RS
  2412. .PP
  2413. .nf
  2414. .ft CW
  2415. XZ_OPT=${XZ_OPT\-"\-7e"}
  2416. export XZ_OPT
  2417. .ft R
  2418. .fi
  2419. .RE
  2420. .RE
  2421. .
  2422. .SH "LZMA UTILS COMPATIBILITY"
  2423. The command line syntax of
  2424. .B xz
  2425. is practically a superset of
  2426. .BR lzma ,
  2427. .BR unlzma ,
  2428. and
  2429. .B lzcat
  2430. as found from LZMA Utils 4.32.x.
  2431. In most cases, it is possible to replace
  2432. LZMA Utils with XZ Utils without breaking existing scripts.
  2433. There are some incompatibilities though,
  2434. which may sometimes cause problems.
  2435. .
  2436. .SS "Compression preset levels"
  2437. The numbering of the compression level presets is not identical in
  2438. .B xz
  2439. and LZMA Utils.
  2440. The most important difference is how dictionary sizes
  2441. are mapped to different presets.
  2442. Dictionary size is roughly equal to the decompressor memory usage.
  2443. .RS
  2444. .PP
  2445. .TS
  2446. tab(;);
  2447. c c c
  2448. c n n.
  2449. Level;xz;LZMA Utils
  2450. \-0;256 KiB;N/A
  2451. \-1;1 MiB;64 KiB
  2452. \-2;2 MiB;1 MiB
  2453. \-3;4 MiB;512 KiB
  2454. \-4;4 MiB;1 MiB
  2455. \-5;8 MiB;2 MiB
  2456. \-6;8 MiB;4 MiB
  2457. \-7;16 MiB;8 MiB
  2458. \-8;32 MiB;16 MiB
  2459. \-9;64 MiB;32 MiB
  2460. .TE
  2461. .RE
  2462. .PP
  2463. The dictionary size differences affect
  2464. the compressor memory usage too,
  2465. but there are some other differences between
  2466. LZMA Utils and XZ Utils, which
  2467. make the difference even bigger:
  2468. .RS
  2469. .PP
  2470. .TS
  2471. tab(;);
  2472. c c c
  2473. c n n.
  2474. Level;xz;LZMA Utils 4.32.x
  2475. \-0;3 MiB;N/A
  2476. \-1;9 MiB;2 MiB
  2477. \-2;17 MiB;12 MiB
  2478. \-3;32 MiB;12 MiB
  2479. \-4;48 MiB;16 MiB
  2480. \-5;94 MiB;26 MiB
  2481. \-6;94 MiB;45 MiB
  2482. \-7;186 MiB;83 MiB
  2483. \-8;370 MiB;159 MiB
  2484. \-9;674 MiB;311 MiB
  2485. .TE
  2486. .RE
  2487. .PP
  2488. The default preset level in LZMA Utils is
  2489. .B \-7
  2490. while in XZ Utils it is
  2491. .BR \-6 ,
  2492. so both use an 8 MiB dictionary by default.
  2493. .
  2494. .SS "Streamed vs. non-streamed .lzma files"
  2495. The uncompressed size of the file can be stored in the
  2496. .B .lzma
  2497. header.
  2498. LZMA Utils does that when compressing regular files.
  2499. The alternative is to mark that uncompressed size is unknown
  2500. and use end-of-payload marker to indicate
  2501. where the decompressor should stop.
  2502. LZMA Utils uses this method when uncompressed size isn't known,
  2503. which is the case, for example, in pipes.
  2504. .PP
  2505. .B xz
  2506. supports decompressing
  2507. .B .lzma
  2508. files with or without end-of-payload marker, but all
  2509. .B .lzma
  2510. files created by
  2511. .B xz
  2512. will use end-of-payload marker and have uncompressed size
  2513. marked as unknown in the
  2514. .B .lzma
  2515. header.
  2516. This may be a problem in some uncommon situations.
  2517. For example, a
  2518. .B .lzma
  2519. decompressor in an embedded device might work
  2520. only with files that have known uncompressed size.
  2521. If you hit this problem, you need to use LZMA Utils
  2522. or LZMA SDK to create
  2523. .B .lzma
  2524. files with known uncompressed size.
  2525. .
  2526. .SS "Unsupported .lzma files"
  2527. The
  2528. .B .lzma
  2529. format allows
  2530. .I lc
  2531. values up to 8, and
  2532. .I lp
  2533. values up to 4.
  2534. LZMA Utils can decompress files with any
  2535. .I lc
  2536. and
  2537. .IR lp ,
  2538. but always creates files with
  2539. .B lc=3
  2540. and
  2541. .BR lp=0 .
  2542. Creating files with other
  2543. .I lc
  2544. and
  2545. .I lp
  2546. is possible with
  2547. .B xz
  2548. and with LZMA SDK.
  2549. .PP
  2550. The implementation of the LZMA1 filter in liblzma
  2551. requires that the sum of
  2552. .I lc
  2553. and
  2554. .I lp
  2555. must not exceed 4.
  2556. Thus,
  2557. .B .lzma
  2558. files, which exceed this limitation, cannot be decompressed with
  2559. .BR xz .
  2560. .PP
  2561. LZMA Utils creates only
  2562. .B .lzma
  2563. files which have a dictionary size of
  2564. .RI "2^" n
  2565. (a power of 2) but accepts files with any dictionary size.
  2566. liblzma accepts only
  2567. .B .lzma
  2568. files which have a dictionary size of
  2569. .RI "2^" n
  2570. or
  2571. .RI "2^" n " + 2^(" n "\-1)."
  2572. This is to decrease false positives when detecting
  2573. .B .lzma
  2574. files.
  2575. .PP
  2576. These limitations shouldn't be a problem in practice,
  2577. since practically all
  2578. .B .lzma
  2579. files have been compressed with settings that liblzma will accept.
  2580. .
  2581. .SS "Trailing garbage"
  2582. When decompressing,
  2583. LZMA Utils silently ignore everything after the first
  2584. .B .lzma
  2585. stream.
  2586. In most situations, this is a bug.
  2587. This also means that LZMA Utils
  2588. don't support decompressing concatenated
  2589. .B .lzma
  2590. files.
  2591. .PP
  2592. If there is data left after the first
  2593. .B .lzma
  2594. stream,
  2595. .B xz
  2596. considers the file to be corrupt unless
  2597. .B \-\-single\-stream
  2598. was used.
  2599. This may break obscure scripts which have
  2600. assumed that trailing garbage is ignored.
  2601. .
  2602. .SH NOTES
  2603. .
  2604. .SS "Compressed output may vary"
  2605. The exact compressed output produced from
  2606. the same uncompressed input file
  2607. may vary between XZ Utils versions even if
  2608. compression options are identical.
  2609. This is because the encoder can be improved
  2610. (faster or better compression)
  2611. without affecting the file format.
  2612. The output can vary even between different
  2613. builds of the same XZ Utils version,
  2614. if different build options are used.
  2615. .PP
  2616. The above means that once
  2617. .B \-\-rsyncable
  2618. has been implemented,
  2619. the resulting files won't necessarily be rsyncable
  2620. unless both old and new files have been compressed
  2621. with the same xz version.
  2622. This problem can be fixed if a part of the encoder
  2623. implementation is frozen to keep rsyncable output
  2624. stable across xz versions.
  2625. .
  2626. .SS "Embedded .xz decompressors"
  2627. Embedded
  2628. .B .xz
  2629. decompressor implementations like XZ Embedded don't necessarily
  2630. support files created with integrity
  2631. .I check
  2632. types other than
  2633. .B none
  2634. and
  2635. .BR crc32 .
  2636. Since the default is
  2637. .BR \-\-check=crc64 ,
  2638. you must use
  2639. .B \-\-check=none
  2640. or
  2641. .B \-\-check=crc32
  2642. when creating files for embedded systems.
  2643. .PP
  2644. Outside embedded systems, all
  2645. .B .xz
  2646. format decompressors support all the
  2647. .I check
  2648. types, or at least are able to decompress
  2649. the file without verifying the
  2650. integrity check if the particular
  2651. .I check
  2652. is not supported.
  2653. .PP
  2654. XZ Embedded supports BCJ filters,
  2655. but only with the default start offset.
  2656. .
  2657. .SH EXAMPLES
  2658. .
  2659. .SS Basics
  2660. Compress the file
  2661. .I foo
  2662. into
  2663. .I foo.xz
  2664. using the default compression level
  2665. .RB ( \-6 ),
  2666. and remove
  2667. .I foo
  2668. if compression is successful:
  2669. .RS
  2670. .PP
  2671. .nf
  2672. .ft CW
  2673. xz foo
  2674. .ft R
  2675. .fi
  2676. .RE
  2677. .PP
  2678. Decompress
  2679. .I bar.xz
  2680. into
  2681. .I bar
  2682. and don't remove
  2683. .I bar.xz
  2684. even if decompression is successful:
  2685. .RS
  2686. .PP
  2687. .nf
  2688. .ft CW
  2689. xz \-dk bar.xz
  2690. .ft R
  2691. .fi
  2692. .RE
  2693. .PP
  2694. Create
  2695. .I baz.tar.xz
  2696. with the preset
  2697. .B \-4e
  2698. .RB ( "\-4 \-\-extreme" ),
  2699. which is slower than the default
  2700. .BR \-6 ,
  2701. but needs less memory for compression and decompression (48\ MiB
  2702. and 5\ MiB, respectively):
  2703. .RS
  2704. .PP
  2705. .nf
  2706. .ft CW
  2707. tar cf \- baz | xz \-4e > baz.tar.xz
  2708. .ft R
  2709. .fi
  2710. .RE
  2711. .PP
  2712. A mix of compressed and uncompressed files can be decompressed
  2713. to standard output with a single command:
  2714. .RS
  2715. .PP
  2716. .nf
  2717. .ft CW
  2718. xz \-dcf a.txt b.txt.xz c.txt d.txt.lzma > abcd.txt
  2719. .ft R
  2720. .fi
  2721. .RE
  2722. .
  2723. .SS "Parallel compression of many files"
  2724. On GNU and *BSD,
  2725. .BR find (1)
  2726. and
  2727. .BR xargs (1)
  2728. can be used to parallelize compression of many files:
  2729. .RS
  2730. .PP
  2731. .nf
  2732. .ft CW
  2733. find . \-type f \e! \-name '*.xz' \-print0 \e
  2734. | xargs \-0r \-P4 \-n16 xz \-T1
  2735. .ft R
  2736. .fi
  2737. .RE
  2738. .PP
  2739. The
  2740. .B \-P
  2741. option to
  2742. .BR xargs (1)
  2743. sets the number of parallel
  2744. .B xz
  2745. processes.
  2746. The best value for the
  2747. .B \-n
  2748. option depends on how many files there are to be compressed.
  2749. If there are only a couple of files,
  2750. the value should probably be 1;
  2751. with tens of thousands of files,
  2752. 100 or even more may be appropriate to reduce the number of
  2753. .B xz
  2754. processes that
  2755. .BR xargs (1)
  2756. will eventually create.
  2757. .PP
  2758. The option
  2759. .B \-T1
  2760. for
  2761. .B xz
  2762. is there to force it to single-threaded mode, because
  2763. .BR xargs (1)
  2764. is used to control the amount of parallelization.
  2765. .
  2766. .SS "Robot mode"
  2767. Calculate how many bytes have been saved in total
  2768. after compressing multiple files:
  2769. .RS
  2770. .PP
  2771. .nf
  2772. .ft CW
  2773. xz \-\-robot \-\-list *.xz | awk '/^totals/{print $5\-$4}'
  2774. .ft R
  2775. .fi
  2776. .RE
  2777. .PP
  2778. A script may want to know that it is using new enough
  2779. .BR xz .
  2780. The following
  2781. .BR sh (1)
  2782. script checks that the version number of the
  2783. .B xz
  2784. tool is at least 5.0.0.
  2785. This method is compatible with old beta versions,
  2786. which didn't support the
  2787. .B \-\-robot
  2788. option:
  2789. .RS
  2790. .PP
  2791. .nf
  2792. .ft CW
  2793. if ! eval "$(xz \-\-robot \-\-version 2> /dev/null)" ||
  2794. [ "$XZ_VERSION" \-lt 50000002 ]; then
  2795. echo "Your xz is too old."
  2796. fi
  2797. unset XZ_VERSION LIBLZMA_VERSION
  2798. .ft R
  2799. .fi
  2800. .RE
  2801. .PP
  2802. Set a memory usage limit for decompression using
  2803. .BR XZ_OPT ,
  2804. but if a limit has already been set, don't increase it:
  2805. .RS
  2806. .PP
  2807. .nf
  2808. .ft CW
  2809. NEWLIM=$((123 << 20))\ \ # 123 MiB
  2810. OLDLIM=$(xz \-\-robot \-\-info\-memory | cut \-f3)
  2811. if [ $OLDLIM \-eq 0 \-o $OLDLIM \-gt $NEWLIM ]; then
  2812. XZ_OPT="$XZ_OPT \-\-memlimit\-decompress=$NEWLIM"
  2813. export XZ_OPT
  2814. fi
  2815. .ft R
  2816. .fi
  2817. .RE
  2818. .
  2819. .SS "Custom compressor filter chains"
  2820. The simplest use for custom filter chains is
  2821. customizing a LZMA2 preset.
  2822. This can be useful,
  2823. because the presets cover only a subset of the
  2824. potentially useful combinations of compression settings.
  2825. .PP
  2826. The CompCPU columns of the tables
  2827. from the descriptions of the options
  2828. .BR "\-0" " ... " "\-9"
  2829. and
  2830. .B \-\-extreme
  2831. are useful when customizing LZMA2 presets.
  2832. Here are the relevant parts collected from those two tables:
  2833. .RS
  2834. .PP
  2835. .TS
  2836. tab(;);
  2837. c c
  2838. n n.
  2839. Preset;CompCPU
  2840. \-0;0
  2841. \-1;1
  2842. \-2;2
  2843. \-3;3
  2844. \-4;4
  2845. \-5;5
  2846. \-6;6
  2847. \-5e;7
  2848. \-6e;8
  2849. .TE
  2850. .RE
  2851. .PP
  2852. If you know that a file requires
  2853. somewhat big dictionary (for example, 32\ MiB) to compress well,
  2854. but you want to compress it quicker than
  2855. .B "xz \-8"
  2856. would do, a preset with a low CompCPU value (for example, 1)
  2857. can be modified to use a bigger dictionary:
  2858. .RS
  2859. .PP
  2860. .nf
  2861. .ft CW
  2862. xz \-\-lzma2=preset=1,dict=32MiB foo.tar
  2863. .ft R
  2864. .fi
  2865. .RE
  2866. .PP
  2867. With certain files, the above command may be faster than
  2868. .B "xz \-6"
  2869. while compressing significantly better.
  2870. However, it must be emphasized that only some files benefit from
  2871. a big dictionary while keeping the CompCPU value low.
  2872. The most obvious situation,
  2873. where a big dictionary can help a lot,
  2874. is an archive containing very similar files
  2875. of at least a few megabytes each.
  2876. The dictionary size has to be significantly bigger
  2877. than any individual file to allow LZMA2 to take
  2878. full advantage of the similarities between consecutive files.
  2879. .PP
  2880. If very high compressor and decompressor memory usage is fine,
  2881. and the file being compressed is
  2882. at least several hundred megabytes, it may be useful
  2883. to use an even bigger dictionary than the 64 MiB that
  2884. .B "xz \-9"
  2885. would use:
  2886. .RS
  2887. .PP
  2888. .nf
  2889. .ft CW
  2890. xz \-vv \-\-lzma2=dict=192MiB big_foo.tar
  2891. .ft R
  2892. .fi
  2893. .RE
  2894. .PP
  2895. Using
  2896. .B \-vv
  2897. .RB ( "\-\-verbose \-\-verbose" )
  2898. like in the above example can be useful
  2899. to see the memory requirements
  2900. of the compressor and decompressor.
  2901. Remember that using a dictionary bigger than
  2902. the size of the uncompressed file is waste of memory,
  2903. so the above command isn't useful for small files.
  2904. .PP
  2905. Sometimes the compression time doesn't matter,
  2906. but the decompressor memory usage has to be kept low, for example,
  2907. to make it possible to decompress the file on an embedded system.
  2908. The following command uses
  2909. .B \-6e
  2910. .RB ( "\-6 \-\-extreme" )
  2911. as a base and sets the dictionary to only 64\ KiB.
  2912. The resulting file can be decompressed with XZ Embedded
  2913. (that's why there is
  2914. .BR \-\-check=crc32 )
  2915. using about 100\ KiB of memory.
  2916. .RS
  2917. .PP
  2918. .nf
  2919. .ft CW
  2920. xz \-\-check=crc32 \-\-lzma2=preset=6e,dict=64KiB foo
  2921. .ft R
  2922. .fi
  2923. .RE
  2924. .PP
  2925. If you want to squeeze out as many bytes as possible,
  2926. adjusting the number of literal context bits
  2927. .RI ( lc )
  2928. and number of position bits
  2929. .RI ( pb )
  2930. can sometimes help.
  2931. Adjusting the number of literal position bits
  2932. .RI ( lp )
  2933. might help too, but usually
  2934. .I lc
  2935. and
  2936. .I pb
  2937. are more important.
  2938. For example, a source code archive contains mostly US-ASCII text,
  2939. so something like the following might give
  2940. slightly (like 0.1\ %) smaller file than
  2941. .B "xz \-6e"
  2942. (try also without
  2943. .BR lc=4 ):
  2944. .RS
  2945. .PP
  2946. .nf
  2947. .ft CW
  2948. xz \-\-lzma2=preset=6e,pb=0,lc=4 source_code.tar
  2949. .ft R
  2950. .fi
  2951. .RE
  2952. .PP
  2953. Using another filter together with LZMA2 can improve
  2954. compression with certain file types.
  2955. For example, to compress a x86-32 or x86-64 shared library
  2956. using the x86 BCJ filter:
  2957. .RS
  2958. .PP
  2959. .nf
  2960. .ft CW
  2961. xz \-\-x86 \-\-lzma2 libfoo.so
  2962. .ft R
  2963. .fi
  2964. .RE
  2965. .PP
  2966. Note that the order of the filter options is significant.
  2967. If
  2968. .B \-\-x86
  2969. is specified after
  2970. .BR \-\-lzma2 ,
  2971. .B xz
  2972. will give an error,
  2973. because there cannot be any filter after LZMA2,
  2974. and also because the x86 BCJ filter cannot be used
  2975. as the last filter in the chain.
  2976. .PP
  2977. The Delta filter together with LZMA2
  2978. can give good results with bitmap images.
  2979. It should usually beat PNG,
  2980. which has a few more advanced filters than simple
  2981. delta but uses Deflate for the actual compression.
  2982. .PP
  2983. The image has to be saved in uncompressed format,
  2984. for example, as uncompressed TIFF.
  2985. The distance parameter of the Delta filter is set
  2986. to match the number of bytes per pixel in the image.
  2987. For example, 24-bit RGB bitmap needs
  2988. .BR dist=3 ,
  2989. and it is also good to pass
  2990. .B pb=0
  2991. to LZMA2 to accommodate the three-byte alignment:
  2992. .RS
  2993. .PP
  2994. .nf
  2995. .ft CW
  2996. xz \-\-delta=dist=3 \-\-lzma2=pb=0 foo.tiff
  2997. .ft R
  2998. .fi
  2999. .RE
  3000. .PP
  3001. If multiple images have been put into a single archive (for example,
  3002. .BR .tar ),
  3003. the Delta filter will work on that too as long as all images
  3004. have the same number of bytes per pixel.
  3005. .
  3006. .SH "SEE ALSO"
  3007. .BR xzdec (1),
  3008. .BR xzdiff (1),
  3009. .BR xzgrep (1),
  3010. .BR xzless (1),
  3011. .BR xzmore (1),
  3012. .BR gzip (1),
  3013. .BR bzip2 (1),
  3014. .BR 7z (1)
  3015. .PP
  3016. XZ Utils: <https://tukaani.org/xz/>
  3017. .br
  3018. XZ Embedded: <https://tukaani.org/xz/embedded.html>
  3019. .br
  3020. LZMA SDK: <https://7-zip.org/sdk.html>