Developer Guide for Intel® Data Analytics Acceleration Library 2016 Update 4

Data Compression

When large amounts of data are sent across devices or need to be stored in memory or in a persistent storage, data compression enables you to reduce network traffic, memory, and persistent storage footprint. Intel DAAL implements several most popular generic compression and decompression methods, which include ZLIB, LZO, RLE, and BZIP2.

General API for Data Compression and Decompression

The CompressionStream and DecompressionStream classes provide general methods for data compression and decompression. The following diagram illustrates the compression and decompression flow at a high level:
Intel® DAAL compression flow

To define compression or decompression methods and related parameters, provide Compressor or Decompressor objects as arguments to CompressionStream or DecompressionStream constructors respectively. For more details on Compressor and Decompressor, refer to Compression and Decompression Interfaces.

Use operator << of CompressionStream or DecompressionStream to provide input data for compression or decompression stream. By default, all compression and decompression stream methods allocate the memory required to store results of compression and decompression. For details of controlling memory allocation, refer to Compression and Decompression Interfaces.

The following methods are available to retrieve compressed data stored in CompressionStream:

The following methods are available to retrieve decompressed data stored in DecompressionStream:

Compression and Decompression Interfaces

CompressionStream and DecompressionStream classes cover most typical usage scenarios. Therefore, you need to work directly with Compressor and Decompressor objects only in the cases as follows:

The Compressor and Decompressor classes provide interfaces to supported compression and decompression methods (ZLIB, LZO, RLE, and BZIP2).

Compression and decompression objects are initialized with a set of default parameters. You can modify parameters of a specific compression method by accessing the parameter field of the Compressor or Decompressor object.

To perform compression or decompression using the Compressor or Decompressor classes, respectively, provide input data using the setInputDataBlock() method and call the run() method. This approach requires that you allocate and control the memory to store the results of compression or decompression. In general, it is impossible to accurately estimate the required size of the output data block, and the memory you provide may be insufficient to store results of compression or decompression. However, you can check whether you need to allocate additional memory to continue the run() operation. To do this, use the isOutputDataBlockFull() method. You can also use the getUsedOutputDataBlockSize() method to obtain the size of compressed or decompressed data actually written to the output data block.

You can use your own compression and decompression methods in CompressionStream and DecompressionStream. In this case, you need to override Compressor and Decompressor objects.

Examples

C++:

Java*: