Adjust performance claims
Document the latest benchmarks on the Nexus 5X and change the "2-4x" overall claim to "2-6x". The peak performance on x86 platforms was already closer to 5x, and the addition of SIMD-accelerated Huffman encoding gave it that extra push over the cliff.
This commit is contained in:
@@ -69,21 +69,28 @@ setting the JSIMD_NOHUFFENC environment variable to 1.
|
||||
|
||||
[13] Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit
|
||||
platforms. This speeds up the compression of full-color JPEGs by about 30% on
|
||||
average. For the purposes of benchmarking or regression testing,
|
||||
SIMD-accelerated Huffman encoding can be disabled by setting the
|
||||
JSIMD_NOHUFFENC environment variable to 1.
|
||||
average on a Cortex-A9 core (iPhone 4S) and by about 6-7% on average on
|
||||
Cortex-A53 and Cortex-A57 cores. For the purposes of benchmarking or
|
||||
regression testing, SIMD-accelerated Huffman encoding can be disabled by
|
||||
setting the JSIMD_NOHUFFENC environment variable to 1.
|
||||
|
||||
[14] Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
|
||||
compression algorithms (including the slow integer forward DCT and h2v2 & h2v1
|
||||
downsampling algorithms, which are not accelerated in the 32-bit NEON
|
||||
implementation.) This speeds up the overall 64-bit compression performance by
|
||||
about 2x on ARMv8 processors.
|
||||
implementation.) This speeds up the compression of full-color JPEGs by about
|
||||
75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
|
||||
Cortex-A53 and Cortex-A57 cores.
|
||||
|
||||
[15] pkg-config (.pc) scripts are now included for both the libjpeg and
|
||||
TurboJPEG API libraries on Un*x systems. Note that if a project's build system
|
||||
relies on these scripts, then it will not be possible to build that project
|
||||
with libjpeg or with a prior version of libjpeg-turbo.
|
||||
|
||||
[16] Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to
|
||||
improve performance on CPUs with in-order pipelines. This speeds up the
|
||||
decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX
|
||||
processor and by about 15% on average on a Cortex-A53 core.
|
||||
|
||||
|
||||
1.4.2
|
||||
=====
|
||||
|
||||
28
README.md
28
README.md
@@ -4,7 +4,7 @@ Background
|
||||
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
|
||||
NEON, AltiVec) to accelerate baseline JPEG compression and decompression on
|
||||
x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is
|
||||
generally 2-4x as fast as libjpeg, all else being equal. On other types of
|
||||
generally 2-6x as fast as libjpeg, all else being equal. On other types of
|
||||
systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by
|
||||
virtue of its highly-optimized Huffman coding routines. In many cases, the
|
||||
performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
|
||||
@@ -36,7 +36,7 @@ Using libjpeg-turbo
|
||||
libjpeg-turbo includes two APIs that can be used to compress and decompress
|
||||
JPEG images:
|
||||
|
||||
- **TurboJPEG API**
|
||||
- **TurboJPEG API**
|
||||
This API provides an easy-to-use interface for compressing and decompressing
|
||||
JPEG images in memory. It also provides some functionality that would not be
|
||||
straightforward to achieve using the underlying libjpeg API, such as
|
||||
@@ -44,7 +44,7 @@ JPEG images:
|
||||
transforms on an image. The Java interface for libjpeg-turbo is written on
|
||||
top of the TurboJPEG API.
|
||||
|
||||
- **libjpeg API**
|
||||
- **libjpeg API**
|
||||
This is the de facto industry-standard API for compressing and decompressing
|
||||
JPEG images. It is more difficult to use than the TurboJPEG API but also
|
||||
more powerful. The libjpeg API implementation in libjpeg-turbo is both
|
||||
@@ -135,17 +135,17 @@ which aren't.
|
||||
|
||||
#### Fully supported
|
||||
|
||||
- **libjpeg: IDCT scaling extensions in decompressor**
|
||||
- **libjpeg: IDCT scaling extensions in decompressor**
|
||||
libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
|
||||
1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
|
||||
and 1/2 are SIMD-accelerated.)
|
||||
|
||||
- **libjpeg: Arithmetic coding**
|
||||
|
||||
- **libjpeg: In-memory source and destination managers**
|
||||
- **libjpeg: In-memory source and destination managers**
|
||||
See notes below.
|
||||
|
||||
- **cjpeg: Separate quality settings for luminance and chrominance**
|
||||
- **cjpeg: Separate quality settings for luminance and chrominance**
|
||||
Note that the libpjeg v7+ API was extended to accommodate this feature only
|
||||
for convenience purposes. It has always been possible to implement this
|
||||
feature with libjpeg v6b (see rdswitch.c for an example.)
|
||||
@@ -174,15 +174,15 @@ http://www.libjpeg-turbo.org/About/SmartScale and draw his/her own conclusions,
|
||||
but it is the general belief of our project that these features have not
|
||||
demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
|
||||
|
||||
- **libjpeg: DCT scaling in compressor**
|
||||
`cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
|
||||
- **libjpeg: DCT scaling in compressor**
|
||||
`cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
|
||||
There is no technical reason why DCT scaling could not be supported when
|
||||
emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
|
||||
below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
|
||||
8/9 would be available, which is of limited usefulness.
|
||||
|
||||
- **libjpeg: SmartScale**
|
||||
`cinfo.block_size` is silently ignored.
|
||||
- **libjpeg: SmartScale**
|
||||
`cinfo.block_size` is silently ignored.
|
||||
SmartScale is an extension to the JPEG format that allows for DCT block
|
||||
sizes other than 8x8. Providing support for this new format would be
|
||||
feasible (particularly without full acceleration.) However, until/unless
|
||||
@@ -194,15 +194,15 @@ demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
|
||||
interest in providing this feature would be as a means of supporting
|
||||
additional DCT scaling factors.
|
||||
|
||||
- **libjpeg: Fancy downsampling in compressor**
|
||||
`cinfo.do_fancy_downsampling` is silently ignored.
|
||||
- **libjpeg: Fancy downsampling in compressor**
|
||||
`cinfo.do_fancy_downsampling` is silently ignored.
|
||||
This requires the DCT scaling feature, which is not supported.
|
||||
|
||||
- **jpegtran: Scaling**
|
||||
- **jpegtran: Scaling**
|
||||
This requires both the DCT scaling and SmartScale features, which are not
|
||||
supported.
|
||||
|
||||
- **Lossless RGB JPEG files**
|
||||
- **Lossless RGB JPEG files**
|
||||
This requires the SmartScale feature, which is not supported.
|
||||
|
||||
### What About libjpeg v9?
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression on x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is generally 2-4x as fast as libjpeg, all else being equal. On other types of systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by virtue of its highly-optimized Huffman coding routines. In many cases, the performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
|
||||
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression on x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is generally 2-6x as fast as libjpeg, all else being equal. On other types of systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by virtue of its highly-optimized Huffman coding routines. In many cases, the performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
|
||||
|
||||
libjpeg-turbo implements both the traditional libjpeg API as well as the less powerful but more straightforward TurboJPEG API. libjpeg-turbo also features colorspace extensions that allow it to compress from/decompress to 32-bit and big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java interface.
|
||||
|
||||
|
||||
@@ -11,7 +11,7 @@ Description: A SIMD-accelerated JPEG codec that provides both the libjpeg and Tu
|
||||
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
|
||||
NEON, AltiVec) to accelerate baseline JPEG compression and decompression on
|
||||
x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is
|
||||
generally 2-4x as fast as libjpeg, all else being equal. On other types of
|
||||
generally 2-6x as fast as libjpeg, all else being equal. On other types of
|
||||
systems, libjpeg-turbo can still outperform libjpeg by a significant amount,
|
||||
by virtue of its highly-optimized Huffman coding routines. In many cases, the
|
||||
performance of libjpeg-turbo rivals that of proprietary high-speed JPEG
|
||||
|
||||
@@ -46,7 +46,7 @@ Provides: %{name} = %{version}-%{release}, @PACKAGE_NAME@ = %{version}-%{release
|
||||
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
|
||||
NEON, AltiVec) to accelerate baseline JPEG compression and decompression on
|
||||
x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is
|
||||
generally 2-4x as fast as libjpeg, all else being equal. On other types of
|
||||
generally 2-6x as fast as libjpeg, all else being equal. On other types of
|
||||
systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by
|
||||
virtue of its highly-optimized Huffman coding routines. In many cases, the
|
||||
performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
|
||||
|
||||
Reference in New Issue
Block a user