Adjust performance claims

Document the latest benchmarks on the Nexus 5X and change the "2-4x"
overall claim to "2-6x".  The peak performance on x86 platforms was
already closer to 5x, and the addition of SIMD-accelerated Huffman
encoding gave it that extra push over the cliff.
This commit is contained in:
DRC
2016-02-03 14:02:13 -06:00
parent cf888486d1
commit fe11699d90
5 changed files with 29 additions and 22 deletions

View File

@@ -69,21 +69,28 @@ setting the JSIMD_NOHUFFENC environment variable to 1.
[13] Added SIMD acceleration for Huffman encoding on NEON-capable ARM 32-bit
platforms. This speeds up the compression of full-color JPEGs by about 30% on
average. For the purposes of benchmarking or regression testing,
SIMD-accelerated Huffman encoding can be disabled by setting the
JSIMD_NOHUFFENC environment variable to 1.
average on a Cortex-A9 core (iPhone 4S) and by about 6-7% on average on
Cortex-A53 and Cortex-A57 cores. For the purposes of benchmarking or
regression testing, SIMD-accelerated Huffman encoding can be disabled by
setting the JSIMD_NOHUFFENC environment variable to 1.
[14] Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
compression algorithms (including the slow integer forward DCT and h2v2 & h2v1
downsampling algorithms, which are not accelerated in the 32-bit NEON
implementation.) This speeds up the overall 64-bit compression performance by
about 2x on ARMv8 processors.
implementation.) This speeds up the compression of full-color JPEGs by about
75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
Cortex-A53 and Cortex-A57 cores.
[15] pkg-config (.pc) scripts are now included for both the libjpeg and
TurboJPEG API libraries on Un*x systems. Note that if a project's build system
relies on these scripts, then it will not be possible to build that project
with libjpeg or with a prior version of libjpeg-turbo.
[16] Optimized the ARM 64-bit (ARMv8) NEON SIMD decompression routines to
improve performance on CPUs with in-order pipelines. This speeds up the
decompression of full-color JPEGs by nearly 2x on average on a Cavium ThunderX
processor and by about 15% on average on a Cortex-A53 core.
1.4.2
=====

View File

@@ -4,7 +4,7 @@ Background
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
NEON, AltiVec) to accelerate baseline JPEG compression and decompression on
x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is
generally 2-4x as fast as libjpeg, all else being equal. On other types of
generally 2-6x as fast as libjpeg, all else being equal. On other types of
systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by
virtue of its highly-optimized Huffman coding routines. In many cases, the
performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
@@ -36,7 +36,7 @@ Using libjpeg-turbo
libjpeg-turbo includes two APIs that can be used to compress and decompress
JPEG images:
- **TurboJPEG API**
- **TurboJPEG API**
This API provides an easy-to-use interface for compressing and decompressing
JPEG images in memory. It also provides some functionality that would not be
straightforward to achieve using the underlying libjpeg API, such as
@@ -44,7 +44,7 @@ JPEG images:
transforms on an image. The Java interface for libjpeg-turbo is written on
top of the TurboJPEG API.
- **libjpeg API**
- **libjpeg API**
This is the de facto industry-standard API for compressing and decompressing
JPEG images. It is more difficult to use than the TurboJPEG API but also
more powerful. The libjpeg API implementation in libjpeg-turbo is both
@@ -135,17 +135,17 @@ which aren't.
#### Fully supported
- **libjpeg: IDCT scaling extensions in decompressor**
- **libjpeg: IDCT scaling extensions in decompressor**
libjpeg-turbo supports IDCT scaling with scaling factors of 1/8, 1/4, 3/8,
1/2, 5/8, 3/4, 7/8, 9/8, 5/4, 11/8, 3/2, 13/8, 7/4, 15/8, and 2/1 (only 1/4
and 1/2 are SIMD-accelerated.)
- **libjpeg: Arithmetic coding**
- **libjpeg: In-memory source and destination managers**
- **libjpeg: In-memory source and destination managers**
See notes below.
- **cjpeg: Separate quality settings for luminance and chrominance**
- **cjpeg: Separate quality settings for luminance and chrominance**
Note that the libpjeg v7+ API was extended to accommodate this feature only
for convenience purposes. It has always been possible to implement this
feature with libjpeg v6b (see rdswitch.c for an example.)
@@ -174,15 +174,15 @@ http://www.libjpeg-turbo.org/About/SmartScale and draw his/her own conclusions,
but it is the general belief of our project that these features have not
demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
- **libjpeg: DCT scaling in compressor**
`cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
- **libjpeg: DCT scaling in compressor**
`cinfo.scale_num` and `cinfo.scale_denom` are silently ignored.
There is no technical reason why DCT scaling could not be supported when
emulating the libjpeg v7+ API/ABI, but without the SmartScale extension (see
below), only scaling factors of 1/2, 8/15, 4/7, 8/13, 2/3, 8/11, 4/5, and
8/9 would be available, which is of limited usefulness.
- **libjpeg: SmartScale**
`cinfo.block_size` is silently ignored.
- **libjpeg: SmartScale**
`cinfo.block_size` is silently ignored.
SmartScale is an extension to the JPEG format that allows for DCT block
sizes other than 8x8. Providing support for this new format would be
feasible (particularly without full acceleration.) However, until/unless
@@ -194,15 +194,15 @@ demonstrated sufficient usefulness to justify inclusion in libjpeg-turbo.
interest in providing this feature would be as a means of supporting
additional DCT scaling factors.
- **libjpeg: Fancy downsampling in compressor**
`cinfo.do_fancy_downsampling` is silently ignored.
- **libjpeg: Fancy downsampling in compressor**
`cinfo.do_fancy_downsampling` is silently ignored.
This requires the DCT scaling feature, which is not supported.
- **jpegtran: Scaling**
- **jpegtran: Scaling**
This requires both the DCT scaling and SmartScale features, which are not
supported.
- **Lossless RGB JPEG files**
- **Lossless RGB JPEG files**
This requires the SmartScale feature, which is not supported.
### What About libjpeg v9?

View File

@@ -1,4 +1,4 @@
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression on x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is generally 2-4x as fast as libjpeg, all else being equal. On other types of systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by virtue of its highly-optimized Huffman coding routines. In many cases, the performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2, NEON, AltiVec) to accelerate baseline JPEG compression and decompression on x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is generally 2-6x as fast as libjpeg, all else being equal. On other types of systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by virtue of its highly-optimized Huffman coding routines. In many cases, the performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.
libjpeg-turbo implements both the traditional libjpeg API as well as the less powerful but more straightforward TurboJPEG API. libjpeg-turbo also features colorspace extensions that allow it to compress from/decompress to 32-bit and big-endian pixel buffers (RGBX, XBGR, etc.), as well as a full-featured Java interface.

View File

@@ -11,7 +11,7 @@ Description: A SIMD-accelerated JPEG codec that provides both the libjpeg and Tu
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
NEON, AltiVec) to accelerate baseline JPEG compression and decompression on
x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is
generally 2-4x as fast as libjpeg, all else being equal. On other types of
generally 2-6x as fast as libjpeg, all else being equal. On other types of
systems, libjpeg-turbo can still outperform libjpeg by a significant amount,
by virtue of its highly-optimized Huffman coding routines. In many cases, the
performance of libjpeg-turbo rivals that of proprietary high-speed JPEG

View File

@@ -46,7 +46,7 @@ Provides: %{name} = %{version}-%{release}, @PACKAGE_NAME@ = %{version}-%{release
libjpeg-turbo is a JPEG image codec that uses SIMD instructions (MMX, SSE2,
NEON, AltiVec) to accelerate baseline JPEG compression and decompression on
x86, x86-64, ARM, and PowerPC systems. On such systems, libjpeg-turbo is
generally 2-4x as fast as libjpeg, all else being equal. On other types of
generally 2-6x as fast as libjpeg, all else being equal. On other types of
systems, libjpeg-turbo can still outperform libjpeg by a significant amount, by
virtue of its highly-optimized Huffman coding routines. In many cases, the
performance of libjpeg-turbo rivals that of proprietary high-speed JPEG codecs.