Merge branch 'master' into dev

This commit is contained in:
DRC
2020-11-05 16:04:55 -06:00
28 changed files with 263 additions and 218 deletions

View File

@@ -388,7 +388,7 @@ detect actual security issues, should they arise in the future.
1. Added AVX2 SIMD implementations of the colorspace conversion, chroma
downsampling and upsampling, integer quantization and sample conversion, and
slow integer DCT/IDCT algorithms. When using the slow integer DCT/IDCT
accurate integer DCT/IDCT algorithms. When using the accurate integer DCT/IDCT
algorithms on AVX2-equipped CPUs, the compression of RGB images is
approximately 13-36% (avg. 22%) faster (relative to libjpeg-turbo 1.5.x) with
64-bit code and 11-21% (avg. 17%) faster with 32-bit code, and the
@@ -498,10 +498,10 @@ libraries that link statically with libjpeg-turbo.
13. Added Loongson MMI SIMD implementations of the RGB-to-YCbCr and
YCbCr-to-RGB colorspace conversion, 4:2:0 chroma downsampling, 4:2:0 fancy
chroma upsampling, integer quantization, and slow integer DCT/IDCT algorithms.
When using the slow integer DCT/IDCT, this speeds up the compression of RGB
images by approximately 70-100% and the decompression of RGB images by
approximately 2-3.5x.
chroma upsampling, integer quantization, and accurate integer DCT/IDCT
algorithms. When using the accurate integer DCT/IDCT, this speeds up the
compression of RGB images by approximately 70-100% and the decompression of RGB
images by approximately 2-3.5x.
14. Fixed a build error when building with older MinGW releases (regression
caused by 1.5.1[7].)
@@ -833,8 +833,8 @@ benchmarking or regression testing, SIMD-accelerated Huffman encoding can be
disabled by setting the `JSIMD_NOHUFFENC` environment variable to `1`.
13. Added ARM 64-bit (ARMv8) NEON SIMD implementations of the commonly-used
compression algorithms (including the slow integer forward DCT and h2v2 & h2v1
downsampling algorithms, which are not accelerated in the 32-bit NEON
compression algorithms (including the accurate integer forward DCT and h2v2 &
h2v1 downsampling algorithms, which are not accelerated in the 32-bit NEON
implementation.) This speeds up the compression of full-color JPEGs by about
75% on average on a Cavium ThunderX processor and by about 2-2.5x on average on
Cortex-A53 and Cortex-A57 cores.
@@ -965,8 +965,8 @@ platforms other than Windows or Linux. Oops.
7. Fixed an extremely rare bug in the Huffman encoder that caused 64-bit
builds of libjpeg-turbo to incorrectly encode a few specific test images when
quality=98, an optimized Huffman table, and the slow integer forward DCT were
used.
quality=98, an optimized Huffman table, and the accurate integer forward DCT
were used.
8. The Windows (CMake) build system now supports building only static or only
shared libraries. This is accomplished by adding either `-DENABLE_STATIC=0` or
@@ -1125,8 +1125,8 @@ floating point inverse DCT (using code borrowed from libjpeg v8a and later.)
The accuracy of this implementation now matches the accuracy of the SSE/SSE2
implementation. Note, however, that the floating point DCT/IDCT algorithms are
mainly a legacy feature. They generally do not produce significantly better
accuracy than the slow integer DCT/IDCT algorithms, and they are quite a bit
slower.
accuracy than the accurate integer DCT/IDCT algorithms, and they are quite a
bit slower.
8. Added a new output colorspace (`JCS_RGB565`) to the libjpeg API that allows
for decompressing JPEG images into RGB565 (16-bit) pixels. If dithering is not
@@ -1536,8 +1536,8 @@ cases.
2. Despite the above, the fast integer forward DCT still degrades somewhat for
JPEG qualities greater than 95, so the TurboJPEG wrapper will now automatically
use the slow integer forward DCT when generating JPEG images of quality 96 or
greater. This reduces compression performance by as much as 15% for these
use the accurate integer forward DCT when generating JPEG images of quality 96
or greater. This reduces compression performance by as much as 15% for these
high-quality images but is necessary to ensure that the images are perceptually
lossless. It also ensures that the library can avoid the performance pitfall
created by [1].

View File

@@ -287,12 +287,13 @@ following reasons:
(and slightly faster) floating point IDCT algorithm introduced in libjpeg
v8a as opposed to the algorithm used in libjpeg v6b. It should be noted,
however, that this algorithm basically brings the accuracy of the floating
point IDCT in line with the accuracy of the slow integer IDCT. The floating
point DCT/IDCT algorithms are mainly a legacy feature, and they do not
produce significantly more accuracy than the slow integer algorithms (to put
numbers on this, the typical difference in PNSR between the two algorithms
is less than 0.10 dB, whereas changing the quality level by 1 in the upper
range of the quality scale is typically more like a 1.0 dB difference.)
point IDCT in line with the accuracy of the accurate integer IDCT. The
floating point DCT/IDCT algorithms are mainly a legacy feature, and they do
not produce significantly more accuracy than the accurate integer algorithms
(to put numbers on this, the typical difference in PNSR between the two
algorithms is less than 0.10 dB, whereas changing the quality level by 1 in
the upper range of the quality scale is typically more like a 1.0 dB
difference.)
- If the floating point algorithms in libjpeg-turbo are not implemented using
SIMD instructions on a particular platform, then the accuracy of the
@@ -340,7 +341,7 @@ The algorithm used by the SIMD-accelerated quantization function cannot produce
correct results whenever the fast integer forward DCT is used along with a JPEG
quality of 98-100. Thus, libjpeg-turbo must use the non-SIMD quantization
function in those cases. This causes performance to drop by as much as 40%.
It is therefore strongly advised that you use the slow integer forward DCT
It is therefore strongly advised that you use the accurate integer forward DCT
whenever encoding images with a JPEG quality of 98 or higher.

51
cjpeg.1
View File

@@ -1,4 +1,4 @@
.TH CJPEG 1 "18 December 2019"
.TH CJPEG 1 "4 November 2020"
.SH NAME
cjpeg \- compress an image file to a JPEG file
.SH SYNOPSIS
@@ -160,31 +160,40 @@ arithmetic coded JPEG is not yet widely implemented, so many decoders will be
unable to view an arithmetic coded JPEG file at all.
.TP
.B \-dct int
Use integer DCT method (default).
Use accurate integer DCT method (default).
.TP
.B \-dct fast
Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.)
Use less accurate integer DCT method [legacy feature].
When the Independent JPEG Group's software was first released in 1991, the
compression time for a 1-megapixel JPEG image on a mainstream PC was measured
in minutes. Thus, the \fBfast\fR integer DCT algorithm provided noticeable
performance benefits. On modern CPUs running libjpeg-turbo, however, the
compression time for a 1-megapixel JPEG image is measured in milliseconds, and
thus the performance benefits of the \fBfast\fR algorithm are much less
noticeable. On modern x86/x86-64 CPUs that support AVX2 instructions, the
\fBfast\fR and \fBint\fR methods have similar performance. On other types of
CPUs, the \fBfast\fR method is generally about 5-15% faster than the \fBint\fR
method.
For quality levels of 90 and below, there should be little or no perceptible
difference between the two algorithms. For quality levels above 90, however,
the difference between the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs generally about a 1-3 dB
loss (in PSNR) relative to the int method, but this can be larger for some
images. Do not use the fast method with quality levels above 97. The
algorithm often degenerates at quality=98 and above and can actually produce a
more lossy image than if lower quality levels had been used. Also, in
libjpeg-turbo, the fast method is not fully accelerated for quality levels
above 97, so it will be slower than the int method.
quality difference between the two algorithms. For quality levels above 90,
however, the difference between the \fBfast\fR and \fBint\fR methods becomes
more pronounced. With quality=97, for instance, the \fBfast\fR method incurs
generally about a 1-3 dB loss in PSNR relative to the \fBint\fR method, but
this can be larger for some images. Do not use the \fBfast\fR method with
quality levels above 97. The algorithm often degenerates at quality=98 and
above and can actually produce a more lossy image than if lower quality levels
had been used. Also, in libjpeg-turbo, the \fBfast\fR method is not fully
accelerated for quality levels above 97, so it will be slower than the
\fBint\fR method.
.TP
.B \-dct float
Use floating-point DCT method.
The float method is mainly a legacy feature. It does not produce significantly
more accurate results than the int method, and it is much slower. The float
method may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same results on
all machines.
Use floating-point DCT method [legacy feature].
The \fBfloat\fR method does not produce significantly more accurate results
than the \fBint\fR method, and it is much slower. The \fBfloat\fR method may
also give different results on different machines due to varying roundoff
behavior, whereas the integer methods should give the same results on all
machines.
.TP
.BI \-icc " file"
Embed ICC color management profile contained in the specified file.

View File

@@ -5,7 +5,7 @@
* Copyright (C) 1991-1998, Thomas G. Lane.
* Modified 2003-2011 by Guido Vollbeding.
* libjpeg-turbo Modifications:
* Copyright (C) 2010, 2013-2014, 2017, 2019, D. R. Commander.
* Copyright (C) 2010, 2013-2014, 2017, 2019-2020, D. R. Commander.
* For conditions of distribution and use, see the accompanying README.ijg
* file.
*
@@ -176,15 +176,15 @@ usage(void)
fprintf(stderr, " -arithmetic Use arithmetic coding\n");
#endif
#ifdef DCT_ISLOW_SUPPORTED
fprintf(stderr, " -dct int Use integer DCT method%s\n",
fprintf(stderr, " -dct int Use accurate integer DCT method%s\n",
(JDCT_DEFAULT == JDCT_ISLOW ? " (default)" : ""));
#endif
#ifdef DCT_IFAST_SUPPORTED
fprintf(stderr, " -dct fast Use fast integer DCT (less accurate)%s\n",
fprintf(stderr, " -dct fast Use less accurate integer DCT method [legacy feature]%s\n",
(JDCT_DEFAULT == JDCT_IFAST ? " (default)" : ""));
#endif
#ifdef DCT_FLOAT_SUPPORTED
fprintf(stderr, " -dct float Use floating-point DCT method%s\n",
fprintf(stderr, " -dct float Use floating-point DCT method [legacy feature]%s\n",
(JDCT_DEFAULT == JDCT_FLOAT ? " (default)" : ""));
#endif
fprintf(stderr, " -icc FILE Embed ICC profile contained in FILE\n");

60
djpeg.1
View File

@@ -1,4 +1,4 @@
.TH DJPEG 1 "18 December 2019"
.TH DJPEG 1 "4 November 2020"
.SH NAME
djpeg \- decompress a JPEG file to an image file
.SH SYNOPSIS
@@ -121,32 +121,40 @@ is specified; otherwise, 24-bit full-color format is emitted.
Switches for advanced users:
.TP
.B \-dct int
Use integer DCT method (default).
Use accurate integer DCT method (default).
.TP
.B \-dct fast
Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.) If
the JPEG image was compressed using a quality level of 85 or below, then there
should be little or no perceptible difference between the two algorithms. When
decompressing images that were compressed using quality levels above 85,
however, the difference between the fast and int methods becomes more
pronounced. With images compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR) relative to the int
method, but this can be larger for some images. If you can avoid it, do not
use the fast method when decompressing images that were compressed using
quality levels above 97. The algorithm often degenerates for such images and
can actually produce a more lossy output image than if the JPEG image had been
compressed using lower quality levels.
Use less accurate integer DCT method [legacy feature].
When the Independent JPEG Group's software was first released in 1991, the
decompression time for a 1-megapixel JPEG image on a mainstream PC was measured
in minutes. Thus, the \fBfast\fR integer DCT algorithm provided noticeable
performance benefits. On modern CPUs running libjpeg-turbo, however, the
decompression time for a 1-megapixel JPEG image is measured in milliseconds,
and thus the performance benefits of the \fBfast\fR algorithm are much less
noticeable. On modern x86/x86-64 CPUs that support AVX2 instructions, the
\fBfast\fR and \fBint\fR methods have similar performance. On other types of
CPUs, the \fBfast\fR method is generally about 5-15% faster than the \fBint\fR
method.
If the JPEG image was compressed using a quality level of 85 or below, then
there should be little or no perceptible quality difference between the two
algorithms. When decompressing images that were compressed using quality
levels above 85, however, the difference between the \fBfast\fR and \fBint\fR
methods becomes more pronounced. With images compressed using quality=97, for
instance, the \fBfast\fR method incurs generally about a 4-6 dB loss in PSNR
relative to the \fBint\fR method, but this can be larger for some images. If
you can avoid it, do not use the \fBfast\fR method when decompressing images
that were compressed using quality levels above 97. The algorithm often
degenerates for such images and can actually produce a more lossy output image
than if the JPEG image had been compressed using lower quality levels.
.TP
.B \-dct float
Use floating-point DCT method.
The float method is mainly a legacy feature. It does not produce significantly
more accurate results than the int method, and it is much slower. The float
method may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same results on
all machines.
Use floating-point DCT method [legacy feature].
The \fBfloat\fR method does not produce significantly more accurate results
than the \fBint\fR method, and it is much slower. The \fBfloat\fR method may
also give different results on different machines due to varying roundoff
behavior, whereas the integer methods should give the same results on all
machines.
.TP
.B \-dither fs
Use Floyd-Steinberg dithering in color quantization.
@@ -282,12 +290,6 @@ is fast but much lower quality than the default behavior.
.B \-dither none
may give acceptable results in two-pass mode, but is seldom tolerable in
one-pass mode.
.PP
If you are fortunate enough to have very fast floating point hardware,
\fB\-dct float\fR may be even faster than \fB\-dct fast\fR. But on most
machines \fB\-dct float\fR is slower than \fB\-dct int\fR; in this case it is
not worth using, because its theoretical accuracy advantage is too small to be
significant in practice.
.SH ENVIRONMENT
.TP
.B JPEGMEM

View File

@@ -149,15 +149,15 @@ usage(void)
#endif
fprintf(stderr, "Switches for advanced users:\n");
#ifdef DCT_ISLOW_SUPPORTED
fprintf(stderr, " -dct int Use integer DCT method%s\n",
fprintf(stderr, " -dct int Use accurate integer DCT method%s\n",
(JDCT_DEFAULT == JDCT_ISLOW ? " (default)" : ""));
#endif
#ifdef DCT_IFAST_SUPPORTED
fprintf(stderr, " -dct fast Use fast integer DCT (less accurate)%s\n",
fprintf(stderr, " -dct fast Use less accurate integer DCT method [legacy feature]%s\n",
(JDCT_DEFAULT == JDCT_IFAST ? " (default)" : ""));
#endif
#ifdef DCT_FLOAT_SUPPORTED
fprintf(stderr, " -dct float Use floating-point DCT method%s\n",
fprintf(stderr, " -dct float Use floating-point DCT method [legacy feature]%s\n",
(JDCT_DEFAULT == JDCT_FLOAT ? " (default)" : ""));
#endif
fprintf(stderr, " -dither fs Use F-S dithering (default)\n");

View File

@@ -4,11 +4,11 @@
* This file was part of the Independent JPEG Group's software:
* Copyright (C) 1991-1996, Thomas G. Lane.
* libjpeg-turbo Modifications:
* Copyright (C) 2015, D. R. Commander.
* Copyright (C) 2015, 2020, D. R. Commander.
* For conditions of distribution and use, see the accompanying README.ijg
* file.
*
* This file contains a slow-but-accurate integer implementation of the
* This file contains a slower but more accurate integer implementation of the
* forward DCT (Discrete Cosine Transform).
*
* A 2-D DCT can be done by 1-D DCT on each row followed by 1-D DCT

View File

@@ -5,11 +5,11 @@
* Copyright (C) 1991-1998, Thomas G. Lane.
* Modification developed 2002-2018 by Guido Vollbeding.
* libjpeg-turbo Modifications:
* Copyright (C) 2015, D. R. Commander.
* Copyright (C) 2015, 2020, D. R. Commander.
* For conditions of distribution and use, see the accompanying README.ijg
* file.
*
* This file contains a slow-but-accurate integer implementation of the
* This file contains a slower but more accurate integer implementation of the
* inverse DCT (Discrete Cosine Transform). In the IJG code, this routine
* must also perform dequantization of the input coefficients.
*

View File

@@ -5,7 +5,7 @@
* Copyright (C) 1991-1997, Thomas G. Lane.
* Modified 1997-2009 by Guido Vollbeding.
* libjpeg-turbo Modifications:
* Copyright (C) 2009, 2011, 2014-2015, 2018, D. R. Commander.
* Copyright (C) 2009, 2011, 2014-2015, 2018, 2020, D. R. Commander.
* For conditions of distribution and use, see the accompanying README.ijg
* file.
*
@@ -238,9 +238,9 @@ typedef int boolean;
/* Capability options common to encoder and decoder: */
#define DCT_ISLOW_SUPPORTED /* slow but accurate integer algorithm */
#define DCT_IFAST_SUPPORTED /* faster, less accurate integer method */
#define DCT_FLOAT_SUPPORTED /* floating-point: accurate, fast on fast HW */
#define DCT_ISLOW_SUPPORTED /* accurate integer method */
#define DCT_IFAST_SUPPORTED /* less accurate int method [legacy feature] */
#define DCT_FLOAT_SUPPORTED /* floating-point method [legacy feature] */
/* Encoder capability options: */

View File

@@ -5,7 +5,7 @@
* Copyright (C) 1991-1998, Thomas G. Lane.
* Modified 2002-2009 by Guido Vollbeding.
* libjpeg-turbo Modifications:
* Copyright (C) 2009-2011, 2013-2014, 2016-2017, D. R. Commander.
* Copyright (C) 2009-2011, 2013-2014, 2016-2017, 2020, D. R. Commander.
* Copyright (C) 2015, Google, Inc.
* For conditions of distribution and use, see the accompanying README.ijg
* file.
@@ -244,9 +244,9 @@ typedef enum {
/* DCT/IDCT algorithm options. */
typedef enum {
JDCT_ISLOW, /* slow but accurate integer algorithm */
JDCT_IFAST, /* faster, less accurate integer method */
JDCT_FLOAT /* floating-point: accurate, fast on fast HW */
JDCT_ISLOW, /* accurate integer method */
JDCT_IFAST, /* less accurate integer method [legacy feature] */
JDCT_FLOAT /* floating-point method [legacy feature] */
} J_DCT_METHOD;
#ifndef JDCT_DEFAULT /* may be overridden in jconfig.h */

View File

@@ -969,30 +969,38 @@ boolean arith_code
J_DCT_METHOD dct_method
Selects the algorithm used for the DCT step. Choices are:
JDCT_ISLOW: slow but accurate integer algorithm
JDCT_IFAST: faster, less accurate integer method
JDCT_FLOAT: floating-point method
JDCT_ISLOW: accurate integer method
JDCT_IFAST: less accurate integer method [legacy feature]
JDCT_FLOAT: floating-point method [legacy feature]
JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST)
In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
with other SIMD implementations, or when using libjpeg-turbo without
SIMD extensions.) For quality levels of 90 and below, there should be
little or no perceptible difference between the two algorithms. For
quality levels above 90, however, the difference between JDCT_IFAST and
When the Independent JPEG Group's software was first released in 1991,
the compression time for a 1-megapixel JPEG image on a mainstream PC
was measured in minutes. Thus, JDCT_IFAST provided noticeable
performance benefits. On modern CPUs running libjpeg-turbo, however,
the compression time for a 1-megapixel JPEG image is measured in
milliseconds, and thus the performance benefits of JDCT_IFAST are much
less noticeable. On modern x86/x86-64 CPUs that support AVX2
instructions, JDCT_IFAST and JDCT_ISLOW have similar performance. On
other types of CPUs, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW.
For quality levels of 90 and below, there should be little or no
perceptible quality difference between the two algorithms. For quality
levels above 90, however, the difference between JDCT_IFAST and
JDCT_ISLOW becomes more pronounced. With quality=97, for instance,
JDCT_IFAST incurs generally about a 1-3 dB loss (in PSNR) relative to
JDCT_IFAST incurs generally about a 1-3 dB loss in PSNR relative to
JDCT_ISLOW, but this can be larger for some images. Do not use
JDCT_IFAST with quality levels above 97. The algorithm often
degenerates at quality=98 and above and can actually produce a more
lossy image than if lower quality levels had been used. Also, in
libjpeg-turbo, JDCT_IFAST is not fully accelerated for quality levels
above 97, so it will be slower than JDCT_ISLOW. JDCT_FLOAT is mainly a
legacy feature. It does not produce significantly more accurate
results than the ISLOW method, and it is much slower. The FLOAT method
may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same
results on all machines.
above 97, so it will be slower than JDCT_ISLOW.
JDCT_FLOAT does not produce significantly more accurate results than
JDCT_ISLOW, and it is much slower. JDCT_FLOAT may also give different
results on different machines due to varying roundoff behavior, whereas
the integer methods should give the same results on all machines.
J_COLOR_SPACE jpeg_color_space
int num_components
@@ -1270,31 +1278,39 @@ Additional decompression parameters that the application may set include:
J_DCT_METHOD dct_method
Selects the algorithm used for the DCT step. Choices are:
JDCT_ISLOW: slow but accurate integer algorithm
JDCT_IFAST: faster, less accurate integer method
JDCT_FLOAT: floating-point method
JDCT_ISLOW: accurate integer method
JDCT_IFAST: less accurate integer method [legacy feature]
JDCT_FLOAT: floating-point method [legacy feature]
JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST)
In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
with other SIMD implementations, or when using libjpeg-turbo without
SIMD extensions.) If the JPEG image was compressed using a quality
level of 85 or below, then there should be little or no perceptible
difference between the two algorithms. When decompressing images that
were compressed using quality levels above 85, however, the difference
When the Independent JPEG Group's software was first released in 1991,
the decompression time for a 1-megapixel JPEG image on a mainstream PC
was measured in minutes. Thus, JDCT_IFAST provided noticeable
performance benefits. On modern CPUs running libjpeg-turbo, however,
the decompression time for a 1-megapixel JPEG image is measured in
milliseconds, and thus the performance benefits of JDCT_IFAST are much
less noticeable. On modern x86/x86-64 CPUs that support AVX2
instructions, JDCT_IFAST and JDCT_ISLOW have similar performance. On
other types of CPUs, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW.
If the JPEG image was compressed using a quality level of 85 or below,
then there should be little or no perceptible quality difference
between the two algorithms. When decompressing images that were
compressed using quality levels above 85, however, the difference
between JDCT_IFAST and JDCT_ISLOW becomes more pronounced. With images
compressed using quality=97, for instance, JDCT_IFAST incurs generally
about a 4-6 dB loss (in PSNR) relative to JDCT_ISLOW, but this can be
about a 4-6 dB loss in PSNR relative to JDCT_ISLOW, but this can be
larger for some images. If you can avoid it, do not use JDCT_IFAST
when decompressing images that were compressed using quality levels
above 97. The algorithm often degenerates for such images and can
actually produce a more lossy output image than if the JPEG image had
been compressed using lower quality levels. JDCT_FLOAT is mainly a
legacy feature. It does not produce significantly more accurate
results than the ISLOW method, and it is much slower. The FLOAT method
may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same
results on all machines.
been compressed using lower quality levels.
JDCT_FLOAT does not produce significantly more accurate results than
JDCT_ISLOW, and it is much slower. JDCT_FLOAT may also give different
results on different machines due to varying roundoff behavior, whereas
the integer methods should give the same results on all machines.
boolean do_fancy_upsampling
If TRUE, do careful upsampling of chroma components. If FALSE,

View File

@@ -6,7 +6,7 @@
* Author: Siarhei Siamashka <siarhei.siamashka@nokia.com>
* Copyright (C) 2013-2014, Linaro Limited. All Rights Reserved.
* Author: Ragesh Radhakrishnan <ragesh.r@linaro.org>
* Copyright (C) 2014-2016, D. R. Commander. All Rights Reserved.
* Copyright (C) 2014-2016, 2020, D. R. Commander. All Rights Reserved.
* Copyright (C) 2015-2016, 2018, Matthieu Darbois. All Rights Reserved.
* Copyright (C) 2016, Siarhei Siamashka. All Rights Reserved.
*
@@ -2373,7 +2373,7 @@ asm_function jsimd_convsamp_neon
/*
* jsimd_fdct_islow_neon
*
* This file contains a slow-but-accurate integer implementation of the
* This file contains a slower but more accurate integer implementation of the
* forward DCT (Discrete Cosine Transform). The following code is based
* directly on the IJG''s original jfdctint.c; see the jfdctint.c for
* more details.

View File

@@ -2,7 +2,7 @@
; jfdctint.asm - accurate integer FDCT (AVX2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2009, 2016, 2018, D. R. Commander.
; Copyright (C) 2009, 2016, 2018, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; forward DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jfdctint.c; see the jfdctint.c for
; more details.
@@ -103,7 +103,7 @@ F_3_072 equ DESCALE(3299298341, 30 - CONST_BITS) ; FIX(3.072711026)
%endmacro
; --------------------------------------------------------------------------
; In-place 8x8x16-bit slow integer forward DCT using AVX2 instructions
; In-place 8x8x16-bit accurate integer forward DCT using AVX2 instructions
; %1-%4: Input/output registers
; %5-%8: Temp registers
; %9: Pass (1 or 2)

View File

@@ -2,7 +2,7 @@
; jfdctint.asm - accurate integer FDCT (MMX)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2016, D. R. Commander.
; Copyright (C) 2016, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; forward DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jfdctint.c; see the jfdctint.c for
; more details.

View File

@@ -2,7 +2,7 @@
; jfdctint.asm - accurate integer FDCT (SSE2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2016, D. R. Commander.
; Copyright (C) 2016, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; forward DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jfdctint.c; see the jfdctint.c for
; more details.

View File

@@ -2,7 +2,7 @@
; jidctint.asm - accurate integer IDCT (AVX2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2009, 2016, 2018, D. R. Commander.
; Copyright (C) 2009, 2016, 2018, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; inverse DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jidctint.c; see the jidctint.c for
; more details.
@@ -113,7 +113,7 @@ F_3_072 equ DESCALE(3299298341, 30 - CONST_BITS) ; FIX(3.072711026)
%endmacro
; --------------------------------------------------------------------------
; In-place 8x8x16-bit slow integer inverse DCT using AVX2 instructions
; In-place 8x8x16-bit accurate integer inverse DCT using AVX2 instructions
; %1-%4: Input/output registers
; %5-%12: Temp registers
; %9: Pass (1 or 2)

View File

@@ -2,7 +2,7 @@
; jidctint.asm - accurate integer IDCT (MMX)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2016, D. R. Commander.
; Copyright (C) 2016, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; inverse DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jidctint.c; see the jidctint.c for
; more details.

View File

@@ -2,7 +2,7 @@
; jidctint.asm - accurate integer IDCT (SSE2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2016, D. R. Commander.
; Copyright (C) 2016, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; inverse DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jidctint.c; see the jidctint.c for
; more details.

View File

@@ -2,7 +2,7 @@
* simd/jsimd.h
*
* Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
* Copyright (C) 2011, 2014-2016, 2018, D. R. Commander.
* Copyright (C) 2011, 2014-2016, 2018, 2020, D. R. Commander.
* Copyright (C) 2013-2014, MIPS Technologies, Inc., California.
* Copyright (C) 2014, Linaro Limited.
* Copyright (C) 2015-2016, 2018, Matthieu Darbois.
@@ -951,7 +951,7 @@ EXTERN(void) jsimd_convsamp_float_sse2
EXTERN(void) jsimd_convsamp_float_dspr2
(JSAMPARRAY sample_data, JDIMENSION start_col, FAST_FLOAT *workspace);
/* Slow Integer Forward DCT */
/* Accurate Integer Forward DCT */
EXTERN(void) jsimd_fdct_islow_mmx(DCTELEM *data);
extern const int jconst_fdct_islow_sse2[];
@@ -1060,7 +1060,7 @@ EXTERN(void) jsimd_idct_12x12_pass1_dspr2
EXTERN(void) jsimd_idct_12x12_pass2_dspr2
(int *workspace, int *output);
/* Slow Integer Inverse DCT */
/* Accurate Integer Inverse DCT */
EXTERN(void) jsimd_idct_islow_mmx
(void *dct_table, JCOEFPTR coef_block, JSAMPARRAY output_buf,
JDIMENSION output_col);

View File

@@ -1,7 +1,7 @@
/*
* Loongson MMI optimizations for libjpeg-turbo
*
* Copyright (C) 2014, 2018, D. R. Commander. All Rights Reserved.
* Copyright (C) 2014, 2018, 2020, D. R. Commander. All Rights Reserved.
* Copyright (C) 2016-2017, Loongson Technology Corporation Limited, BeiJing.
* All Rights Reserved.
* Authors: ZhuChen <zhuchen@loongson.cn>
@@ -28,7 +28,7 @@
* 3. This notice may not be removed or altered from any source distribution.
*/
/* SLOW INTEGER FORWARD DCT */
/* ACCURATE INTEGER FORWARD DCT */
#include "jsimd_mmi.h"

View File

@@ -1,7 +1,7 @@
/*
* Loongson MMI optimizations for libjpeg-turbo
*
* Copyright (C) 2014-2015, 2018, D. R. Commander. All Rights Reserved.
* Copyright (C) 2014-2015, 2018, 2020, D. R. Commander. All Rights Reserved.
* Copyright (C) 2016-2017, Loongson Technology Corporation Limited, BeiJing.
* All Rights Reserved.
* Authors: ZhuChen <zhuchen@loongson.cn>
@@ -28,7 +28,7 @@
* 3. This notice may not be removed or altered from any source distribution.
*/
/* SLOW INTEGER INVERSE DCT */
/* ACCUATE INTEGER INVERSE DCT */
#include "jsimd_mmi.h"

View File

@@ -1,7 +1,7 @@
/*
* AltiVec optimizations for libjpeg-turbo
*
* Copyright (C) 2014, D. R. Commander. All Rights Reserved.
* Copyright (C) 2014, 2020, D. R. Commander. All Rights Reserved.
*
* This software is provided 'as-is', without any express or implied
* warranty. In no event will the authors be held liable for any damages
@@ -20,7 +20,7 @@
* 3. This notice may not be removed or altered from any source distribution.
*/
/* SLOW INTEGER FORWARD DCT */
/* ACCURATE INTEGER FORWARD DCT */
#include "jsimd_altivec.h"

View File

@@ -1,7 +1,7 @@
/*
* AltiVec optimizations for libjpeg-turbo
*
* Copyright (C) 2014-2015, D. R. Commander. All Rights Reserved.
* Copyright (C) 2014-2015, 2020, D. R. Commander. All Rights Reserved.
*
* This software is provided 'as-is', without any express or implied
* warranty. In no event will the authors be held liable for any damages
@@ -20,7 +20,7 @@
* 3. This notice may not be removed or altered from any source distribution.
*/
/* SLOW INTEGER INVERSE DCT */
/* ACCURATE INTEGER INVERSE DCT */
#include "jsimd_altivec.h"

View File

@@ -2,7 +2,7 @@
; jfdctint.asm - accurate integer FDCT (64-bit AVX2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2009, 2016, 2018, D. R. Commander.
; Copyright (C) 2009, 2016, 2018, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; forward DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jfdctint.c; see the jfdctint.c for
; more details.
@@ -103,7 +103,7 @@ F_3_072 equ DESCALE(3299298341, 30 - CONST_BITS) ; FIX(3.072711026)
%endmacro
; --------------------------------------------------------------------------
; In-place 8x8x16-bit slow integer forward DCT using AVX2 instructions
; In-place 8x8x16-bit accurate integer forward DCT using AVX2 instructions
; %1-%4: Input/output registers
; %5-%8: Temp registers
; %9: Pass (1 or 2)

View File

@@ -2,7 +2,7 @@
; jfdctint.asm - accurate integer FDCT (64-bit SSE2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2009, 2016, D. R. Commander.
; Copyright (C) 2009, 2016, 2020, D. R. Commander.
;
; Based on the x86 SIMD extension for IJG JPEG library
; Copyright (C) 1999-2006, MIYASAKA Masaru.
@@ -14,7 +14,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; forward DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jfdctint.c; see the jfdctint.c for
; more details.

View File

@@ -2,7 +2,7 @@
; jidctint.asm - accurate integer IDCT (64-bit AVX2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2009, 2016, 2018, D. R. Commander.
; Copyright (C) 2009, 2016, 2018, 2020, D. R. Commander.
; Copyright (C) 2018, Matthias Räncker.
;
; Based on the x86 SIMD extension for IJG JPEG library
@@ -15,7 +15,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; inverse DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jidctint.c; see the jidctint.c for
; more details.
@@ -114,7 +114,7 @@ F_3_072 equ DESCALE(3299298341, 30 - CONST_BITS) ; FIX(3.072711026)
%endmacro
; --------------------------------------------------------------------------
; In-place 8x8x16-bit slow integer inverse DCT using AVX2 instructions
; In-place 8x8x16-bit accurate integer inverse DCT using AVX2 instructions
; %1-%4: Input/output registers
; %5-%12: Temp registers
; %9: Pass (1 or 2)

View File

@@ -2,7 +2,7 @@
; jidctint.asm - accurate integer IDCT (64-bit SSE2)
;
; Copyright 2009 Pierre Ossman <ossman@cendio.se> for Cendio AB
; Copyright (C) 2009, 2016, D. R. Commander.
; Copyright (C) 2009, 2016, 2020, D. R. Commander.
; Copyright (C) 2018, Matthias Räncker.
;
; Based on the x86 SIMD extension for IJG JPEG library
@@ -15,7 +15,7 @@
; NASM is available from http://nasm.sourceforge.net/ or
; http://sourceforge.net/project/showfiles.php?group_id=6208
;
; This file contains a slow-but-accurate integer implementation of the
; This file contains a slower but more accurate integer implementation of the
; inverse DCT (Discrete Cosine Transform). The following code is based
; directly on the IJG's original jidctint.c; see the jidctint.c for
; more details.

135
usage.txt
View File

@@ -168,35 +168,43 @@ Switches for advanced users:
be unable to view an arithmetic coded JPEG file at
all.
-dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about
5-15% faster than the int method when using the
x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo
without SIMD extensions.) For quality levels of 90 and
below, there should be little or no perceptible
difference between the two algorithms. For quality
levels above 90, however, the difference between
the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs
generally about a 1-3 dB loss (in PSNR) relative to
the int method, but this can be larger for some images.
Do not use the fast method with quality levels above
97. The algorithm often degenerates at quality=98 and
above and can actually produce a more lossy image than
if lower quality levels had been used. Also, in
libjpeg-turbo, the fast method is not fully accerated
for quality levels above 97, so it will be slower than
the int method.
-dct float Use floating-point DCT method.
The float method is mainly a legacy feature. It does
not produce significantly more accurate results than
the int method, and it is much slower. The float
method may also give different results on different
machines due to varying roundoff behavior, whereas the
integer methods should give the same results on all
machines.
-dct int Use accurate integer DCT method (default).
-dct fast Use less accurate integer DCT method [legacy feature].
When the Independent JPEG Group's software was first
released in 1991, the compression time for a
1-megapixel JPEG image on a mainstream PC was measured
in minutes. Thus, the fast integer DCT algorithm
provided noticeable performance benefits. On modern
CPUs running libjpeg-turbo, however, the compression
time for a 1-megapixel JPEG image is measured in
milliseconds, and thus the performance benefits of the
fast algorithm are much less noticeable. On modern
x86/x86-64 CPUs that support AVX2 instructions, the
fast and int methods have similar performance. On
other types of CPUs, the fast method is generally about
5-15% faster than the int method.
For quality levels of 90 and below, there should be
little or no perceptible quality difference between the
two algorithms. For quality levels above 90, however,
the difference between the fast and int methods becomes
more pronounced. With quality=97, for instance, the
fast method incurs generally about a 1-3 dB loss in
PSNR relative to the int method, but this can be larger
for some images. Do not use the fast method with
quality levels above 97. The algorithm often
degenerates at quality=98 and above and can actually
produce a more lossy image than if lower quality levels
had been used. Also, in libjpeg-turbo, the fast method
is not fully accelerated for quality levels above 97,
so it will be slower than the int method.
-dct float Use floating-point DCT method [legacy feature].
The float method does not produce significantly more
accurate results than the int method, and it is much
slower. The float method may also give different
results on different machines due to varying roundoff
behavior, whereas the integer methods should give the
same results on all machines.
-restart N Emit a JPEG restart marker every N MCU rows, or every
N MCU blocks if "B" is attached to the number.
@@ -318,36 +326,45 @@ The basic command line switches for djpeg are:
Switches for advanced users:
-dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about
5-15% faster than the int method when using the
x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo
without SIMD extensions.) If the JPEG image was
compressed using a quality level of 85 or below, then
there should be little or no perceptible difference
between the two algorithms. When decompressing images
that were compressed using quality levels above 85,
however, the difference between the fast and int
methods becomes more pronounced. With images
compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR)
relative to the int method, but this can be larger for
some images. If you can avoid it, do not use the fast
method when decompressing images that were compressed
using quality levels above 97. The algorithm often
degenerates for such images and can actually produce
a more lossy output image than if the JPEG image had
been compressed using lower quality levels.
-dct float Use floating-point DCT method.
The float method is mainly a legacy feature. It does
not produce significantly more accurate results than
the int method, and it is much slower. The float
method may also give different results on different
machines due to varying roundoff behavior, whereas the
integer methods should give the same results on all
machines.
-dct int Use accurate integer DCT method (default).
-dct fast Use less accurate integer DCT method [legacy feature].
When the Independent JPEG Group's software was first
released in 1991, the decompression time for a
1-megapixel JPEG image on a mainstream PC was measured
in minutes. Thus, the fast integer DCT algorithm
provided noticeable performance benefits. On modern
CPUs running libjpeg-turbo, however, the decompression
time for a 1-megapixel JPEG image is measured in
milliseconds, and thus the performance benefits of the
fast algorithm are much less noticeable. On modern
x86/x86-64 CPUs that support AVX2 instructions, the
fast and int methods have similar performance. On
other types of CPUs, the fast method is generally about
5-15% faster than the int method.
If the JPEG image was compressed using a quality level
of 85 or below, then there should be little or no
perceptible quality difference between the two
algorithms. When decompressing images that were
compressed using quality levels above 85, however, the
difference between the fast and int methods becomes
more pronounced. With images compressed using
quality=97, for instance, the fast method incurs
generally about a 4-6 dB loss in PSNR relative to the
int method, but this can be larger for some images. If
you can avoid it, do not use the fast method when
decompressing images that were compressed using quality
levels above 97. The algorithm often degenerates for
such images and can actually produce a more lossy
output image than if the JPEG image had been compressed
using lower quality levels.
-dct float Use floating-point DCT method [legacy feature].
The float method does not produce significantly more
accurate results than the int method, and it is much
slower. The float method may also give different
results on different machines due to varying roundoff
behavior, whereas the integer methods should give the
same results on all machines.
-dither fs Use Floyd-Steinberg dithering in color quantization.
-dither ordered Use ordered dithering in color quantization.