Provide a more thorough description of the trade-offs between the various DCT/IDCT algorithms, based on new resarch

git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1286 632fc199-4ca6-4c93-a231-07263d6284db
This commit is contained in:
DRC
2014-05-11 09:46:28 +00:00
parent 311f16e848
commit 6b48fbd229
5 changed files with 144 additions and 50 deletions

View File

@@ -419,10 +419,16 @@ details.
For the most part, libjpeg-turbo should produce identical output to libjpeg For the most part, libjpeg-turbo should produce identical output to libjpeg
v6b. The one exception to this is when using the floating point DCT/IDCT, in v6b. The one exception to this is when using the floating point DCT/IDCT, in
which case the outputs of libjpeg v6b and libjpeg-turbo are not guaranteed to which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
be identical (the accuracy of the floating point DCT/IDCT is constant when following reasons:
using libjpeg-turbo's SIMD extensions, but otherwise, it can depend heavily on
the compiler and compiler settings.) -- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
slightly more accurate than the implementation in libjpeg v6b, but not by
any amount perceptible to human vision (generally in the range of 0.01 to
0.08 dB gain in PNSR.)
-- When not using the SIMD extensions, then the accuracy of the floating point
DCT/IDCT can depend on the compiler and compiler settings.
While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
still using the same algorithms as libjpeg v6b, so there are several specific still using the same algorithms as libjpeg v6b, so there are several specific
@@ -430,12 +436,14 @@ cases in which libjpeg-turbo cannot be expected to produce the same output as
libjpeg v8: libjpeg v8:
-- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8 -- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
implements those scaling algorithms a bit differently than libjpeg v6b does, implements those scaling algorithms differently than libjpeg v6b does, and
and libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior. libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
-- When using chrominance subsampling, because libjpeg v8 implements this -- When using chrominance subsampling, because libjpeg v8 implements this
with its DCT/IDCT scaling algorithms rather than with a separate with its DCT/IDCT scaling algorithms rather than with a separate
downsampling/upsampling algorithm. downsampling/upsampling algorithm. In our testing, the subsampled/upsampled
output of libjpeg v8 is less accurate than that of libjpeg v6b for this
reason.
-- When using the floating point IDCT, for the reasons stated above and also -- When using the floating point IDCT, for the reasons stated above and also
because the floating point IDCT algorithm was modified in libjpeg v8a to because the floating point IDCT algorithm was modified in libjpeg v8a to

23
cjpeg.1
View File

@@ -1,4 +1,4 @@
.TH CJPEG 1 "18 January 2013" .TH CJPEG 1 "11 May 2014"
.SH NAME .SH NAME
cjpeg \- compress an image file to a JPEG file cjpeg \- compress an image file to a JPEG file
.SH SYNOPSIS .SH SYNOPSIS
@@ -166,14 +166,25 @@ Use integer DCT method (default).
.TP .TP
.B \-dct fast .B \-dct fast
Use fast integer DCT (less accurate). Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.)
For quality levels of 90 and below, there should be little or no perceptible
difference between the two algorithms. For quality levels above 90, however,
the difference between the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs generally about a 1-3 dB
loss (in PSNR) relative to the int method, but this can be larger for some
images. Do not use the fast method with quality levels above 97. The
algorithm often degenerates at quality=98 and above and can actually produce a
more lossy image than if lower quality levels had been used.
.TP .TP
.B \-dct float .B \-dct float
Use floating-point DCT method. Use floating-point DCT method.
The float method is very slightly more accurate than the int method, but is The float method is mostly a legacy feature. It does not produce significantly
much slower unless your machine has very fast floating-point hardware. Also more accurate results than the int method, and it is much slower. The float
note that results of the floating-point method may vary slightly across method may also give different results on different machines due to varying
machines, while the integer methods should give the same results everywhere. roundoff behavior, whereas the integer methods should give the same results on
The fast integer method is much less accurate than the other two. all machines.
.TP .TP
.BI \-restart " N" .BI \-restart " N"
Emit a JPEG restart marker every N MCU rows, or every N MCU blocks if "B" is Emit a JPEG restart marker every N MCU rows, or every N MCU blocks if "B" is

26
djpeg.1
View File

@@ -1,4 +1,4 @@
.TH DJPEG 1 "18 January 2013" .TH DJPEG 1 "11 May 2014"
.SH NAME .SH NAME
djpeg \- decompress a JPEG file to an image file djpeg \- decompress a JPEG file to an image file
.SH SYNOPSIS .SH SYNOPSIS
@@ -115,14 +115,28 @@ Use integer DCT method (default).
.TP .TP
.B \-dct fast .B \-dct fast
Use fast integer DCT (less accurate). Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.) If
the JPEG image was compressed using a quality level of 85 or below, then there
should be little or no perceptible difference between the two algorithms. When
decompressing images that were compressed using quality levels above 85,
however, the difference between the fast and int methods becomes more
pronounced. With images compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR) relative to the int
method, but this can be larger for some images. If you can avoid it, do not
use the fast method when decompressing images that were compressed using
quality levels above 97. The algorithm often degenerates for such images and
can actually produce a more lossy output image than if the JPEG image had been
compressed using lower quality levels.
.TP .TP
.B \-dct float .B \-dct float
Use floating-point DCT method. Use floating-point DCT method.
The float method is very slightly more accurate than the int method, but is The float method is mostly a legacy feature. It does not produce significantly
much slower unless your machine has very fast floating-point hardware. Also more accurate results than the int method, and it is much slower. The float
note that results of the floating-point method may vary slightly across method may also give different results on different machines due to varying
machines, while the integer methods should give the same results everywhere. roundoff behavior, whereas the integer methods should give the same results on
The fast integer method is much less accurate than the other two. all machines.
.TP .TP
.B \-dither fs .B \-dither fs
Use Floyd-Steinberg dithering in color quantization. Use Floyd-Steinberg dithering in color quantization.

View File

@@ -3,7 +3,7 @@ USING THE IJG JPEG LIBRARY
This file was part of the Independent JPEG Group's software: This file was part of the Independent JPEG Group's software:
Copyright (C) 1994-2011, Thomas G. Lane, Guido Vollbeding. Copyright (C) 1994-2011, Thomas G. Lane, Guido Vollbeding.
Modifications: Modifications:
Copyright (C) 2010, D. R. Commander. Copyright (C) 2010, 2014, D. R. Commander.
For conditions of distribution and use, see the accompanying README file. For conditions of distribution and use, see the accompanying README file.
@@ -886,14 +886,23 @@ J_DCT_METHOD dct_method
JDCT_FLOAT: floating-point method JDCT_FLOAT: floating-point method
JDCT_DEFAULT: default method (normally JDCT_ISLOW) JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST) JDCT_FASTEST: fastest method (normally JDCT_IFAST)
The FLOAT method is very slightly more accurate than the ISLOW method, In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
but may give different results on different machines due to varying JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
roundoff behavior. The integer methods should give the same results with other SIMD implementations, or when using libjpeg-turbo without
on all machines. On machines with sufficiently fast FP hardware, the SIMD extensions.) For quality levels of 90 and below, there should be
floating-point method may also be the fastest. The IFAST method is little or no perceptible difference between the two algorithms. For
considerably less accurate than the other two; its use is not quality levels above 90, however, the difference between JDCT_IFAST and
recommended if high quality is a concern. JDCT_DEFAULT and JDCT_ISLOW becomes more pronounced. With quality=97, for instance,
JDCT_FASTEST are macros configurable by each installation. JDCT_IFAST incurs generally about a 1-3 dB loss (in PSNR) relative to
JDCT_ISLOW, but this can be larger for some images. Do not use
JDCT_IFAST with quality levels above 97. The algorithm often
degenerates at quality=98 and above and can actually produce a more
lossy image than if lower quality levels had been used. JDCT_FLOAT is
mostly a legacy feature. It does not produce significantly more
accurate results than the ISLOW method, and it is much slower. The
FLOAT method may also give different results on different machines due
to varying roundoff behavior, whereas the integer methods should give
the same results on all machines.
J_COLOR_SPACE jpeg_color_space J_COLOR_SPACE jpeg_color_space
int num_components int num_components
@@ -1170,8 +1179,32 @@ int actual_number_of_colors
Additional decompression parameters that the application may set include: Additional decompression parameters that the application may set include:
J_DCT_METHOD dct_method J_DCT_METHOD dct_method
Selects the algorithm used for the DCT step. Choices are the same Selects the algorithm used for the DCT step. Choices are:
as described above for compression. JDCT_ISLOW: slow but accurate integer algorithm
JDCT_IFAST: faster, less accurate integer method
JDCT_FLOAT: floating-point method
JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST)
In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
with other SIMD implementations, or when using libjpeg-turbo without
SIMD extensions.) If the JPEG image was compressed using a quality
level of 85 or below, then there should be little or no perceptible
difference between the two algorithms. When decompressing images that
were compressed using quality levels above 85, however, the difference
between JDCT_IFAST and JDCT_ISLOW becomes more pronounced. With images
compressed using quality=97, for instance, JDCT_IFAST incurs generally
about a 4-6 dB loss (in PSNR) relative to JDCT_ISLOW, but this can be
larger for some images. If you can avoid it, do not use JDCT_IFAST
when decompressing images that were compressed using quality levels
above 97. The algorithm often degenerates for such images and can
actually produce a more lossy output image than if the JPEG image had
been compressed using lower quality levels. JDCT_FLOAT is mostly a
legacy feature. It does not produce significantly more accurate
results than the ISLOW method, and it is much slower. The FLOAT method
may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same
results on all machines.
boolean do_fancy_upsampling boolean do_fancy_upsampling
If TRUE, do careful upsampling of chroma components. If FALSE, If TRUE, do careful upsampling of chroma components. If FALSE,

View File

@@ -172,13 +172,28 @@ Switches for advanced users:
-dct int Use integer DCT method (default). -dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate). -dct fast Use fast integer DCT (less accurate).
-dct float Use floating-point DCT method. -dct float Use floating-point DCT method.
The float method is very slightly more accurate than In libjpeg-turbo, the fast method is generally about
the int method, but is much slower unless your machine 5-15% faster than the int method when using the
has very fast floating-point hardware. Also note that x86/x86-64 SIMD extensions (results may vary with other
results of the floating-point method may vary slightly SIMD implementations, or when using libjpeg-turbo
across machines, while the integer methods should give without SIMD extensions.) For quality levels of 90 and
the same results everywhere. The fast integer method below, there should be little or no perceptible
is much less accurate than the other two. difference between the two algorithms. For quality
levels above 90, however, the difference between
the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs
generally about a 1-3 dB loss (in PSNR) relative to
the int method, but this can be larger for some images.
Do not use the fast method with quality levels above
97. The algorithm often degenerates at quality=98 and
above and can actually produce a more lossy image than
if lower quality levels had been used. The float
method is mostly a legacy feature. It does not produce
significantly more accurate results than the int
method, and it is much slower. The float method may
also give different results on different machines due
to varying roundoff behavior, whereas the integer
methods should give the same results on all machines.
-restart N Emit a JPEG restart marker every N MCU rows, or every -restart N Emit a JPEG restart marker every N MCU rows, or every
N MCU blocks if "B" is attached to the number. N MCU blocks if "B" is attached to the number.
@@ -296,13 +311,32 @@ Switches for advanced users:
-dct int Use integer DCT method (default). -dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate). -dct fast Use fast integer DCT (less accurate).
-dct float Use floating-point DCT method. -dct float Use floating-point DCT method.
The float method is very slightly more accurate than In libjpeg-turbo, the fast method is generally about
the int method, but is much slower unless your machine 5-15% faster than the int method when using the
has very fast floating-point hardware. Also note that x86/x86-64 SIMD extensions (results may vary with other
results of the floating-point method may vary slightly SIMD implementations, or when using libjpeg-turbo
across machines, while the integer methods should give without SIMD extensions.) If the JPEG image was
the same results everywhere. The fast integer method compressed using a quality level of 85 or below, then
is much less accurate than the other two. there should be little or no perceptible difference
between the two algorithms. When decompressing images
that were compressed using quality levels above 85,
however, the difference between the fast and int
methods becomes more pronounced. With images
compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR)
relative to the int method, but this can be larger for
some images. If you can avoid it, do not use the fast
method when decompressing images that were compressed
using quality levels above 97. The algorithm often
degenerates for such images and can actually produce
a more lossy output image than if the JPEG image had
been compressed using lower quality levels. The float
method is mostly a legacy feature. It does not produce
significantly more accurate results than the int
method, and it is much slower. The float method may
also give different results on different machines due
to varying roundoff behavior, whereas the integer
methods should give the same results on all machines.
-dither fs Use Floyd-Steinberg dithering in color quantization. -dither fs Use Floyd-Steinberg dithering in color quantization.
-dither ordered Use ordered dithering in color quantization. -dither ordered Use ordered dithering in color quantization.
@@ -381,12 +415,6 @@ When producing a color-quantized image, "-onepass -dither ordered" is fast but
much lower quality than the default behavior. "-dither none" may give much lower quality than the default behavior. "-dither none" may give
acceptable results in two-pass mode, but is seldom tolerable in one-pass mode. acceptable results in two-pass mode, but is seldom tolerable in one-pass mode.
If you are fortunate enough to have very fast floating point hardware,
"-dct float" may be even faster than "-dct fast". But on most machines
"-dct float" is slower than "-dct int"; in this case it is not worth using,
because its theoretical accuracy advantage is too small to be significant
in practice.
Two-pass color quantization requires a good deal of memory; on MS-DOS machines Two-pass color quantization requires a good deal of memory; on MS-DOS machines
it may run out of memory even with -maxmemory 0. In that case you can still it may run out of memory even with -maxmemory 0. In that case you can still
decompress, with some loss of image quality, by specifying -onepass for decompress, with some loss of image quality, by specifying -onepass for