Provide a more thorough description of the trade-offs between the various DCT/IDCT algorithms, based on new resarch

This commit is contained in:
DRC
2014-05-11 09:48:11 +00:00
parent 84f9fbfe3e
commit e7fe0653e5
5 changed files with 144 additions and 50 deletions

View File

@@ -294,10 +294,16 @@ details.
For the most part, libjpeg-turbo should produce identical output to libjpeg
v6b. The one exception to this is when using the floating point DCT/IDCT, in
which case the outputs of libjpeg v6b and libjpeg-turbo are not guaranteed to
be identical (the accuracy of the floating point DCT/IDCT is constant when
using libjpeg-turbo's SIMD extensions, but otherwise, it can depend heavily on
the compiler and compiler settings.)
which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
following reasons:
-- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
slightly more accurate than the implementation in libjpeg v6b, but not by
any amount perceptible to human vision (generally in the range of 0.01 to
0.08 dB gain in PNSR.)
-- When not using the SIMD extensions, then the accuracy of the floating point
DCT/IDCT can depend on the compiler and compiler settings.
While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
still using the same algorithms as libjpeg v6b, so there are several specific
@@ -305,12 +311,14 @@ cases in which libjpeg-turbo cannot be expected to produce the same output as
libjpeg v8:
-- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
implements those scaling algorithms a bit differently than libjpeg v6b does,
and libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
implements those scaling algorithms differently than libjpeg v6b does, and
libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
-- When using chrominance subsampling, because libjpeg v8 implements this
with its DCT/IDCT scaling algorithms rather than with a separate
downsampling/upsampling algorithm.
downsampling/upsampling algorithm. In our testing, the subsampled/upsampled
output of libjpeg v8 is less accurate than that of libjpeg v6b for this
reason.
-- When using the floating point IDCT, for the reasons stated above and also
because the floating point IDCT algorithm was modified in libjpeg v8a to

23
cjpeg.1
View File

@@ -1,4 +1,4 @@
.TH CJPEG 1 "18 January 2013"
.TH CJPEG 1 "11 May 2014"
.SH NAME
cjpeg \- compress an image file to a JPEG file
.SH SYNOPSIS
@@ -166,14 +166,25 @@ Use integer DCT method (default).
.TP
.B \-dct fast
Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.)
For quality levels of 90 and below, there should be little or no perceptible
difference between the two algorithms. For quality levels above 90, however,
the difference between the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs generally about a 1-3 dB
loss (in PSNR) relative to the int method, but this can be larger for some
images. Do not use the fast method with quality levels above 97. The
algorithm often degenerates at quality=98 and above and can actually produce a
more lossy image than if lower quality levels had been used.
.TP
.B \-dct float
Use floating-point DCT method.
The float method is very slightly more accurate than the int method, but is
much slower unless your machine has very fast floating-point hardware. Also
note that results of the floating-point method may vary slightly across
machines, while the integer methods should give the same results everywhere.
The fast integer method is much less accurate than the other two.
The float method is mostly a legacy feature. It does not produce significantly
more accurate results than the int method, and it is much slower. The float
method may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same results on
all machines.
.TP
.BI \-restart " N"
Emit a JPEG restart marker every N MCU rows, or every N MCU blocks if "B" is

26
djpeg.1
View File

@@ -1,4 +1,4 @@
.TH DJPEG 1 "18 January 2013"
.TH DJPEG 1 "11 May 2014"
.SH NAME
djpeg \- decompress a JPEG file to an image file
.SH SYNOPSIS
@@ -115,14 +115,28 @@ Use integer DCT method (default).
.TP
.B \-dct fast
Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.) If
the JPEG image was compressed using a quality level of 85 or below, then there
should be little or no perceptible difference between the two algorithms. When
decompressing images that were compressed using quality levels above 85,
however, the difference between the fast and int methods becomes more
pronounced. With images compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR) relative to the int
method, but this can be larger for some images. If you can avoid it, do not
use the fast method when decompressing images that were compressed using
quality levels above 97. The algorithm often degenerates for such images and
can actually produce a more lossy output image than if the JPEG image had been
compressed using lower quality levels.
.TP
.B \-dct float
Use floating-point DCT method.
The float method is very slightly more accurate than the int method, but is
much slower unless your machine has very fast floating-point hardware. Also
note that results of the floating-point method may vary slightly across
machines, while the integer methods should give the same results everywhere.
The fast integer method is much less accurate than the other two.
The float method is mostly a legacy feature. It does not produce significantly
more accurate results than the int method, and it is much slower. The float
method may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same results on
all machines.
.TP
.B \-dither fs
Use Floyd-Steinberg dithering in color quantization.

View File

@@ -3,7 +3,7 @@ USING THE IJG JPEG LIBRARY
This file was part of the Independent JPEG Group's software:
Copyright (C) 1994-2011, Thomas G. Lane, Guido Vollbeding.
Modifications:
Copyright (C) 2010, D. R. Commander.
Copyright (C) 2010, 2014, D. R. Commander.
For conditions of distribution and use, see the accompanying README file.
@@ -886,14 +886,23 @@ J_DCT_METHOD dct_method
JDCT_FLOAT: floating-point method
JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST)
The FLOAT method is very slightly more accurate than the ISLOW method,
but may give different results on different machines due to varying
roundoff behavior. The integer methods should give the same results
on all machines. On machines with sufficiently fast FP hardware, the
floating-point method may also be the fastest. The IFAST method is
considerably less accurate than the other two; its use is not
recommended if high quality is a concern. JDCT_DEFAULT and
JDCT_FASTEST are macros configurable by each installation.
In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
with other SIMD implementations, or when using libjpeg-turbo without
SIMD extensions.) For quality levels of 90 and below, there should be
little or no perceptible difference between the two algorithms. For
quality levels above 90, however, the difference between JDCT_IFAST and
JDCT_ISLOW becomes more pronounced. With quality=97, for instance,
JDCT_IFAST incurs generally about a 1-3 dB loss (in PSNR) relative to
JDCT_ISLOW, but this can be larger for some images. Do not use
JDCT_IFAST with quality levels above 97. The algorithm often
degenerates at quality=98 and above and can actually produce a more
lossy image than if lower quality levels had been used. JDCT_FLOAT is
mostly a legacy feature. It does not produce significantly more
accurate results than the ISLOW method, and it is much slower. The
FLOAT method may also give different results on different machines due
to varying roundoff behavior, whereas the integer methods should give
the same results on all machines.
J_COLOR_SPACE jpeg_color_space
int num_components
@@ -1170,8 +1179,32 @@ int actual_number_of_colors
Additional decompression parameters that the application may set include:
J_DCT_METHOD dct_method
Selects the algorithm used for the DCT step. Choices are the same
as described above for compression.
Selects the algorithm used for the DCT step. Choices are:
JDCT_ISLOW: slow but accurate integer algorithm
JDCT_IFAST: faster, less accurate integer method
JDCT_FLOAT: floating-point method
JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST)
In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
with other SIMD implementations, or when using libjpeg-turbo without
SIMD extensions.) If the JPEG image was compressed using a quality
level of 85 or below, then there should be little or no perceptible
difference between the two algorithms. When decompressing images that
were compressed using quality levels above 85, however, the difference
between JDCT_IFAST and JDCT_ISLOW becomes more pronounced. With images
compressed using quality=97, for instance, JDCT_IFAST incurs generally
about a 4-6 dB loss (in PSNR) relative to JDCT_ISLOW, but this can be
larger for some images. If you can avoid it, do not use JDCT_IFAST
when decompressing images that were compressed using quality levels
above 97. The algorithm often degenerates for such images and can
actually produce a more lossy output image than if the JPEG image had
been compressed using lower quality levels. JDCT_FLOAT is mostly a
legacy feature. It does not produce significantly more accurate
results than the ISLOW method, and it is much slower. The FLOAT method
may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same
results on all machines.
boolean do_fancy_upsampling
If TRUE, do careful upsampling of chroma components. If FALSE,

View File

@@ -172,13 +172,28 @@ Switches for advanced users:
-dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate).
-dct float Use floating-point DCT method.
The float method is very slightly more accurate than
the int method, but is much slower unless your machine
has very fast floating-point hardware. Also note that
results of the floating-point method may vary slightly
across machines, while the integer methods should give
the same results everywhere. The fast integer method
is much less accurate than the other two.
In libjpeg-turbo, the fast method is generally about
5-15% faster than the int method when using the
x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo
without SIMD extensions.) For quality levels of 90 and
below, there should be little or no perceptible
difference between the two algorithms. For quality
levels above 90, however, the difference between
the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs
generally about a 1-3 dB loss (in PSNR) relative to
the int method, but this can be larger for some images.
Do not use the fast method with quality levels above
97. The algorithm often degenerates at quality=98 and
above and can actually produce a more lossy image than
if lower quality levels had been used. The float
method is mostly a legacy feature. It does not produce
significantly more accurate results than the int
method, and it is much slower. The float method may
also give different results on different machines due
to varying roundoff behavior, whereas the integer
methods should give the same results on all machines.
-restart N Emit a JPEG restart marker every N MCU rows, or every
N MCU blocks if "B" is attached to the number.
@@ -296,13 +311,32 @@ Switches for advanced users:
-dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate).
-dct float Use floating-point DCT method.
The float method is very slightly more accurate than
the int method, but is much slower unless your machine
has very fast floating-point hardware. Also note that
results of the floating-point method may vary slightly
across machines, while the integer methods should give
the same results everywhere. The fast integer method
is much less accurate than the other two.
In libjpeg-turbo, the fast method is generally about
5-15% faster than the int method when using the
x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo
without SIMD extensions.) If the JPEG image was
compressed using a quality level of 85 or below, then
there should be little or no perceptible difference
between the two algorithms. When decompressing images
that were compressed using quality levels above 85,
however, the difference between the fast and int
methods becomes more pronounced. With images
compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR)
relative to the int method, but this can be larger for
some images. If you can avoid it, do not use the fast
method when decompressing images that were compressed
using quality levels above 97. The algorithm often
degenerates for such images and can actually produce
a more lossy output image than if the JPEG image had
been compressed using lower quality levels. The float
method is mostly a legacy feature. It does not produce
significantly more accurate results than the int
method, and it is much slower. The float method may
also give different results on different machines due
to varying roundoff behavior, whereas the integer
methods should give the same results on all machines.
-dither fs Use Floyd-Steinberg dithering in color quantization.
-dither ordered Use ordered dithering in color quantization.
@@ -381,12 +415,6 @@ When producing a color-quantized image, "-onepass -dither ordered" is fast but
much lower quality than the default behavior. "-dither none" may give
acceptable results in two-pass mode, but is seldom tolerable in one-pass mode.
If you are fortunate enough to have very fast floating point hardware,
"-dct float" may be even faster than "-dct fast". But on most machines
"-dct float" is slower than "-dct int"; in this case it is not worth using,
because its theoretical accuracy advantage is too small to be significant
in practice.
Two-pass color quantization requires a good deal of memory; on MS-DOS machines
it may run out of memory even with -maxmemory 0. In that case you can still
decompress, with some loss of image quality, by specifying -onepass for