Compare commits

...

18 Commits
v2.0 ... v2.1

Author SHA1 Message Date
Josh Aas
594b7258cc Bump version to 2.1. 2014-07-29 09:12:48 -05:00
fbossen
0533b31891 Merge pull request #79 from arjun024/devel
more logical flow of control if trellis_quant enabled
2014-07-25 09:02:45 -04:00
Arjun Sreedharan
cca53c920d more logical flow of control if trellis_quant enabled
if trellis_quant is enabled, increment total number of
passes by optimization beginning pass number.

Signed-off-by: Arjun Sreedharan <arjun024@gmail.com>
2014-07-25 17:33:35 +05:30
Josh Aas
5901802871 Update README.md with link to 2.0 announcement and mailing list. 2014-07-24 16:39:26 -05:00
Frank Bossen
fbef31f76d Add option to disable progressive coding in cjpeg
Redefine baseline option in cjpeg to actually create a baseline JPEG
file by disabling progressive coding
2014-07-24 17:09:27 -04:00
Frank Bossen
1aa50b71d9 Use precomputed table
From jpeglib-turbo r1221:
Integrate a slightly modified version of Mozilla's patch for
precomputing the bit-counting LUT.  This is useful if the table needs
to be shared among multiple processes, although the primary reason for
doing that is reduced footprint on mobile devices, which are probably
already covered by the clz intrinsic code.
2014-07-24 10:50:59 -04:00
Frank Bossen
3adc64a4cb Improve floating point DCT
From libjpeg-turbo r1288
Port the more accurate (and slightly faster) floating point IDCT
implementation from jpeg-8a and later.  New research revealed that the
SSE/SSE2 floating point IDCT implementation was actually more accurate
than the jpeg-6b implementation, not less, which is why its
mathematical results have always differed from those of the jpeg-6b
implementation.  This patch brings the accuracy of the C code in line
with that of the SSE/SSE2 code.
2014-07-23 10:26:46 -04:00
Frank Bossen
ccb1d12f53 Update doc re: various DCT implementations
From libjpeg-turbo r1287
2014-07-23 10:23:24 -04:00
Frank Bossen
c73a82c6aa Fix build for wrjpgcom
See r1325 in libjpeg-turbo
2014-07-23 09:22:55 -04:00
Frank Bossen
b9f25333f6 Disable scan optimization if no scan given
Addresses segmentation fault issue in #69
2014-07-21 20:11:46 +02:00
Josh Aas
0f6b96c68e Merge pull request #73 from pornel/master
Added few error messages in cjpeg
2014-07-21 11:23:20 -05:00
Josh Aas
eeea9ed397 Merge pull request #72 from jlongman/master
Bugfix: AM_PROG_AR is not recognized by older automake, so only use it w...
2014-07-21 11:21:30 -05:00
Josh Aas
b95528727f Merge pull request #75 from pmed/master
Fixed mozjpeg build with Visual C++ 2010
2014-07-21 11:18:33 -05:00
Pavel Medvedev
d9605b0560 Fixed mozjpeg build with Visual C++ 2010
Moved several variable declarations out of inner scopes to the function scope to compile C code with Visual C++ 2010
2014-07-21 15:32:54 +04:00
Kornel Lesiński
40e6e8b2a2 Added few error messages in cjpeg 2014-07-20 16:11:59 +01:00
Frank Bossen
514307e9e6 Fix trellis for nonprogressive mode (#69)
Correct number of trellis passes when in nonprogressive mode
2014-07-18 18:10:56 +02:00
jlongman
7388a54647 Bugfix: AM_PROG_AR is not recognized by older automake, so only use it when defined 2014-07-17 14:16:24 -04:00
Josh Aas
e7a135b930 Bump version number for 2.0, make this version 2.0.1. 2014-07-15 13:55:55 -05:00
16 changed files with 4352 additions and 151 deletions

View File

@@ -5,7 +5,7 @@
cmake_minimum_required(VERSION 2.6)
project(libmozjpeg C)
set(VERSION 2.0pre)
set(VERSION 2.1)
if(MINGW OR CYGWIN)
execute_process(COMMAND "date" "+%Y%m%d" OUTPUT_VARIABLE BUILD)
@@ -264,7 +264,7 @@ set_property(TARGET jpegtran-static PROPERTY COMPILE_FLAGS "-DUSE_SETMODE")
add_executable(rdjpgcom rdjpgcom.c)
add_executable(wrjpgcom rdjpgcom.c)
add_executable(wrjpgcom wrjpgcom.c)
#

View File

@@ -23,6 +23,14 @@ and is out of scope for a codec library.
[5] The TurboJPEG API can now be used to compress JPEG images from YUV planar
source images.
[7] Improved the accuracy and performance of the non-SIMD implementation of the
floating point inverse DCT (using code borrowed from libjpeg v8a and later.)
The accuracy of this implementation now matches the accuracy of the SSE/SSE2
implementation. Note, however, that the floating point DCT/IDCT algorithms are
mainly a legacy feature. They generally do not produce significantly better
accuracy than the slow integer DCT/IDCT algorithms, and they are quite a bit
slower.
1.3.1
=====

View File

@@ -419,10 +419,25 @@ details.
For the most part, libjpeg-turbo should produce identical output to libjpeg
v6b. The one exception to this is when using the floating point DCT/IDCT, in
which case the outputs of libjpeg v6b and libjpeg-turbo are not guaranteed to
be identical (the accuracy of the floating point DCT/IDCT is constant when
using libjpeg-turbo's SIMD extensions, but otherwise, it can depend heavily on
the compiler and compiler settings.)
which case the outputs of libjpeg v6b and libjpeg-turbo can differ for the
following reasons:
-- The SSE/SSE2 floating point DCT implementation in libjpeg-turbo is ever so
slightly more accurate than the implementation in libjpeg v6b, but not by
any amount perceptible to human vision (generally in the range of 0.01 to
0.08 dB gain in PNSR.)
-- When not using the SIMD extensions, libjpeg-turbo uses the more accurate
(and slightly faster) floating point IDCT algorithm introduced in libjpeg
v8a as opposed to the algorithm used in libjpeg v6b. It should be noted,
however, that this algorithm basically brings the accuracy of the floating
point IDCT in line with the accuracy of the slow integer IDCT. The floating
point DCT/IDCT algorithms are mainly a legacy feature, and they do not
produce significantly more accuracy than the slow integer algorithms (to put
numbers on this, the typical difference in PNSR between the two algorithms
is less than 0.10 dB, whereas changing the quality level by 1 in the upper
range of the quality scale is typically more like a 1.0 dB difference.)
-- When not using the SIMD extensions, then the accuracy of the floating point
DCT/IDCT can depend on the compiler and compiler settings.
While libjpeg-turbo does emulate the libjpeg v8 API/ABI, under the hood, it is
still using the same algorithms as libjpeg v6b, so there are several specific
@@ -430,16 +445,14 @@ cases in which libjpeg-turbo cannot be expected to produce the same output as
libjpeg v8:
-- When decompressing using scaling factors of 1/2 and 1/4, because libjpeg v8
implements those scaling algorithms a bit differently than libjpeg v6b does,
and libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
implements those scaling algorithms differently than libjpeg v6b does, and
libjpeg-turbo's SIMD extensions are based on the libjpeg v6b behavior.
-- When using chrominance subsampling, because libjpeg v8 implements this
with its DCT/IDCT scaling algorithms rather than with a separate
downsampling/upsampling algorithm.
-- When using the floating point IDCT, for the reasons stated above and also
because the floating point IDCT algorithm was modified in libjpeg v8a to
improve accuracy.
downsampling/upsampling algorithm. In our testing, the subsampled/upsampled
output of libjpeg v8 is less accurate than that of libjpeg v6b for this
reason.
-- When decompressing using a scaling factor > 1 and merged (AKA "non-fancy" or
"non-smooth") chrominance upsampling, because libjpeg v8 does not support

View File

@@ -7,6 +7,8 @@ The idea is to reduce transfer times for JPEGs on the Web, thus reducing page lo
'mozjpeg' is not intended to be a general JPEG library replacement. It makes tradeoffs that are intended to benefit Web use cases and focuses solely on improving encoding. It is best used as part of a Web encoding workflow. For a general JPEG library (e.g. your system libjpeg), especially if you care about decoding, we recommend libjpeg-turbo.
For more information, see the project announcement:
More information:
https://blog.mozilla.org/research/2014/03/05/introducing-the-mozjpeg-project/
* [Version 1.0 Announcement](https://blog.mozilla.org/research/2014/03/05/introducing-the-mozjpeg-project/)
* [Version 2.0 Announcement](https://blog.mozilla.org/research/2014/07/15/mozilla-advances-jpeg-encoding-with-mozjpeg-2-0/)
* [Mailing List](https://lists.mozilla.org/listinfo/dev-mozjpeg)</a>

23
cjpeg.1
View File

@@ -1,4 +1,4 @@
.TH CJPEG 1 "18 January 2013"
.TH CJPEG 1 "11 May 2014"
.SH NAME
cjpeg \- compress an image file to a JPEG file
.SH SYNOPSIS
@@ -166,14 +166,25 @@ Use integer DCT method (default).
.TP
.B \-dct fast
Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.)
For quality levels of 90 and below, there should be little or no perceptible
difference between the two algorithms. For quality levels above 90, however,
the difference between the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs generally about a 1-3 dB
loss (in PSNR) relative to the int method, but this can be larger for some
images. Do not use the fast method with quality levels above 97. The
algorithm often degenerates at quality=98 and above and can actually produce a
more lossy image than if lower quality levels had been used.
.TP
.B \-dct float
Use floating-point DCT method.
The float method is very slightly more accurate than the int method, but is
much slower unless your machine has very fast floating-point hardware. Also
note that results of the floating-point method may vary slightly across
machines, while the integer methods should give the same results everywhere.
The fast integer method is much less accurate than the other two.
The float method is mostly a legacy feature. It does not produce significantly
more accurate results than the int method, and it is much slower. The float
method may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same results on
all machines.
.TP
.BI \-restart " N"
Emit a JPEG restart marker every N MCU rows, or every N MCU blocks if "B" is

40
cjpeg.c
View File

@@ -168,6 +168,7 @@ usage (void)
#ifdef C_PROGRESSIVE_SUPPORTED
fprintf(stderr, " -progressive Create progressive JPEG file (enabled by default)\n");
#endif
fprintf(stderr, " -baseline Create baseline JPEG file (disable progressive coding)\n");
#ifdef TARGA_SUPPORTED
fprintf(stderr, " -targa Input file is Targa format (usually not needed)\n");
#endif
@@ -206,7 +207,6 @@ usage (void)
#endif
fprintf(stderr, " -verbose or -debug Emit debug output\n");
fprintf(stderr, "Switches for wizards:\n");
fprintf(stderr, " -baseline Force baseline quantization tables\n");
fprintf(stderr, " -qtables file Use quantization tables given in file\n");
fprintf(stderr, " -qslots N[,...] Set component quantization tables\n");
fprintf(stderr, " -sample HxV[,...] Set component sampling factors\n");
@@ -279,11 +279,17 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
} else if (keymatch(arg, "baseline", 1)) {
/* Force baseline-compatible output (8-bit quantizer values). */
force_baseline = TRUE;
/* Disable multiple scans */
simple_progressive = FALSE;
cinfo->num_scans = 0;
cinfo->scan_info = NULL;
} else if (keymatch(arg, "dct", 2)) {
/* Select DCT algorithm. */
if (++argn >= argc) /* advance to next argument */
if (++argn >= argc) { /* advance to next argument */
fprintf(stderr, "%s: missing argument for dct\n", progname);
usage();
}
if (keymatch(argv[argn], "int", 1)) {
cinfo->dct_method = JDCT_ISLOW;
} else if (keymatch(argv[argn], "fast", 2)) {
@@ -291,6 +297,7 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
} else if (keymatch(argv[argn], "float", 2)) {
cinfo->dct_method = JDCT_FLOAT;
} else
fprintf(stderr, "%s: invalid argument for dct\n", progname);
usage();
} else if (keymatch(arg, "debug", 1) || keymatch(arg, "verbose", 1)) {
@@ -314,7 +321,7 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
} else if (keymatch(arg, "flat", 4)) {
cinfo->use_flat_quant_tbl = TRUE;
jpeg_set_quality(cinfo, 75, TRUE);
} else if (keymatch(arg, "grayscale", 2) || keymatch(arg, "greyscale",2)) {
/* Force a monochrome JPEG file to be generated. */
jpeg_set_colorspace(cinfo, JCS_GRAYSCALE);
@@ -327,12 +334,12 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
if (++argn >= argc) /* advance to next argument */
usage();
cinfo->lambda_log_scale1 = atof(argv[argn]);
} else if (keymatch(arg, "lambda2", 7)) {
if (++argn >= argc) /* advance to next argument */
usage();
cinfo->lambda_log_scale2 = atof(argv[argn]);
} else if (keymatch(arg, "maxmemory", 3)) {
/* Maximum memory in Kb (or Mb with 'm'). */
long lval;
@@ -348,7 +355,7 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
} else if (keymatch(arg, "multidcscan", 3)) {
cinfo->one_dc_scan = FALSE;
} else if (keymatch(arg, "optimize", 1) || keymatch(arg, "optimise", 1)) {
/* Enable entropy parm optimization. */
#ifdef ENTROPY_OPT_SUPPORTED
@@ -361,8 +368,10 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
} else if (keymatch(arg, "outfile", 4)) {
/* Set output file name. */
if (++argn >= argc) /* advance to next argument */
if (++argn >= argc) { /* advance to next argument */
fprintf(stderr, "%s: missing argument for outfile\n", progname);
usage();
}
outfilename = argv[argn]; /* save it away for later use */
} else if (keymatch(arg, "progressive", 1)) {
@@ -388,8 +397,10 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
} else if (keymatch(arg, "quality", 1)) {
/* Quality ratings (quantization table scaling factors). */
if (++argn >= argc) /* advance to next argument */
if (++argn >= argc) { /* advance to next argument */
fprintf(stderr, "%s: missing argument for quality\n", progname);
usage();
}
qualityarg = argv[argn];
} else if (keymatch(arg, "qslots", 2)) {
@@ -505,6 +516,7 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
jpeg_set_quality(cinfo, 75, TRUE);
} else {
fprintf(stderr, "%s: unknown option '%s'\n", progname, arg);
usage(); /* bogus switch */
}
}
@@ -516,20 +528,26 @@ parse_switches (j_compress_ptr cinfo, int argc, char **argv,
/* Set quantization tables for selected quality. */
/* Some or all may be overridden if -qtables is present. */
if (qualityarg != NULL) /* process -quality if it was present */
if (! set_quality_ratings(cinfo, qualityarg, force_baseline))
if (! set_quality_ratings(cinfo, qualityarg, force_baseline)) {
fprintf(stderr, "%s: can't set quality ratings\n", progname);
usage();
}
if (qtablefile != NULL) /* process -qtables if it was present */
if (! read_quant_tables(cinfo, qtablefile, force_baseline))
if (! read_quant_tables(cinfo, qtablefile, force_baseline)) {
fprintf(stderr, "%s: can't read qtable file\n", progname);
usage();
}
if (qslotsarg != NULL) /* process -qslots if it was present */
if (! set_quant_slots(cinfo, qslotsarg))
usage();
if (samplearg != NULL) /* process -sample if it was present */
if (! set_sample_factors(cinfo, samplearg))
if (! set_sample_factors(cinfo, samplearg)) {
fprintf(stderr, "%s: can't set sample factors\n", progname);
usage();
}
#ifdef C_PROGRESSIVE_SUPPORTED
if (simple_progressive) /* process -progressive; -scans can override */

View File

@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.56])
AC_INIT([libmozjpeg], [2.0pre])
AC_INIT([libmozjpeg], [2.1])
BUILD=`date +%Y%m%d`
AM_INIT_AUTOMAKE([-Wall foreign dist-bzip2])
@@ -20,7 +20,7 @@ AC_PROG_CC
AM_PROG_CC_C_O
AM_PROG_AS
AC_PROG_INSTALL
AM_PROG_AR
m4_ifdef([AM_PROG_AR], [AM_PROG_AR])
AC_PROG_LIBTOOL
AC_PROG_LN_S

26
djpeg.1
View File

@@ -1,4 +1,4 @@
.TH DJPEG 1 "18 January 2013"
.TH DJPEG 1 "11 May 2014"
.SH NAME
djpeg \- decompress a JPEG file to an image file
.SH SYNOPSIS
@@ -115,14 +115,28 @@ Use integer DCT method (default).
.TP
.B \-dct fast
Use fast integer DCT (less accurate).
In libjpeg-turbo, the fast method is generally about 5-15% faster than the int
method when using the x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo without SIMD extensions.) If
the JPEG image was compressed using a quality level of 85 or below, then there
should be little or no perceptible difference between the two algorithms. When
decompressing images that were compressed using quality levels above 85,
however, the difference between the fast and int methods becomes more
pronounced. With images compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR) relative to the int
method, but this can be larger for some images. If you can avoid it, do not
use the fast method when decompressing images that were compressed using
quality levels above 97. The algorithm often degenerates for such images and
can actually produce a more lossy output image than if the JPEG image had been
compressed using lower quality levels.
.TP
.B \-dct float
Use floating-point DCT method.
The float method is very slightly more accurate than the int method, but is
much slower unless your machine has very fast floating-point hardware. Also
note that results of the floating-point method may vary slightly across
machines, while the integer methods should give the same results everywhere.
The fast integer method is much less accurate than the other two.
The float method is mostly a legacy feature. It does not produce significantly
more accurate results than the int method, and it is much slower. The float
method may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same results on
all machines.
.TP
.B \-dither fs
Use Floyd-Steinberg dithering in color quantization.

View File

@@ -46,7 +46,7 @@ jpeg_start_compress (j_compress_ptr cinfo, boolean write_all_tables)
jpeg_suppress_tables(cinfo, FALSE); /* mark all tables to be written */
/* setting up scan optimisation pattern failed, disable scan optimisation */
if (cinfo->num_scans_luma == 0)
if (cinfo->num_scans_luma == 0 || cinfo->scan_info == NULL || cinfo->num_scans == 0)
cinfo->optimize_scans = FALSE;
/* (Re)initialize error mgr and destination modules */

View File

@@ -543,6 +543,8 @@ forward_DCT_float (j_compress_ptr cinfo, jpeg_component_info * compptr,
FAST_FLOAT * divisors = fdct->float_divisors[compptr->quant_tbl_no];
FAST_FLOAT * workspace;
JDIMENSION bi;
float v;
int x;
/* Make sure the compiler doesn't look up these every pass */
@@ -572,10 +574,10 @@ forward_DCT_float (j_compress_ptr cinfo, jpeg_component_info * compptr,
};
for (i = 0; i < DCTSIZE2; i++) {
float v = workspace[i];
v = workspace[i];
v /= aanscalefactor[i%8];
v /= aanscalefactor[i/8];
int x = (v >= 0.0) ? (int)(v + 0.5) : (int)(v - 0.5);
x = (v >= 0.0) ? (int)(v + 0.5) : (int)(v - 0.5);
dst[bi][i] = x;
}
}
@@ -588,9 +590,7 @@ forward_DCT_float (j_compress_ptr cinfo, jpeg_component_info * compptr,
#endif /* DCT_FLOAT_SUPPORTED */
#include "jchuff.h"
static unsigned char jpeg_nbits_table[65536];
static int jpeg_nbits_table_init = 0;
#include "jpeg_nbits_table.h"
static const float jpeg_lambda_weights_flat[64] = {
1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f, 1.0f,
@@ -637,6 +637,10 @@ quantize_trellis(j_compress_ptr cinfo, c_derived_tbl *actbl, JBLOCKROW coef_bloc
int has_eob;
float cost_all_zeros;
float best_cost_skip;
float cost;
int zero_run;
int run_bits;
int rate;
Ss = cinfo->Ss;
Se = cinfo->Se;
@@ -654,15 +658,6 @@ quantize_trellis(j_compress_ptr cinfo, c_derived_tbl *actbl, JBLOCKROW coef_bloc
requires_eob[0] = 0;
}
if(!jpeg_nbits_table_init) {
for(i = 0; i < 65536; i++) {
int nbits = 0, temp = i;
while (temp) {temp >>= 1; nbits++;}
jpeg_nbits_table[i] = nbits;
}
jpeg_nbits_table_init = 1;
}
norm = 0.0;
for (i = 1; i < DCTSIZE2; i++) {
norm += qtbl->quantval[i] * qtbl->quantval[i];
@@ -725,11 +720,11 @@ quantize_trellis(j_compress_ptr cinfo, c_derived_tbl *actbl, JBLOCKROW coef_bloc
if (j != Ss-1 && coef_blocks[bi][zz] == 0)
continue;
int zero_run = i - 1 - j;
zero_run = i - 1 - j;
if ((zero_run >> 4) && actbl->ehufsi[0xf0] == 0)
continue;
int run_bits = (zero_run >> 4) * actbl->ehufsi[0xf0];
run_bits = (zero_run >> 4) * actbl->ehufsi[0xf0];
zero_run &= 15;
for (k = 0; k < num_candidates; k++) {
@@ -737,8 +732,8 @@ quantize_trellis(j_compress_ptr cinfo, c_derived_tbl *actbl, JBLOCKROW coef_bloc
if (coef_bits == 0)
continue;
int rate = coef_bits + candidate_bits[k] + run_bits;
float cost = rate + candidate_dist[k];
rate = coef_bits + candidate_bits[k] + run_bits;
cost = rate + candidate_dist[k];
cost += accumulated_zero_dist[i-1] - accumulated_zero_dist[j] + accumulated_cost[j];
if (cost < accumulated_cost[i]) {

View File

@@ -22,8 +22,7 @@
#include "jchuff.h" /* Declarations shared with jcphuff.c */
#include <limits.h>
static unsigned char jpeg_nbits_table[65536];
static int jpeg_nbits_table_init = 0;
#include "jpeg_nbits_table.h"
#ifndef min
#define min(a,b) ((a)<(b)?(a):(b))
@@ -271,15 +270,6 @@ jpeg_make_c_derived_tbl (j_compress_ptr cinfo, boolean isDC, int tblno,
dtbl->ehufco[i] = huffcode[p];
dtbl->ehufsi[i] = huffsize[p];
}
if(!jpeg_nbits_table_init) {
for(i = 0; i < 65536; i++) {
int nbits = 0, temp = i;
while (temp) {temp >>= 1; nbits++;}
jpeg_nbits_table[i] = nbits;
}
jpeg_nbits_table_init = 1;
}
}

View File

@@ -933,11 +933,10 @@ jinit_c_master_control (j_compress_ptr cinfo, boolean transcode_only)
else
master->total_passes = cinfo->num_scans;
master->pass_number_scan_opt_base = 0;
if (cinfo->trellis_quant) {
if (cinfo->progressive_mode)
master->total_passes += ((cinfo->use_scans_in_trellis) ? 4 : 2) * cinfo->num_components * cinfo->trellis_num_loops;
else
master->total_passes += 1;
master->pass_number_scan_opt_base = ((cinfo->use_scans_in_trellis) ? 4 : 2) * cinfo->num_components * cinfo->trellis_num_loops;
master->total_passes += master->pass_number_scan_opt_base;
}
if (cinfo->optimize_scans) {
@@ -947,9 +946,4 @@ jinit_c_master_control (j_compress_ptr cinfo, boolean transcode_only)
for (i = 0; i < cinfo->num_scans; i++)
master->scan_buffer[i] = NULL;
}
if (cinfo->trellis_quant)
master->pass_number_scan_opt_base = ((cinfo->use_scans_in_trellis) ? 4 : 2) * cinfo->num_components * cinfo->trellis_num_loops;
else
master->pass_number_scan_opt_base = 0;
}

View File

@@ -1,9 +1,12 @@
/*
* jidctflt.c
*
* This file was part of the Independent JPEG Group's software:
* Copyright (C) 1994-1998, Thomas G. Lane.
* This file is part of the Independent JPEG Group's software.
* For conditions of distribution and use, see the accompanying README file.
* Modified 2010 by Guido Vollbeding.
* libjpeg-turbo Modifications:
* Copyright (C) 2014, D. R. Commander.
* For conditions of distribution and use, see the accompanying README file.
*
* This file contains a floating-point implementation of the
* inverse DCT (Discrete Cosine Transform). In the IJG code, this routine
@@ -76,10 +79,10 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
FLOAT_MULT_TYPE * quantptr;
FAST_FLOAT * wsptr;
JSAMPROW outptr;
JSAMPLE *range_limit = IDCT_range_limit(cinfo);
JSAMPLE *range_limit = cinfo->sample_range_limit;
int ctr;
FAST_FLOAT workspace[DCTSIZE2]; /* buffers data between passes */
SHIFT_TEMPS
#define _0_125 ((FLOAT_MULT_TYPE)0.125)
/* Pass 1: process columns from input, store into work array. */
@@ -101,7 +104,8 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
inptr[DCTSIZE*5] == 0 && inptr[DCTSIZE*6] == 0 &&
inptr[DCTSIZE*7] == 0) {
/* AC terms all zero */
FAST_FLOAT dcval = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
FAST_FLOAT dcval = DEQUANTIZE(inptr[DCTSIZE*0],
quantptr[DCTSIZE*0] * _0_125);
wsptr[DCTSIZE*0] = dcval;
wsptr[DCTSIZE*1] = dcval;
@@ -120,10 +124,10 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
/* Even part */
tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0]);
tmp1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2]);
tmp2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4]);
tmp3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6]);
tmp0 = DEQUANTIZE(inptr[DCTSIZE*0], quantptr[DCTSIZE*0] * _0_125);
tmp1 = DEQUANTIZE(inptr[DCTSIZE*2], quantptr[DCTSIZE*2] * _0_125);
tmp2 = DEQUANTIZE(inptr[DCTSIZE*4], quantptr[DCTSIZE*4] * _0_125);
tmp3 = DEQUANTIZE(inptr[DCTSIZE*6], quantptr[DCTSIZE*6] * _0_125);
tmp10 = tmp0 + tmp2; /* phase 3 */
tmp11 = tmp0 - tmp2;
@@ -138,10 +142,10 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
/* Odd part */
tmp4 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1]);
tmp5 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3]);
tmp6 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5]);
tmp7 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7]);
tmp4 = DEQUANTIZE(inptr[DCTSIZE*1], quantptr[DCTSIZE*1] * _0_125);
tmp5 = DEQUANTIZE(inptr[DCTSIZE*3], quantptr[DCTSIZE*3] * _0_125);
tmp6 = DEQUANTIZE(inptr[DCTSIZE*5], quantptr[DCTSIZE*5] * _0_125);
tmp7 = DEQUANTIZE(inptr[DCTSIZE*7], quantptr[DCTSIZE*7] * _0_125);
z13 = tmp6 + tmp5; /* phase 6 */
z10 = tmp6 - tmp5;
@@ -152,12 +156,12 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
tmp11 = (z11 - z13) * ((FAST_FLOAT) 1.414213562); /* 2*c4 */
z5 = (z10 + z12) * ((FAST_FLOAT) 1.847759065); /* 2*c2 */
tmp10 = ((FAST_FLOAT) 1.082392200) * z12 - z5; /* 2*(c2-c6) */
tmp12 = ((FAST_FLOAT) -2.613125930) * z10 + z5; /* -2*(c2+c6) */
tmp10 = z5 - z12 * ((FAST_FLOAT) 1.082392200); /* 2*(c2-c6) */
tmp12 = z5 - z10 * ((FAST_FLOAT) 2.613125930); /* 2*(c2+c6) */
tmp6 = tmp12 - tmp7; /* phase 2 */
tmp5 = tmp11 - tmp6;
tmp4 = tmp10 + tmp5;
tmp4 = tmp10 - tmp5;
wsptr[DCTSIZE*0] = tmp0 + tmp7;
wsptr[DCTSIZE*7] = tmp0 - tmp7;
@@ -165,8 +169,8 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
wsptr[DCTSIZE*6] = tmp1 - tmp6;
wsptr[DCTSIZE*2] = tmp2 + tmp5;
wsptr[DCTSIZE*5] = tmp2 - tmp5;
wsptr[DCTSIZE*4] = tmp3 + tmp4;
wsptr[DCTSIZE*3] = tmp3 - tmp4;
wsptr[DCTSIZE*3] = tmp3 + tmp4;
wsptr[DCTSIZE*4] = tmp3 - tmp4;
inptr++; /* advance pointers to next column */
quantptr++;
@@ -174,7 +178,6 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
}
/* Pass 2: process rows from work array, store into output array. */
/* Note that we must descale the results by a factor of 8 == 2**3. */
wsptr = workspace;
for (ctr = 0; ctr < DCTSIZE; ctr++) {
@@ -187,8 +190,10 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
/* Even part */
tmp10 = wsptr[0] + wsptr[4];
tmp11 = wsptr[0] - wsptr[4];
/* Apply signed->unsigned and prepare float->int conversion */
z5 = wsptr[0] + ((FAST_FLOAT) CENTERJSAMPLE + (FAST_FLOAT) 0.5);
tmp10 = z5 + wsptr[4];
tmp11 = z5 - wsptr[4];
tmp13 = wsptr[2] + wsptr[6];
tmp12 = (wsptr[2] - wsptr[6]) * ((FAST_FLOAT) 1.414213562) - tmp13;
@@ -209,31 +214,23 @@ jpeg_idct_float (j_decompress_ptr cinfo, jpeg_component_info * compptr,
tmp11 = (z11 - z13) * ((FAST_FLOAT) 1.414213562);
z5 = (z10 + z12) * ((FAST_FLOAT) 1.847759065); /* 2*c2 */
tmp10 = ((FAST_FLOAT) 1.082392200) * z12 - z5; /* 2*(c2-c6) */
tmp12 = ((FAST_FLOAT) -2.613125930) * z10 + z5; /* -2*(c2+c6) */
tmp10 = z5 - z12 * ((FAST_FLOAT) 1.082392200); /* 2*(c2-c6) */
tmp12 = z5 - z10 * ((FAST_FLOAT) 2.613125930); /* 2*(c2+c6) */
tmp6 = tmp12 - tmp7;
tmp5 = tmp11 - tmp6;
tmp4 = tmp10 + tmp5;
tmp4 = tmp10 - tmp5;
/* Final output stage: scale down by a factor of 8 and range-limit */
/* Final output stage: float->int conversion and range-limit */
outptr[0] = range_limit[(int) DESCALE((INT32) (tmp0 + tmp7), 3)
& RANGE_MASK];
outptr[7] = range_limit[(int) DESCALE((INT32) (tmp0 - tmp7), 3)
& RANGE_MASK];
outptr[1] = range_limit[(int) DESCALE((INT32) (tmp1 + tmp6), 3)
& RANGE_MASK];
outptr[6] = range_limit[(int) DESCALE((INT32) (tmp1 - tmp6), 3)
& RANGE_MASK];
outptr[2] = range_limit[(int) DESCALE((INT32) (tmp2 + tmp5), 3)
& RANGE_MASK];
outptr[5] = range_limit[(int) DESCALE((INT32) (tmp2 - tmp5), 3)
& RANGE_MASK];
outptr[4] = range_limit[(int) DESCALE((INT32) (tmp3 + tmp4), 3)
& RANGE_MASK];
outptr[3] = range_limit[(int) DESCALE((INT32) (tmp3 - tmp4), 3)
& RANGE_MASK];
outptr[0] = range_limit[((int) (tmp0 + tmp7)) & RANGE_MASK];
outptr[7] = range_limit[((int) (tmp0 - tmp7)) & RANGE_MASK];
outptr[1] = range_limit[((int) (tmp1 + tmp6)) & RANGE_MASK];
outptr[6] = range_limit[((int) (tmp1 - tmp6)) & RANGE_MASK];
outptr[2] = range_limit[((int) (tmp2 + tmp5)) & RANGE_MASK];
outptr[5] = range_limit[((int) (tmp2 - tmp5)) & RANGE_MASK];
outptr[3] = range_limit[((int) (tmp3 + tmp4)) & RANGE_MASK];
outptr[4] = range_limit[((int) (tmp3 - tmp4)) & RANGE_MASK];
wsptr += DCTSIZE; /* advance pointer to next row */
}

4098
jpeg_nbits_table.h Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -3,7 +3,7 @@ USING THE IJG JPEG LIBRARY
This file was part of the Independent JPEG Group's software:
Copyright (C) 1994-2011, Thomas G. Lane, Guido Vollbeding.
Modifications:
Copyright (C) 2010, D. R. Commander.
Copyright (C) 2010, 2014, D. R. Commander.
For conditions of distribution and use, see the accompanying README file.
@@ -886,14 +886,23 @@ J_DCT_METHOD dct_method
JDCT_FLOAT: floating-point method
JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST)
The FLOAT method is very slightly more accurate than the ISLOW method,
but may give different results on different machines due to varying
roundoff behavior. The integer methods should give the same results
on all machines. On machines with sufficiently fast FP hardware, the
floating-point method may also be the fastest. The IFAST method is
considerably less accurate than the other two; its use is not
recommended if high quality is a concern. JDCT_DEFAULT and
JDCT_FASTEST are macros configurable by each installation.
In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
with other SIMD implementations, or when using libjpeg-turbo without
SIMD extensions.) For quality levels of 90 and below, there should be
little or no perceptible difference between the two algorithms. For
quality levels above 90, however, the difference between JDCT_IFAST and
JDCT_ISLOW becomes more pronounced. With quality=97, for instance,
JDCT_IFAST incurs generally about a 1-3 dB loss (in PSNR) relative to
JDCT_ISLOW, but this can be larger for some images. Do not use
JDCT_IFAST with quality levels above 97. The algorithm often
degenerates at quality=98 and above and can actually produce a more
lossy image than if lower quality levels had been used. JDCT_FLOAT is
mostly a legacy feature. It does not produce significantly more
accurate results than the ISLOW method, and it is much slower. The
FLOAT method may also give different results on different machines due
to varying roundoff behavior, whereas the integer methods should give
the same results on all machines.
J_COLOR_SPACE jpeg_color_space
int num_components
@@ -1170,8 +1179,32 @@ int actual_number_of_colors
Additional decompression parameters that the application may set include:
J_DCT_METHOD dct_method
Selects the algorithm used for the DCT step. Choices are the same
as described above for compression.
Selects the algorithm used for the DCT step. Choices are:
JDCT_ISLOW: slow but accurate integer algorithm
JDCT_IFAST: faster, less accurate integer method
JDCT_FLOAT: floating-point method
JDCT_DEFAULT: default method (normally JDCT_ISLOW)
JDCT_FASTEST: fastest method (normally JDCT_IFAST)
In libjpeg-turbo, JDCT_IFAST is generally about 5-15% faster than
JDCT_ISLOW when using the x86/x86-64 SIMD extensions (results may vary
with other SIMD implementations, or when using libjpeg-turbo without
SIMD extensions.) If the JPEG image was compressed using a quality
level of 85 or below, then there should be little or no perceptible
difference between the two algorithms. When decompressing images that
were compressed using quality levels above 85, however, the difference
between JDCT_IFAST and JDCT_ISLOW becomes more pronounced. With images
compressed using quality=97, for instance, JDCT_IFAST incurs generally
about a 4-6 dB loss (in PSNR) relative to JDCT_ISLOW, but this can be
larger for some images. If you can avoid it, do not use JDCT_IFAST
when decompressing images that were compressed using quality levels
above 97. The algorithm often degenerates for such images and can
actually produce a more lossy output image than if the JPEG image had
been compressed using lower quality levels. JDCT_FLOAT is mostly a
legacy feature. It does not produce significantly more accurate
results than the ISLOW method, and it is much slower. The FLOAT method
may also give different results on different machines due to varying
roundoff behavior, whereas the integer methods should give the same
results on all machines.
boolean do_fancy_upsampling
If TRUE, do careful upsampling of chroma components. If FALSE,

View File

@@ -172,13 +172,28 @@ Switches for advanced users:
-dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate).
-dct float Use floating-point DCT method.
The float method is very slightly more accurate than
the int method, but is much slower unless your machine
has very fast floating-point hardware. Also note that
results of the floating-point method may vary slightly
across machines, while the integer methods should give
the same results everywhere. The fast integer method
is much less accurate than the other two.
In libjpeg-turbo, the fast method is generally about
5-15% faster than the int method when using the
x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo
without SIMD extensions.) For quality levels of 90 and
below, there should be little or no perceptible
difference between the two algorithms. For quality
levels above 90, however, the difference between
the fast and the int methods becomes more pronounced.
With quality=97, for instance, the fast method incurs
generally about a 1-3 dB loss (in PSNR) relative to
the int method, but this can be larger for some images.
Do not use the fast method with quality levels above
97. The algorithm often degenerates at quality=98 and
above and can actually produce a more lossy image than
if lower quality levels had been used. The float
method is mostly a legacy feature. It does not produce
significantly more accurate results than the int
method, and it is much slower. The float method may
also give different results on different machines due
to varying roundoff behavior, whereas the integer
methods should give the same results on all machines.
-restart N Emit a JPEG restart marker every N MCU rows, or every
N MCU blocks if "B" is attached to the number.
@@ -296,13 +311,32 @@ Switches for advanced users:
-dct int Use integer DCT method (default).
-dct fast Use fast integer DCT (less accurate).
-dct float Use floating-point DCT method.
The float method is very slightly more accurate than
the int method, but is much slower unless your machine
has very fast floating-point hardware. Also note that
results of the floating-point method may vary slightly
across machines, while the integer methods should give
the same results everywhere. The fast integer method
is much less accurate than the other two.
In libjpeg-turbo, the fast method is generally about
5-15% faster than the int method when using the
x86/x86-64 SIMD extensions (results may vary with other
SIMD implementations, or when using libjpeg-turbo
without SIMD extensions.) If the JPEG image was
compressed using a quality level of 85 or below, then
there should be little or no perceptible difference
between the two algorithms. When decompressing images
that were compressed using quality levels above 85,
however, the difference between the fast and int
methods becomes more pronounced. With images
compressed using quality=97, for instance, the fast
method incurs generally about a 4-6 dB loss (in PSNR)
relative to the int method, but this can be larger for
some images. If you can avoid it, do not use the fast
method when decompressing images that were compressed
using quality levels above 97. The algorithm often
degenerates for such images and can actually produce
a more lossy output image than if the JPEG image had
been compressed using lower quality levels. The float
method is mostly a legacy feature. It does not produce
significantly more accurate results than the int
method, and it is much slower. The float method may
also give different results on different machines due
to varying roundoff behavior, whereas the integer
methods should give the same results on all machines.
-dither fs Use Floyd-Steinberg dithering in color quantization.
-dither ordered Use ordered dithering in color quantization.
@@ -381,12 +415,6 @@ When producing a color-quantized image, "-onepass -dither ordered" is fast but
much lower quality than the default behavior. "-dither none" may give
acceptable results in two-pass mode, but is seldom tolerable in one-pass mode.
If you are fortunate enough to have very fast floating point hardware,
"-dct float" may be even faster than "-dct fast". But on most machines
"-dct float" is slower than "-dct int"; in this case it is not worth using,
because its theoretical accuracy advantage is too small to be significant
in practice.
Two-pass color quantization requires a good deal of memory; on MS-DOS machines
it may run out of memory even with -maxmemory 0. In that case you can still
decompress, with some loss of image quality, by specifying -onepass for