The JPEG-1 spec never uses the term "MCU block". That term is rarely
used in other literature to describe the equivalent of an MCU in an
interleaved JPEG image, but the libjpeg documentation uses "iMCU" to
describe the same thing. "iMCU" is a better term, since the equivalent
of an interleaved MCU can contain multiple DCT blocks (or samples in
lossless mode) that are only grouped together if the image is
interleaved.
In the case of restart markers, "MCU block" was used in the libjpeg
documentation instead of "MCU", but "MCU" is more accurate and less
confusing. (The restart interval is literally in MCUs, where one MCU
is one data unit in a non-interleaved JPEG image and multiple data units
in a multi-component interleaved JPEG image.)
In the case of 9b704f96b2, the issue was
actually with progressive JPEG images exactly two DCT blocks wide, not
two MCU blocks wide.
This commit also defines "MCU" and "MCU row" in the description of the
various restart marker options/parameters. Although an MCU row is
technically always a row of samples in lossless mode, "sample row" was
confusing, since it is used in other places to describe a row of samples
for a single component (whereas an MCU row in a typical lossless JPEG
image consists of a row of interleaved samples for all components.)
(Regression introduced by 7bb958b732)
Because of 7bb958b732, the TurboJPEG
compression and encoding functions no longer transfer the value of
TJPARAM_OPTIMIZE into cinfo->data_precision unless the data precision
is 8. The intent of that was to prevent using_std_huff_tables() from
being called more than once when reusing the same compressor object to
generate multiple 12-bit-per-sample JPEG images. However, because
cinfo->optimize_coding is always set to TRUE by jpeg_set_defaults() if
the data precision is 12, calling applications that use 12-bit data
precision had to unset cinfo->optimize_coding if they set
cinfo->arith_code after calling jpeg_set_defaults(). Because of
7bb958b732, the TurboJPEG API stopped
doing that except with 8-bit data precision. Thus, attempting to
generate a 12-bit-per-sample arithmetic-coded lossy JPEG image using
the TurboJPEG API failed with "Requested features are incompatible."
Since the compressor will always fail if cinfo->arith_code and
cinfo->optimize_coding are both set, and since cinfo->optimize_coding
has no relevance for arithmetic coding, the most robust and user-proof
solution is for jinit_c_master_control() to set cinfo->optimize_coding
to FALSE if cinfo->arith_code is TRUE.
This commit also:
- modifies TJBench so that it no longer reports that it is using
optimized baseline entropy coding in modes where that setting
is irrelevant,
- amends the cjpeg documentation to clarify that -optimize is implied
when specifying -progressive or '-precision 12' without -arithmetic,
and
- prevents jpeg_set_defaults() from uselessly checking the value of
cinfo->arith_code immediately after it has been set to FALSE.
This gives the user a hint as to whether converting the input file into
a JPEG file with a particular data precision will result in a loss of
precision. Also modify usage.txt and the cjpeg man page to warn the
user about that possibility.
- "bits per component" = "bits per sample"
Describing the data precision of a JPEG image using "bits per
component" is technically correct, but "bits per sample" is the
terminology that the JPEG-1 spec uses. Also, "bits per component" is
more commonly used to describe the precision of packed-pixel formats
(as opposed to "bits per pixel") rather than planar formats, in which
all components are grouped together.
- Unmention legacy display technologies. Colormapped and monochrome
displays aren't a thing anymore, and even when they were still a
thing, it was possible to display full-color images to them. In 1991,
when JPEG decompression time was measured in minutes per megapixel, it
made sense to keep a decompressed copy of JPEG images on disk, in a
format that could be displayed without further color conversion (since
color conversion was slow and memory-intensive.) In 2024, JPEG
decompression time is measured in milliseconds per megapixel, and
color conversion is even faster. Thus, JPEG images can be
decompressed, displayed, and color-converted (if necessary) "on the
fly" at speeds too fast for human vision to perceive. (In fact, your
TV performs much more complicated decompression algorithms at least 60
times per second.)
- Document that color quantization (and associated features), GIF
input/output, Targa input/output, and OS/2 BMP input/output are legacy
features. Legacy status doesn't necessarily mean that the features
are deprecated. Rather, it is meant to discourage users from using
features that may be of little or no benefit on modern machines (such
as low-quality modes that had significant performance advantages in
the early 1990s but no longer do) and that are maintained on a
break/fix basis only.
- General wordsmithing, grammar/punctuation policing, and formatting
tweaks
- Clarify which data precisions each cjpeg input format and each djpeg
output format supports.
- cjpeg.1: Remove unnecessary and impolitic statement about the -targa
switch.
- Adjust or remove performance claims to reflect the fact that:
* On modern machines, the djpeg "-fast" switch has a negligible effect
on performance.
* There is a measurable difference between the performance of Floyd-
Steinberg dithering and no dithering, but it is not likely
perceptible to most users.
* There is a measurable difference between the performance of 1-pass
and 2-pass color quantization, but it is not likely perceptible to
most users.
* There is a measurable difference between the performance of
full-color and grayscale output when decompressing a full-color JPEG
image, but it is not likely perceptible to most users.
* IDCT scaling does not necessarily improve performance. (It
generally does if the scaling factor is <= 1/2 and generally doesn't
if the scaling factor is > 1/2, at least on my machine. The
performance claim made in jpeg-6b was probably invalidated when we
merged the additional scaling factors from jpeg-7.)
- Clarify which djpeg switches/output formats cannot be used when
decompressing lossless JPEG images.
- Remove djpeg hints, since those involve quality vs. speed tradeoffs
that are no longer relevant for modern machines.
- Remove documentation regarding using color quantization with 16-bit
data precision. (Color quantization requires lossy mode.)
- Java: Fix typos in TJDecompressor.decompress12() and
TJDecompressor.decompress16() documentation.
- jpegtran.1: Fix truncated paragraph
In a man page, a single quote at the start of a line is interpreted as
a macro.
Closes#775
- libjpeg.txt:
* Mention J16SAMPLE data type (oversight.)
* Remove statement about extending jdcolor.c. (libjpeg-turbo is not
quite as DIY as libjpeg once was.)
* Remove paragraph about tweaking the various typedefs in jmorecfg.h.
It is no longer relevant for modern machines.
* Remove caveat regarding systems with ints less than 16 bits wide.
(ANSI/ISO C requires an int to be at least 16 bits wide, and
libjpeg-turbo has never supported non-ANSI compilers.)
- usage.txt:
* Add copyright header.
* Document cjpeg -icc, -memdst, -report, -strict, and -version
switches.
* Document djpeg -icc, -maxscans, -memsrc, -report, -skip, -crop,
-strict, and -version switches.
* Document jpegtran -icc, -maxscans, -report, -strict, and -version
switches.
- Clarify that lossless JPEG is slower than and doesn't compress as well
as lossy JPEG. (That should be obvious, because "lossy" literally
means that data is thrown away.)
- Re-generate TurboJPEG C API documentation using Doxygen 1.9.8.
- Clarify that setting the data_precision field in jpeg_compress_struct
to 16 requires lossless mode.
- Explain what the predictor selection value actually does. (Refer to
Recommendation ITU-T T.81 (1992) | ISO/IEC 10918-1:1994, Section
H.1.2.1.)
Color quantization is a legacy feature that serves little or no purpose
with lossless JPEG images. 9f756bc67a
eliminated interaction issues between the lossless decompressor and the
color quantizers related to out-of-range 12-bit samples, but referring
to #701, other interaction issues apparently still exist. Such issues
are likely, given the fact that the color quantizers were not designed
with lossless decompression in mind.
This commit reverts 9f756bc67a, since the
issues it fixed are no longer relevant because of this commit and
2192560d74.
Fixed#672Fixes#673Fixes#674Fixes#676Fixes#677Fixes#678Fixes#679Fixes#681Fixes#683Fixes#701
Lossless: Accommodate LJT colorspace/SIMD exts
In libjpeg-turbo, grayscale_convert() and null_convert() aren't the only
lossless color conversion algorithms. We can also losslessly convert
RGB to and from any of the extended RGB colorspaces, and some platforms
have SIMD-accelerated null color conversion.
This commit also disallows RGB565 output in lossless mode, and it moves
the IsExtRGB() macro from cdjpeg.h to jpegint.h and repurposes it to
make jinit_color_converter() and jinit_color_deconverter() more
readable.
- Rename jpeg_simple_lossless() to jpeg_enable_lossless() and modify the
function so that it stores the lossless parameters directly in the Ss
and Al fields of jpeg_compress_struct rather than using a scan script.
- Move the cjpeg -lossless switch into "Switches for advanced users".
- Document the libjpeg API and run-time features that are unavailable in
lossless mode, and ensure that all parameters, functions, and switches
related to unavailable features are ignored or generate errors in
lossless mode.
- Defer any action that depends on whether lossless mode is enabled
until jpeg_start_compress()/jpeg_start_decompress() is called.
- Document the purpose of the point transform value.
- "Codec" stands for coder/decoder, so it is a bit awkward to say
"lossless compression codec" and "lossless decompression codec".
Use "lossless compressor" and "lossless decompressor" instead.
- Restore backward API/ABI compatibility with libjpeg v6b:
* Move the new 'lossless' field from the exposed jpeg_compress_struct
and jpeg_decompress_struct structures into the opaque
jpeg_comp_master and jpeg_decomp_master structures, and allocate the
master structures in the body of jpeg_create_compress() and
jpeg_create_decompress().
* Remove the new 'process' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with the old
'progressive_mode' field and the new 'lossless' field.
* Remove the new 'data_unit' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with a locally-computed
data unit variable.
* Restore the names of macros and fields that refer to DCT blocks, and
document that they have a different meaning in lossless mode. (Most
of them aren't very meaningful in lossless mode anyhow.)
* Remove the new alloc_darray() method from jpeg_memory_mgr and
replace it with an internal macro that wraps the alloc_sarray()
method.
* Move the JDIFF* data types from jpeglib.h and jmorecfg.h into
jpegint.h.
* Remove the new 'codec' field from jpeg_compress_struct and
jpeg_decompress_struct and instead reuse the existing internal
coefficient control, forward/inverse DCT, and entropy
encoding/decoding structures for lossless compression/decompression.
* Repurpose existing error codes rather than introducing new ones.
(The new JERR_BAD_RESTART and JWRN_MUST_DOWNSCALE codes remain,
although JWRN_MUST_DOWNSCALE will probably be removed in
libjpeg-turbo, since we have a different way of handling multiple
data precisions.)
- Automatically enable lossless mode when a scan script with parameters
that are only valid for lossless mode is detected, and document the
use of scan scripts to generate lossless JPEG images.
- Move the sequential and shared Huffman routines back into jchuff.c and
jdhuff.c, and document that those routines are shared with jclhuff.c
and jdlhuff.c as well as with jcphuff.c and jdphuff.c.
- Move MAX_DIFF_BITS from jchuff.h into jclhuff.c, the only place where
it is used.
- Move the predictor and scaler code into jclossls.c and jdlossls.c.
- Streamline register usage in the [un]differencers (inspired by similar
optimizations in the color [de]converters.)
- Restructure the logic in a few places to reduce duplicated code.
- Ensure that all lossless-specific code is guarded by
C_LOSSLESS_SUPPORTED or D_LOSSLESS_SUPPORTED and that the library can
be built successfully if either or both of those macros is undefined.
- Remove all short forms of external names introduced by the lossless
JPEG patch. (These will not be needed by libjpeg-turbo, so there is
no use cleaning them up.)
- Various wordsmithing, formatting, and punctuation tweaks
- Eliminate various compiler warnings.
- Rename jpeg_simple_lossless() to jpeg_enable_lossless() and modify the
function so that it stores the lossless parameters directly in the Ss
and Al fields of jpeg_compress_struct rather than using a scan script.
- Move the cjpeg -lossless switch into "Switches for advanced users".
- Document the libjpeg API and run-time features that are unavailable in
lossless mode, and ensure that all parameters, functions, and switches
related to unavailable features are ignored or generate errors in
lossless mode.
- Defer any action that depends on whether lossless mode is enabled
until jpeg_start_compress()/jpeg_start_decompress() is called.
- Document the purpose of the point transform value.
- "Codec" stands for coder/decoder, so it is a bit awkward to say
"lossless compression codec" and "lossless decompression codec".
Use "lossless compressor" and "lossless decompressor" instead.
- Restore backward API/ABI compatibility with libjpeg v6b:
* Move the new 'lossless' field from the exposed jpeg_compress_struct
and jpeg_decompress_struct structures into the opaque
jpeg_comp_master and jpeg_decomp_master structures, and allocate the
master structures in the body of jpeg_create_compress() and
jpeg_create_decompress().
* Remove the new 'process' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with the old
'progressive_mode' field and the new 'lossless' field.
* Remove the new 'data_unit' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with a locally-computed
data unit variable.
* Restore the names of macros and fields that refer to DCT blocks, and
document that they have a different meaning in lossless mode. (Most
of them aren't very meaningful in lossless mode anyhow.)
* Remove the new alloc_darray() method from jpeg_memory_mgr and
replace it with an internal macro that wraps the alloc_sarray()
method.
* Move the JDIFF* data types from jpeglib.h and jmorecfg.h into
jpegint.h.
* Remove the new 'codec' field from jpeg_compress_struct and
jpeg_decompress_struct and instead reuse the existing internal
coefficient control, forward/inverse DCT, and entropy
encoding/decoding structures for lossless compression/decompression.
* Repurpose existing error codes rather than introducing new ones.
(The new JERR_BAD_RESTART and JWRN_MUST_DOWNSCALE codes remain,
although JWRN_MUST_DOWNSCALE will probably be removed in
libjpeg-turbo, since we have a different way of handling multiple
data precisions.)
- Automatically enable lossless mode when a scan script with parameters
that are only valid for lossless mode is detected, and document the
use of scan scripts to generate lossless JPEG images.
- Move the sequential and shared Huffman routines back into jchuff.c and
jdhuff.c, and document that those routines are shared with jclhuff.c
and jdlhuff.c as well as with jcphuff.c and jdphuff.c.
- Move MAX_DIFF_BITS from jchuff.h into jclhuff.c, the only place where
it is used.
- Move the predictor and scaler code into jclossls.c and jdlossls.c.
- Streamline register usage in the [un]differencers (inspired by similar
optimizations in the color [de]converters.)
- Restructure the logic in a few places to reduce duplicated code.
- Ensure that all lossless-specific code is guarded by
C_LOSSLESS_SUPPORTED or D_LOSSLESS_SUPPORTED and that the library can
be built successfully if either or both of those macros is undefined.
- Remove all short forms of external names introduced by the lossless
JPEG patch. (These will not be needed by libjpeg-turbo, so there is
no use cleaning them up.)
- Various wordsmithing, formatting, and punctuation tweaks
- Eliminate various compiler warnings.
The Gordian knot that 7fec5074f9 attempted
to unravel was caused by the fact that there are several
data-precision-dependent (JSAMPLE-dependent) fields and methods in the
exposed libjpeg API structures, and if you change the exposed libjpeg
API structures, then you have to change the whole API. If you change
the whole API, then you have to provide a whole new library to support
the new API, and that makes it difficult to support multiple data
precisions in the same application. (It is not impossible, as example.c
demonstrated, but using data-precision-dependent libjpeg API structures
would have made the cjpeg, djpeg, and jpegtran source code hard to read,
so it made more sense to build, install, and package 12-bit-specific
versions of those applications.)
Unfortunately, the result of that initial integration effort was an
unreadable and unmaintainable mess, which is a problem for a library
that is an ISO/ITU-T reference implementation. Also, as I dug into the
problem of lossless JPEG support, I realized that 16-bit lossless JPEG
images are a thing, and supporting yet another version of the libjpeg
API just for those images is untenable.
In fact, however, the touch points for JSAMPLE in the exposed libjpeg
API structures are minimal:
- The colormap and sample_range_limit fields in jpeg_decompress_struct
- The alloc_sarray() and access_virt_sarray() methods in
jpeg_memory_mgr
- jpeg_write_scanlines() and jpeg_write_raw_data()
- jpeg_read_scanlines() and jpeg_read_raw_data()
- jpeg_skip_scanlines() and jpeg_crop_scanline()
(This is subtle, but both of those functions use JSAMPLE-dependent
opaque structures behind the scenes.)
It is much more readable and maintainable to provide 12-bit-specific
versions of those six top-level API functions and to document that the
aforementioned methods and fields must be type-cast when using 12-bit
samples. Since that eliminates the need to provide a 12-bit-specific
version of the exposed libjpeg API structures, we can:
- Compile only the precision-dependent libjpeg modules (the
coefficient buffer controllers, the colorspace converters, the
DCT/IDCT managers, the main buffer controllers, the preprocessing
and postprocessing controller, the downsampler and upsamplers, the
quantizers, the integer DCT methods, and the IDCT methods) for
multiple data precisions.
- Introduce 12-bit-specific methods into the various internal
structures defined in jpegint.h.
- Create precision-independent data type, macro, method, field, and
function names that are prefixed by an underscore, and use an
internal header to convert those into precision-dependent data
type, macro, method, field, and function names, based on the value
of BITS_IN_JSAMPLE, when compiling the precision-dependent libjpeg
modules.
- Expose precision-dependent jinit*() functions for each of the
precision-dependent libjpeg modules.
- Abstract the precision-dependent libjpeg modules by calling the
appropriate precision-dependent jinit*() function, based on the
value of cinfo->data_precision, from top-level libjpeg API
functions.
- Refer to the "slow" [I]DCT algorithms as "accurate" instead, since
they are not slow under libjpeg-turbo.
- Adjust documentation claims to reflect the fact that the "slow" and
"fast" algorithms produce about the same performance on AVX2-equipped
CPUs (because of the dual-lane nature of AVX2, it was not possible to
accelerate the "fast" algorithm beyond what was achievable with SSE2.)
Also adjust the claims to reflect the fact that the "fast" algorithm
tends to be ~5-15% faster than the "slow" algorithm on
non-AVX2-equipped CPUs, regardless of the use of the libjpeg-turbo
SIMD extensions.
- Indicate the legacy status of the "fast" and float algorithms in the
documentation and cjpeg/djpeg usage info.
- Remove obsolete paragraph in the djpeg man page that suggested that
the float algorithm could be faster than the "fast" algorithm on some
CPUs.
- Restore GIF read/compressed GIF write support from jpeg-6a and
jpeg-9d.
- Integrate jpegtran -wipe and -drop options from jpeg-9a and jpeg-9d.
- Integrate jpegtran -crop extension (for expanding the image size) from
jpeg-9a and jpeg-9d.
- Integrate other minor code tweaks from jpeg-9*
libjpeg-turbo never included that code, because it requires an external
library (the Utah Raster Toolkit.) The RLE image format was supplanted
by GIF in the late 1980s, so it is rarely seen these days. (It had a
lousy Weissman score, anyhow.)
- Enable progress reporting at run time using a new -report argument
(cjpeg now supports that argument as well)
- Limit the allowable number of scans using a new -maxscans argument
- Treat warnings as fatal using a new -strict argument
This mainly demonstrates how to work around the two issues with the
JPEG standard described here:
https://libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf
since those and similar issues continue to be erroneously reported as
libjpeg-turbo bugs.
With rare exceptions ...
- Always separate line continuation characters by one space from
preceding code.
- Always use two-space indentation. Never use tabs.
- Always use K&R-style conditional blocks.
- Always surround operators with spaces, except in raw assembly code.
- Always put a space after, but not before, a comma.
- Never put a space between type casts and variables/function calls.
- Never put a space between the function name and the argument list in
function declarations and prototypes.
- Always surround braces ('{' and '}') with spaces.
- Always surround statements (if, for, else, catch, while, do, switch)
with spaces.
- Always attach pointer symbols ('*' and '**') to the variable or
function name.
- Always precede pointer symbols ('*' and '**') by a space in type
casts.
- Use the MIN() macro from jpegint.h within the libjpeg and TurboJPEG
API libraries (using min() from tjutil.h is still necessary for
TJBench.)
- Where it makes sense (particularly in the TurboJPEG code), put a blank
line after variable declaration blocks.
- Always separate statements in one-liners by two spaces.
The purpose of this was to ease maintenance on my part and also to make
it easier for contributors to figure out how to format patch
submissions. This was admittedly confusing (even to me sometimes) when
we had 3 or 4 different style conventions in the same source tree. The
new convention is more consistent with the formatting of other OSS code
bases.
This commit corrects deviations from the chosen formatting style in the
libjpeg API code and reformats the TurboJPEG API code such that it
conforms to the same standard.
NOTES:
- Although it is no longer necessary for the function name in function
declarations to begin in Column 1 (this was historically necessary
because of the ansi2knr utility, which allowed libjpeg to be built
with non-ANSI compilers), we retain that formatting for the libjpeg
code because it improves readability when using libjpeg's function
attribute macros (GLOBAL(), etc.)
- This reformatting project was accomplished with the help of AStyle and
Uncrustify, although neither was completely up to the task, and thus
a great deal of manual tweaking was required. Note to developers of
code formatting utilities: the libjpeg-turbo code base is an
excellent test bed, because AFAICT, it breaks every single one of the
utilities that are currently available.
- The legacy (MMX, SSE, 3DNow!) assembly code for i386 has been
formatted to match the SSE2 code (refer to
ff5685d5344273df321eb63a005eaae19d2496e3.) I hadn't intended to
bother with this, but the Loongson MMI implementation demonstrated
that there is still academic value to the MMX implementation, as an
algorithmic model for other 64-bit vector implementations. Thus, it
is desirable to improve its readability in the same manner as that of
the SSE2 implementation.
This re-introduces a feature of the obsolete system-specific libjpeg
memory managers-- namely the ability to limit the amount of main memory
used by the library during decompression or multi-pass compression.
This is mainly beneficial for two reasons:
- Works around a 2 GB limit in libFuzzer
- Allows security-sensitive applications to set a memory limit for the
JPEG decoder so as to work around the progressive JPEG exploit
(LJT-01-004) described here:
http://www.libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf
This commit also removes obsolete documentation regarding the MS-DOS
memory manager (which itself was removed long ago) and changes the
documentation of the -maxmemory switch and JPEGMEM environment variable
to reflect the fact that backing stores are never used in libjpeg-turbo.
Inspired by:
066fee2e7dCloses#143
This commit does the following:
-- Merges the two glueware functions (read_icc_profile() and
write_icc_profile()) from iccjpeg.c, which is contained in downstream
projects such as LCMS, Ghostscript, Mozilla, etc. These functions were
originally intended for inclusion in libjpeg, but Tom Lane left the IJG
before that could be accomplished. Since then, programs and libraries
that needed to embed/extract ICC profiles in JPEG files had to include
their own local copy of iccjpeg.c, which is suboptimal.
-- The new functions were prefixed with jpeg_ and split into separate
files for the compressor and decompressor, per the existing libjpeg
coding standards.
-- jpeg_write_icc_profile() was made slightly more fault-tolerant.
It will now trigger a libjpeg error if it is called before
jpeg_start_compress() or if it is passed NULL arguments.
-- jpeg_read_icc_profile() was made slightly more fault-tolerant.
It will now trigger a libjpeg error if it is called before
jpeg_read_header() or if it is passed NULL arguments. It will also
now trigger libjpeg warnings if the ICC profile data is corrupt.
-- The code comments have been wordsmithed.
-- Note that the one-line setup_read_icc_profile() function was not
included. Instead, libjpeg.txt now documents the need to call
jpeg_save_markers(cinfo, JPEG_APP0 + 2, 0xFFFF) prior to calling
jpeg_read_header(), if jpeg_read_icc_profile() is to be used.
-- Adds documentation for the new functions to libjpeg.txt.
-- Adds an -icc switch to cjpeg and jpegtran that allows those programs
to embed an ICC profile in the JPEG files they generate.
-- Adds an -icc switch to djpeg that allows that program to extract an
ICC profile from a JPEG file while decompressing.
-- Adds appropriate unit tests for all of the above.
-- Bumps the SO_AGE of the libjpeg API library to indicate the presence
of new API functions.
Note that the licensing information was obtained from:
https://github.com/mm2/Little-CMS/issues/37#issuecomment-66450180
Quality values > 95 are not useless. They just may not provide as good
of a size vs. perceptual quality tradeoff as lower quality values. This
also displays the default quality value in the cjpeg usage.
Closes#39