Since libjpeg-turbo does not support Exif, the only way it can embed
density information in a JPEG image is by using the JFIF marker, which
is only written if the JPEG colorspace is YCbCr or grayscale.
(Referring to the conversation under #793, we may need to further
restrict that to 8-bit-per-sample JPEG images, because the JFIF spec
requires 8-bit data precision.)
With respect to tj3Transform(), this addresses an oversight from
bb1d540a80.
Note to self: A convenience function/method for computing the worst-case
transformed JPEG size for a particular transform would be nice.
- When transforming, the worst-case JPEG buffer size depends on the
subsampling level used in the destination image, since a grayscale
transform might have been applied.
- Parentheses Police
Lossless cropping is performed after other lossless transform
operations, so the cropping region must be specified relative to the
destination image dimensions and level of chrominance subsampling, not
the source image dimensions and level of chrominance subsampling.
More specifically, if the lossless transform operation swaps the X and Y
axes, or if the image is converted to grayscale, then that changes the
cropping region requirements.
The JPEG-1 spec never uses the term "MCU block". That term is rarely
used in other literature to describe the equivalent of an MCU in an
interleaved JPEG image, but the libjpeg documentation uses "iMCU" to
describe the same thing. "iMCU" is a better term, since the equivalent
of an interleaved MCU can contain multiple DCT blocks (or samples in
lossless mode) that are only grouped together if the image is
interleaved.
In the case of restart markers, "MCU block" was used in the libjpeg
documentation instead of "MCU", but "MCU" is more accurate and less
confusing. (The restart interval is literally in MCUs, where one MCU
is one data unit in a non-interleaved JPEG image and multiple data units
in a multi-component interleaved JPEG image.)
In the case of 9b704f96b2, the issue was
actually with progressive JPEG images exactly two DCT blocks wide, not
two MCU blocks wide.
This commit also defines "MCU" and "MCU row" in the description of the
various restart marker options/parameters. Although an MCU row is
technically always a row of samples in lossless mode, "sample row" was
confusing, since it is used in other places to describe a row of samples
for a single component (whereas an MCU row in a typical lossless JPEG
image consists of a row of interleaved samples for all components.)
TJ*PARAM_RESTARTBLOCKS technically works with lossless compression, but
it is not useful, since the value must be equal to the number of samples
in a row. (In other words, it is no different than
TJ*PARAM_RESTARTINROWS, except that it requires the user to do more
math.)
Due to an oversight in the multi-precision feature,
TJCompressor.srcBuf12 and TJCompressor.srcBuf16 were not set to null
in TJCompressor.setSourceImage(YUVImage) or
TJCompressor.setSourceImage(BufferedImage, ...). Thus, if an
application set a 12-bit or 16-bit packed-pixel buffer as the source
image then set a BufferedImage with integer pixels as the source image,
TJCompress.compress() would compress from the 12-bit or 16-bit
packed-pixel buffer instead of the BufferedImage. The odds of an
application actually doing that are very slim, however.
- "Optimized baseline entropy coding" = "Huffman table optimization"
"Optimized baseline entropy coding" was meant to emphasize that the
feature is only useful when generating baseline (single-scan lossy
8-bit-per-sample Huffman-coded) JPEG images, because it is
automatically enabled when generating Huffman-coded progressive
(multi-scan), 12-bit-per-sample, and lossless JPEG images. However,
Huffman table optimization isn't actually an integral part of those
non-baseline modes. You can forego Huffman table optimization with
12-bit data precision if you supply your own Huffman tables. The spec
doesn't require it with progressive or lossless mode, either, although
our implementation does. Furthermore, "baseline" describes more than
just the type of entropy coding used. It was incorrect to say that
optimized "baseline" entropy coding is automatically enabled for
Huffman-coded progressive, 12-bit-per-sample, and lossless JPEG
images, since those are clearly not baseline images.
- "Progressive entropy coding" = "Progressive JPEG"
"Progressive" describes more than just the type of entropy coding
used. (In fact, both Huffman-coded and arithmetic-coded images can be
progressive.)
- Mention that TJPARAM_OPTIMIZE/TJ.PARAM_OPTIMIZE can be used with
lossless transformation as well.
- General wordsmithing
- Formatting tweaks
(Regression introduced by 7bb958b732)
Because of 7bb958b732, the TurboJPEG
compression and encoding functions no longer transfer the value of
TJPARAM_OPTIMIZE into cinfo->data_precision unless the data precision
is 8. The intent of that was to prevent using_std_huff_tables() from
being called more than once when reusing the same compressor object to
generate multiple 12-bit-per-sample JPEG images. However, because
cinfo->optimize_coding is always set to TRUE by jpeg_set_defaults() if
the data precision is 12, calling applications that use 12-bit data
precision had to unset cinfo->optimize_coding if they set
cinfo->arith_code after calling jpeg_set_defaults(). Because of
7bb958b732, the TurboJPEG API stopped
doing that except with 8-bit data precision. Thus, attempting to
generate a 12-bit-per-sample arithmetic-coded lossy JPEG image using
the TurboJPEG API failed with "Requested features are incompatible."
Since the compressor will always fail if cinfo->arith_code and
cinfo->optimize_coding are both set, and since cinfo->optimize_coding
has no relevance for arithmetic coding, the most robust and user-proof
solution is for jinit_c_master_control() to set cinfo->optimize_coding
to FALSE if cinfo->arith_code is TRUE.
This commit also:
- modifies TJBench so that it no longer reports that it is using
optimized baseline entropy coding in modes where that setting
is irrelevant,
- amends the cjpeg documentation to clarify that -optimize is implied
when specifying -progressive or '-precision 12' without -arithmetic,
and
- prevents jpeg_set_defaults() from uselessly checking the value of
cinfo->arith_code immediately after it has been set to FALSE.
- "bits per component" = "bits per sample"
Describing the data precision of a JPEG image using "bits per
component" is technically correct, but "bits per sample" is the
terminology that the JPEG-1 spec uses. Also, "bits per component" is
more commonly used to describe the precision of packed-pixel formats
(as opposed to "bits per pixel") rather than planar formats, in which
all components are grouped together.
- Unmention legacy display technologies. Colormapped and monochrome
displays aren't a thing anymore, and even when they were still a
thing, it was possible to display full-color images to them. In 1991,
when JPEG decompression time was measured in minutes per megapixel, it
made sense to keep a decompressed copy of JPEG images on disk, in a
format that could be displayed without further color conversion (since
color conversion was slow and memory-intensive.) In 2024, JPEG
decompression time is measured in milliseconds per megapixel, and
color conversion is even faster. Thus, JPEG images can be
decompressed, displayed, and color-converted (if necessary) "on the
fly" at speeds too fast for human vision to perceive. (In fact, your
TV performs much more complicated decompression algorithms at least 60
times per second.)
- Document that color quantization (and associated features), GIF
input/output, Targa input/output, and OS/2 BMP input/output are legacy
features. Legacy status doesn't necessarily mean that the features
are deprecated. Rather, it is meant to discourage users from using
features that may be of little or no benefit on modern machines (such
as low-quality modes that had significant performance advantages in
the early 1990s but no longer do) and that are maintained on a
break/fix basis only.
- General wordsmithing, grammar/punctuation policing, and formatting
tweaks
- Clarify which data precisions each cjpeg input format and each djpeg
output format supports.
- cjpeg.1: Remove unnecessary and impolitic statement about the -targa
switch.
- Adjust or remove performance claims to reflect the fact that:
* On modern machines, the djpeg "-fast" switch has a negligible effect
on performance.
* There is a measurable difference between the performance of Floyd-
Steinberg dithering and no dithering, but it is not likely
perceptible to most users.
* There is a measurable difference between the performance of 1-pass
and 2-pass color quantization, but it is not likely perceptible to
most users.
* There is a measurable difference between the performance of
full-color and grayscale output when decompressing a full-color JPEG
image, but it is not likely perceptible to most users.
* IDCT scaling does not necessarily improve performance. (It
generally does if the scaling factor is <= 1/2 and generally doesn't
if the scaling factor is > 1/2, at least on my machine. The
performance claim made in jpeg-6b was probably invalidated when we
merged the additional scaling factors from jpeg-7.)
- Clarify which djpeg switches/output formats cannot be used when
decompressing lossless JPEG images.
- Remove djpeg hints, since those involve quality vs. speed tradeoffs
that are no longer relevant for modern machines.
- Remove documentation regarding using color quantization with 16-bit
data precision. (Color quantization requires lossy mode.)
- Java: Fix typos in TJDecompressor.decompress12() and
TJDecompressor.decompress16() documentation.
- jpegtran.1: Fix truncated paragraph
In a man page, a single quote at the start of a line is interpreted as
a macro.
Closes#775
- libjpeg.txt:
* Mention J16SAMPLE data type (oversight.)
* Remove statement about extending jdcolor.c. (libjpeg-turbo is not
quite as DIY as libjpeg once was.)
* Remove paragraph about tweaking the various typedefs in jmorecfg.h.
It is no longer relevant for modern machines.
* Remove caveat regarding systems with ints less than 16 bits wide.
(ANSI/ISO C requires an int to be at least 16 bits wide, and
libjpeg-turbo has never supported non-ANSI compilers.)
- usage.txt:
* Add copyright header.
* Document cjpeg -icc, -memdst, -report, -strict, and -version
switches.
* Document djpeg -icc, -maxscans, -memsrc, -report, -skip, -crop,
-strict, and -version switches.
* Document jpegtran -icc, -maxscans, -report, -strict, and -version
switches.
This makes it possible for downstream packagers and other integrators of
libjpeg-turbo to include only specific directories from the
libjpeg-turbo installation (or to install specific directories under a
different prefix, etc.) The names of the components correspond to the
directories into which they will be installed.
Refer to libvips/libvips#3931, #265, #338Closes#756
- Clarify that lossless JPEG is slower than and doesn't compress as well
as lossy JPEG. (That should be obvious, because "lossy" literally
means that data is thrown away.)
- Re-generate TurboJPEG C API documentation using Doxygen 1.9.8.
- Clarify that setting the data_precision field in jpeg_compress_struct
to 16 requires lossless mode.
- Explain what the predictor selection value actually does. (Refer to
Recommendation ITU-T T.81 (1992) | ISO/IEC 10918-1:1994, Section
H.1.2.1.)
TJPARAM_MAXPIXELS was previously hidden and used only for fuzz testing,
but it is potentially useful for calling applications as well,
particularly if they want to guard against excessive memory consumption
by the tj3LoadImage*() functions. The parameter has also been extended
to decompression and lossless transformation functions/methods, mainly
as a convenience. (It was already possible for calling applications to
impose their own JPEG image size limits by reading the JPEG header prior
to decompressing or transforming the image.)
This corresponds to max_memory_to_use in the jpeg_memory_mgr struct in
the libjpeg API, except that the TurboJPEG parameter is specified in
megabytes. Because this is 2023 and computers with less than 1 MB of
memory are not a thing (at least not within the scope of libjpeg-turbo
support), it isn't useful to allow a limit less than 1 MB to be
specified. Furthermore, because TurboJPEG parameters are signed
integers, if we allowed the memory limit to be specified in bytes, then
it would be impossible to specify a limit larger than 2 GB on 64-bit
machines. Because max_memory_to_use is a long signed integer,
effectively we can specify a limit of up to 2 petabytes on 64-bit
machines if the TurboJPEG parameter is specified in megabytes. (2 PB
should be enough for anybody, right?)
This commit also bumps the TurboJPEG API version to 3.0.1. Since the
TurboJPEG API version no longer tracks the libjpeg-turbo version, it
makes sense to increment the API revision number when adding constants,
to increment the minor version number when adding functions, and to
increment the major version number for a complete overhaul.
This commit also removes the vestigial TJ_NUMPARAM macro, which was
never defined because it proved unnecessary.
Partially implements #735
(oversight from 386ec0abc7)
Tiled decompression will ultimately fail if the subsampling type of the
JPEG input image is unknown, but the C version of TJBench needs to fail
earlier in order to avoid using -1 (TJSAMP_UNKNOWN) as an array index
for tjMCUWidth[]/tjMCUHeight[]. The Java version now fails earlier as
well, although there is no benefit to that other than making the error
message less cryptic.
If we have transformed a 4:1:1 or 4:4:1 JPEG input image in such a way
that the horizontal and vertical dimensions are transposed, then we need
to change the subsampling type that is passed to the decomp() function.
Otherwise, tj3YUVBufSize() may return an incorrect value.
(oversight from fc881ebb21)
When -tile is used with a JPEG input image, TJBench generates the tiles
using lossless cropping, which will fail if the cropping region doesn't
align with an MCU boundary. Furthermore, there is no reason to avoid
8x8 tiles when decompressing 4:4:4 or grayscale JPEG images.
This allows losslessly transposed or rotated 4:1:1 JPEG images to be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images.
Because tj3Transform() allows multiple lossless transformations to be
chained together, all subsampling options need to have a corresponding
transposed subsampling option. (This is why 4:4:0 was originally
implemented as well.) Otherwise, the documentation would be technically
incorrect. It says that images with unknown subsampling types cannot be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images, but it doesn't say anything about images with known
subsampling types whose subsampling type becomes unknown if the image is
rotated or transposed. This is one of those situations in which it is
easier to implement a feature that works around the problem than to
document the problem.
Closes#659
This actually works and apparently always has worked. It only failed
because the libjpeg code, which did not originally support arithmetic
coding, assumed that optimize_coding should always be TRUE for 12-bit
data precision.
(ChangeLog update forthcoming)
- Prefix all function names with "tj3" and remove version suffixes from
function names. (Future API overhauls will increment the prefix to
"tj4", etc., thus retaining backward API/ABI compatibility without
versioning each individual function.)
- Replace stateless boolean flags (including TJ*FLAG_ARITHMETIC and
TJ*FLAG_LOSSLESS, which were never released) with stateful integer
parameters, the value of which persists between function calls.
* Use parameters for the JPEG quality and subsampling as well, in
order to eliminate the awkwardness of specifying function arguments
that weren't relevant for lossless compression.
* tj3DecompressHeader() now stores all relevant information about the
JPEG image, including the width, height, subsampling type, entropy
coding type, etc. in parameters rather than returning that
information in its arguments.
* TJ*FLAG_LIMITSCANS has been reimplemented as an integer parameter
(TJ*PARAM_SCANLIMIT) that allows the number of scans to be
specified.
- Use the const keyword for all pointer arguments to unmodified
buffers, as well as for both dimensions of 2D pointers. Addresses
#395.
- Use size_t rather than unsigned long to represent buffer sizes, since
unsigned long is a 32-bit type on Windows. Addresses #24.
- Return 0 from all buffer size functions if an error occurs, rather
than awkwardly trying to return -1 in an unsigned data type.
- Implement 12-bit and 16-bit data precision using dedicated
compression, decompression, and image I/O functions/methods.
* Suffix the names of all data-precision-specific functions with 8,
12, or 16.
* Because the YUV functions are intended to be used for video, they
are currently only implemented with 8-bit data precision, but they
can be expanded to 12-bit data precision in the future, if
necessary.
* Extend TJUnitTest and TJBench to test 12-bit and 16-bit data
precision, using a new -precision option.
* Add appropriate regression tests for all of the above to the 'test'
target.
* Extend tjbenchtest to test 12-bit and 16-bit data precision, and
add separate 'tjtest12' and 'tjtest16' targets.
* BufferedImage I/O in the Java API is currently limited to 8-bit
data precision, since the BufferedImage class does not
straightforwardly support higher data precisions.
* Extend the PPM reader to convert 12-bit and 16-bit PBMPLUS files
to grayscale or CMYK pixels, as it already does for 8-bit files.
- Properly accommodate lossless JPEG using dedicated parameters
(TJ*PARAM_LOSSLESS, TJ*PARAM_LOSSLESSPSV, and TJ*PARAM_LOSSLESSPT),
rather than using a flag and awkwardly repurposing the JPEG quality.
Update TJBench to properly reflect whether a JPEG image is lossless.
- Re-organize the TJBench usage screen.
- Update the Java docs using Java 11, to improve the formatting and
eliminate HTML frames.
- Use the accurate integer DCT algorithm by default for both
compression and decompression, since the "fast" algorithm is a legacy
feature, it does not pass the ISO compliance tests, and it is not
actually faster on modern x86 CPUs.
* Remove the -accuratedct option from TJBench and TJExample.
- Re-implement the 'tjtest' target using a CMake script that enables
the appropriate tests, depending on the data precision and whether or
not the Java API is part of the build.
- Consolidate the C and Java versions of tjbenchtest into one script.
- Consolidate the C and Java versions of tjexampletest into one script.
- Combine all initialization functions into a single function
(tj3Init()) that accepts an integer parameter specifying the
subsystems to initialize.
- Enable decompression scaling explicitly, using a new function/method
(tj3SetScalingFactor()/TJDecompressor.setScalingFactor()), rather
than implicitly using awkward "desired width"/"desired height"
parameters.
- Introduce a new macro/constant (TJUNSCALED/TJ.UNSCALED) that maps to
a scaling factor of 1/1.
- Implement partial image decompression, using a new function/method
(tj3SetCroppingRegion()/TJDecompressor.setCroppingRegion()) and
TJBench option (-crop). Extend tjbenchtest to test the new feature.
Addresses #1.
- Allow the JPEG colorspace to be specified explicitly when
compressing, using a new parameter (TJ*PARAM_COLORSPACE). This
allows JPEG images with the RGB and CMYK colorspaces to be created.
- Remove the error/difference image feature from TJBench. Identical
images to the ones that TJBench created can be generated using
ImageMagick with
'magick composite <original_image> <output_image> -compose difference <diff_image>'
- Handle JPEG images with unknown subsampling types. TJ*PARAM_SUBSAMP
is set to TJ*SAMP_UNKNOWN (== -1) for such images, but they can still
be decompressed fully into packed-pixel images or losslessly
transformed (with the exception of lossless cropping.) They cannot
be partially decompressed or decompressed into planar YUV images.
Note also that TJBench, due to its lack of support for imperfect
transforms, requires that the subsampling type be known when
rotating, flipping, or transversely transposing an image. Addresses
#436
- The Java version of TJBench now has identical functionality to the C
version. This was accomplished by (somewhat hackishly) calling the
TurboJPEG C image I/O functions through JNI and copying the pixels
between the C heap and the Java heap.
- Add parameters (TJ*PARAM_RESTARTROWS and TJ*PARAM_RESTARTBLOCKS) and
a TJBench option (-restart) to allow the restart marker interval to
be specified when compressing. Eliminate the undocumented TJ_RESTART
environment variable.
- Add a parameter (TJ*PARAM_OPTIMIZE), a transform option
(TJ*OPT_OPTIMIZE), and a TJBench option (-optimize) to allow
optimized baseline Huffman coding to be specified when compressing.
Eliminate the undocumented TJ_OPTIMIZE environment variable.
- Add parameters (TJ*PARAM_XDENSITY, TJ*PARAM_DENSITY, and
TJ*DENSITYUNITS) to allow the pixel density to be specified when
compressing or saving a Windows BMP image and to be queried when
decompressing or loading a Windows BMP image. Addresses #77.
- Refactor the fuzz targets to use the new API.
* Extend decompression coverage to 12-bit and 16-bit data precision.
* Replace the awkward cjpeg12 and cjpeg16 targets with proper
TurboJPEG-based compress12, compress12-lossless, and
compress16-lossless targets
- Fix innocuous UBSan warnings uncovered by the new fuzzers.
- Implement previous versions of the TurboJPEG API by wrapping the new
functions (tested by running the 2.1.x versions of TJBench, via
tjbenchtest, and TJUnitTest against the new implementation.)
* Remove all JNI functions for deprecated Java methods and implement
the deprecated methods using pure Java wrappers. It should be
understood that backward API compatibility in Java applies only to
the Java classes and that one cannot mix and match a JAR file from
one version of libjpeg-turbo with a JNI library from another
version.
- tj3Destroy() now silently accepts a NULL handle.
- tj3Alloc() and tj3Free() now return/accept void pointers, as malloc()
and free() do.
- The image I/O functions now accept a TurboJPEG instance handle, which
is used to transmit/receive parameters and to receive error
information.
Closes#517
Because Java array sizes are ints, the various size methods in the TJ
class have int return values. Thus, we have to guard against signed
int overflow at the JNI level, because the C functions can return sizes
greater than INT_MAX.
This also adds a test for TJ.planeWidth() and TJ.planeHeight(), in order
to validate 8a1526a442 in Java.
The documented behavior of the -progressive option is to use progressive
entropy coding in JPEG images generated by compression and transform
operations. However, setting TJFLAG_PROGRESSIVE was insufficient to
accomplish that, because TJBench doesn't enable lossless transformation
if xformOpt == 0.
- TJBench/TJUnitTest: Wordsmith command-line output
- Java: "decompress operations"="decompression operations"
- tjLoadImage(): Error message tweak
- Don't mention compression performance in the description of
TJXOPT_PROGRESSIVE/TJTransform.OPT_PROGRESSIVE, because the image has
already been compressed at that point.
(Oversights from 9a146f0f23)
This is similar to the fix that 2a9e3bd743
applied to the C API. We have to apply it separately at the JNI level
because the Java API always stores buffer sizes in 32-bit integers, and
the C buffer size functions could overflow an int when using 64-bit
code. (NOTE: The Java API stores buffer sizes in 32-bit integers
because Java itself always uses 32-bit integers for array sizes.) Since
Java don't allow no buffer overruns 'round here, this commit doesn't
change the ultimate outcome. It just makes the inevitable exception
easier to diagnose.