The JPEG-1 spec never uses the term "MCU block". That term is rarely
used in other literature to describe the equivalent of an MCU in an
interleaved JPEG image, but the libjpeg documentation uses "iMCU" to
describe the same thing. "iMCU" is a better term, since the equivalent
of an interleaved MCU can contain multiple DCT blocks (or samples in
lossless mode) that are only grouped together if the image is
interleaved.
In the case of restart markers, "MCU block" was used in the libjpeg
documentation instead of "MCU", but "MCU" is more accurate and less
confusing. (The restart interval is literally in MCUs, where one MCU
is one data unit in a non-interleaved JPEG image and multiple data units
in a multi-component interleaved JPEG image.)
In the case of 9b704f96b2, the issue was
actually with progressive JPEG images exactly two DCT blocks wide, not
two MCU blocks wide.
This commit also defines "MCU" and "MCU row" in the description of the
various restart marker options/parameters. Although an MCU row is
technically always a row of samples in lossless mode, "sample row" was
confusing, since it is used in other places to describe a row of samples
for a single component (whereas an MCU row in a typical lossless JPEG
image consists of a row of interleaved samples for all components.)
- "Optimized baseline entropy coding" = "Huffman table optimization"
"Optimized baseline entropy coding" was meant to emphasize that the
feature is only useful when generating baseline (single-scan lossy
8-bit-per-sample Huffman-coded) JPEG images, because it is
automatically enabled when generating Huffman-coded progressive
(multi-scan), 12-bit-per-sample, and lossless JPEG images. However,
Huffman table optimization isn't actually an integral part of those
non-baseline modes. You can forego Huffman table optimization with
12-bit data precision if you supply your own Huffman tables. The spec
doesn't require it with progressive or lossless mode, either, although
our implementation does. Furthermore, "baseline" describes more than
just the type of entropy coding used. It was incorrect to say that
optimized "baseline" entropy coding is automatically enabled for
Huffman-coded progressive, 12-bit-per-sample, and lossless JPEG
images, since those are clearly not baseline images.
- "Progressive entropy coding" = "Progressive JPEG"
"Progressive" describes more than just the type of entropy coding
used. (In fact, both Huffman-coded and arithmetic-coded images can be
progressive.)
- Mention that TJPARAM_OPTIMIZE/TJ.PARAM_OPTIMIZE can be used with
lossless transformation as well.
- General wordsmithing
- Formatting tweaks
(Regression introduced by 7bb958b732)
Because of 7bb958b732, the TurboJPEG
compression and encoding functions no longer transfer the value of
TJPARAM_OPTIMIZE into cinfo->data_precision unless the data precision
is 8. The intent of that was to prevent using_std_huff_tables() from
being called more than once when reusing the same compressor object to
generate multiple 12-bit-per-sample JPEG images. However, because
cinfo->optimize_coding is always set to TRUE by jpeg_set_defaults() if
the data precision is 12, calling applications that use 12-bit data
precision had to unset cinfo->optimize_coding if they set
cinfo->arith_code after calling jpeg_set_defaults(). Because of
7bb958b732, the TurboJPEG API stopped
doing that except with 8-bit data precision. Thus, attempting to
generate a 12-bit-per-sample arithmetic-coded lossy JPEG image using
the TurboJPEG API failed with "Requested features are incompatible."
Since the compressor will always fail if cinfo->arith_code and
cinfo->optimize_coding are both set, and since cinfo->optimize_coding
has no relevance for arithmetic coding, the most robust and user-proof
solution is for jinit_c_master_control() to set cinfo->optimize_coding
to FALSE if cinfo->arith_code is TRUE.
This commit also:
- modifies TJBench so that it no longer reports that it is using
optimized baseline entropy coding in modes where that setting
is irrelevant,
- amends the cjpeg documentation to clarify that -optimize is implied
when specifying -progressive or '-precision 12' without -arithmetic,
and
- prevents jpeg_set_defaults() from uselessly checking the value of
cinfo->arith_code immediately after it has been set to FALSE.
TJPARAM_MAXPIXELS was previously hidden and used only for fuzz testing,
but it is potentially useful for calling applications as well,
particularly if they want to guard against excessive memory consumption
by the tj3LoadImage*() functions. The parameter has also been extended
to decompression and lossless transformation functions/methods, mainly
as a convenience. (It was already possible for calling applications to
impose their own JPEG image size limits by reading the JPEG header prior
to decompressing or transforming the image.)
This corresponds to max_memory_to_use in the jpeg_memory_mgr struct in
the libjpeg API, except that the TurboJPEG parameter is specified in
megabytes. Because this is 2023 and computers with less than 1 MB of
memory are not a thing (at least not within the scope of libjpeg-turbo
support), it isn't useful to allow a limit less than 1 MB to be
specified. Furthermore, because TurboJPEG parameters are signed
integers, if we allowed the memory limit to be specified in bytes, then
it would be impossible to specify a limit larger than 2 GB on 64-bit
machines. Because max_memory_to_use is a long signed integer,
effectively we can specify a limit of up to 2 petabytes on 64-bit
machines if the TurboJPEG parameter is specified in megabytes. (2 PB
should be enough for anybody, right?)
This commit also bumps the TurboJPEG API version to 3.0.1. Since the
TurboJPEG API version no longer tracks the libjpeg-turbo version, it
makes sense to increment the API revision number when adding constants,
to increment the minor version number when adding functions, and to
increment the major version number for a complete overhaul.
This commit also removes the vestigial TJ_NUMPARAM macro, which was
never defined because it proved unnecessary.
Partially implements #735
(oversight from 386ec0abc7)
Tiled decompression will ultimately fail if the subsampling type of the
JPEG input image is unknown, but the C version of TJBench needs to fail
earlier in order to avoid using -1 (TJSAMP_UNKNOWN) as an array index
for tjMCUWidth[]/tjMCUHeight[]. The Java version now fails earlier as
well, although there is no benefit to that other than making the error
message less cryptic.
If we have transformed a 4:1:1 or 4:4:1 JPEG input image in such a way
that the horizontal and vertical dimensions are transposed, then we need
to change the subsampling type that is passed to the decomp() function.
Otherwise, tj3YUVBufSize() may return an incorrect value.
(oversight from fc881ebb21)
When -tile is used with a JPEG input image, TJBench generates the tiles
using lossless cropping, which will fail if the cropping region doesn't
align with an MCU boundary. Furthermore, there is no reason to avoid
8x8 tiles when decompressing 4:4:4 or grayscale JPEG images.
This allows losslessly transposed or rotated 4:1:1 JPEG images to be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images.
Because tj3Transform() allows multiple lossless transformations to be
chained together, all subsampling options need to have a corresponding
transposed subsampling option. (This is why 4:4:0 was originally
implemented as well.) Otherwise, the documentation would be technically
incorrect. It says that images with unknown subsampling types cannot be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images, but it doesn't say anything about images with known
subsampling types whose subsampling type becomes unknown if the image is
rotated or transposed. This is one of those situations in which it is
easier to implement a feature that works around the problem than to
document the problem.
Closes#659
This actually works and apparently always has worked. It only failed
because the libjpeg code, which did not originally support arithmetic
coding, assumed that optimize_coding should always be TRUE for 12-bit
data precision.
(ChangeLog update forthcoming)
- Prefix all function names with "tj3" and remove version suffixes from
function names. (Future API overhauls will increment the prefix to
"tj4", etc., thus retaining backward API/ABI compatibility without
versioning each individual function.)
- Replace stateless boolean flags (including TJ*FLAG_ARITHMETIC and
TJ*FLAG_LOSSLESS, which were never released) with stateful integer
parameters, the value of which persists between function calls.
* Use parameters for the JPEG quality and subsampling as well, in
order to eliminate the awkwardness of specifying function arguments
that weren't relevant for lossless compression.
* tj3DecompressHeader() now stores all relevant information about the
JPEG image, including the width, height, subsampling type, entropy
coding type, etc. in parameters rather than returning that
information in its arguments.
* TJ*FLAG_LIMITSCANS has been reimplemented as an integer parameter
(TJ*PARAM_SCANLIMIT) that allows the number of scans to be
specified.
- Use the const keyword for all pointer arguments to unmodified
buffers, as well as for both dimensions of 2D pointers. Addresses
#395.
- Use size_t rather than unsigned long to represent buffer sizes, since
unsigned long is a 32-bit type on Windows. Addresses #24.
- Return 0 from all buffer size functions if an error occurs, rather
than awkwardly trying to return -1 in an unsigned data type.
- Implement 12-bit and 16-bit data precision using dedicated
compression, decompression, and image I/O functions/methods.
* Suffix the names of all data-precision-specific functions with 8,
12, or 16.
* Because the YUV functions are intended to be used for video, they
are currently only implemented with 8-bit data precision, but they
can be expanded to 12-bit data precision in the future, if
necessary.
* Extend TJUnitTest and TJBench to test 12-bit and 16-bit data
precision, using a new -precision option.
* Add appropriate regression tests for all of the above to the 'test'
target.
* Extend tjbenchtest to test 12-bit and 16-bit data precision, and
add separate 'tjtest12' and 'tjtest16' targets.
* BufferedImage I/O in the Java API is currently limited to 8-bit
data precision, since the BufferedImage class does not
straightforwardly support higher data precisions.
* Extend the PPM reader to convert 12-bit and 16-bit PBMPLUS files
to grayscale or CMYK pixels, as it already does for 8-bit files.
- Properly accommodate lossless JPEG using dedicated parameters
(TJ*PARAM_LOSSLESS, TJ*PARAM_LOSSLESSPSV, and TJ*PARAM_LOSSLESSPT),
rather than using a flag and awkwardly repurposing the JPEG quality.
Update TJBench to properly reflect whether a JPEG image is lossless.
- Re-organize the TJBench usage screen.
- Update the Java docs using Java 11, to improve the formatting and
eliminate HTML frames.
- Use the accurate integer DCT algorithm by default for both
compression and decompression, since the "fast" algorithm is a legacy
feature, it does not pass the ISO compliance tests, and it is not
actually faster on modern x86 CPUs.
* Remove the -accuratedct option from TJBench and TJExample.
- Re-implement the 'tjtest' target using a CMake script that enables
the appropriate tests, depending on the data precision and whether or
not the Java API is part of the build.
- Consolidate the C and Java versions of tjbenchtest into one script.
- Consolidate the C and Java versions of tjexampletest into one script.
- Combine all initialization functions into a single function
(tj3Init()) that accepts an integer parameter specifying the
subsystems to initialize.
- Enable decompression scaling explicitly, using a new function/method
(tj3SetScalingFactor()/TJDecompressor.setScalingFactor()), rather
than implicitly using awkward "desired width"/"desired height"
parameters.
- Introduce a new macro/constant (TJUNSCALED/TJ.UNSCALED) that maps to
a scaling factor of 1/1.
- Implement partial image decompression, using a new function/method
(tj3SetCroppingRegion()/TJDecompressor.setCroppingRegion()) and
TJBench option (-crop). Extend tjbenchtest to test the new feature.
Addresses #1.
- Allow the JPEG colorspace to be specified explicitly when
compressing, using a new parameter (TJ*PARAM_COLORSPACE). This
allows JPEG images with the RGB and CMYK colorspaces to be created.
- Remove the error/difference image feature from TJBench. Identical
images to the ones that TJBench created can be generated using
ImageMagick with
'magick composite <original_image> <output_image> -compose difference <diff_image>'
- Handle JPEG images with unknown subsampling types. TJ*PARAM_SUBSAMP
is set to TJ*SAMP_UNKNOWN (== -1) for such images, but they can still
be decompressed fully into packed-pixel images or losslessly
transformed (with the exception of lossless cropping.) They cannot
be partially decompressed or decompressed into planar YUV images.
Note also that TJBench, due to its lack of support for imperfect
transforms, requires that the subsampling type be known when
rotating, flipping, or transversely transposing an image. Addresses
#436
- The Java version of TJBench now has identical functionality to the C
version. This was accomplished by (somewhat hackishly) calling the
TurboJPEG C image I/O functions through JNI and copying the pixels
between the C heap and the Java heap.
- Add parameters (TJ*PARAM_RESTARTROWS and TJ*PARAM_RESTARTBLOCKS) and
a TJBench option (-restart) to allow the restart marker interval to
be specified when compressing. Eliminate the undocumented TJ_RESTART
environment variable.
- Add a parameter (TJ*PARAM_OPTIMIZE), a transform option
(TJ*OPT_OPTIMIZE), and a TJBench option (-optimize) to allow
optimized baseline Huffman coding to be specified when compressing.
Eliminate the undocumented TJ_OPTIMIZE environment variable.
- Add parameters (TJ*PARAM_XDENSITY, TJ*PARAM_DENSITY, and
TJ*DENSITYUNITS) to allow the pixel density to be specified when
compressing or saving a Windows BMP image and to be queried when
decompressing or loading a Windows BMP image. Addresses #77.
- Refactor the fuzz targets to use the new API.
* Extend decompression coverage to 12-bit and 16-bit data precision.
* Replace the awkward cjpeg12 and cjpeg16 targets with proper
TurboJPEG-based compress12, compress12-lossless, and
compress16-lossless targets
- Fix innocuous UBSan warnings uncovered by the new fuzzers.
- Implement previous versions of the TurboJPEG API by wrapping the new
functions (tested by running the 2.1.x versions of TJBench, via
tjbenchtest, and TJUnitTest against the new implementation.)
* Remove all JNI functions for deprecated Java methods and implement
the deprecated methods using pure Java wrappers. It should be
understood that backward API compatibility in Java applies only to
the Java classes and that one cannot mix and match a JAR file from
one version of libjpeg-turbo with a JNI library from another
version.
- tj3Destroy() now silently accepts a NULL handle.
- tj3Alloc() and tj3Free() now return/accept void pointers, as malloc()
and free() do.
- The image I/O functions now accept a TurboJPEG instance handle, which
is used to transmit/receive parameters and to receive error
information.
Closes#517
The documented behavior of the -progressive option is to use progressive
entropy coding in JPEG images generated by compression and transform
operations. However, setting TJFLAG_PROGRESSIVE was insufficient to
accomplish that, because TJBench doesn't enable lossless transformation
if xformOpt == 0.
- TJBench/TJUnitTest: Wordsmith command-line output
- Java: "decompress operations"="decompression operations"
- tjLoadImage(): Error message tweak
- Don't mention compression performance in the description of
TJXOPT_PROGRESSIVE/TJTransform.OPT_PROGRESSIVE, because the image has
already been compressed at that point.
(Oversights from 9a146f0f23)
- Wordsmithing, formatting, and grammar tweaks
- Various clarifications and corrections, including specifying whether
a particular buffer or image is used as a source or destination
- Accommodate/mention features that were introduced since the API
documentation was created.
- For clarity, use "packed-pixel" to describe uncompressed
source/destination images that are not planar YUV.
- Use "row" rather than "line" to refer to a single horizontal group of
pixels or component values, for consistency with the libjpeg API
documentation. (libjpeg also uses "scanline", which is a more archaic
term.)
- Use "alignment" rather than "padding" to refer to the number of bytes
by which a row's width is evenly divisible. This consistifies the
documention of the YUV functions and tjLoadImage(). ("Padding"
typically refers to the number of bytes added to each row, which is
not the same thing.)
- Remove all references to "the underlying codec." Although the
TurboJPEG API originated as a cross-platform wrapper for the Intel
Integrated Performance Primitives, Sun mediaLib, QuickTime, and
libjpeg, none of those TurboJPEG implementations has been maintained
since 2009. Nothing would prevent someone from implementing the
TurboJPEG API without libjpeg-turbo, but such an implementation would
not necessarily have an "underlying codec." (It could be fully
self-contained.)
- Use "destination image" rather than "output image", for consistency,
or describe the type of image that will be output.
- Avoid the term "image buffer" and instead use "byte buffer" to
refer to buffers that will hold JPEG images, or describe the type of
image that will be contained in the buffer. (The Java documentation
doesn't use "byte buffer", because the buffer arrays literally have
"byte" in front of them, and since Java doesn't have pointers, it is
not possible for mere mortals to store any other type of data in those
arrays.)
- C: Use "unified" to describe YUV images stored in a single buffer, for
consistency with the Java documentation.
- Use "planar YUV" rather than "YUV planar". Is is our convention to
describe images using {component layout} {colorspace/pixel format}
{image function}, e.g. "packed-pixel RGB source image" or "planar YUV
destination image."
- C: Document the TurboJPEG API version in which a particular function
or macro was introduced, and reorder the backward compatibility
function stubs in turbojpeg.h alphabetically by API version.
- C: Use Markdown rather than HTML tags, where possible, in the Doxygen
comments.
Add a new TurboJPEG C API function (tjDecompressHeader4()) and Java API
method (TJDecompressor.getFlags()) that return the bitwise OR of any
flags that are relevant to the JPEG image being decompressed (currently
TJFLAG_PROGRESSIVE, TJFLAG_ARITHMETIC, TJFLAG_LOSSLESS, and their Java
equivalents.) This allows a calling program to determine whether the
image being decompressed is a lossless JPEG image, which means that the
decompression scaling feature will not be available and that a
full-sized destination buffer should be allocated.
More specifically, this fixes a buffer overrun in TJBench, TJExample,
and the decompress* fuzz targets that occurred when attempting (in vain)
to decompress a lossless JPEG image with decompression scaling enabled.
Prevent several integer overflow issues and subsequent segfaults that
occurred when attempting to compress or decompress gigapixel images with
the TurboJPEG API:
- Modify tjBufSize(), tjBufSizeYUV2(), and tjPlaneSizeYUV() to avoid
integer overflow when computing the return values and to return an
error if such an overflow is unavoidable.
- Modify tjunittest to validate the above.
- Modify tjCompress2(), tjEncodeYUVPlanes(), tjDecompress2(), and
tjDecodeYUVPlanes() to avoid integer overflow when computing the row
pointers in the 64-bit TurboJPEG C API.
- Modify TJBench (both C and Java versions) to avoid overflowing the
size argument to malloc()/new and to fail gracefully if such an
overflow is unavoidable.
In general, this allows gigapixel images to be accommodated by the
64-bit TurboJPEG C API when using automatic JPEG buffer (re)allocation.
Such images cannot currently be accommodated without automatic JPEG
buffer (re)allocation, due to the fact that tjAlloc() accepts a 32-bit
integer argument (oops.) Such images cannot be accommodated in the
TurboJPEG Java API due to the fact that Java always uses a signed 32-bit
integer as an array index.
Fixes#361
... and modify tjbench.c to match the variable name changes made to
TJBench.java
("checkstyle" = http://checkstyle.sourceforge.net, not our regex-based
checkstyle script)
With rare exceptions ...
- Always separate line continuation characters by one space from
preceding code.
- Always use two-space indentation. Never use tabs.
- Always use K&R-style conditional blocks.
- Always surround operators with spaces, except in raw assembly code.
- Always put a space after, but not before, a comma.
- Never put a space between type casts and variables/function calls.
- Never put a space between the function name and the argument list in
function declarations and prototypes.
- Always surround braces ('{' and '}') with spaces.
- Always surround statements (if, for, else, catch, while, do, switch)
with spaces.
- Always attach pointer symbols ('*' and '**') to the variable or
function name.
- Always precede pointer symbols ('*' and '**') by a space in type
casts.
- Use the MIN() macro from jpegint.h within the libjpeg and TurboJPEG
API libraries (using min() from tjutil.h is still necessary for
TJBench.)
- Where it makes sense (particularly in the TurboJPEG code), put a blank
line after variable declaration blocks.
- Always separate statements in one-liners by two spaces.
The purpose of this was to ease maintenance on my part and also to make
it easier for contributors to figure out how to format patch
submissions. This was admittedly confusing (even to me sometimes) when
we had 3 or 4 different style conventions in the same source tree. The
new convention is more consistent with the formatting of other OSS code
bases.
This commit corrects deviations from the chosen formatting style in the
libjpeg API code and reformats the TurboJPEG API code such that it
conforms to the same standard.
NOTES:
- Although it is no longer necessary for the function name in function
declarations to begin in Column 1 (this was historically necessary
because of the ansi2knr utility, which allowed libjpeg to be built
with non-ANSI compilers), we retain that formatting for the libjpeg
code because it improves readability when using libjpeg's function
attribute macros (GLOBAL(), etc.)
- This reformatting project was accomplished with the help of AStyle and
Uncrustify, although neither was completely up to the task, and thus
a great deal of manual tweaking was required. Note to developers of
code formatting utilities: the libjpeg-turbo code base is an
excellent test bed, because AFAICT, it breaks every single one of the
utilities that are currently available.
- The legacy (MMX, SSE, 3DNow!) assembly code for i386 has been
formatted to match the SSE2 code (refer to
ff5685d5344273df321eb63a005eaae19d2496e3.) I hadn't intended to
bother with this, but the Loongson MMI implementation demonstrated
that there is still academic value to the MMX implementation, as an
algorithmic model for other 64-bit vector implementations. Thus, it
is desirable to improve its readability in the same manner as that of
the SSE2 implementation.
Previously, -stoponwarning only had an effect on the underlying
TurboJPEG C functions, but TJBench still aborted if a non-fatal error
occurred. This commit modifies the C version of TJBench such that it
always recovers from a non-fatal error unless -stoponwarning is
specified. Furthermore, the benchmark stores the details of the last
non-fatal error and does not print any subsequent non-fatal error
messages unless they differ from the last one.
Due to limitations in the Java API (specifically, the fact that it
cannot communicate errors, fatal or otherwise, to the calling program
without throwing a TJException), it was only possible to make
decompression operations fully recoverable within TJBench. With other
operations, -stoponwarning still has an effect on the underlying C
library but has no effect at the Java level.
The Java API documentation has been amended to reflect that only certain
methods are truly recoverable, regardless of the state of
TJ.FLAG_STOPONWARNING.
Allow progressive entropy coding to be enabled on a
transform-by-transform basis, and implement a new transform option for
disabling the copying of markers.
Closes#153
Given that libjpeg-turbo can often process hundreds of megapixels/second
on modern hardware, the default of one warmup iteration was essentially
meaningless. Furthermore, the -warmup option was a bit clunky, since
it required some foreknowledge of how fast the benchmarks were going to
execute.
This commit introduces a 1-second warmup interval for each benchmark by
default, and the -warmup option has been retasked to control the length
of that interval.
- Provide a new C API function and TJException method that allows
calling programs to query the severity of a compression/decompression/
transform error.
- Provide a new flag that instructs the library to immediately stop
compressing/decompressing/transforming if a warning is encountered.
Fixes#151
Embedded ICC profiles can cause the size of a JPEG file to exceed the
size returned by tjBufSize() (which is really meant to be used for
compression anyhow, not for decompression), and this was causing a
segfault (C) or an ArrayIndexOutOfBoundsException (Java) when
decompressing such files with TJBench. This commit modifies the
benchmark such that, when tiled decompression is disabled, it re-uses
the source buffer as the primary JPEG buffer.