* commit '15274b901acb75d6d2433e8578f3cfbc6f4f5fd9': (98 commits)
AppVeyor: Use SignPath release cert/only sign tags
xform fuzz: Use only xform opts to set entropy alg
jchuff.c: Test for out-of-range coefficients
turbojpeg.h: Make customFilter() proto match doc
ChangeLog.md: Fix typo
djpeg: Fix -map option with 12-bit data precision
Disallow color quantization with lossless decomp
tj3Transform: Calc dst buf size from xformed dims
README.md: Include link to project home page
AppVeyor: Only add installers to zip file
AppVeyor: Integrate with SignPath.io
Fix build warnings/errs w/ -DNO_GETENV/-DNO_PUTENV
GitHub: Fix x32 build
Bump version to 3.0.0
tjexample.c: Prevent integer overflow
Disallow merged upsampling with lossless decomp
SECURITY.md: Wordsmithing and clarifications
GitHub: Add security policy
ChangeLog.md: List CVE ID fixed by 9f756bc6
jpeg_crop_scanline: Fix calc w/sclg + 2x4,4x2 samp
...
# By DRC (96) and others
# Via DRC
* tag '3.0.0': (98 commits)
AppVeyor: Use SignPath release cert/only sign tags
xform fuzz: Use only xform opts to set entropy alg
jchuff.c: Test for out-of-range coefficients
turbojpeg.h: Make customFilter() proto match doc
ChangeLog.md: Fix typo
djpeg: Fix -map option with 12-bit data precision
Disallow color quantization with lossless decomp
tj3Transform: Calc dst buf size from xformed dims
README.md: Include link to project home page
AppVeyor: Only add installers to zip file
AppVeyor: Integrate with SignPath.io
Fix build warnings/errs w/ -DNO_GETENV/-DNO_PUTENV
GitHub: Fix x32 build
Bump version to 3.0.0
tjexample.c: Prevent integer overflow
Disallow merged upsampling with lossless decomp
SECURITY.md: Wordsmithing and clarifications
GitHub: Add security policy
ChangeLog.md: List CVE ID fixed by 9f756bc6
jpeg_crop_scanline: Fix calc w/sclg + 2x4,4x2 samp
...
# Conflicts:
# CMakeLists.txt
# ChangeLog.md
# fuzz/decompress.cc
# fuzz/decompress_yuv.cc
If a hypothetical calling application does something really stupid and
changes cinfo->data_precision after calling jpeg_start_*compress(), then
the precision-specific methods called by jpeg_write_scanlines(),
jpeg_write_raw_data(), jpeg_finish_compress(), jpeg_read_scanlines(),
jpeg_read_raw_data(), or jpeg_start_output() may not be initialized.
Ensure that the first precision-specific method (which will always be
cinfo->main->process_data*(), cinfo->coef->compress_data*(), or
cinfo->coef->decompress_data()) called by any global function that may
be called after jpeg_start_*compress() is initialized and non-NULL.
This increases the likelihood (but does not guarantee) that a
hypothetical stupid calling application will fail gracefully rather than
segfault if it changes cinfo->data_precision after calling
jpeg_start_*compress(). A hypothetical stupid calling application can
still bork itself by changing cinfo->data_precision after initializing
the source manager but before calling jpeg_start_compress(), or after
initializing the destination manager but before calling
jpeg_start_decompress().
As with x86-64, the Power ISA basically implements 64-bit instructions
as extensions of their 32-bit counterparts. Thus, 64-bit Power ISA CPUs
can natively execute legacy 32-bit PowerPC instructions when running in
big-endian mode. Most Power ISA support has shifted (pun intended) to
little-endian mode, so there are few remaining operating systems that
support big-endian mode. Debian is one of them, however (albeit
unofficially.)
Unlike other versions of Clang 14.0.0, AppleClang 14.0.0 in Xcode 14.2
retains the old -ffp-contract=off default. Apple didn't adopt
-ffp-contract=on as the default until Xcode 14.3 (AppleClang 14.0.3.)
This has been confirmed in the Xcode 14.3 release notes.
We use a standard set of strict compiler warnings with Clang and GCC to
continuously test and maintain C89 conformance in the libjpeg API code.
However, SIMD extensions need not comply with that. The AltiVec code
specifically uses some C99isms, so disable -Wc99-extensions and
-Wpedantic in the scope of that code. Also disable -Wshadow, because
I'm too lazy to fix the TRANSPOSE() macro. Also
use #ifdef __BIG_ENDIAN__ and #if defined(__BIG_ENDIAN__) instead
of #if __BIG_ENDIAN__
We use a standard set of strict compiler warnings with Clang and GCC to
continuously test and maintain C89 conformance in the libjpeg API code.
However, SIMD extensions need not comply with that. The Neon code
specifically uses some C99isms, so disable
-Wdeclaration-after-statement, -Wc99-extensions, and -Wpedantic in the
scope of that code. Also modify the Neon feature tests so that they
will succeed if any of the aforementioned compiler warnings are enabled.
Rename the ARMV8_BUILD CMake variable to SECONDARY_BUILD, and modify the
makemacpkg script so that it allows any architecture in a primary or
secondary build. The idea is that Apple Silicon users can package an
arm64 primary build and a secondary x86_64 build, and Intel users can
package an x86_64 primary build and a secondary arm64 build, using the
same procedure.
Also simplify the iOS build instructions, using the
CMAKE_OSX_ARCHITECTURES variable rather than a toolchain.
Since libjpeg-turbo does not support Exif, the only way it can embed
density information in a JPEG image is by using the JFIF marker, which
is only written if the JPEG colorspace is YCbCr or grayscale.
(Referring to the conversation under #793, we may need to further
restrict that to 8-bit-per-sample JPEG images, because the JFIF spec
requires 8-bit data precision.)
- Due to an oversight, a113506d17
(libjpeg-turbo 1.4 beta1) effectively made the call to
std_huff_tables() in jpeg_set_defaults() a no-op if the Huffman tables
were previously defined, which made it impossible to disable Huffman
table optimization or progressive mode if they were previously enabled
in the same API instance. std_huff_tables() retains its previous
behavior for decompression instances, but it now force-enables the
standard (baseline) Huffman tables for compression instances.
- Due to another oversight, there was no way to disable lossless mode
if it was previously enabled in a particular API instance.
jpeg_set_defaults() now accomplishes this, which makes
TJ*PARAM_LOSSLESS behave as intended/documented.
- Due to yet another oversight, setCompDefaults() in the TurboJPEG API
library permanently modified the value of TJ*PARAM_SUBSAMP when
generating a lossless JPEG image, which affected subsequent lossy
compression operations. This issue was hidden by the issue above and
thus does not need to be publicly documented.
Fixes#792
The target data precision isn't known at the time that the calling
program sets TJPARAM_LOSSLESSPT, so tj3Set() needs to allow all possible
values (from 0 to 15.) jpeg_enable_lossless(), which is called within
the body of tj3Compress*(), will throw an error if the point transform
value is greater than {data precision} - 1.
Since LLVM/Windows emulates Visual Studio, CMake sets
CMAKE_C_SIMULATE_ID="MSVC" but does not set MSVC. Thus, our build
system needs to enable most (but not all) of the Visual Studio features
when CMAKE_C_SIMLUATE_ID="MSVC". Support for LLVM/Windows is currently
undocumented because it isn't a standalone build environment. (It
requires the Visual Studio and Windows SDK headers and link libraries.)
Closes#786
This just improves code readability by emphasizing that we don't care
about the destination image's level of subsampling unless
TJPARAM_NOREALLOC is set or lossless cropping will be performed.
With respect to tj3Transform(), this addresses an oversight from
bb1d540a80.
Note to self: A convenience function/method for computing the worst-case
transformed JPEG size for a particular transform would be nice.
- When transforming, the worst-case JPEG buffer size depends on the
subsampling level used in the destination image, since a grayscale
transform might have been applied.
- Parentheses Police
tj*Transform() relied upon the underlying transupp API to check the
cropping region. However, transupp uses unsigned integers for the
cropping region, whereas the tjregion structure uses signed integers.
Thus, casting negative values from a tjregion structure produced very
large unsigned values. In the case of the left and upper boundary, this
was innocuous, because jtransform_request_workspace() rejected the
values as being out of bounds. However, jtransform_request_workspace()
did not always reject very large width and height values, because it
supports expanding the destination image by specifying a cropping region
larger than the source image. In certain cases, it allowed those
values, and the libjpeg memory manager subsequently ran out of memory.
NOTE: Prior to this commit, image expansion technically worked with
tj*Transform() as long as the cropping width and height were valid and
automatic JPEG buffer (re)allocation was used. However, that behavior
is not a documented feature of the TurboJPEG API, nor do we have any way
of testing it at the moment. Official support for image expansion can
be added later, if there is sufficient demand for it.
Similarly, this commit modifies tj3SetCroppingRegion() so that it
explicitly checks for left boundary values exactly equal to the scaled
image width and upper boundary values exactly equal to the scaled image
height. If the specified cropping width or height was 0 (which is
interpreted as {scaled image width} - {left boundary} or
{scaled image height} - {upper boundary}), then such values caused a
cropping width or height of 0 to be passed to the libjpeg API. In the
case of the width, this was innocuous, because jpeg_crop_scanline()
rejected the value. In the case of the height, however, this caused
unexpected and hard-to-diagnose errors farther down the pipeline.
jcopy_markers_execute() has historically ignored its option argument,
which is OK for jpegtran, but tj*Transform() needs to be able to save a
set of markers from the source image and write a subset of those markers
to each destination image. Without that ability, the function
effectively behaved as if TJ*OPT_COPYNONE was not specified unless all
transforms specified it.
These images were only used with tjbenchtest and tjexampletest, but I
was concerned about introducing additional licensing provisions into the
libjpeg-turbo source tree. Thus, this commit replaces them with images
that I own and can thus make available under the libjpeg-turbo licenses
with no additional provisions.
Put all general functions at the top of the list, and ensure that all
functions are defined before they are mentioned. Also consistify the
function ordering between turbojpeg.h and turbojpeg.c
Extending the Bayer matrix past Column 80 seems like a lesser
readability sin than not putting a space after each comma, especially
since we had to explicitly whitelist the file in checkstyle.
- Eliminate unnecessary "www."
- Use HTTPS.
- Update Java, MSYS, tdm-gcc, and NSIS URLs.
- Update URL and title of Agner Fog's assembly language optimization
manual.
- Remove extraneous information about MASM and Borland Turbo Assembler
and outdated NASM URLs from the x86 assembly headers, and mention
Yasm.
Lossless cropping is performed after other lossless transform
operations, so the cropping region must be specified relative to the
destination image dimensions and level of chrominance subsampling, not
the source image dimensions and level of chrominance subsampling.
More specifically, if the lossless transform operation swaps the X and Y
axes, or if the image is converted to grayscale, then that changes the
cropping region requirements.
The JPEG-1 spec never uses the term "MCU block". That term is rarely
used in other literature to describe the equivalent of an MCU in an
interleaved JPEG image, but the libjpeg documentation uses "iMCU" to
describe the same thing. "iMCU" is a better term, since the equivalent
of an interleaved MCU can contain multiple DCT blocks (or samples in
lossless mode) that are only grouped together if the image is
interleaved.
In the case of restart markers, "MCU block" was used in the libjpeg
documentation instead of "MCU", but "MCU" is more accurate and less
confusing. (The restart interval is literally in MCUs, where one MCU
is one data unit in a non-interleaved JPEG image and multiple data units
in a multi-component interleaved JPEG image.)
In the case of 9b704f96b2, the issue was
actually with progressive JPEG images exactly two DCT blocks wide, not
two MCU blocks wide.
This commit also defines "MCU" and "MCU row" in the description of the
various restart marker options/parameters. Although an MCU row is
technically always a row of samples in lossless mode, "sample row" was
confusing, since it is used in other places to describe a row of samples
for a single component (whereas an MCU row in a typical lossless JPEG
image consists of a row of interleaved samples for all components.)
TJ*PARAM_RESTARTBLOCKS technically works with lossless compression, but
it is not useful, since the value must be equal to the number of samples
in a row. (In other words, it is no different than
TJ*PARAM_RESTARTINROWS, except that it requires the user to do more
math.)
For reasons I can't recall, fc01f4673b
(the TurboJPEG 3 API overhaul) changed the C version of TJBench, but not
the Java version, so that it requires an exact scaling factor match (as
opposed to allowing unreduced scaling factors, as djpeg does and prior
versions of TJBench did.) That might have been temporary testing code
that was accidentally committed.
Because the crop spec was parsed using unsigned 32-bit integers,
negative numbers were interpreted as values ~= UINT_MAX (4,294,967,295).
This had the following ramifications:
- If the cropping region width was negative and the adjusted width + the
adjusted left boundary was greater than 0, then the 32-bit unsigned
integer bounds checks in djpeg and jpeg_crop_scanline() overflowed and
failed to detect the out-of-bounds width, jpeg_crop_scanline() set
cinfo->output_width to a value ~= UINT_MAX, and a buffer overrun and
subsequent segfault occurred in the upsampling or color conversion
routine. The segfault occurred in the body of
jpeg_skip_scanlines() --> read_and_discard_scanlines() if the cropping
region upper boundary was greater than 0 and the JPEG image used
chrominance subsampling and in the body of jpeg_read_scanlines()
otherwise.
- If the cropping region width was negative and the adjusted width + the
adjusted left boundary was 0, then a zero-width output image was
generated.
- If the cropping region left boundary was negative, then an output
image with bogus data was generated.
This commit modifies djpeg and jpeg_crop_scanline() so that the
aforementioned bounds checks use 64-bit unsigned integers, thus guarding
against overflow. It similarly modifies jpeg_skip_scanlines(). In the
case of jpeg_skip_scanlines(), the issue was not reproducible with
djpeg, but passing a negative number of lines to jpeg_skip_scanlines()
caused a similar overflow if the number of lines +
cinfo->output_scanline was greater than 0. That caused
jpeg_skip_scanlines() to read past the end of the JPEG image, throw a
warning ("Corrupt JPEG data: premature end of data segment"), and fail
to return unless warnings were treated as fatal. Also, djpeg now parses
the crop spec using signed integers and checks for negative values.
Due to an oversight in the multi-precision feature,
TJCompressor.srcBuf12 and TJCompressor.srcBuf16 were not set to null
in TJCompressor.setSourceImage(YUVImage) or
TJCompressor.setSourceImage(BufferedImage, ...). Thus, if an
application set a 12-bit or 16-bit packed-pixel buffer as the source
image then set a BufferedImage with integer pixels as the source image,
TJCompress.compress() would compress from the 12-bit or 16-bit
packed-pixel buffer instead of the BufferedImage. The odds of an
application actually doing that are very slim, however.
There are two approaches to handling abbreviated command-line options:
1. If a new option is introduced that begins with the same letters as an
existing option, require a longer abbreviation for the existing option
in order to ensure that abbreviations are always unique.
2. Require a unique abbreviation only for new options, and match all
non-unique abbreviations with existing options, thus maintaining
backward compatibility.
keymatch() supports either approach, and Tom Lane historically seemed to
prefer Approach 2, whereas both approaches have been applied
inconsistently in the years since. This commit consistently applies
Approach 2.
More specific notes:
We unnecessarily required 'cjpeg -progressive' to be abbreviated as
'cjpeg -pro' rather than 'cjpeg -p' when the -precision option was
introduced in libjpeg-turbo 3.0 beta.
The IJG unnecessarily required 'cjpeg -scans' to be abbreviated as
'cjpeg -scan' rather than 'cjpeg -sc' when the -scale option was
introduced in jpeg-7. We even more unnecessarily adopted that
requirement, even though we never adopted the -scale option.
We unnecessarily required 'djpeg -scale' to be abbreviated as
'djpeg -sc' rather than 'djpeg -s' when the -skip option was introduced
in libjpeg-turbo 1.5 beta.
The IJG unnecessarily required 'jpegtran -copy' to be abbreviated as
'jpegtran -co' rather than 'jpegtran -c' when the -crop option was
introduced in jpeg-7.
The IJG unnecessarily required 'jpegtran -progressive' to be abbreviated
as 'jpegtran -pr' rather than 'jpegtran -p' when the -perfect option was
introduced in jpeg-7.
We need to use tj3Alloc() (which, when ZERO_BUFFERS is defined, calls
calloc() instead of malloc()) to allocate all destination buffers.
Otherwise, if the compression/decompression/transform operation fails,
then the buffer checksum (which is computed to prevent the compiler from
optimizing out the whole test, since the destination buffer is never
used otherwise) will depend upon values in the destination buffer that
were never written, and MSan will complain.
- "Optimized baseline entropy coding" = "Huffman table optimization"
"Optimized baseline entropy coding" was meant to emphasize that the
feature is only useful when generating baseline (single-scan lossy
8-bit-per-sample Huffman-coded) JPEG images, because it is
automatically enabled when generating Huffman-coded progressive
(multi-scan), 12-bit-per-sample, and lossless JPEG images. However,
Huffman table optimization isn't actually an integral part of those
non-baseline modes. You can forego Huffman table optimization with
12-bit data precision if you supply your own Huffman tables. The spec
doesn't require it with progressive or lossless mode, either, although
our implementation does. Furthermore, "baseline" describes more than
just the type of entropy coding used. It was incorrect to say that
optimized "baseline" entropy coding is automatically enabled for
Huffman-coded progressive, 12-bit-per-sample, and lossless JPEG
images, since those are clearly not baseline images.
- "Progressive entropy coding" = "Progressive JPEG"
"Progressive" describes more than just the type of entropy coding
used. (In fact, both Huffman-coded and arithmetic-coded images can be
progressive.)
- Mention that TJPARAM_OPTIMIZE/TJ.PARAM_OPTIMIZE can be used with
lossless transformation as well.
- General wordsmithing
- Formatting tweaks
Referring to
https://sourceforge.net/p/libjpeg-turbo/bugs/48,
https://sourceforge.net/p/libjpeg-turbo/bugs/82,
#15, #238, #253, and #619,
valgrind and MSan have failed to properly detect data initialization by
libjpeg-turbo's x86 SIMD extensions for the entire 14 years that
libjpeg-turbo has been a project, resulting in false positives unless
libjpeg-turbo is built with WITH_SIMD=0 or run with JSIMD_FORCENONE=1.
This commit introduces a new C preprocessor macro (ZERO_BUFFERS) that,
if set, causes libjpeg-turbo to zero certain buffers in order to work
around the specific valgrind/MSan test failures caused by the
aforementioned false positives. This allows us to more closely
approximate the production configuration of libjpeg-turbo when testing
with valgrind or MSan.
Closes#781
The dimensions in the PPM header of the output file generated by
'example decompress' were always 640 x 480, regardless of the size of
the JPEG image being decompressed. Our regression tests (which this
commit also fixes) missed the bug because they decompressed the
640 x 480 image generated by 'example compress'.
Fixes#778
(Regression introduced by 7bb958b732)
Because of 7bb958b732, the TurboJPEG
compression and encoding functions no longer transfer the value of
TJPARAM_OPTIMIZE into cinfo->data_precision unless the data precision
is 8. The intent of that was to prevent using_std_huff_tables() from
being called more than once when reusing the same compressor object to
generate multiple 12-bit-per-sample JPEG images. However, because
cinfo->optimize_coding is always set to TRUE by jpeg_set_defaults() if
the data precision is 12, calling applications that use 12-bit data
precision had to unset cinfo->optimize_coding if they set
cinfo->arith_code after calling jpeg_set_defaults(). Because of
7bb958b732, the TurboJPEG API stopped
doing that except with 8-bit data precision. Thus, attempting to
generate a 12-bit-per-sample arithmetic-coded lossy JPEG image using
the TurboJPEG API failed with "Requested features are incompatible."
Since the compressor will always fail if cinfo->arith_code and
cinfo->optimize_coding are both set, and since cinfo->optimize_coding
has no relevance for arithmetic coding, the most robust and user-proof
solution is for jinit_c_master_control() to set cinfo->optimize_coding
to FALSE if cinfo->arith_code is TRUE.
This commit also:
- modifies TJBench so that it no longer reports that it is using
optimized baseline entropy coding in modes where that setting
is irrelevant,
- amends the cjpeg documentation to clarify that -optimize is implied
when specifying -progressive or '-precision 12' without -arithmetic,
and
- prevents jpeg_set_defaults() from uselessly checking the value of
cinfo->arith_code immediately after it has been set to FALSE.
jpeg_enable_lossless() checks the point transform value against the data
precision, so we need to defer calling jpeg_enable_lossless() until
after all command-line options have been parsed.
This gives the user a hint as to whether converting the input file into
a JPEG file with a particular data precision will result in a loss of
precision. Also modify usage.txt and the cjpeg man page to warn the
user about that possibility.
- "bits per component" = "bits per sample"
Describing the data precision of a JPEG image using "bits per
component" is technically correct, but "bits per sample" is the
terminology that the JPEG-1 spec uses. Also, "bits per component" is
more commonly used to describe the precision of packed-pixel formats
(as opposed to "bits per pixel") rather than planar formats, in which
all components are grouped together.
- Unmention legacy display technologies. Colormapped and monochrome
displays aren't a thing anymore, and even when they were still a
thing, it was possible to display full-color images to them. In 1991,
when JPEG decompression time was measured in minutes per megapixel, it
made sense to keep a decompressed copy of JPEG images on disk, in a
format that could be displayed without further color conversion (since
color conversion was slow and memory-intensive.) In 2024, JPEG
decompression time is measured in milliseconds per megapixel, and
color conversion is even faster. Thus, JPEG images can be
decompressed, displayed, and color-converted (if necessary) "on the
fly" at speeds too fast for human vision to perceive. (In fact, your
TV performs much more complicated decompression algorithms at least 60
times per second.)
- Document that color quantization (and associated features), GIF
input/output, Targa input/output, and OS/2 BMP input/output are legacy
features. Legacy status doesn't necessarily mean that the features
are deprecated. Rather, it is meant to discourage users from using
features that may be of little or no benefit on modern machines (such
as low-quality modes that had significant performance advantages in
the early 1990s but no longer do) and that are maintained on a
break/fix basis only.
- General wordsmithing, grammar/punctuation policing, and formatting
tweaks
- Clarify which data precisions each cjpeg input format and each djpeg
output format supports.
- cjpeg.1: Remove unnecessary and impolitic statement about the -targa
switch.
- Adjust or remove performance claims to reflect the fact that:
* On modern machines, the djpeg "-fast" switch has a negligible effect
on performance.
* There is a measurable difference between the performance of Floyd-
Steinberg dithering and no dithering, but it is not likely
perceptible to most users.
* There is a measurable difference between the performance of 1-pass
and 2-pass color quantization, but it is not likely perceptible to
most users.
* There is a measurable difference between the performance of
full-color and grayscale output when decompressing a full-color JPEG
image, but it is not likely perceptible to most users.
* IDCT scaling does not necessarily improve performance. (It
generally does if the scaling factor is <= 1/2 and generally doesn't
if the scaling factor is > 1/2, at least on my machine. The
performance claim made in jpeg-6b was probably invalidated when we
merged the additional scaling factors from jpeg-7.)
- Clarify which djpeg switches/output formats cannot be used when
decompressing lossless JPEG images.
- Remove djpeg hints, since those involve quality vs. speed tradeoffs
that are no longer relevant for modern machines.
- Remove documentation regarding using color quantization with 16-bit
data precision. (Color quantization requires lossy mode.)
- Java: Fix typos in TJDecompressor.decompress12() and
TJDecompressor.decompress16() documentation.
- jpegtran.1: Fix truncated paragraph
In a man page, a single quote at the start of a line is interpreted as
a macro.
Closes#775
- libjpeg.txt:
* Mention J16SAMPLE data type (oversight.)
* Remove statement about extending jdcolor.c. (libjpeg-turbo is not
quite as DIY as libjpeg once was.)
* Remove paragraph about tweaking the various typedefs in jmorecfg.h.
It is no longer relevant for modern machines.
* Remove caveat regarding systems with ints less than 16 bits wide.
(ANSI/ISO C requires an int to be at least 16 bits wide, and
libjpeg-turbo has never supported non-ANSI compilers.)
- usage.txt:
* Add copyright header.
* Document cjpeg -icc, -memdst, -report, -strict, and -version
switches.
* Document djpeg -icc, -maxscans, -memsrc, -report, -skip, -crop,
-strict, and -version switches.
* Document jpegtran -icc, -maxscans, -report, -strict, and -version
switches.
The 8-bit-per-sample image I/O modules have always been built with BMP,
GIF, PBMPLUS, and Targa support, and the 12-bit-per-sample image I/O
modules have always been built with only GIF and PBMPLUS support. In
libjpeg-turbo 2.1.x and prior, cjpeg and djpeg were built with the same
image format support as the image I/O modules. However, because of the
multi-precision feature introduced in libjpeg-turbo 3.0.x, cjpeg and
djpeg are now always built with support for all image formats. Thus,
the error message table compiled into cjpeg and djpeg was a superset of
the error message table compiled into the 12-bit-per-sample and
16-bit-per-sample image I/O modules. If, for example, the
12-bit-per-sample PPM writer threw JERR_PPM_COLORSPACE (== 15, because
it was built with only GIF and PBMPLUS support), then djpeg interpreted
that as JERR_GIF_COLORSPACE (== 15, because it was built with support
for all image formats.) There was no chance of a buffer overrun, since
the error message table lookup was performed against the table compiled
into cjpeg and djpeg, which contained all possible entries. However,
this issue caused an incorrect error message to be displayed by cjpeg or
djpeg if a 12-bit-per-sample or 16-bit-per-sample image I/O module threw
an error. This commit simply removes the *_SUPPORTED #ifdefs from
cderror.h so that all image I/O error messages are always included in
every instance of the image I/O error message table.
Note that a similar issue could have also occurred with the
12-bit-per-sample and 16-bit-per-sample TurboJPEG image I/O functions,
since the 12-bit-per-sample and 16-bit-per-sample image I/O modules
supporting those functions are built with only PBMPLUS support whereas
the library as a whole is built with BMP and PBMPLUS support.
Referring to https://bugzilla.mozilla.org/show_bug.cgi?id=1898606,
attempting to decompress a specially-crafted malformed JPEG image
(specifically an image with a complete 12-bit Start Of Frame segment
followed by an incomplete 8-bit Start Of Frame segment) using the
default marker processor, buffered-image mode, and input prefetching
triggered the following sequence of events:
- When the 12-bit SOF segment was encountered (in the body of
jpeg_read_header()), the marker processor's read_markers() method
called the get_sof() function, which processed the 12-bit SOF segment
and set cinfo->data_precision to 12.
- If the application subsequently called jpeg_consume_input() in a loop
to prefetch input data, and it didn't stop calling
jpeg_consume_input() when the function returned JPEG_REACHED_SOS, then
the 8-bit SOF segment was encountered in the body of
jpeg_consume_input(). As a result, the marker processor's
read_markers() method called get_sof(), which started to process the
8-bit SOF segment and set cinfo->data_precision to 8.
- Since the 8-bit SOF segment was incomplete, the end of the JPEG data
stream was encountered when get_sof() attempted to read the image
height, width, and number of components.
- If the fill_input_buffer() method in the application's custom source
manager incorrectly returned FALSE in response to a prematurely-
terminated JPEG data stream, then get_sof() returned FALSE while
attempting to read the image height, width, and number of components
(before the duplicate SOF check was reached.) That caused the default
marker processor's read_markers() method, and subsequently
jpeg_consume_input(), to return JPEG_SUSPENDED.
- If the application failed to respond to the JPEG_SUSPENDED return
value and subsequently attempted to call jpeg_read_scanlines(),
then the data precision check in jpeg_read_scanlines() succeeded
(because cinfo->data_precision was now 8.) However, because
cinfo->data_precision had been 12 during the previous call to
jpeg_start_decompress(), only the 12-bit version of the main
controller was initialized, and the cinfo->main->process_data() method
was undefined. Thus, a segfault occurred when jpeg_read_scanlines()
attempted to invoke that method.
Scenarios in which the issue was thwarted:
1. The default source managers handle a prematurely-terminated JPEG data
stream by inserting a fake EOI marker into the data stream. Thus, when
using one of those source managers, the INPUT_2BYTES() and INPUT_BYTE()
macros (which get_sof() invokes to read the image height, width, and
number of components) succeeded-- albeit with bogus data, since the fake
EOI marker was read into those fields. The duplicate SOF check in
get_sof() then failed, generating a fatal libjpeg error.
2. When using a custom source manager that correctly returns TRUE in
response to a prematurely-terminated JPEG data stream, the
aforementioned INPUT_2BYTES() and INPUT_BYTE() macros also succeeded
(albeit with bogus data read from the previous bytes of the data
stream), and the duplicate SOF check failed.
3. If the application did not prefetch input data, or if it stopped
invoking jpeg_consume_input() when the function returned
JPEG_REACHED_SOS, then the duplicate SOF segment was not read prior to
the first call to jpeg_read_scanlines(). Thus, the data precision check
in jpeg_read_scanlines() failed. If the application instead called
jpeg12_read_scanlines() (that is, if it properly supported multiple data
precisions), then the duplicate SOF segment was not read until the body
of jpeg_finish_decompress(). At that point, its only negative effect
was to cause jpeg_finish_decompress() to return FALSE before the
duplicate SOF check was reached.
In other words, this issue depended not only upon an incorrectly-written
source manager but also upon a very specific sequence of API calls. It
also depended upon the multi-precision feature introduced in
libjpeg-turbo 3.0.x. When using an 8-bit-per-sample build of
libjpeg-turbo 2.1.x, jpeg_read_header() failed with "Unsupported JPEG
data precision 12" after the 12-bit SOF segment was processed. When
using a 12-bit-per-sample build of libjpeg-turbo 2.1.x, the behavior
was the same as if the application called jpeg12_read_scanlines() in
Scenario 3 above.
This commit simply moves the duplicate SOF check to the top of
get_sof() so the check will fail before the marker processor attempts to
read the duplicate SOF. It should be noted that this issue isn't a
libjpeg-turbo bug per se, because it occurs only when the calling
application does something it shouldn't. It is, rather, an issue of API
hardening/caller-proofing.
jpeg_std_message_table[] was never a documented part of the libjpeg API,
nor was it exposed in jpegint.h or the Windows libjpeg API DLL. Thus,
it was never a good idea (nor was it even remotely necessary) for
applications to use it. However, referring to #763 and #767, one
application (RawTherapee) did use it.
34c055851e hid the symbol, which broke the
Un*x builds of RawTherapee. (RawTherapee already did the right thing on
Windows, because jpeg_std_message_table[] was not exposed in the
Windows libjpeg API DLL. Referring to
https://github.com/Beep6581/RawTherapee/issues/7074, RawTherapee now
does the right thing on Un*x platforms as well.)
The comment in jerror.c indicated that Tom Lane intended the symbol to
be external "just in case any applications want to refer to it
directly." However, with respect to libjpeg-turbo, the comment was
effectively already a lie on Windows. My choices at this point are:
1. Revert 34c055851e and start exposing
the symbol on Windows, thus making the comment true again.
2. Remove the comment.
Option 1 would have created whiplash, since 3.0.3 has already been
released with the symbol hidden, and that option would have further
encouraged ill-advised behavior on the part of applications. Since the
issue has already been fixed in RawTherapee, and since it is known to
affect only that application, I chose Option 2.
In my professional opinion, a code comment does not rise to the level
of a contract between a library and a calling application. In other
words, the comment did not make the symbol part of the API, even though
the symbol was exposed on some platforms. Applications that use
internal symbols (even the symbols defined in jpegint.h) do so at their
own risk. There is no guarantee that those symbols will remain
unchanged from release to release, as it is sometimes necessary to
modify the internal (opaque) structures and methods in order to
implement new features or even fix bugs. Some implementations of the
libjpeg API (such as the one in Jpegli, for instance), do not provide
those internal symbols at all. Best practices are for applications that
use the internal libjpeg-turbo symbols to provide their own copy of
libjpeg-turbo (for instance, via static linking), so they can manage any
unexpected changes in those symbols. (In fact, that is how libjpeg was
originally used. Application developers built it from source with
compile-time options to exclude unneeded functionality. Thus, no two
builds of libjpeg were guaranteed to be API/ABI-compatible. The idea of
a libjpeg shared library that exposes a pseudo-standard ABI came later,
and that has always been an awkward fit for an API that was intended to
be modified at compile time to suit specific application needs.
libjpeg-turbo's colorspace extensions are but one example, among many,
of our attempts to reconcile that awkwardness while preserving backward
compatibility with the aforementioned pseudo-standard ABI.)
libjpeg.txt documents the need to "include system headers that define at
least the typedefs FILE and size_t" before including jpeglib.h.
However, some software developers unfortunately assume that any
downstream build failure is due to an upstream oversight (as if such an
oversight would have gone unnoticed for 14 years in a library as
ubiquitous as libjpeg-turbo) and "come in hot" with a proposal and
arguments that have already been carefully considered and rejected
multiple times (as opposed to grepping the documentation or searching
existing issues and PRs to find out whether the upstream behavior is
intentional.)
Multiple issues and PRs (including #17, #18, #116, #737, and #766) have
been filed regarding this, so this commit adds a comment to jpeglib.h in
the hope of heading off future issues and PRs. I suspect that some
people will still choose to waste my time by arguing about it, even
though the decision was made by Tom Lane nearly 20 years before
libjpeg-turbo even existed, the decision was supportable based on the
needs of early 1990s computing systems, and reversing that decision
would break backward API compatibility (which is one of the reasons that
many downstream projects adopted libjpeg-turbo in the first place.) At
least now, if someone posts an issue or PR about this again, I can just
link to this commit and close the issue/PR without comment.
If the calling application invokes jpeg_save_markers() to save a
particular type of marker, then the save_marker() function will be
invoked for every marker of that type that is encountered. Previously,
only the head of the marker linked list was stored (in
jpeg_decompress_struct), so save_marker() had to traverse the entire
linked list before it could add a new marker to the tail of the list.
That caused CPU usage to grow exponentially with the number of markers.
Referring to #764, it is possible to create a JPEG image that contains
an excessive number of markers. The specific reproducer that uncovered
this issue is a specially-crafted 1-megabyte malformed JPEG image with
tens of thousands of APP1 markers, which required approximately 30
seconds of CPU time (on a modern Intel processor) to process. However,
it should also be possible to create a legitimate JPEG image that
reproduces the issue (such as an image with tens of thousands of
duplicate EXIF tags.)
This commit introduces a new pointer (in jpeg_decomp_master, in order to
preserve backward ABI compatibility) that is used to store the tail of
the marker linked list whenever a marker is added to it. Thus, it is no
longer necessary to traverse the list when adding a marker, and CPU
usage will grow linearly rather than exponentially with the number of
markers.
Fixes#764
If an error manager instance is passed to jpeg_std_error(), then its
format_message() method will point to the format_message() function in
jerror.c. The format_message() function passes all eight values from
the jpeg_error_mgr::msg_parm.i[] array as arguments to
snprintf()/_snprintf_s(), even if the format string doesn't use all of
those values. Subsequently invoking one of the ERREXIT[1-6]() macros
will leave the unused values uninitialized, and if the
-fsanitize-memory-param-retval option (introduced in Clang 14) is
enabled (which it is by default in Clang 16 and later), then MSan will
complain when the format_message() function tries to pass the
uninitialized-but-unused values as function arguments.
This commit modifies jpeg_std_error() so that it zeroes out the error
manager instance passed to it, thus working around the warning as well
as simplifying the code.
Closes#761
(Regression introduced by 24e09baaf0)
The install() INCLUDES option is not an artifact option. It specifies
a list of directories that will be added to the
INTERFACE_INCLUDE_DIRECTORIES target property when the target is
exported using the install() EXPORT option, which occurs when CMake
package config files are generated. Specifying 'COMPONENT include' with
the install() INCLUDES option caused the INTERFACE_INCLUDE_DIRECTORIES
properties in our CMake package config files to contain
'${_IMPORT_PREFIX}/COMPONENT', which caused errors of the form 'Imported
target "libjpeg-turbo::XXXX" includes non-existent path' when downstream
build systems attempted to include libjpeg-turbo using find_package().
Fixes#759Closes#760
This makes it possible for downstream packagers and other integrators of
libjpeg-turbo to include only specific directories from the
libjpeg-turbo installation (or to install specific directories under a
different prefix, etc.) The names of the components correspond to the
directories into which they will be installed.
Refer to libvips/libvips#3931, #265, #338Closes#756
It is possible to craft a malformed JPEG image in which all of the
scans contain fewer components than the number of components specified
in the Start Of Frame (SOF) segment. Attempting to use such an image as
either an input image or a drop image with 'jpegtran -drop' caused a
NULL dereference and subsequent segfault in transupp.c:adjust_quant(),
so this commit adds appropriate checks to guard against that.
Since the issue involved an interface that is only exposed on the
jpegtran command line, it did not represent a security risk.
'jpegtran -drop' could not ever be used successfully with images such as
the ones described above. This commit simply makes jpegtran fail
gracefully rather than crash.
Fixes#758
Referring to actions/runner-images#9491, the sanitizers in LLVM 14 that
ships with Ubuntu 22.04 are incompatible with high-entropy address space
layout randomization (ASLR), which is enabled in the GitHub runners via
their use of a newer kernel than ubuntu 22.04 uses.
Because of 1644bdb7d2, we are now
effectively using the NEW behavior for all CMake policies introduced in
all CMake versions up to and including CMake 3.28. The NEW behavior for
CMP0025, introduced in CMake 3.0, sets CMAKE_C_COMPILER_ID to
"AppleClang" instead of "Clang" when using Apple's variant of Clang (in
Xcode), so we need to match all values of CMAKE_C_COMPILER_ID that
contain "Clang".
This fixes three issues:
- -O2 was not replaced with -O3 in CMAKE_C_FLAGS_RELWITHDEBINFO. This
was a minor issue, since -O3 is now the default in
CMAKE_C_FLAGS_RELEASE, and we use CMAKE_BUILD_TYPE=Release in our
official builds.
- The build system erroneously set the default value of FLOATTEST8 and
FLOATTEST12 to no-fp-contract when compiling for PowerPC or Arm using
Apple Clang 14+ (effectively reverting
5b2beb4bc4f41dd9dd2a905cb931b8d5054d909b.) Because Clang 14+ now
enables -ffp-contract=on by default, this issue caused floating point
test failures unless FLOATTEST8 and FLOATTEST12 were overridden.
- The build system set MD5_PPM_3x2_FLOAT_FP_CONTRACT as appropriate for
GCC, not as appropriate for Clang (effectively reverting
47656a082091f9c9efda054674522513f4768c6c.) This also caused floating
point test failures.
Fixes#753Closes#755
Several TurboJPEG functions store their return value in an unsigned
long long intermediate and compare it against the maximum value of
unsigned long or size_t in order to avoid integer overflow. However,
such comparisons are tautological (always true, i.e. redundant) unless
the size of unsigned long or size_t is less than the size of unsigned
long long. Explicitly guarding the comparisons with #if avoids compiler
warnings with -Wtautological-constant-in-range-compare in Clang and also
makes it clear to the reader that the comparisons are only intended for
32-bit code.
Refer to #752
(regression introduced by e8b40f3c2b)
The documented behavior of the libjpeg API is to compute optimal Huffman
tables when generating 12-bit lossy Huffman-coded JPEG images, unless
the calling application supplies its own Huffman tables. However,
e8b40f3c2b and
96bc40c1b3 modified
jinit_c_master_control() so that it always set cinfo->optimize_coding to
TRUE when generarating 12-bit lossy Huffman-coded JPEG images, which
prevented calling applications from supplying custom Huffman tables for
such images.
This commit modifies jinit_c_master_control() so that it only overrides
cinfo->optimize_coding when generating 12-bit lossy Huffman-coded JPEG
images if all Huffman table slots are empty or all slots contain default
Huffman tables. Determining whether the latter is true requires using
memcmp() to compare the allocated Huffman tables with the default
Huffman tables, because:
- The documented behavior of jpeg_set_defaults() is to initialize any
empty Huffman table slot with the default Huffman table corresponding
to that slot, regardless of the data precision. There is also no
requirement that the data precision be specified prior to calling
jpeg_set_defaults(). Thus, there is no reliable way to prevent
jpeg_set_defaults() from initializing empty Huffman table slots with
default Huffman tables, which are useless for 12-bit data precision.
- There is no requirement that custom Huffman tables be defined prior to
calling jpeg_set_defaults(). A calling application could call
jpeg_set_defaults() and modify the values in the default Huffman
tables rather than allocating new tables. Thus, there is no reliable
way to detect whether the allocated Huffman tables contain default
values without comparing the tables with the default Huffman tables.
Fortunately, comparing the allocated Huffman tables with the default
Huffman tables is the last stop on the logic train, so it won't happen
unless cinfo->data_precision == 12, cinfo->arith_code == FALSE,
cinfo->optimize_coding == FALSE, and one or more Huffman tables are
allocated. (If the compressor object is reused, this ensures that the
full comparison will be performed at most once.) Custom Huffman tables
will be flagged as non-default when the first non-default value is
encountered, and the worst case (comparing 400 bytes) is very fast on
modern CPUs anyhow.
Fixes#751
- Detect at configure time, via the __CET__ C preprocessor macro,
whether the C compiler will include either indirect branch tracking
(IBT) or shadow stack support, and define a NASM macro (__CET__) if
so.
- Modify the x86-64 SIMD code so that it includes appropriate endbr64
instructions (to support IBT) and an appropriate .note.gnu.property
section (to support both IBT and shadow stack) when __CET__ is
defined.
Closes#350
While QEMU will run executables from the current working directory,
other emulators may not. It is more reliable to pass the full
executable path to the emulator. The add_test(NAME ... COMMAND ...)
syntax automatically invokes the emulator (e.g. the command specified
in CMAKE_CROSSCOMPILING_EMULATOR) and passes the full executable path to
it, as long as the first COMMAND argument is the name of a target. This
cleans up the CMake code somewhat as well, since it is no longer
necessary to manually invoke CMAKE_CROSSCOMPILING_EMULATOR.
Closes#747
Disclaimer: I am not a lawyer, nor do I play one on TV.
Referring to #744, mentioning the zlib License as a license that applies
to libjpeg-turbo is confusing, and it isn't actually necessary, since
the IJG License subsumes the terms of the zlib License in the context of
the libjpeg API library and associated programs. This was presumably
understood to be the case by Miyasaka-san when he chose the zlib License
for the first libjpeg SIMD extensions. The libjpeg/SIMD web site
(https://cetus.sakura.ne.jp/softlab/jpeg-x86simd/jpegsimd.html) states
(translated from Japanese): "The terms of use of this SIMD enhanced
version of IJG JPEG software are subject to the terms of use of the
original version of IJG JPEG software."
Detailed analysis of the zlib License terms:
This software is provided 'as-is', without any express or implied
warranty. In no event will the authors be held liable for any
damages arising from the use of this software.
This text is an almost literal subset of the warranty disclaimer text in
the IJG License. The IJG License states everything above with only
slight differences in wording, and it further clarifies that the user
assumes all risk as to the software's quality and accuracy and that
vendors of commercial products based on the software must assume all
warranty and liability claims.
Permission is granted to anyone to use this software for any
purpose, including commercial applications, and to alter it and
redistribute it freely, subject to the following restrictions:
This is semantically the same as the permission text in the IJG License,
since "use, copy, modify, and distribute this software (or portions
thereof) for any purpose, without fee" covers "use" for "any purpose,
including commercial applications" as well as alteration and
redistribution.
1. The origin of this software must not be misrepresented; you must
not claim that you wrote the original software. If you use this
software in a product, an acknowledgment in the product
documentation would be appreciated but is not required.
The IJG License requirement that "If any part of the source code for
this software is distributed, then this README file must be included,
with this copyright and no-warranty notice unaltered; and any additions,
deletions, or changes to the original files must be clearly indicated in
accompanying documentation" (Clause 1), as well as the requirement that
"If only executable code is distributed, then the accompanying
documentation must state that 'this software is based in part on the
work of the Independent JPEG Group'" (Clause 2), satisfies the
requirement of Clause 1 of the zlib License.
2. Altered source versions must be plainly marked as such, and must
not be misrepresented as being the original software.
Since Clause 1 of the IJG License applies only to the distribution of
source code, the copyright headers in the source code are effectively
"accompanying documentation" in that case. This is why we ensure that
the copyright headers of individual source files indicate the year(s) in
which modifications were made by each contributor. Doing so satisfies
the requirements of both Clause 2 of the zlib License and Clause 1 of
the IJG License.
3. This notice may not be removed or altered from any source
distribution.
Clauses 2 and 3 of the zlib License apply only to the source code that
bears that license. Thus, as applied to the software as a whole, those
requirements of the inbound zlib License are compatible with the
outbound IJG License as long as the IJG License does not contradict
them (which it doesn't.)
NOTE: To be clear, existing source code that bears the zlib License
cannot literally be re-licensed under the IJG License, since that would
violate Clause 3 of the zlib License. However, when considering the
terms under which the overall library is made available, the IJG License
effectively subsumes the terms of the zlib License.
https://www.gnu.org/licenses/license-compatibility.en.html is a
thorough, albeit somewhat GPL-biased, discussion of license
compatibility.
(regression introduced by 1644bdb7d2)
Setting a maximum version in cmake_minimum_required() effectively sets
the behavior to NEW for all policies introduced in all CMake versions up
to and including that maximum version. The NEW behavior for CMP0091,
introduced in CMake 3.15, uses CMake variables to specify the MSVC
runtime library against which to link, rather than placing the relevant
flags in CMAKE_C_FLAGS*. Thus, replacing /MD with /MT in CMAKE_C_FLAGS*
no longer has any effect when using CMake 3.15+.
- Integrate
c07bba2730 (diff-1e2deb5301e9481203102fcddd1b2d0d2bf0ddc1cbb445c7f4b6414a3b869ce8)
so that the default man directory is <CMAKE_INSTALL_PREFIX>/share/man
on FreeBSD systems.
- For good measure, integrate
f835f189ae
so that the default info directory is
<CMAKE_INSTALL_PREFIX>/share/info on FreeBSD systems, even though we
don't use that directory.
- Automatically set the CMake variable type to PATH for any
user-specified CMAKE_INSTALL_*DIR variables.
Addresses concerns raised in #326, #346, #648Closes#648
libjpeg-turbo always includes Adobe APP14 markers in the lossless JPEG
images that it generates, but some compressors (e.g. accusoft PICTools
Medical) do not.
Fixes#743
ef9a4e05ba (libjpeg-turbo 1.4.x), which
was based on
https://bug815473.bmoattachments.org/attachment.cgi?id=692126
(https://bugzilla.mozilla.org/show_bug.cgi?id=815473), modified the C
baseline Huffman encoder so that it precomputes jpeg_nbits_table, in
order to facilitate sharing the table among multiple processes.
However, libjpeg-turbo never shared the table, and because the table was
implemented as a static array, f3a8684cd1
(libjpeg-turbo 1.5.x) and 37bae1a0e9
(libjpeg-turbo 2.0.x) each introduced a duplicate copy of the table for
(respectively) the SSE2 baseline Huffman encoder and the C progressive
Huffman encoder.
This commit does the following:
- Move the duplicated code in jchuff.c and jcphuff.c, originally
introduced in 0cfc4c17b7 and
37bae1a0e9, into a header
(jpeg_nbits.h).
- Credit the co-author of 0cfc4c17b7.
(Refer to https://sourceforge.net/p/libjpeg-turbo/patches/57).
- Modify the SSE2 baseline Huffman encoder so that the C Huffman
encoders can share its definition of jpeg_nbits_table.
- Move the definition of jpeg_nbits_table into a C source file
(jpeg_nbits.c) rather than a header, and define the table only if
USE_CLZ_INTRINSIC is undefined and the SSE2 baseline Huffman encoder
will not be built.
- Apply hidden symbol visibility to the shared definition of
jpeg_nbits_table, if the compiler supports the necessary attribute.
(In practice, only Visual C++ doesn't.)
Closes#114
See also:
https://bugzilla.mozilla.org/show_bug.cgi?id=1501523
- Clarify that lossless JPEG is slower than and doesn't compress as well
as lossy JPEG. (That should be obvious, because "lossy" literally
means that data is thrown away.)
- Re-generate TurboJPEG C API documentation using Doxygen 1.9.8.
- Clarify that setting the data_precision field in jpeg_compress_struct
to 16 requires lossless mode.
- Explain what the predictor selection value actually does. (Refer to
Recommendation ITU-T T.81 (1992) | ISO/IEC 10918-1:1994, Section
H.1.2.1.)
TJPARAM_MAXPIXELS was previously hidden and used only for fuzz testing,
but it is potentially useful for calling applications as well,
particularly if they want to guard against excessive memory consumption
by the tj3LoadImage*() functions. The parameter has also been extended
to decompression and lossless transformation functions/methods, mainly
as a convenience. (It was already possible for calling applications to
impose their own JPEG image size limits by reading the JPEG header prior
to decompressing or transforming the image.)
Unfortunately, most of the GitHub security advisories filed against
libjpeg-turbo thus far have been the result of non-exploitable API
abuses triggered by randomly-generated test programs and accompanied by
wild claims of denials of service with no demonstrable or even probable
exploit that might cause such a DoS (assuming a service even existed
that used the API in question.) Security advisories remain private
unless accepted, and I cannot accept them if they do not describe an
actual security issue. Thus, it's best to steer most users toward
regular bug reports.
Because of the TurboJPEG 3 API overhaul, the legacy decompression and
lossless transformation functions now wrap the new TurboJPEG 3
functions. For performance reasons, we don't want to read the JPEG
header more than once during the same operation, so the wrapped
functions do not read the header if it has already been read by a
wrapper function. Initially the TurboJPEG 3 functions used a state
variable to track whether the header had already been read, but
b94041390c made this more robust by using
the libjpeg global decompression state instead. If a wrapper function
has already read the JPEG header successfully, then the global
decompression state will be DSTATE_READY, and the logic introduced in
b94041390c will prevent the header from
being read again.
A subtle issue arises because tj3DecompressHeader() does not call
jpeg_abort_decompress() if jpeg_read_header() fails. (That is arguably
a bug, but it has existed since the very first implementation of the
function.) Depending on the nature of the failure, this can cause
tj3DecompressHeader() to return an error code and leave the libjpeg
global decompression state set to DSTATE_INHEADER. If a misbehaved
application ignored the error and subsequently called a TurboJPEG
decompression or lossless transformation function, then the function
would fail to read the JPEG header because the global decompression
state was greater than DSTATE_START. In the case of the decompression
functions, this was innocuous, because jpeg_calc_output_dimensions()
and jpeg_start_decompress() both sanity check the global decompression
state. However, it was possible for a misbehaved application to call
tj3DecompressHeader() with junk data, ignore the return value, and pass
the same junk data into tj3Transform(). Because tj3DecompressHeader()
left the global decompression state set to DSTATE_INHEADER,
tj3Transform() failed to detect the junk data (because it didn't try to
read the JPEG header), and it called jtransform_request_workspace() with
dinfo->image_width and dinfo->image_height still initialized to 0.
Because jtransform_request_workspace() does not sanity check the
decompression state, a division-by-zero error occurred with certain
combinations of transform options in which TJXOPT_TRIM or TJXOPT_CROP
was specified. However, it should be noted that TJXOPT_TRIM and
TJXOPT_CROP cannot be expected to work properly without foreknowledge of
the JPEG source image dimensions, which cannot be gained except by
calling tj3DecompressHeader() successfully. Thus, a calling application
is inviting trouble if it does not check the return value of
tj3DecompressHeader() and sanity check the JPEG source image dimensions
before calling tj3Transform(). This commit softens the failure, but the
failure is still due to improper API usage.
(regression introduced by 300a344d65)
300a344d65 fixed the recompression code
path but also broke the pure decompression code path, because the fix
caused the TurboJPEG decompression instance to be destroyed before
tj3SaveImage() could use it. Furthermore, the fix in
300a344d65 prevented pixel density
information from being transferred from the input image to the output
image when recompressing. This commit does the following:
- Modify tjexample.c so that a single TurboJPEG instance is initialized
for lossless transformation and shared by all code paths. In addition
to fixing both of the aforementioned issues, this makes the code more
readable.
- Extend tjexampletest to test the recompression code path, thus
ensuring that the issues fixed by this commit and
300a344d65 are not reintroduced.
- Modify tjexample.c to remove redundant fclose(), tj3Destroy(), and
tj3Free() calls.
This corresponds to max_memory_to_use in the jpeg_memory_mgr struct in
the libjpeg API, except that the TurboJPEG parameter is specified in
megabytes. Because this is 2023 and computers with less than 1 MB of
memory are not a thing (at least not within the scope of libjpeg-turbo
support), it isn't useful to allow a limit less than 1 MB to be
specified. Furthermore, because TurboJPEG parameters are signed
integers, if we allowed the memory limit to be specified in bytes, then
it would be impossible to specify a limit larger than 2 GB on 64-bit
machines. Because max_memory_to_use is a long signed integer,
effectively we can specify a limit of up to 2 petabytes on 64-bit
machines if the TurboJPEG parameter is specified in megabytes. (2 PB
should be enough for anybody, right?)
This commit also bumps the TurboJPEG API version to 3.0.1. Since the
TurboJPEG API version no longer tracks the libjpeg-turbo version, it
makes sense to increment the API revision number when adding constants,
to increment the minor version number when adding functions, and to
increment the major version number for a complete overhaul.
This commit also removes the vestigial TJ_NUMPARAM macro, which was
never defined because it proved unnecessary.
Partially implements #735
This is very subtle, but if a user specifies a libjpeg virtual array
memory limit via the JPEGMEM environment variable and one of the
tj3Compress*() functions hits that limit, the libjpeg error handler
will be invoked in jpeg_start_compress() (more specifically in
realize_virt_arrays() in jinit_compress_master()) before the libjpeg
global compression state can be incremented. Thus,
jpeg_abort_compress() will not be called before the tj3Compress*()
function exits, the unrealized virtual arrays will not be freed, and if
the TurboJPEG compression instance is reused, those unrealized virtual
arrays will count against the specified memory limit. This could cause
subsequent compression operations that require smaller virtual arrays
(or even no virtual arrays at all) to fail when they would otherwise
succeed. In reality, the vast majority of calling programs would abort
and free the TurboJPEG compression instance if one of the tj3Compress*()
functions failed, but TJBench is a rare exception. This issue does not
bear documenting because of its subtlety and rarity and because JPEGMEM
is not a documented feature of the TurboJPEG API.
Note that the issue does not exist in the tj3Encode*() and tj3Decode*()
functions, because realize_virt_arrays() is never called in the body of
those functions. The issue also does not exist in the tj3Decompress*()
and tj3Transform() functions, because those functions ensure that the
JPEG header is read (and thus the libjpeg global decompression state is
incremented) prior to calling a function that calls
realize_virt_arrays() (i.e. jpeg_start_decompress() or
jpeg_read_coefficients().) If realize_virt_arrays() failed in the body
of jpeg_write_coefficients(), then tj3Transform() would abort without
calling jpeg_abort_compress(). However, since jpeg_start_compress() is
never called in the body of tj3Transform(), no virtual arrays are ever
requested from the compression object, so failing to call
jpeg_abort_compress() would be innocuous.
Those obsolete memory managers have never been included in
libjpeg-turbo, nor has libjpeg-turbo ever claimed to support MS-DOS
or Mac operating systems prior to OS X 10.4 "Tiger." Note that we
retain the ability to use temp files, even though libjpeg-turbo does
not use them. This allows downstream implementations to write their own
memory managers that use temp files, if necessary.
If the align parameter was set to an unreasonably large value, such as
0x2000000, strides[0] * ph0 and strides[1] * ph1 could have overflowed
the int datatype and wrapped around when computing (src|dst)Planes[1]
and (src|dst)Planes[2] (respectively.) This would have caused
(src|dst)Planes[1] and (src|dst)Planes[2] to point to lower addresses in
the YUV buffer than expected, so the worst case would have been a
visually incorrect output image, not a buffer overrun or other
exploitable issue.
Because of 47656a0820, we can now
reliably determine the correct default values for FLOATTEST8 and
FLOATTEST12 when using Clang or GCC to build for AArch64 or PowerPC
platforms. (Testing confirms that this is the case with GCC 5-13 and
Clang 5-14 on Ubuntu/AArch64, GCC 4 on CentOS 7/PPC, and GCC 8-10 and
Clang 6-12 on Ubuntu/PPCLE.) Other CPU architectures and compilers can
be added on a case-by-case basis as they are tested.
Because of bf01ed2fbc, the simd field in
huff_entropy_encoder (and, by extension, the simd field in
savable_state) is only initialized if WITH_SIMD is defined. Due to an
oversight, the simd field in savable_state was queried in flush_bits()
regardless of whether WITH_SIMD was defined. In most cases, both
branches of the query have identical code, and the optimizer removes the
branch. However, because the legacy Neon GAS Huffman encoder uses the
older bit buffer logic from libjpeg-turbo 2.0.x and prior (refer to
087c29e07f), the branches do not have
identical code when building for AArch64 with NEON_INTRINSICS undefined
(which will be the case if WITH_SIMD is undefined.) Thus, if
libjpeg-turbo was built for AArch64 with the SIMD extensions disabled
at build time, it was possible for the Neon GAS branch in flush_bits()
to be taken, which would have set put_bits to a value that is incorrect
for the C Huffman encoder. Referring to #728, a user reported that this
issue sometimes caused libjpeg-turbo to generate bogus JPEG images if it
was built for AArch64 without SIMD extensions and subsequently used
through the Qt framework. (It should be noted, however, that disabling
the SIMD extensions in AArch64 builds of libjpeg-turbo is inadvisable
for performance reasons.)
I was unable to reproduce the issue on Linux/AArch64 using libjpeg-turbo
alone, despite testing various versions of GCC and Clang and various
optimization levels. However, the issue is reproducible using MSan with
-O0, so this commit also modifies the GitHub Actions workflow so that
compiler optimization is disabled in the linux-msan job. That should
prevent the issue or similar issues from re-emerging.
Fixes#728
libjpeg-turbo implements 4:4:0 "fancy" (smooth) upsampling, which is
enabled by default when decompressing JPEG images that use 4:4:0
chrominance subsampling. libjpeg did not and does not implement fancy
4:4:0 upsampling.
The MD5 sums associated with FLOATTEST8=fp-contract and
FLOATTEST12=fp-contract are appropriate for GCC (tested v5 through v13)
with -ffp-contract=fast, which is the default when compiling for an
architecture that has fused multiply-add (FMA) instructions. However,
different MD5 sums are needed for Clang (tested v5 through v14) with
-ffp-contract=on, which is now the default in Clang 14 when compiling
for an architecture that has FMA instructions.
Refer to #705, #709, #710
* libjpeg-turbo/2.1.x:
ChangeLog.md: List CVE ID fixed by ccaba5d7
jpeglib.h: Document that JCS_RGB565 is decomp-only
Fix block smoothing w/vert.-subsampled prog. JPEGs
# By DRC
# Via DRC
* commit 'eadd243':
Fix interblock smoothing with narrow prog. JPEGs
jchuff.c/flush_bits(): Guard against free_bits < 0
jchuff.c/flush_bits(): Guard against put_bits < 0
Restore xform fuzzer behavior from before 19f9d8f0
xform fuzz: Use src subsamp to calc dst buf size
Doc: Mention that we are a JPEG ref implementation
jchuff.c: Test for out-of-range coefficients
turbojpeg.h: Make customFilter() proto match doc
ChangeLog.md: Fix typo
tjTransform(): Calc dst buf size from xformed dims
Fix build warnings/errs w/ -DNO_GETENV/-DNO_PUTENV
GitHub: Fix x32 build
tjexample.c: Prevent integer overflow
jpeg_crop_scanline: Fix calc w/sclg + 2x4,4x2 samp
Decomp: Don't enable 2-pass color quant w/ RGB565
TJBench: w/JPEG input imgs, set min tile= MCU size
Bump version to 2.1.6 to prepare for new commits
GitHub: Add pull request template
Build: Clarify CMAKE_OSX_ARCHITECTURES error
Build: Fail if included with add_subdirectory()
# Conflicts:
# .github/workflows/build.yml
# CMakeLists.txt
# README.md
# release/deb-control.in
# By DRC
# Via DRC
* tag '2.1.5.1':
ChangeLog.md: Document d743a2c1
OSS-Fuzz: Bail out immediately on decomp failure
SIMD/x86: Initialize simd_support before every use
Build: Define THREAD_LOCAL even if !WITH_TURBOJPEG
# Conflicts:
# CMakeLists.txt
The 5x5 interblock smoothing implementation, introduced in libjpeg-turbo
2.1, improperly extended the logic from the traditional 3x3 smoothing
implementation. Both implementations point prev_block_row and
next_block_row to the current block row when processing, respectively,
the first and the last block row in the image:
if (block_row > 0 || cinfo->output_iMCU_row > 0)
prev_block_row =
buffer[block_row - 1] + cinfo->master->first_MCU_col[ci];
else
prev_block_row = buffer_ptr;
if (block_row < block_rows - 1 ||
cinfo->output_iMCU_row < last_iMCU_row)
next_block_row =
buffer[block_row + 1] + cinfo->master->first_MCU_col[ci];
else
next_block_row = buffer_ptr;
6d91e950c8 naively extended that logic to
accommodate a 5x5 smoothing window:
if (block_row > 1 || cinfo->output_iMCU_row > 1)
prev_prev_block_row =
buffer[block_row - 2] + cinfo->master->first_MCU_col[ci];
else
prev_prev_block_row = prev_block_row;
if (block_row < block_rows - 2 ||
cinfo->output_iMCU_row + 1 < last_iMCU_row)
next_next_block_row =
buffer[block_row + 2] + cinfo->master->first_MCU_col[ci];
else
next_next_block_row = next_block_row;
However, this new logic was only correct if block_rows == 1, so the
values of prev_prev_block_row and next_next_block_row were incorrect
when processing, respectively, the second and second to last iMCU rows
in a vertically-subsampled progressive JPEG image.
The intent was to:
- point prev_block_row to the current block row when processing the
first block row in the image,
- point prev_prev_block_row to prev_block_row when processing the first
two block rows in the image,
- point next_block_row to the current block row when processing the
last block row in the image, and
- point next_next_block_row to next_block_row when processing the last
two block rows in the image.
This commit modifies decompress_smooth_data() so that it computes the
current block row's position relative to the whole image and sets
the block row pointers based on that value.
This commit also restores a line of code that was accidentally deleted
by 6d91e950c8:
access_rows += compptr->v_samp_factor; /* prior iMCU row too */
access_rows is merely a sanity check that tells the access_virt_barray()
method to generate an error if accessing the specified number of rows
would cause a buffer overrun. Essentially, it is a belt-and-suspenders
measure to ensure that j*init_d_coef_controller() allocated enough rows
for the full-image virtual array. Thus, excluding that line of code did
not cause an observable issue.
This commit also documents eadd2436ee in
the change log.
Fixes#721
The 5x5 interblock smoothing implementation, introduced in libjpeg-turbo
2.1, improperly extended the logic from the traditional 3x3 smoothing
implementation. Both implementations point prev_block_row and
next_block_row to the current block row when processing, respectively,
the first and the last block row in the image:
if (block_row > 0 || cinfo->output_iMCU_row > 0)
prev_block_row =
buffer[block_row - 1] + cinfo->master->first_MCU_col[ci];
else
prev_block_row = buffer_ptr;
if (block_row < block_rows - 1 ||
cinfo->output_iMCU_row < last_iMCU_row)
next_block_row =
buffer[block_row + 1] + cinfo->master->first_MCU_col[ci];
else
next_block_row = buffer_ptr;
6d91e950c8 naively extended that logic to
accommodate a 5x5 smoothing window:
if (block_row > 1 || cinfo->output_iMCU_row > 1)
prev_prev_block_row =
buffer[block_row - 2] + cinfo->master->first_MCU_col[ci];
else
prev_prev_block_row = prev_block_row;
if (block_row < block_rows - 2 ||
cinfo->output_iMCU_row + 1 < last_iMCU_row)
next_next_block_row =
buffer[block_row + 2] + cinfo->master->first_MCU_col[ci];
else
next_next_block_row = next_block_row;
However, this new logic was only correct if block_rows == 1, so the
values of prev_prev_block_row and next_next_block_row were incorrect
when processing, respectively, the second and second to last iMCU rows
in a vertically-subsampled progressive JPEG image.
The intent was to:
- point prev_block_row to the current block row when processing the
first block row in the image,
- point prev_prev_block_row to prev_block_row when processing the first
two block rows in the image,
- point next_block_row to the current block row when processing the
last block row in the image, and
- point next_next_block_row to next_block_row when processing the last
two block rows in the image.
This commit modifies decompress_smooth_data() so that it computes the
current block row's position relative to the whole image and sets
the block row pointers based on that value.
This commit also restores a line of code that was accidentally deleted
by 6d91e950c8:
access_rows += compptr->v_samp_factor; /* prior iMCU row too */
access_rows is merely a sanity check that tells the access_virt_barray()
method to generate an error if accessing the specified number of rows
would cause a buffer overrun. Essentially, it is a belt-and-suspenders
measure to ensure that j*init_d_coef_controller() allocated enough rows
for the full-image virtual array. Thus, excluding that line of code did
not cause an observable issue.
This commit also documents dbae59281f in
the change log.
Fixes#721
This allows debuggers and profilers to reliably capture backtraces from
within the x86-64 SIMD functions.
In places where rbp was previously used to access temporary variables
(after stack alignment), we now use r15 and save/restore it accordingly.
The total amount of work is approximately the same, because the previous
code pushed the pre-alignment stack pointer to the aligned stack. The
new prologue and epilogue actually have fewer instructions.
Also note that the {un}collect_args macros now use rbp instead of rax to
access arguments passed on the stack, so we save a few instructions
there as well.
Based on:
debcc7c3b4Closes#707Closes#708
Due to an oversight, the assignment of DC05, DC10, DC15, DC20, and DC25
(the right edge coefficients in the 5x5 interblock smoothing window) in
decompress_smooth_data() was incorrect for images exactly two MCU blocks
wide. For such images, DC04, DC09, DC14, DC19, and DC24 were assigned
values based on the last MCU column, but DC05, DC10, DC15, DC20, and
DC25 were assigned values based on the first MCU column (because
block_num + 1 was never less than last_block_column.) This commit
modifies jdcoefct.c so that, for images at least two MCU blocks wide,
DC05, DC10, DC15, DC20, and DC25 are assigned the same values as DC04,
DC09, DC14, DC19, and DC24 (respectively.) DC05, DC10, DC15, DC20, and
DC25 are then immediately overwritten for images more than two MCU
blocks wide.
Since this issue was minor and not likely obvious to an end user, the
fix is undocumented.
Fixes#700
Due to an oversight, the assignment of DC05, DC10, DC15, DC20, and DC25
(the right edge coefficients in the 5x5 interblock smoothing window) in
decompress_smooth_data() was incorrect for images exactly two MCU blocks
wide. For such images, DC04, DC09, DC14, DC19, and DC24 were assigned
values based on the last MCU column, but DC05, DC10, DC15, DC20, and
DC25 were assigned values based on the first MCU column (because
block_num + 1 was never less than last_block_column.) This commit
modifies jdcoefct.c so that, for images at least two MCU blocks wide,
DC05, DC10, DC15, DC20, and DC25 are assigned the same values as DC04,
DC09, DC14, DC19, and DC24 (respectively.) DC05, DC10, DC15, DC20, and
DC25 are then immediately overwritten for images more than two MCU
blocks wide.
Since this issue was minor and not likely obvious to an end user, the
fix is undocumented.
Fixes#700
(regression introduced by fc01f4673b)
Because of the TurboJPEG 3 API overhaul, the pure compression code path
in tjexample.c now creates a TurboJPEG compression instance prior to
calling tj3LoadImage(). However, due to an oversight, a compression
instance was no longer created in the recompression code path. Thus,
that code path attempted to reuse the TurboJPEG decompression instance,
which caused an error ("tj3Set(): TJPARAM_QUALITY is not applicable to
decompression instances.") This commit modifies tjexample.c so that the
recompression code path creates a new TurboJPEG compression instance if
one has not already been created.
Fixes#716
This fixes a buffer overrun, reported by OSS-Fuzz, that occurred when
attempting to transform a specially-crafted malformed arithmetic-coded
JPEG image into a baseline Huffman-coded JPEG destination image with
default Huffman tables. This issue probably had a similar root cause to
the issue fixed in 31a301389b, but in this
case, the issue only occurred with the SIMD baseline Huffman encoder in
libjpeg-turbo 2.1.x and 2.0.x. It was not reproducible in 3.0.x or when
using the C baseline Huffman encoder. (NOTE: In order to reproduce the
issue with 2.1.x, it was necessary to revert
58cee6d90cd616c86fae142891b987c289ea055a.)
Because libjpeg-turbo 3.0.x now supports multiple data precisions in the
same build, the regression test system can test the 8-bit and 12-bit
floating point DCT/IDCT algorithms separately. The expected MD5 sums
for those tests are communicated to the test system using the FLOATTEST8
and FLOATTEST12 CMake variables. Whereas it is possible to
intelligently set a default value for FLOATTEST8 when building for
x86[-64] and a default value for FLOATTEST12 when building for x86-64,
it is not possible with other architectures. (Refer to #705, #709,
and #710.) Clang 14, for example, now enables FMA (fused multiply-add)
instructions by default on architectures that support them, but with
AArch64 builds, the results are not the same as when using GCC/AArch64
with FMA instructions enabled. Thus, setting FLOATTEST12=fp-contract
doesn't make the tests pass. It was already impossible to intelligently
set a default for FLOATTEST8 with i386 builds, but referring to #710,
that appears to be the case with other non-x86-64 builds as well.
Back in 1991, when Tom Lane first released libjpeg, some CPUs had
floating point units and some didn't. It could take minutes to compress
or decompress a 1-megapixel JPEG image using the "slow" integer DCT/IDCT
algorithms, and the floating point algorithms were significantly faster
on systems that had an FPU. On systems without FPUs, floating point
math was emulated and really slow, so Tom also developed "fast" integer
DCT/IDCT algorithms to speed up JPEG performance, at the expense of
accuracy, on those systems. Because of libjpeg-turbo's SIMD extensions,
the floating point algorithms are now significantly slower than the
"slow" integer algorithms without being significantly more accurate, and
the "fast" integer algorithms fail the ISO/ITU-T conformance tests
without being any faster than the "slow" integer algorithms on x86
systems. Thus, the floating point and "fast" integer algorithms are
considered legacy features.
In order for the floating point regression tests to be useful, the
results of the tests must be validated against an independent metric.
(In other words, it wouldn't be useful to use the floating point
DCT/IDCT algorithms to determine the expected results of the floating
point DCT/IDCT algorithms.) In the past, I attempted without success to
develop a low-level floating point test that would run at configure time
and determine the appropriate default value of FLOATTEST*. Barring that
approach, the only other possibilities would be:
1. Develop a test framework that compares the floating point results
with a margin of error, as TJUnitTest does. However, that effort isn't
justified unless it could also benefit non-legacy features.
2. Compare the floating point results against an expected MD5 sum, as we
currently do. However, as previously described, it isn't possible in
most cases to determine an appropriate default value for the expected
MD5 sum.
For the moment, it makes the most sense to disable the 8-bit floating
point tests by default except with x86[-64] builds and to disable the
12-bit floating point tests by default except with x86-64 builds. That
means that the floating point algorithms will still be regression tested
when performing x86[-64] builds, but other types of builds will have to
opt in to the same regression tests. Since the floating point DCT/IDCT
algorithms are unlikely to change ever again (the only reason they still
exist at all is to maintain backward compatibility with libjpeg), this
seems like a reasonable tradeoff.
This fixes a UBSan negative shift warning, reported by OSS-Fuzz, that
occurred when attempting to transform a specially-crafted malformed
arithmetic-coded JPEG image into a baseline Huffman-coded JPEG
destination image with default Huffman tables. This issue probably
had a similar root cause to the issue fixed in
31a301389b, but in this case, the issue
only occurred with the SIMD baseline Huffman encoder in libjpeg-turbo
2.1.x. It was not reproducible in 2.0.x or 3.0.x or when using the
C baseline Huffman encoder.
- The example-*bit-*-decompress test must run after the
example-*bit-*-compress test, since the latter generates
testout*-example.jpg.
- Add -static to the filenames of all output files generated by the
"static" regression tests, to avoid conflicts with the "shared"
regression tests.
- Add the PID to the filenames of all files generated by the tjunittest
packed-pixel image I/O tests.
- Check the return value of MD5File() in tjunittest to avoid a segfault
if the file doesn't exist. (Prior to the fix described above, that
could occur if two instances of tjunittest ran concurrently from the
same directory with the same -bmp and -precision arguments.)
Fixes#705
The intent was for the final transform operation to be the same as the
first transform operation but without TJXOPT_COPYNONE or
TJPARAM_NOREALLOC. Unrolling the transform operations in
c8d52f1c4c accidentally changed that.
The intent was for the final transform operation to be the same as the
first transform operation but without TJXOPT_COPYNONE or
TJFLAG_NOREALLOC. Unrolling the transform operations in
19f9d8f0fd accidentally changed that.
Referring to
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60379
there are some specially-crafted malformed JPEG images that, when
transformed to grayscale, will exceed the worst-case transformed
grayscale JPEG image size. This is similar in nature to the issue fixed
by 19f9d8f0fd, except that in this case,
the issue occurs regardless of the amount of metadata in the source
image. Also, the tjTransform() function, the
Java_org_libjpegturbo_turbojpeg_TJTransformer_transform() JNI function,
and TJBench were behaving correctly in this case, because the TurboJPEG
API documentation specifies that the source image's subsampling type
should be used when computing the worst-case transformed JPEG image
size. (However, only the Java API documentation specified that. Oops.
The C API documentation now does as well.) The documented usage
mitigates the issue, and only the transform fuzzer did not adhere to
that. Thus, this was an issue with the fuzzer itself rather than an
issue with the library.
Referring to
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60379
there are some specially-crafted malformed JPEG images that, when
transformed to grayscale, will exceed the worst-case transformed
grayscale JPEG image size. This is similar in nature to the issue fixed
by c8d52f1c4c, except that in this case,
the issue occurs regardless of the amount of metadata in the source
image. Also, the tj3Transform() function, the
Java_org_libjpegturbo_turbojpeg_TJTransformer_transform() JNI function,
and TJBench were behaving correctly in this case, because the TurboJPEG
API documentation specifies that the source image's subsampling type
should be used when computing the worst-case transformed JPEG image
size. (However, only the Java API documentation specified that. Oops.
The C API documentation now does as well.) The documented usage
mitigates the issue, and only the transform fuzzer did not adhere to
that. Thus, this was an issue with the fuzzer itself rather than an
issue with the library.
Unlike its predecessors, tj3DecompressHeader() does not fail if passed
a JPEG image with an unknown subsampling type. This led me to believe
that it was OK for the fuzzers to abort if tj3DecompressHeader()
returned an error. However, there are apparently some malformed JPEG
images that can expose issues in libjpeg-turbo while also causing
tj3DecompressHeader() to complain about header errors. Thus, it is best
to ignore the return value of tj3DecompressHeader(), as the fuzzers in
libjpeg-turbo 2.1.x and prior did.
The SignPath release certificate for our project is not yet available as
of this writing, but this commit prepares the CI system to use the
release certificate whenever it becomes available. It also restricts
signing only to tags, which correspond to official releases. (That
mimics our traditional policy of not signing pre-release builds.)
Restore two coefficient range checks from libjpeg to the C baseline
Huffman encoder. This fixes an issue
(https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60253) whereby
the encoder could read from uninitialized memory when attempting to
transform a specially-crafted malformed arithmetic-coded JPEG source
image into a baseline Huffman-coded JPEG destination image with default
Huffman tables. More specifically, the out-of-range coefficients caused
r to equal 256, which overflowed the actbl->ehufsi[] array. Because the
overflow was contained within the huff_entropy_encoder structure, this
issue was not exploitable (nor was it observable at all on x86 or Arm
CPUs unless JSIMD_NOHUFFENC=1 or JSIMD_FORCENONE=1 was set in the
environment or unless libjpeg-turbo was built with WITH_SIMD=0.)
The fix is performance-neutral (+/- 1-2%) for x86-64 code and causes a
0-4% (avg. 1-2%, +/- 1-2%) compression regression for i386 code on Intel
CPUs when the C baseline Huffman encoder is used (JSIMD_NOHUFFENC=1).
The fix is performance-neutral (+/- 1-2%) on Intel CPUs when all of the
libjpeg-turbo SIMD extensions are disabled (JSIMD_FORCENONE=1). The fix
causes a 0-2% (avg. <1%, +/- 1%) compression regression for PowerPC
code.
This is subtle, but the tj3DecompressHeader() function sets the values
of TJPARAM_ARITHMETIC, TJPARAM_OPTIMIZE, and TJPARAM_PROGRESSIVE.
Unless we unset those values, the entropy algorithm used in the
transformed JPEG image will be determined by the union of the parameter
values and the transform options, which isn't what we want.
Restore two coefficient range checks from libjpeg to the C baseline
Huffman encoder. This fixes an issue
(https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60253) whereby
the encoder could read from uninitialized memory when attempting to
transform a specially-crafted malformed arithmetic-coded JPEG source
image into a baseline Huffman-coded JPEG destination image with default
Huffman tables. More specifically, the out-of-range coefficients caused
r to equal 256, which overflowed the actbl->ehufsi[] array. Because the
overflow was contained within the huff_entropy_encoder structure, this
issue was not exploitable (nor was it observable at all on x86 or Arm
CPUs unless JSIMD_NOHUFFENC=1 or JSIMD_FORCENONE=1 was set in the
environment or unless libjpeg-turbo was built with WITH_SIMD=0.)
The fix is performance-neutral (+/- 1-2%) for x86-64 code and causes a
0-4% (avg. 1-2%, +/- 1-2%) compression regression for i386 code on Intel
CPUs when the C baseline Huffman encoder is used (JSIMD_NOHUFFENC=1).
The fix is performance-neutral (+/- 1-2%) on Intel CPUs when all of the
libjpeg-turbo SIMD extensions are disabled (JSIMD_FORCENONE=1). The fix
causes a 0-2% (avg. <1%, +/- 1%) compression regression for PowerPC
code.
Color quantization is a legacy feature that serves little or no purpose
with lossless JPEG images. 9f756bc67a
eliminated interaction issues between the lossless decompressor and the
color quantizers related to out-of-range 12-bit samples, but referring
to #701, other interaction issues apparently still exist. Such issues
are likely, given the fact that the color quantizers were not designed
with lossless decompression in mind.
This commit reverts 9f756bc67a, since the
issues it fixed are no longer relevant because of this commit and
2192560d74.
Fixed#672Fixes#673Fixes#674Fixes#676Fixes#677Fixes#678Fixes#679Fixes#681Fixes#683Fixes#701
When used with TJFLAG_NOREALLOC and with TJXOP_TRANSPOSE,
TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270, tjTransform()
incorrectly based the destination buffer size for a transform on the
source image dimensions rather than the transformed image dimensions.
This was apparently a long-standing bug that had existed in the
tjTransform() function since its inception. As initially implemented in
the evolving libjpeg-turbo v1.2 code base, tjTransform() required
dstSizes[i] to be set regardless of whether TJFLAG_NOREALLOC was set.
ff78e37595, which was introduced later in
the evolving libjpeg-turbo v1.2 code base, removed that requirement and
planted the seed for the bug. However, the bug was not activated until
9b49f0e4c7 was introduced still later in
the evolving libjpeg-turbo v1.2 code base, adding a subsampling type
argument to the (new at the time) tjBufSize() function and thus making
the width and height arguments no longer commutative.
The bug opened up the possibility that a JPEG source image could cause
tjTransform() to overflow the destination buffer for a transform if all
of the following were true:
- The JPEG source image used 4:2:2, 4:4:0, or 4:1:1 subsampling. (These
are the only supported subsampling types for which the width and
height arguments to tjBufSize() are not commutative.)
- The width and height of the JPEG source image were such that
tjBufSize(height, width, subsamplingType) returned a smaller value
than tjBufSize(width, height, subsamplingType).
- The JPEG source image contained enough metadata that the size of the
transformed image was larger than
tjBufSize(height, width, subsamplingType).
- TJFLAG_NOREALLOC was set.
- TJXOP_TRANSPOSE, TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270 was
used.
- TJXOPT_COPYNONE was not set.
- TJXOPT_CROP was not set.
- The calling program allocated
tjBufSize(height, width, subsamplingType) bytes for the
destination buffer, as the API documentation instructs.
The API documentation cautions that JPEG source images containing a
large amount of extraneous metadata (EXIF, IPTC, ICC, etc.) cannot
reliably be transformed if TJFLAG_NOREALLOC is set and TJXOPT_COPYNONE
is not set. Irrespective of the bug, there are still cases in which a
JPEG source image with a large amount of metadata can, when transformed,
exceed the worst-case transformed JPEG image size. For instance, if you
try to losslessly crop a JPEG image with 3 kB of EXIF data to 16x16
pixels, then you are guaranteed to exceed the worst-case 16x16 JPEG
image size unless you discard the EXIF data.
Even without the bug, tjTransform() will still fail with "Buffer passed
to JPEG library is too small" when attempting to transform JPEG source
images that meet the aforementioned criteria. The bug is that the
function segfaults rather than failing gracefully, but the chances of
that occurring in a real-world application are very slim. Any
real-world application developers who attempted to transform arbitrary
JPEG source images with TJFLAG_NOREALLOC set would very quickly realize
that they cannot reliably do that without also setting TJXOPT_COPYNONE.
Thus, I posit that the actual risk posed by this bug is low.
Applications such as web browsers that are the most exposed to security
risks from arbitrary JPEG source images do not use the TurboJPEG
lossless transform feature. (None of those applications even use the
TurboJPEG API, to the best of my knowledge, and the public libjpeg API
has no equivalent transform function.) Our only command-line interface
to the tjTransform() function, TJBench, was not exposed to the bug
because it had a compatible bug whereby it allocated the JPEG
destination buffer to the same size that tjTransform() erroneously
expected. The TurboJPEG Java API was also not exposed to the bug
because of a similar compatible bug in the
Java_org_libjpegturbo_turbojpeg_TJTransformer_transform() JNI function.
(This commit fixes both compatible bugs.)
In short, best practices for tjTransform() are to use TJFLAG_NOREALLOC
only with JPEG source images that are known to be free of metadata (such
as images generated by tjCompress*()) or to use TJXOPT_COPYNONE along
with TJFLAG_NOREALLOC. Still, however, the function shouldn't segfault
as long as the calling program allocates the suggested amount of space
for the JPEG destination buffer.
Usability notes:
tjTransform() could hypothetically require dstSizes[i] to be set
regardless of whether TJFLAG_NOREALLOC is set, but there are usability
pitfalls either way. The main pitfall I sought to avoid with
ff78e37595 was a calling program failing
to set dstSizes[i] at all, thus leaving its value undefined. It could
be argued that requiring dstSizes[i] to be set in all cases is more
consistent, but it could also be argued that not requiring it to be set
when TJFLAG_NOREALLOC is set is more user-proof. tjTransform() could
also hypothetically set TJXOPT_COPYNONE automatically when
TJFLAG_NOREALLOC is set, but that could lead to user confusion.
When used with TJPARAM_NOREALLOC and with TJXOP_TRANSPOSE,
TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270, tj3Transform()
incorrectly based the destination buffer size for a transform on the
source image dimensions rather than the transformed image dimensions.
This was apparently a long-standing bug that had existed in the
tj*Transform() function since its inception. As initially implemented
in the evolving libjpeg-turbo v1.2 code base, tjTransform() required
dstSizes[i] to be set regardless of whether TJFLAG_NOREALLOC (the
predecessor to TJPARAM_NOREALLOC) was set.
ff78e37595, which was introduced later in
the evolving libjpeg-turbo v1.2 code base, removed that requirement and
planted the seed for the bug. However, the bug was not activated until
9b49f0e4c7 was introduced still later in
the evolving libjpeg-turbo v1.2 code base, adding a subsampling type
argument to the (new at the time) tjBufSize() function and thus making
the width and height arguments no longer commutative.
The bug opened up the possibility that a JPEG source image could cause
tj3Transform() to overflow the destination buffer for a transform if all
of the following were true:
- The JPEG source image used 4:2:2, 4:4:0, 4:1:1, or 4:4:1 subsampling.
(These are the only subsampling types for which the width and height
arguments to tj3JPEGBufSize() are not commutative.)
- The width and height of the JPEG source image were such that
tj3JPEGBufSize(height, width, subsamplingType) returned a smaller
value than tj3JPEGBufSize(width, height, subsamplingType).
- The JPEG source image contained enough metadata that the size of the
transformed image was larger than
tj3JPEGBufSize(height, width, subsamplingType).
- TJPARAM_NOREALLOC was set.
- TJXOP_TRANSPOSE, TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270 was
used.
- TJXOPT_COPYNONE was not set.
- TJXOPT_CROP was not set.
- The calling program allocated
tj3JPEGBufSize(height, width, subsamplingType) bytes for the
destination buffer, as the API documentation instructs.
The API documentation cautions that JPEG source images containing a
large amount of extraneous metadata (EXIF, IPTC, ICC, etc.) cannot
reliably be transformed if TJPARAM_NOREALLOC is set and TJXOPT_COPYNONE
is not set. Irrespective of the bug, there are still cases in which a
JPEG source image with a large amount of metadata can, when transformed,
exceed the worst-case transformed JPEG image size. For instance, if you
try to losslessly crop a JPEG image with 3 kB of EXIF data to 16x16
pixels, then you are guaranteed to exceed the worst-case 16x16 JPEG
image size unless you discard the EXIF data.
Even without the bug, tj3Transform() will still fail with "Buffer passed
to JPEG library is too small" when attempting to transform JPEG source
images that meet the aforementioned criteria. The bug is that the
function segfaults rather than failing gracefully, but the chances of
that occurring in a real-world application are very slim. Any
real-world application developers who attempted to transform arbitrary
JPEG source images with TJPARAM_NOREALLOC set would very quickly realize
that they cannot reliably do that without also setting TJXOPT_COPYNONE.
Thus, I posit that the actual risk posed by this bug is low.
Applications such as web browsers that are the most exposed to security
risks from arbitrary JPEG source images do not use the TurboJPEG
lossless transform feature. (None of those applications even use the
TurboJPEG API, to the best of my knowledge, and the public libjpeg API
has no equivalent transform function.) Our only command-line interface
to the tj3Transform() function, TJBench, was not exposed to the bug
because it had a compatible bug whereby it allocated the JPEG
destination buffer to the same size that tj3Transform() erroneously
expected. The TurboJPEG Java API was also not exposed to the bug
because of a similar compatible bug in the
Java_org_libjpegturbo_turbojpeg_TJTransformer_transform() JNI function.
(This commit fixes both compatible bugs.)
In short, best practices for tj3Transform() are to use TJPARAM_NOREALLOC
only with JPEG source images that are known to be free of metadata (such
as images generated by tj3Compress*()) or to use TJXOPT_COPYNONE along
with TJPARAM_NOREALLOC. Still, however, the function shouldn't segfault
as long as the calling program allocates the suggested amount of space
for the JPEG destination buffer.
Usability notes:
tj3Transform() could hypothetically require dstSizes[i] to be set
regardless of the value of TJPARAM_NOREALLOC, but there are usability
pitfalls either way. The main pitfall I sought to avoid with
ff78e37595 was a calling program failing
to set dstSizes[i] at all, thus leaving its value undefined. It could
be argued that requiring dstSizes[i] to be set in all cases is more
consistent, but it could also be argued that not requiring it to be set
when TJPARAM_NOREALLOC is set is more user-proof. tj3Transform() could
also hypothetically set TJXOPT_COPYNONE automatically when
TJPARAM_NOREALLOC is set, but that could lead to user confusion.
Ultimately, I would like to address these issues in TurboJPEG v4 by
using managed buffer objects, but that would be an extensive overhaul.
- strtest.c: Fix unused variable warnings if both -DNO_GETENV and
-DNO_PUTENV are specified or if only -DNO_GETENV is specified.
- jinclude.h: Fix build error if only -DNO_GETENV is specified.
Fixes#697
- strtest.c: Fix unused variable warnings if both -DNO_GETENV and
-DNO_PUTENV are specified or if only -DNO_GETENV is specified.
- jinclude.h: Fix build error if only -DNO_GETENV is specified.
Fixes#697
Because width, height, and tjPixelSize[] are signed integers, signed
integer overflow will occur if width * height *
tjPixelSize[pixelFormat] > INT_MAX or jpegSize > INT_MAX, which would
cause an incorrect value to be passed to tjAlloc(). This commit
modifies tjexample.c in the following ways:
- Use malloc() rather than tjAlloc() to allocate the uncompressed image
buffer. (tjAlloc() is only necessary for JPEG buffers that will
potentially be reallocated by the TurboJPEG API library.)
- Implicitly promote width, height, and tjPixelSize[pixelFormat] to
size_t before multiplying them.
- If size_t is 32-bit, throw an error if width * height *
tjPixelSize[pixelFormat] would overflow the data type.
- Throw an error if jpegSize would overflow a signed integer (which
could only happen on a 64-bit machine.)
Since tjexample is not installed or packaged, the worst case for this
issue was that a downstream application might interpret tjexample.c
literally and introduce a similar overflow issue into its own code.
However, it's worth noting that such issues could also be introduced
when using malloc().
Because width, height, and tjPixelSize[] are signed integers, signed
integer overflow will occur if width * height *
tjPixelSize[pixelFormat] > INT_MAX, which would cause an incorrect value
to be passed to tj3Alloc(). This commit modifies tjexample.c in the
following ways:
- Implicitly promote width, height, and tjPixelSize[pixelFormat] to
size_t before multiplying them.
- Use malloc() rather than tj3Alloc() to allocate the uncompressed image
buffer. (tj3Alloc() is only necessary for JPEG buffers that will
potentially be reallocated by the TurboJPEG API library.)
- If size_t is 32-bit, throw an error if width * height *
tjPixelSize[pixelFormat] would overflow the data type.
Since tjexample is not installed or packaged, the worst case for this
issue was that a downstream application might interpret tjexample.c
literally and introduce a similar overflow issue into its own code.
However, it's worth noting that such issues could also be introduced
when using malloc().
Colorspace conversion is explicitly not supported with lossless JPEG
images. Merged upsampling implies YCbCr-to-RGB colorspace conversion,
so allowing it with lossless decompression was an oversight.
9f756bc67a eliminated interaction issues
between the lossless decompressor and the merged upsampler related to
out-of-range 12-bit samples, but referring to #690, other interaction
issues apparently still exist. Such issues are likely, given the fact
that the merged upsampler was never designed with lossless decompression
in mind.
This commit also extends the decompress fuzzer so that it catches the
issue reported in #690.
Fixes#690
Redundantly fixes#670
Redundantly fixes#675
- Clarify that encrypted e-mail is optional.
- Mention the new GitHub security advisory system.
- Clarify that vulnerabilities against new features that are not yet in
a Stable release series need not be reported securely.
When computing the downsampled width for a particular component,
jpeg_crop_scanline() needs to take into account the fact that the
libjpeg code uses a combination of IDCT scaling and upsampling to
implement 4x2 and 2x4 upsampling with certain decompression scaling
factors. Failing to account for that led to incomplete upsampling of
4x2- or 2x4-subsampled components, which caused the color converter to
read from uninitialized memory. With 12-bit data precision, this caused
a buffer overrun or underrun and subsequent segfault if the
uninitialized memory contained a value that was outside of the valid
sample range (because the color converter uses the value as an array
index.)
Fixes#669
When computing the downsampled width for a particular component,
jpeg_crop_scanline() needs to take into account the fact that the
libjpeg code uses a combination of IDCT scaling and upsampling to
implement 4x2 and 2x4 upsampling with certain decompression scaling
factors. Failing to account for that led to incomplete upsampling of
4x2- or 2x4-subsampled components, which caused the color converter to
read from uninitialized memory. With 12-bit data precision, this caused
a buffer overrun or underrun and subsequent segfault if the
uninitialized memory contained a value that was outside of the valid
sample range (because the color converter uses the value as an array
index.)
Fixes#669
The 2-pass color quantization algorithm assumes 3-sample pixels. RGB565
is the only 3-component colorspace that doesn't have 3-sample pixels, so
we need to treat it as a special case when determining whether to enable
2-pass color quantization. Otherwise, attempting to initialize 2-pass
color quantization with an RGB565 output buffer could cause
prescan_quantize() to read from uninitialized memory and subsequently
underflow/overflow the histogram array.
djpeg is supposed to fail gracefully if both -rgb565 and -colors are
specified, because none of its destination managers (image writers)
support color quantization with RGB565. However, prescan_quantize() was
called before that could occur. It is possible but very unlikely that
these issues could have been reproduced in applications other than
djpeg. The issues involve the use of two features (12-bit precision and
RGB565) that are incompatible, and they also involve the use of two
rarely-used legacy features (RGB565 and color quantization) that don't
make much sense when combined.
Fixes#668Fixes#671Fixes#680
The 2-pass color quantization algorithm assumes 3-sample pixels. RGB565
is the only 3-component colorspace that doesn't have 3-sample pixels, so
we need to treat it as a special case when determining whether to enable
2-pass color quantization. Otherwise, attempting to initialize 2-pass
color quantization with an RGB565 output buffer could cause
prescan_quantize() to read from uninitialized memory and subsequently
underflow/overflow the histogram array.
djpeg is supposed to fail gracefully if both -rgb565 and -colors are
specified, because none of its destination managers (image writers)
support color quantization with RGB565. However, prescan_quantize() was
called before that could occur. It is possible but very unlikely that
these issues could have been reproduced in applications other than
djpeg. The issues involve the use of two features (12-bit precision and
RGB565) that are incompatible, and they also involve the use of two
rarely-used legacy features (RGB565 and color quantization) that don't
make much sense when combined.
Fixes#668Fixes#671Fixes#680
12-bit is the only data precision for which the range of the sample data
type exceeds the valid sample range, so it is possible to craft a 12-bit
lossless JPEG image that contains out-of-range 12-bit samples.
Attempting to decompress such an image using color quantization or merged
upsampling (NOTE: libjpeg-turbo cannot generate YCbCr or subsampled
lossless JPEG images, but it can decompress them) caused segfaults or
buffer overruns when those algorithms attempted to use the out-of-range
sample values as array indices. This commit modifies the lossless
decompressor so that it range-limits the output of the scaler when using
12-bit samples.
Fixes#670Fixes#672Fixes#673Fixes#674Fixes#675Fixes#676Fixes#677Fixes#678Fixes#679Fixes#681Fixes#683
If PIC isn't enabled for the entire build (using
CMAKE_POSITION_INDEPENDENT_CODE), then we need to enable it for any of
the objects in the libjpeg-turbo shared libraries. Ideally, however, we
don't want to enable PIC for any of the objects in the libjpeg-turbo
static libraries, unless CMAKE_POSITION_INDEPENDENT_CODE is set. Thus,
we need to build separate static and shared jpeg12 and jpeg16 object
libraries.
Fixes#684
(oversight from 386ec0abc7)
Tiled decompression will ultimately fail if the subsampling type of the
JPEG input image is unknown, but the C version of TJBench needs to fail
earlier in order to avoid using -1 (TJSAMP_UNKNOWN) as an array index
for tjMCUWidth[]/tjMCUHeight[]. The Java version now fails earlier as
well, although there is no benefit to that other than making the error
message less cryptic.
When -tile is used with a JPEG input image, TJBench generates the tiles
using lossless cropping, which will fail if the cropping region doesn't
align with an MCU boundary. Furthermore, there is no reason to avoid
8x8 tiles when decompressing 4:4:4 or grayscale JPEG images.
If we have transformed a 4:1:1 or 4:4:1 JPEG input image in such a way
that the horizontal and vertical dimensions are transposed, then we need
to change the subsampling type that is passed to the decomp() function.
Otherwise, tj3YUVBufSize() may return an incorrect value.
(oversight from fc881ebb21)
When -tile is used with a JPEG input image, TJBench generates the tiles
using lossless cropping, which will fail if the cropping region doesn't
align with an MCU boundary. Furthermore, there is no reason to avoid
8x8 tiles when decompressing 4:4:4 or grayscale JPEG images.
This allows losslessly transposed or rotated 4:1:1 JPEG images to be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images.
Because tj3Transform() allows multiple lossless transformations to be
chained together, all subsampling options need to have a corresponding
transposed subsampling option. (This is why 4:4:0 was originally
implemented as well.) Otherwise, the documentation would be technically
incorrect. It says that images with unknown subsampling types cannot be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images, but it doesn't say anything about images with known
subsampling types whose subsampling type becomes unknown if the image is
rotated or transposed. This is one of those situations in which it is
easier to implement a feature that works around the problem than to
document the problem.
Closes#659
It's not that the build system doesn't support multiple values in
CMAKE_OSX_ARCHITECTURES. It's that libjpeg-turbo, because of its SIMD
extensions, *cannot* support multiple values in CMAKE_OSX_ARCHITECTURES.
It's not that the build system doesn't support multiple values in
CMAKE_OSX_ARCHITECTURES. It's that libjpeg-turbo, because of its SIMD
extensions, *cannot* support multiple values in CMAKE_OSX_ARCHITECTURES.
Even though BUILDING.md and CONTRIBUTING.md explicitly state that the
libjpeg-turbo build system does not and will not support being
integrated into downstream build systems using add_subdirectory() (see
0565548191), people continue to file bug
reports, feature requests, and pull requests regarding that (see #265,
#637, and #653 in addition to the issues listed in
05655481917a2d2761cf2fe19b76f639b7f159ef.) Responding to those
issues wastes our project's limited resources. Hopefully people will
get the hint if the build system explicitly tells them that it can't be
included using add_subdirectory(), which will prompt them to read the
comments in CMakeLists.txt explaining why. To anyone stumbling upon
this commit message, please refer to the discussions under the issues
listed above, as well as the issues listed in
0565548191. Our project's position on
this has been stated, explained, and defended numerous times.
Even though BUILDING.md and CONTRIBUTING.md explicitly state that the
libjpeg-turbo build system does not and will not support being
integrated into downstream build systems using add_subdirectory() (see
0565548191), people continue to file bug
reports, feature requests, and pull requests regarding that (see #265,
#637, and #653 in addition to the issues listed in
05655481917a2d2761cf2fe19b76f639b7f159ef.) Responding to those
issues wastes our project's limited resources. Hopefully people will
get the hint if the build system explicitly tells them that it can't be
included using add_subdirectory(), which will prompt them to read the
comments in CMakeLists.txt explaining why. To anyone stumbling upon
this commit message, please refer to the discussions under the issues
listed above, as well as the issues listed in
0565548191. Our project's position on
this has been stated, explained, and defended numerous times.
Don't keep trying to decompress the same image if tj3Decompress*() has
already thrown an error. Otherwise, if the image has an excessive
number of scans, then each iteration of the loop will try to decompress
up to the scan limit, which may cause the overall test to time out even
if one iteration doesn't time out.
Don't keep trying to decompress the same image if tj3Decompress*() has
already thrown an error. Otherwise, if the image has an excessive
number of scans, then each iteration of the loop will try to decompress
up to the scan limit, which may cause the overall test to time out even
if one iteration doesn't time out.
* tag '2.1.5': (41 commits)
BUILDING.md: Specify install prefix for MinGW/Un*x
Java: Guard against int overflow in size methods
turbojpeg.c: Fix UBSan warning
tjPlane*(): Guard against int overflow
Java doc: TJ.pixelSize --> TJ.getPixelSize()
TJBench: Unset TJ*OPT_CROP when disabling tiling
TJExample: Remove "underlying codec" references
GitHub: Update to actions/checkout@v3
TJBench: Set TJ*OPT_PROGRESSIVE with -progressive
TJBench/Java: Fix parsing of quality ranges
TJBench: Strictly check all non-boolean arguments
TurboJPEG: More documentation improvements
TJDecompressor.java: Exception message tweak
12-bit: Set alpha channel to 4095 rather than 255
TJDecompressor.java: "YUV" = "planar YUV"
Java: Don't allow int overflow in buf size methods
tjDecompressToYUV2: Use scaled dims for plane calc
TurboJPEG: Numerous documentation improvements
TurboJPEG: Don't use backward compatibility macros
TurboJPEG: Ensure 'pad' arg is a power of 2
...
As long as a libjpeg instance is only used by one thread at a time, a
program is technically within its rights to call jpeg_start_*compress()
in one thread and jpeg_(read|write)_*(), with the same libjpeg instance,
in a second thread. However, because the various jsimd_can*() functions
are called within the body of jpeg_start_*compress() and simd_support is
now thread-local (due to f579cc11b3), that
led to a situation in which simd_support was initialized in the first
thread but not the second. The uninitialized value of simd_support is
0xFFFFFFFF, which the second thread interpreted to mean that it could
use any instruction set, and when it attempted to use AVX2 instructions
on a CPU that didn't support them, an illegal instruction error
occurred.
This issue was known to affect libvips.
This commit modifies the i386 and x86-64 SIMD dispatchers so that the
various jsimd_*() functions always call init_simd(), if simd_support is
uninitialized, prior to dispatching based on the value of simd_support.
Note that the other SIMD dispatchers don't need this, because only the
x86 SIMD extensions currently support multiple instruction sets.
This patch has been verified to be performance-neutral to within
+/- 0.4% with 32-bit and 64-bit code running on a 2.8 GHz Intel Xeon
W3530 and a 3.6 GHz Intel Xeon W2123.
Fixes#649
As long as a libjpeg instance is only used by one thread at a time, a
program is technically within its rights to call jpeg_start_*compress()
in one thread and jpeg_(read|write)_*(), with the same libjpeg instance,
in a second thread. However, because the various jsimd_can*() functions
are called within the body of jpeg_start_*compress() and simd_support is
now thread-local (due to f579cc11b3), that
led to a situation in which simd_support was initialized in the first
thread but not the second. The uninitialized value of simd_support is
0xFFFFFFFF, which the second thread interpreted to mean that it could
use any instruction set, and when it attempted to use AVX2 instructions
on a CPU that didn't support them, an illegal instruction error
occurred.
This issue was known to affect libvips.
This commit modifies the i386 and x86-64 SIMD dispatchers so that the
various jsimd_*() functions always call init_simd(), if simd_support is
uninitialized, prior to dispatching based on the value of simd_support.
Note that the other SIMD dispatchers don't need this, because only the
x86 SIMD extensions currently support multiple instruction sets.
This patch has been verified to be performance-neutral to within
+/- 0.4% with 32-bit and 64-bit code running on a 2.8 GHz Intel Xeon
W3530 and a 3.6 GHz Intel Xeon W2123.
Fixes#649
(regression introduced by fc01f4673b)
Oops. In the process of migrating the fuzzers to the TurboJPEG 3 API,
I accidentally left out the code in decompress.cc that updates the width
and height based on the scaling factor (but I apparently included that
code in decompress_yuv.cc.)
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=55573
Since tj3Alloc() now accepts a size_t argument rather than an int
argument, it is no longer necessary to check for signed integer overflow
in the C version of TJBench.
The default install prefix when building under MinGW is chosen based on
the needs of the official build system, which uses MSYS2 to generate
Windows installer packages that install under c:\libjpeg-turbo-gcc[64].
However, attempting to configure the build with that install prefix on
a Un*x machine causes a CMake error.
Fixes#641
Those changes worked around an innocuous UBSan warning that was
exposed by the new TurboJPEG 3 transform fuzz target, due to the fact
that tj3Transform() no longer rejects images with unknown subsampling
configurations. That UBSan warning was a false positive, and attempting
to fix it introduced a buffer overrun triggered by a malformed input
image that causes jpeg_write_marker() to be called with datalen == 0. I
suspect that the UBSan false positive was only reproducible on my local
machine, but I guess we'll see.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=55413
- Don't report which function was being called when a TurboJPEG error
occurred, because the TurboJPEG error message already contains that
information.
- Use THROW_TJG() for functions that report errors globally instead of
through an instance handle.
- Use THROW_TJ() for the image I/O functions.
- Formatting tweaks
In decompression and transform functions, use the libjpeg API state
rather than a TurboJPEG instance variable to determine whether
jpeg_mem_src_tj() and jpeg_read_header() have already been called by a
wrapper function.
This actually works and apparently always has worked. It only failed
because the libjpeg code, which did not originally support arithmetic
coding, assumed that optimize_coding should always be TRUE for 12-bit
data precision.
(ChangeLog update forthcoming)
- Prefix all function names with "tj3" and remove version suffixes from
function names. (Future API overhauls will increment the prefix to
"tj4", etc., thus retaining backward API/ABI compatibility without
versioning each individual function.)
- Replace stateless boolean flags (including TJ*FLAG_ARITHMETIC and
TJ*FLAG_LOSSLESS, which were never released) with stateful integer
parameters, the value of which persists between function calls.
* Use parameters for the JPEG quality and subsampling as well, in
order to eliminate the awkwardness of specifying function arguments
that weren't relevant for lossless compression.
* tj3DecompressHeader() now stores all relevant information about the
JPEG image, including the width, height, subsampling type, entropy
coding type, etc. in parameters rather than returning that
information in its arguments.
* TJ*FLAG_LIMITSCANS has been reimplemented as an integer parameter
(TJ*PARAM_SCANLIMIT) that allows the number of scans to be
specified.
- Use the const keyword for all pointer arguments to unmodified
buffers, as well as for both dimensions of 2D pointers. Addresses
#395.
- Use size_t rather than unsigned long to represent buffer sizes, since
unsigned long is a 32-bit type on Windows. Addresses #24.
- Return 0 from all buffer size functions if an error occurs, rather
than awkwardly trying to return -1 in an unsigned data type.
- Implement 12-bit and 16-bit data precision using dedicated
compression, decompression, and image I/O functions/methods.
* Suffix the names of all data-precision-specific functions with 8,
12, or 16.
* Because the YUV functions are intended to be used for video, they
are currently only implemented with 8-bit data precision, but they
can be expanded to 12-bit data precision in the future, if
necessary.
* Extend TJUnitTest and TJBench to test 12-bit and 16-bit data
precision, using a new -precision option.
* Add appropriate regression tests for all of the above to the 'test'
target.
* Extend tjbenchtest to test 12-bit and 16-bit data precision, and
add separate 'tjtest12' and 'tjtest16' targets.
* BufferedImage I/O in the Java API is currently limited to 8-bit
data precision, since the BufferedImage class does not
straightforwardly support higher data precisions.
* Extend the PPM reader to convert 12-bit and 16-bit PBMPLUS files
to grayscale or CMYK pixels, as it already does for 8-bit files.
- Properly accommodate lossless JPEG using dedicated parameters
(TJ*PARAM_LOSSLESS, TJ*PARAM_LOSSLESSPSV, and TJ*PARAM_LOSSLESSPT),
rather than using a flag and awkwardly repurposing the JPEG quality.
Update TJBench to properly reflect whether a JPEG image is lossless.
- Re-organize the TJBench usage screen.
- Update the Java docs using Java 11, to improve the formatting and
eliminate HTML frames.
- Use the accurate integer DCT algorithm by default for both
compression and decompression, since the "fast" algorithm is a legacy
feature, it does not pass the ISO compliance tests, and it is not
actually faster on modern x86 CPUs.
* Remove the -accuratedct option from TJBench and TJExample.
- Re-implement the 'tjtest' target using a CMake script that enables
the appropriate tests, depending on the data precision and whether or
not the Java API is part of the build.
- Consolidate the C and Java versions of tjbenchtest into one script.
- Consolidate the C and Java versions of tjexampletest into one script.
- Combine all initialization functions into a single function
(tj3Init()) that accepts an integer parameter specifying the
subsystems to initialize.
- Enable decompression scaling explicitly, using a new function/method
(tj3SetScalingFactor()/TJDecompressor.setScalingFactor()), rather
than implicitly using awkward "desired width"/"desired height"
parameters.
- Introduce a new macro/constant (TJUNSCALED/TJ.UNSCALED) that maps to
a scaling factor of 1/1.
- Implement partial image decompression, using a new function/method
(tj3SetCroppingRegion()/TJDecompressor.setCroppingRegion()) and
TJBench option (-crop). Extend tjbenchtest to test the new feature.
Addresses #1.
- Allow the JPEG colorspace to be specified explicitly when
compressing, using a new parameter (TJ*PARAM_COLORSPACE). This
allows JPEG images with the RGB and CMYK colorspaces to be created.
- Remove the error/difference image feature from TJBench. Identical
images to the ones that TJBench created can be generated using
ImageMagick with
'magick composite <original_image> <output_image> -compose difference <diff_image>'
- Handle JPEG images with unknown subsampling types. TJ*PARAM_SUBSAMP
is set to TJ*SAMP_UNKNOWN (== -1) for such images, but they can still
be decompressed fully into packed-pixel images or losslessly
transformed (with the exception of lossless cropping.) They cannot
be partially decompressed or decompressed into planar YUV images.
Note also that TJBench, due to its lack of support for imperfect
transforms, requires that the subsampling type be known when
rotating, flipping, or transversely transposing an image. Addresses
#436
- The Java version of TJBench now has identical functionality to the C
version. This was accomplished by (somewhat hackishly) calling the
TurboJPEG C image I/O functions through JNI and copying the pixels
between the C heap and the Java heap.
- Add parameters (TJ*PARAM_RESTARTROWS and TJ*PARAM_RESTARTBLOCKS) and
a TJBench option (-restart) to allow the restart marker interval to
be specified when compressing. Eliminate the undocumented TJ_RESTART
environment variable.
- Add a parameter (TJ*PARAM_OPTIMIZE), a transform option
(TJ*OPT_OPTIMIZE), and a TJBench option (-optimize) to allow
optimized baseline Huffman coding to be specified when compressing.
Eliminate the undocumented TJ_OPTIMIZE environment variable.
- Add parameters (TJ*PARAM_XDENSITY, TJ*PARAM_DENSITY, and
TJ*DENSITYUNITS) to allow the pixel density to be specified when
compressing or saving a Windows BMP image and to be queried when
decompressing or loading a Windows BMP image. Addresses #77.
- Refactor the fuzz targets to use the new API.
* Extend decompression coverage to 12-bit and 16-bit data precision.
* Replace the awkward cjpeg12 and cjpeg16 targets with proper
TurboJPEG-based compress12, compress12-lossless, and
compress16-lossless targets
- Fix innocuous UBSan warnings uncovered by the new fuzzers.
- Implement previous versions of the TurboJPEG API by wrapping the new
functions (tested by running the 2.1.x versions of TJBench, via
tjbenchtest, and TJUnitTest against the new implementation.)
* Remove all JNI functions for deprecated Java methods and implement
the deprecated methods using pure Java wrappers. It should be
understood that backward API compatibility in Java applies only to
the Java classes and that one cannot mix and match a JAR file from
one version of libjpeg-turbo with a JNI library from another
version.
- tj3Destroy() now silently accepts a NULL handle.
- tj3Alloc() and tj3Free() now return/accept void pointers, as malloc()
and free() do.
- The image I/O functions now accept a TurboJPEG instance handle, which
is used to transmit/receive parameters and to receive error
information.
Closes#517
Because Java array sizes are ints, the various size methods in the TJ
class have int return values. Thus, we have to guard against signed
int overflow at the JNI level, because the C functions can return sizes
greater than INT_MAX.
This also adds a test for TJ.planeWidth() and TJ.planeHeight(), in order
to validate 8a1526a442 in Java.
tjPlaneWidth() and tjPlaneHeight() could overflow a signed int and
return a negative value if passed a width/height argument of INT_MAX and
a subsampling type for which the MCU block size is larger than 8x8.
The documented behavior of the -progressive option is to use progressive
entropy coding in JPEG images generated by compression and transform
operations. However, setting TJFLAG_PROGRESSIVE was insufficient to
accomplish that, because TJBench doesn't enable lossless transformation
if xformOpt == 0.
- TJBench/TJUnitTest: Wordsmith command-line output
- Java: "decompress operations"="decompression operations"
- tjLoadImage(): Error message tweak
- Don't mention compression performance in the description of
TJXOPT_PROGRESSIVE/TJTransform.OPT_PROGRESSIVE, because the image has
already been compressed at that point.
(Oversights from 9a146f0f23)
This is similar to the fix that 2a9e3bd743
applied to the C API. We have to apply it separately at the JNI level
because the Java API always stores buffer sizes in 32-bit integers, and
the C buffer size functions could overflow an int when using 64-bit
code. (NOTE: The Java API stores buffer sizes in 32-bit integers
because Java itself always uses 32-bit integers for array sizes.) Since
Java don't allow no buffer overruns 'round here, this commit doesn't
change the ultimate outcome. It just makes the inevitable exception
easier to diagnose.
The documented behavior of the function is to use decompression scaling
to generate the largest possible image that will fit within the desired
image dimensions. Thus, if the desired image dimensions are larger than
the scaled image dimensions, then tjDecompressToYUV2() should use the
scaled image dimensions when computing the plane pointers and strides to
pass to tjDecompressToYUVPlanes().
Note that this bug was not previously detected, because tjunittest and
tjbench always passed the scaled image dimensions to
tjDecompressToYUV2().
- Wordsmithing, formatting, and grammar tweaks
- Various clarifications and corrections, including specifying whether
a particular buffer or image is used as a source or destination
- Accommodate/mention features that were introduced since the API
documentation was created.
- For clarity, use "packed-pixel" to describe uncompressed
source/destination images that are not planar YUV.
- Use "row" rather than "line" to refer to a single horizontal group of
pixels or component values, for consistency with the libjpeg API
documentation. (libjpeg also uses "scanline", which is a more archaic
term.)
- Use "alignment" rather than "padding" to refer to the number of bytes
by which a row's width is evenly divisible. This consistifies the
documention of the YUV functions and tjLoadImage(). ("Padding"
typically refers to the number of bytes added to each row, which is
not the same thing.)
- Remove all references to "the underlying codec." Although the
TurboJPEG API originated as a cross-platform wrapper for the Intel
Integrated Performance Primitives, Sun mediaLib, QuickTime, and
libjpeg, none of those TurboJPEG implementations has been maintained
since 2009. Nothing would prevent someone from implementing the
TurboJPEG API without libjpeg-turbo, but such an implementation would
not necessarily have an "underlying codec." (It could be fully
self-contained.)
- Use "destination image" rather than "output image", for consistency,
or describe the type of image that will be output.
- Avoid the term "image buffer" and instead use "byte buffer" to
refer to buffers that will hold JPEG images, or describe the type of
image that will be contained in the buffer. (The Java documentation
doesn't use "byte buffer", because the buffer arrays literally have
"byte" in front of them, and since Java doesn't have pointers, it is
not possible for mere mortals to store any other type of data in those
arrays.)
- C: Use "unified" to describe YUV images stored in a single buffer, for
consistency with the Java documentation.
- Use "planar YUV" rather than "YUV planar". Is is our convention to
describe images using {component layout} {colorspace/pixel format}
{image function}, e.g. "packed-pixel RGB source image" or "planar YUV
destination image."
- C: Document the TurboJPEG API version in which a particular function
or macro was introduced, and reorder the backward compatibility
function stubs in turbojpeg.h alphabetically by API version.
- C: Use Markdown rather than HTML tags, where possible, in the Doxygen
comments.
Macros from older versions of the TurboJPEG API are supported but not
documented, so using the current version of those macros makes the code
more readable.
Because the PAD() macro can only handle powers of 2, this is a necessary
restriction (and a documented one, except in the case of
tjCompressFromYUV()-- oops.) Failing to check the 'pad' argument
caused tjBufSizeYUV2() to return bogus results if 'pad' was less than 1
or otherwise not a power of 2. tjEncodeYUV3() and tjDecodeYUV()
effectively treated a 'pad' value of 0 as unpadded, but that was subtle
and undocumented behavior. tjCompressFromYUV() did not check whether
'pad' was a power of 2, so the strides passed to
tjCompressFromYUVPlanes() would have been incorrect if 'pad' was not a
power of 2. That would not have caused tjCompressFromYUV() to overrun
the source buffer, as long as the calling application allocated the
buffer based on the return value of tjBufSizeYUV2() (which computes the
strides in the same manner as tjCompressFromYUV().) However, if the
calling application attempted to initialize the source buffer using
correctly-computed strides, then it could have overrun its own
buffer in certain cases or produced incorrect JPEG images in others.
Realistically, there is no reason why an application would want to pass
a non-power-of-2 'pad' value to a TurboJPEG API function, so this commit
is about user-proofing the API rather than fixing any known issue.
(oversight from 2241434eb9)
This fixes segfaults and other issues that occurred when attempting to
decompress a 16-bit lossless JPEG image using color quantization
(which is used when decompressing to a GIF image.)
requant_comp() in transupp.c, a function that supports the jpegtran
-drop option, borrows code from the C quantization function in order to
re-quantize the coefficients from the dropped image. However, the
function does not guard against the possibility that a corrupt source
image could inject quantization table values equal to 0, thus causing a
divide-by-zero error. Since this error affected only jpegtran and not
any of the libraries (the tjTransform() function in the TurboJPEG API
does not expose the image drop feature), it did not represent a security
risk. In fact, this commit does not change the output of jpegtran when
attempting to transform the aforementioned corrupt source image. It
merely eliminates the floating point exception. Like most issues of
this type, however, eliminating the error prevents it from hiding
legitimate security issues that may later be introduced.
Fixes#635Fixes#636
Unfortunately, iOS builds cannot be used with the iOS simulator on Macs
with Apple silicon CPUs. Even more unfortunately, universal binaries
can only have one slice for each CPU architecture, so it would not be
possible to add a dedicated Arm64 iOS simulator slice to the existing
libjpeg-turbo iOS binaries. (It would be necessary to release a
separate package solely for the iOS simulator.) Because the Arm Neon
SIMD extensions for libjpeg-turbo now use compiler intrinsics when
building with Xcode, it is easy to build libjpeg-turbo from source when
targeting Arm64-based Apple platforms. Thus, for the moment, I have
chosen to document how to avoid the pothole rather than to fill it in.
cjpeg relies on the various file I/O modules to range-limit the input
samples, but no range limiting is performed by the
jpeg_write_scanlines() function itself. With 8-bit samples, that isn't
a problem, because sample values > MAXJSAMPLE will overflow the data
type and wrap around to 0. With 12-bit samples, however, it is possible
to pass sample values < 0 or > 4095 to jpeg_write_scanlines(), which
would cause the RGB-to-YCbCr color converter to underflow or overflow
the RGB-to-YCbCr conversion tables. That issue has existed in libjpeg
all along. This commit mitigates the issue by masking off all but the
lowest 12 bits of each 12-bit input sample prior to using the input
sample value to index the RGB-to-YCbCr conversion tables.
Fixes#633
Because of e8b40f3c2b, we now support
multiple data precisions in the same libjpeg API library, so
BITS_IN_JSAMPLE is no longer used to specify the data precision at build
time. However, some downstream software expects the macro to be
defined. Since 12-bit data precision is an opt-in feature that requires
explicitly calling 12-bit-specific libjpeg API functions and using
12-bit-specific data types, the unmodified portion of the libjpeg API
still behaves as if it were built for 8-bit precision, and JSAMPLE is
still literally an 8-bit data type. Thus, it is correct to externally
define BITS_IN_JSAMPLE to 8 in jconfig.h. Since the build system also
uses BITS_IN_JSAMPLE internally to build both 8-bit and 12-bit versions
of relevant modules, the definition of BITS_IN_JSAMPLE in jconfig.h is
now guarded by #ifndef BITS_IN_JSAMPLE. (Hopefully that doesn't cause
any problems.)
Fixes#632
That macro has been defined in jconfig.h since libjpeg-turbo 1.3.x, so
it is possible that some downstream software conditions the use of the
jpeg_mem_*() functions on whether MEM_SRCDST_SUPPORTED is defined or
JPEG_LIB_VERSION >= 80 (as libjpeg-turbo 2.1.x and prior did
internally.)
TJFLAG_LOSSLESS is irrelevant to planar YUV encoding, and setting the
flag caused tjEncode*() to fail with "Invalid lossless parameters"
because tjEncodeYUVPlanes() passes a JPEG quality value of -1 to
setCompDefaults(). This commit modifies setCompDefaults() so that it
takes no action related to the jpegQual parameter unless jpegQual >= 0.
Add a new TurboJPEG C API function (tjDecompressHeader4()) and Java API
method (TJDecompressor.getFlags()) that return the bitwise OR of any
flags that are relevant to the JPEG image being decompressed (currently
TJFLAG_PROGRESSIVE, TJFLAG_ARITHMETIC, TJFLAG_LOSSLESS, and their Java
equivalents.) This allows a calling program to determine whether the
image being decompressed is a lossless JPEG image, which means that the
decompression scaling feature will not be available and that a
full-sized destination buffer should be allocated.
More specifically, this fixes a buffer overrun in TJBench, TJExample,
and the decompress* fuzz targets that occurred when attempting (in vain)
to decompress a lossless JPEG image with decompression scaling enabled.
Lossless: Accommodate LJT colorspace/SIMD exts
In libjpeg-turbo, grayscale_convert() and null_convert() aren't the only
lossless color conversion algorithms. We can also losslessly convert
RGB to and from any of the extended RGB colorspaces, and some platforms
have SIMD-accelerated null color conversion.
This commit also disallows RGB565 output in lossless mode, and it moves
the IsExtRGB() macro from cdjpeg.h to jpegint.h and repurposes it to
make jinit_color_converter() and jinit_color_deconverter() more
readable.
- Rename jpeg_simple_lossless() to jpeg_enable_lossless() and modify the
function so that it stores the lossless parameters directly in the Ss
and Al fields of jpeg_compress_struct rather than using a scan script.
- Move the cjpeg -lossless switch into "Switches for advanced users".
- Document the libjpeg API and run-time features that are unavailable in
lossless mode, and ensure that all parameters, functions, and switches
related to unavailable features are ignored or generate errors in
lossless mode.
- Defer any action that depends on whether lossless mode is enabled
until jpeg_start_compress()/jpeg_start_decompress() is called.
- Document the purpose of the point transform value.
- "Codec" stands for coder/decoder, so it is a bit awkward to say
"lossless compression codec" and "lossless decompression codec".
Use "lossless compressor" and "lossless decompressor" instead.
- Restore backward API/ABI compatibility with libjpeg v6b:
* Move the new 'lossless' field from the exposed jpeg_compress_struct
and jpeg_decompress_struct structures into the opaque
jpeg_comp_master and jpeg_decomp_master structures, and allocate the
master structures in the body of jpeg_create_compress() and
jpeg_create_decompress().
* Remove the new 'process' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with the old
'progressive_mode' field and the new 'lossless' field.
* Remove the new 'data_unit' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with a locally-computed
data unit variable.
* Restore the names of macros and fields that refer to DCT blocks, and
document that they have a different meaning in lossless mode. (Most
of them aren't very meaningful in lossless mode anyhow.)
* Remove the new alloc_darray() method from jpeg_memory_mgr and
replace it with an internal macro that wraps the alloc_sarray()
method.
* Move the JDIFF* data types from jpeglib.h and jmorecfg.h into
jpegint.h.
* Remove the new 'codec' field from jpeg_compress_struct and
jpeg_decompress_struct and instead reuse the existing internal
coefficient control, forward/inverse DCT, and entropy
encoding/decoding structures for lossless compression/decompression.
* Repurpose existing error codes rather than introducing new ones.
(The new JERR_BAD_RESTART and JWRN_MUST_DOWNSCALE codes remain,
although JWRN_MUST_DOWNSCALE will probably be removed in
libjpeg-turbo, since we have a different way of handling multiple
data precisions.)
- Automatically enable lossless mode when a scan script with parameters
that are only valid for lossless mode is detected, and document the
use of scan scripts to generate lossless JPEG images.
- Move the sequential and shared Huffman routines back into jchuff.c and
jdhuff.c, and document that those routines are shared with jclhuff.c
and jdlhuff.c as well as with jcphuff.c and jdphuff.c.
- Move MAX_DIFF_BITS from jchuff.h into jclhuff.c, the only place where
it is used.
- Move the predictor and scaler code into jclossls.c and jdlossls.c.
- Streamline register usage in the [un]differencers (inspired by similar
optimizations in the color [de]converters.)
- Restructure the logic in a few places to reduce duplicated code.
- Ensure that all lossless-specific code is guarded by
C_LOSSLESS_SUPPORTED or D_LOSSLESS_SUPPORTED and that the library can
be built successfully if either or both of those macros is undefined.
- Remove all short forms of external names introduced by the lossless
JPEG patch. (These will not be needed by libjpeg-turbo, so there is
no use cleaning them up.)
- Various wordsmithing, formatting, and punctuation tweaks
- Eliminate various compiler warnings.
Fix segfault when decomp lossless JPEG w/ restarts
The predict_process_restart() method in jpeg_lossless_decompressor was
unset, because we now use the start_pass() method in jpeg_inverse_dct
instead. Thus, a segfault occurred when attempting to decompress a
lossless JPEG that contained restart markers.
- Rename jpeg_simple_lossless() to jpeg_enable_lossless() and modify the
function so that it stores the lossless parameters directly in the Ss
and Al fields of jpeg_compress_struct rather than using a scan script.
- Move the cjpeg -lossless switch into "Switches for advanced users".
- Document the libjpeg API and run-time features that are unavailable in
lossless mode, and ensure that all parameters, functions, and switches
related to unavailable features are ignored or generate errors in
lossless mode.
- Defer any action that depends on whether lossless mode is enabled
until jpeg_start_compress()/jpeg_start_decompress() is called.
- Document the purpose of the point transform value.
- "Codec" stands for coder/decoder, so it is a bit awkward to say
"lossless compression codec" and "lossless decompression codec".
Use "lossless compressor" and "lossless decompressor" instead.
- Restore backward API/ABI compatibility with libjpeg v6b:
* Move the new 'lossless' field from the exposed jpeg_compress_struct
and jpeg_decompress_struct structures into the opaque
jpeg_comp_master and jpeg_decomp_master structures, and allocate the
master structures in the body of jpeg_create_compress() and
jpeg_create_decompress().
* Remove the new 'process' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with the old
'progressive_mode' field and the new 'lossless' field.
* Remove the new 'data_unit' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with a locally-computed
data unit variable.
* Restore the names of macros and fields that refer to DCT blocks, and
document that they have a different meaning in lossless mode. (Most
of them aren't very meaningful in lossless mode anyhow.)
* Remove the new alloc_darray() method from jpeg_memory_mgr and
replace it with an internal macro that wraps the alloc_sarray()
method.
* Move the JDIFF* data types from jpeglib.h and jmorecfg.h into
jpegint.h.
* Remove the new 'codec' field from jpeg_compress_struct and
jpeg_decompress_struct and instead reuse the existing internal
coefficient control, forward/inverse DCT, and entropy
encoding/decoding structures for lossless compression/decompression.
* Repurpose existing error codes rather than introducing new ones.
(The new JERR_BAD_RESTART and JWRN_MUST_DOWNSCALE codes remain,
although JWRN_MUST_DOWNSCALE will probably be removed in
libjpeg-turbo, since we have a different way of handling multiple
data precisions.)
- Automatically enable lossless mode when a scan script with parameters
that are only valid for lossless mode is detected, and document the
use of scan scripts to generate lossless JPEG images.
- Move the sequential and shared Huffman routines back into jchuff.c and
jdhuff.c, and document that those routines are shared with jclhuff.c
and jdlhuff.c as well as with jcphuff.c and jdphuff.c.
- Move MAX_DIFF_BITS from jchuff.h into jclhuff.c, the only place where
it is used.
- Move the predictor and scaler code into jclossls.c and jdlossls.c.
- Streamline register usage in the [un]differencers (inspired by similar
optimizations in the color [de]converters.)
- Restructure the logic in a few places to reduce duplicated code.
- Ensure that all lossless-specific code is guarded by
C_LOSSLESS_SUPPORTED or D_LOSSLESS_SUPPORTED and that the library can
be built successfully if either or both of those macros is undefined.
- Remove all short forms of external names introduced by the lossless
JPEG patch. (These will not be needed by libjpeg-turbo, so there is
no use cleaning them up.)
- Various wordsmithing, formatting, and punctuation tweaks
- Eliminate various compiler warnings.
llvm-rc doesn't like the copyright symbol in the LegalCopyright string.
Possible solutions:
1. Replace the copyright symbol with "(C)".
2. Prefix the string literal with L, thus making it a wide character
string.
3. Pass -c65001 to the resource compiler to enable the UTF-8 code page.
Option 2 is the least disruptive, since it doesn't change the behavior
of the official builds or require any build system modifications.
Closes#627
Regression introduced by 16bd984557 and
5b177b3cab
The pre-computed absolute values used in encode_mcu_AC_first() and
encode_mcu_AC_refine() were stored in a JCOEF (signed short) array.
When attempting to losslessly transform a specially-crafted malformed
12-bit JPEG image with a coefficient value of -32768 into a progressive
12-bit JPEG image, the progressive Huffman encoder attempted to store
the absolute value of -32768 in the JCOEF array, thus overflowing the
16-bit signed data type. Therefore, at this point in the code:
8c5e78ce29/jcphuff.c (L889)
the absolute value was read as -32768, which caused the test at
8c5e78ce29/jcphuff.c (L896)
to fail, falling through to
8c5e78ce29/jcphuff.c (L908)
with an overly large value of r (46) that, when shifted left four
places, incremented, and passed to emit_symbol(), exceeded the maximum
index (255) for the derived code tables. Fortunately, the buffer
overrun was fully contained within phuff_entropy_encoder, so the issue
did not generate a segfault or other user-visible errant behavior, but
it did cause a UBSan failure that was detected by OSS-Fuzz.
This commit introduces an unsigned JCOEF (UJCOEF) data type and uses it
to store the absolute values of DCT coefficients computed by the
AC_first_prepare() and AC_refine_prepare() methods.
Note that the changes to the Arm Neon progressive Huffman encoder
extensions cause signed 16-bit instructions to be replaced with
equivalent unsigned 16-bit instructions, so the changes should be
performance-neutral.
Based on:
bbf61c0382Closes#628
- Because of b5a9ef64ea, "by default" is
no longer applicable. (12-bit-per-component JPEG support is now part
of the core libjpeg-turbo functionality and cannot be disabled.)
- Change awkward "can be used to enable the creation of" to less awkward
"can be used to create".
- Rename jpeg_simple_lossless() to jpeg_enable_lossless() and modify the
function so that it stores the lossless parameters directly in the Ss
and Al fields of jpeg_compress_struct rather than using a scan script.
- Move the cjpeg -lossless switch into "Switches for advanced users".
- Document the libjpeg API and run-time features that are unavailable in
lossless mode, and ensure that all parameters, functions, and switches
related to unavailable features are ignored or generate errors in
lossless mode.
- Defer any action that depends on whether lossless mode is enabled
until jpeg_start_compress()/jpeg_start_decompress() is called.
- Document the purpose of the point transform value.
- "Codec" stands for coder/decoder, so it is a bit awkward to say
"lossless compression codec" and "lossless decompression codec".
Use "lossless compressor" and "lossless decompressor" instead.
- Restore backward API/ABI compatibility with libjpeg v6b:
* Move the new 'lossless' field from the exposed jpeg_compress_struct
and jpeg_decompress_struct structures into the opaque
jpeg_comp_master and jpeg_decomp_master structures, and allocate the
master structures in the body of jpeg_create_compress() and
jpeg_create_decompress().
* Remove the new 'process' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with the old
'progressive_mode' field and the new 'lossless' field.
* Remove the new 'data_unit' field from jpeg_compress_struct and
jpeg_decompress_struct and replace it with a locally-computed
data unit variable.
* Restore the names of macros and fields that refer to DCT blocks, and
document that they have a different meaning in lossless mode. (Most
of them aren't very meaningful in lossless mode anyhow.)
* Remove the new alloc_darray() method from jpeg_memory_mgr and
replace it with an internal macro that wraps the alloc_sarray()
method.
* Move the JDIFF* data types from jpeglib.h and jmorecfg.h into
jpegint.h.
* Remove the new 'codec' field from jpeg_compress_struct and
jpeg_decompress_struct and instead reuse the existing internal
coefficient control, forward/inverse DCT, and entropy
encoding/decoding structures for lossless compression/decompression.
* Repurpose existing error codes rather than introducing new ones.
(The new JERR_BAD_RESTART and JWRN_MUST_DOWNSCALE codes remain,
although JWRN_MUST_DOWNSCALE will probably be removed in
libjpeg-turbo, since we have a different way of handling multiple
data precisions.)
- Automatically enable lossless mode when a scan script with parameters
that are only valid for lossless mode is detected, and document the
use of scan scripts to generate lossless JPEG images.
- Move the sequential and shared Huffman routines back into jchuff.c and
jdhuff.c, and document that those routines are shared with jclhuff.c
and jdlhuff.c as well as with jcphuff.c and jdphuff.c.
- Move MAX_DIFF_BITS from jchuff.h into jclhuff.c, the only place where
it is used.
- Move the predictor and scaler code into jclossls.c and jdlossls.c.
- Streamline register usage in the [un]differencers (inspired by similar
optimizations in the color [de]converters.)
- Restructure the logic in a few places to reduce duplicated code.
- Ensure that all lossless-specific code is guarded by
C_LOSSLESS_SUPPORTED or D_LOSSLESS_SUPPORTED and that the library can
be built successfully if either or both of those macros is undefined.
- Remove all short forms of external names introduced by the lossless
JPEG patch. (These will not be needed by libjpeg-turbo, so there is
no use cleaning them up.)
- Various wordsmithing, formatting, and punctuation tweaks
- Eliminate various compiler warnings.
In libjpeg-turbo 2.1.x and prior, the WITH_12BIT CMake variable was used
to enable 12-bit JPEG support at compile time, because the libjpeg API
library could not handle multiple JPEG data precisions at run time. The
initial approach to handling multiple JPEG data precisions at run time
(7fec5074f9) created a whole new API,
library, and applications for 12-bit data precision, so it made sense to
repurpose WITH_12BIT to allow 12-bit data precision to be disabled.
e8b40f3c2b made it so that the libjpeg API
library can handle multiple JPEG data precisions at run time via a
handful of straightforward API extensions. Referring to
6c2bc901e2, it hasn't been possible to
build libjpeg-turbo with both forward and backward libjpeg API/ABI
compatibility since libjpeg-turbo 1.4.x. Thus, whereas we retain full
backward API/ABI compatibility with libjpeg v6b-v8, forward libjpeg
API/ABI compatibility ceased being realistic years ago, so it no longer
makes sense to provide compile-time options that give a false sense of
forward API/ABI compatibility by allowing some (but not all) of our
libjpeg API extensions to be disabled. Such options are difficult to
maintain and clutter the code with #ifdefs.
cjpeg doesn't accept image format arguments other than -targa (it
auto-detects the others), and since Targa images aren't supported with
12-bit precision, we don't need a second pass.
The Gordian knot that 7fec5074f9 attempted
to unravel was caused by the fact that there are several
data-precision-dependent (JSAMPLE-dependent) fields and methods in the
exposed libjpeg API structures, and if you change the exposed libjpeg
API structures, then you have to change the whole API. If you change
the whole API, then you have to provide a whole new library to support
the new API, and that makes it difficult to support multiple data
precisions in the same application. (It is not impossible, as example.c
demonstrated, but using data-precision-dependent libjpeg API structures
would have made the cjpeg, djpeg, and jpegtran source code hard to read,
so it made more sense to build, install, and package 12-bit-specific
versions of those applications.)
Unfortunately, the result of that initial integration effort was an
unreadable and unmaintainable mess, which is a problem for a library
that is an ISO/ITU-T reference implementation. Also, as I dug into the
problem of lossless JPEG support, I realized that 16-bit lossless JPEG
images are a thing, and supporting yet another version of the libjpeg
API just for those images is untenable.
In fact, however, the touch points for JSAMPLE in the exposed libjpeg
API structures are minimal:
- The colormap and sample_range_limit fields in jpeg_decompress_struct
- The alloc_sarray() and access_virt_sarray() methods in
jpeg_memory_mgr
- jpeg_write_scanlines() and jpeg_write_raw_data()
- jpeg_read_scanlines() and jpeg_read_raw_data()
- jpeg_skip_scanlines() and jpeg_crop_scanline()
(This is subtle, but both of those functions use JSAMPLE-dependent
opaque structures behind the scenes.)
It is much more readable and maintainable to provide 12-bit-specific
versions of those six top-level API functions and to document that the
aforementioned methods and fields must be type-cast when using 12-bit
samples. Since that eliminates the need to provide a 12-bit-specific
version of the exposed libjpeg API structures, we can:
- Compile only the precision-dependent libjpeg modules (the
coefficient buffer controllers, the colorspace converters, the
DCT/IDCT managers, the main buffer controllers, the preprocessing
and postprocessing controller, the downsampler and upsamplers, the
quantizers, the integer DCT methods, and the IDCT methods) for
multiple data precisions.
- Introduce 12-bit-specific methods into the various internal
structures defined in jpegint.h.
- Create precision-independent data type, macro, method, field, and
function names that are prefixed by an underscore, and use an
internal header to convert those into precision-dependent data
type, macro, method, field, and function names, based on the value
of BITS_IN_JSAMPLE, when compiling the precision-dependent libjpeg
modules.
- Expose precision-dependent jinit*() functions for each of the
precision-dependent libjpeg modules.
- Abstract the precision-dependent libjpeg modules by calling the
appropriate precision-dependent jinit*() function, based on the
value of cinfo->data_precision, from top-level libjpeg API
functions.
By default, libjpeg-turbo 1.3.x and later have enabled the in-memory
source/destination manager functions from libjpeg v8 when emulating the
libjpeg v6b or v7 API/ABI, which has allowed operating system
distributors to provide those functions without adopting the
backward-incompatible libjpeg v8 API/ABI.
Prior to libjpeg-turbo 1.5.x, it made sense to allow users to disable
the in-memory source/destination manager functions at build time and
thus retain both backward and forward API/ABI compatibility relative to
libjpeg v6b or v7. Since then, however, we have introduced several new
libjpeg API functions that break forward API/ABI compatibility, so it no
longer makes sense to allow the in-memory source/destination managers to
be disabled. libjpeg-turbo only claims to be
backward-API/ABI-compatible, i.e. to allow applications built against
libjpeg or an older version of libjpeg-turbo to work properly with the
current version of libjpeg-turbo.
- Fix an issue whereby a build with ENABLE_SHARED=0 could not be
installed when using the Ninja Multi-Config CMake generator.
- Fix an issue whereby a Windows installer could not be built when using
the Ninja Multi-Config CMake generator.
- Fix an issue whereby the Java regression tests failed when using the
Ninja Multi-Config CMake generator.
Based on:
4f169deeb0Closes#626
... on platforms that support TLS, which should include all
currently-supported platforms
(https://libjpeg-turbo.org/Documentation/OfficialBinaries)
Addresses a concern raised in #87
Although it is still my opinion that the data race in init_simd() was
innocuous, we can now fix it for free thanks to
ae87a95861, so why not?
This commit reverts 4dbc293125 and
9f8f683e74 (the previous two commits) and
fixes#613 the correct way. The crux of the issue wasn't the size of
the whole_image virtual array but rather that, since last_iMCU_row is
unsigned, (last_iMCU_row - 1) wrapped around to 0xFFFFFFFF when
last_iMCU_row was 0. This caused the interblock smoothing algorithm
introduced in 6d91e950c8 to erroneously
try to access the next two iMCU rows, neither of which existed. The
first attempt at a fix (4dbc293125)
exposed a NULL dereference, detected by OSS-Fuzz, that occurred when
attempting to decompress a specially-crafted malformed JPEG image to a
YUV buffer using tjDecompressToYUV*() with 1/4 IDCT scaling.
Fixes#613 (again)
Also fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=49898
Regression introduced by 6d91e950c8
Because we're now using a 5x5 smoothing window when decompressing
progressive JPEG images, we need to ensure that the whole_image virtual
array contains at least five rows. Previously that was not always the
case unless the progressive JPEG image being decompressed had at least
five iMCU rows. Since an iMCU has a height of (8 * the vertical
sampling factor), attempting to decompress 4:2:2 and 4:4:4 images <= 32
pixels in height or 4:2:0 images <= 64 pixels in height triggered a
JERR_BAD_VIRTUAL_ACCESS error in decompress_smooth_data(), because
access_rows exceeded the number of rows in the virtual array.
Fixes#613
libjpeg-turbo's AltiVec SIMD extensions previously assumed that AltiVec
instructions were available on all Power Macs that supported OS X 10.4
"Tiger" (the earliest version of OS X that libjpeg-turbo has ever
supported), but Tiger can actually run on PowerPC G3 processors, which
lack AltiVec instructions. This commit enables run-time detection of
AltiVec instructions on OS X/PowerPC systems if AltiVec instructions are
not force-enabled at compile time (using -maltivec). This allows the
same build of libjpeg-turbo to support G3, G4, and G5 Power Macs.
Closes#609
The macros in jerror.h refer to j_common_ptr, so it is unfortunately
necessary to introduce a 12-bit-specific version of that header file
(j12error.h) with 12-bit specific ERREXIT*(), WARNMS*(), and
TRACEMS*() macros. (The message table is still shared between 8-bit and
12-bit implementations.)
Fixes#607
Although it is uncommon, some downstream implementations undefine one or
more of the *_SUPPORTED macros in jmorecfg.h in order to reduce the size
of the library. In the interest of maintaining backward compatibility
with libjpeg, this is still a supported use case.
Regression introduced by 9120a24743
Based on:
74c4d032f0Closes#601
MinGW defines strcasecmp() and strncasecmp() as macros in string.h if
__CRT__NO_INLINE is defined, which will be the case when including any
of the Win32 API headers.
Closes#594
Referring to https://github.com/google/oss-fuzz/issues/7575, if the
fuzzer suffix contains periods, it can cause ClusterFuzz to misinterpret
the file extension of the fuzzer executables and thus misidentify them.
When building without the SIMD extensions, memory allocations are
currently aligned to sizeof(double). However, this may be insufficient
on architectures such as Arm Morello or 64-bit CHERI-RISC-V where
pointers require 16-byte rather than 8-byte alignment. This patch
causes memory allocations to be aligned to
MAX(sizeof(void *), sizeof(double)) when building without the SIMD
extensions.
(NOTE: With C11 we could instead use alignof(max_align_t), but C89
compatibility is still necessary in libjpeg-turbo.)
Closes#587
Referring to the conversation in
https://github.com/google/oss-fuzz/issues/7479 and #559, there was a
misunderstanding regarding how CIFuzz works. It cannot be used to fuzz
arbitrary PRs or code branches, and it has a 90-day delay in downloading
corpora from OSS-Fuzz. That makes it unsuitable for libjpeg-turbo.
With x86-64 builds, the default value of FLOATTEST works with both the
8-bit-per-sample and 12-bit-per-sample flavors of the libjpeg API
library. However, that is not the case with x86 builds. Thus, we need
separate 8-bit-per-sample and 12-bit-per-sample FLOATTEST variables.
Newer versions of the 32-bit x86 Visual Studio compiler produce results
compatible with FLOATTEST=no-fp-contract, so we can no longer
intelligently set a default FLOATTEST value for that platform.
When 12-bit-per-component JPEG support is enabled (WITH_12BIT=1) or the
TurboJPEG API library and associated test programs are disabled
(WITH_TURBOJPEG=0), the Windows installer target should not depend on
the turbojpeg, turbojpeg-static, and tjbench targets.
(broken by 607b668ff9)
- Visual Studio 2010 apparently doesn't have the snprintf() inline
function, so restore the macro that emulates that function using
_snprintf_s().
- Explicitly include errno.h in strtest.c, since jinclude.h doesn't
include it when building with Visual Studio.
- Remove the section in libjpeg.txt that advised against building
libjpeg as a shared library. We obviously do not follow that advice,
and libjpeg-turbo does guarantee backward ABI compatibility in our
libjpeg API library, even though libjpeg did not and does not.
(Future expansion of our libjpeg API library, if necessary, will be
accomplished using get/set functions that store the new parameters
in the opaque master structs. Refer to
db2986c96f.)
- Unmention install.txt, which was never relevant to libjpeg-turbo and
was removed in v1.3 (6f96153c67).
- Remove extraneous spaces.
- Document the fact that TWO_FILE_COMMANDLINE must be defined in order
to use the two-file interface with cjpeg, djpeg, jpegtran, and
wrjpgcom. libjpeg-turbo never enables that interface by default.
People keep trying to include libjpeg-turbo into downstream CMake-based
build systems by way of the add_subdirectory() function and requesting
upstream support when something inevitably breaks.
(Refer to: #122, #173, #176, #202, #241, #349, #353, #412, #504,
a3d4aadd0d (commitcomment-67575889)).
libjpeg-turbo has never supported that method of sub-project
integration, because doing so would require that we (minimally):
1. avoid using certain CMake variables, such as CMAKE_SOURCE_DIR,
CMAKE_BINARY_DIR, and CMAKE_PROJECT_NAME;
2. avoid using implicit include directories and relative paths;
3. provide a way to optionally skip the installation of libjpeg-turbo
components in response to 'make install';
4. provide a way to optionally postfix target names, to avoid namespace
conflicts;
5. restructure the top-level CMakeLists.txt so that it properly sets
the PROJECT_VERSION variable; and
6. design automated CI tests to ensure that new commits don't break
any of the above.
Even if we did all of that, issues would still arise, because it is
impossible for one upstream build system to anticipate the widely
varying needs of every downstream build system. That's why the CMake
ExternalProject_Add() function exists, and it is my sincere hope that
adding a blurb to BUILDING.md mentioning the need to use that function
will head off future GitHub issues on this topic. If not, then I can at
least post a link to this commit and the blurb and avoid doing the same
song and dance over and over again.
The loop in jsimd_quantize_neon() is only executed twice and should be
unrolled for AArch64 targets. GCC does that by default, but Clang 11
and later versions available at the time of this writing do not. This
patch adds an unroll pragma when targetting AArch64 with Clang. We do
not use the unroll pragma for AArch32 targets, because it causes the
Clang-generated assembly code to exhaust the available Neon registers
(32 x 64-bit) and spill to the stack. (DRC: Referring to the discussion
in #570, this is likely due to compiler confusion that results in poor
register allocation. It is possible to eliminate the spillage and
reduce the instruction count by loading the data on a just-in-time
basis, thus explicitly interleaving compute and I/O, but the performance
implications of that are currently unknown.)
The effects of unrolling the quantization loop are:
1) elimination of the loop control flow overhead and
2) enabling the use of LDP/STP instructions that work from a single
base pointer, instead of using double the number of LDR/STR
instructions, each requiring an address calculation.
Closes#570
- Suppress a UBSan warning regarding storing a 64-bit value to a
non-64-bit-aligned address. That behavior is technically undefined
per the C spec but is supported in the context of the AArch64
architecture and compilers.
- Explicitly promote block_diff[i] to unsigned int prior to left
shifting it, in order to avoid a UBSan warning. This warning also
described behavior that is technically undefined per the C spec but is
supported in the context of the AArch64 architecture and compilers.
Changing the type cast order eliminated the warning without changing
the generated assembly code.
Closes#582
- Make better use of 128-bit vector registers, thus reducing the number
of Neon instructions required to construct the AC coefficient bitmap.
- Refactor the Neon computations of 'nbits' and 'diff' to use shorter
and higher-throughput instruction sequences.
DRC's notes:
This commit partially integrates #570. Arm reported a 1-4% speedup on
Cortex-A55 and Neoverse-N1 cores when using recent compilers but little
or no speedup with Clang 10. I observed no speedup with Clang 10 on my
Cortex-A53 and Cortex-A72 cores. Thus, referring to #582, the primary
purpose of this commit is to fix UBSan warnings regarding the shift
operations previously located at Line 253:
d640a45730/simd/arm/aarch64/jchuff-neon.c (L253)
The primary purpose of this is to encourage adoption of libjpeg-turbo in
downstream Windows projects that forbid the use of "deprecated"
functions. libjpeg-turbo's usage of those functions was not actually
unsafe, because:
- libjpeg-turbo always checks the return value of fopen() and ensures
that a NULL filename can never be passed to it.
- libjpeg-turbo always checks the return value of getenv() and never
passes a NULL argument to it.
- The sprintf() calls in format_message() (jerror.c) could never
overflow the destination string buffer or leave it unterminated as
long as the buffer was at least JMSG_LENGTH_MAX bytes in length, as
instructed. (Regardless, this commit replaces those calls with
snprintf() calls.)
- libjpeg-turbo never uses sscanf() to read strings or multi-byte
character arrays.
- Because of b7d6e84d6a, wrjpgcom
explicitly checks the bounds of the source and destination strings
before calling strcat() and strcpy().
- libjpeg-turbo always ensures that the destination string is
terminated when using strncpy().
(548490fe5e made this explicit.)
Regarding thread safety:
Technically speaking, getenv() is not thread-safe, because the returned
pointer may be invalidated if another thread sets the same environment
variable between the time that the first thread calls getenv() and the
time that that thread uses the return value. In practice, however, this
could only occur with libjpeg-turbo if:
(1) A multithreaded calling application used the deprecated and
undocumented TJFLAG_FORCEMMX/TJFLAG_FORCESSE/TJFLAG_FORCESSE2 flags in
the TurboJPEG API or set one of the corresponding environment variables
(which are only intended for testing purposes.) Since the TurboJPEG API
library only ever passed string constants to putenv(), the only inherent
risk (i.e. the only risk introduced by the library and not the calling
application) was that the SIMD extensions may have read an incorrect
value from one of the aforementioned environment variables.
or
(2) A multithreaded calling application modified the value of the
JPEGMEM environment variable in one thread while another thread was
reading the value of that environment variable (in the body of
jpeg_create_compress() or jpeg_create_decompress().) Given that the
libjpeg API provides a thread-safe way for applications to modify the
default memory limit without using the JPEGMEM environment variable,
direct modification of that environment variable by calling applications
is not supported.
Microsoft's implementation of getenv_s() does not claim to be
thread-safe either, so this commit uses getenv_s() solely to mollify
Visual Studio. New inline functions and macros (GETENV_S() and
PUTENV_S) wrap getenv_s()/_putenv_s() when building for Visual Studio
and getenv()/setenv() otherwise, but GETENV_S()/PUTENV_S() provide no
advantages over getenv()/setenv() other than parameter validation. They
are implemented solely for convenience.
Technically speaking, strerror() is not thread-safe, because the
returned pointer may be invalidated if another thread changes the locale
and/or calls strerror() between the time that the first thread calls
strerror() and the time that that thread uses the return value. In
practice, however, this could only occur with libjpeg-turbo if a
multithreaded calling application encountered a file I/O error in
tjLoadImage() or tjSaveImage(). Since both of those functions
immediately copy the string returned from strerror() into a thread-local
buffer, the risk is minimal, and the worst case would involve an
incorrect error string being reported to the calling application.
Regardless, this commit uses strerror_s() in the TurboJPEG API library
when building for Visual Studio. Note that strerror_r() could have been
used on Un*x systems, but it would have been necessary to handle both
the POSIX and GNU implementations of that function and perform
widespread compatibility testing. Such is left as an exercise for
another day.
Fixes#568
- Since the ERREXITS() and TRACEMSS() macros are never used internally
(they are a relic of the legacy memory managers that libjpeg
provided), the only risk was that an external program might have
invoked one of those macros with a string longer than 79 characters
(JMSG_STR_PARM_MAX - 1).
- TJBench never invokes the THROW_TJ() macro with a string longer than
199 (JMSG_LENGTH_MAX - 1) characters, so there was no risk. However,
it's a good idea to explicitly terminate the destination strings so
that anyone looking at the code can immediately tell that it is safe.
The h2v2 (4:2:0) merged upsampler uses a spare row buffer so that it can
upsample two rows at a time but return only one row to the application,
if necessary. merged_2v_upsample() copies from this spare row buffer
into the application-supplied output buffer, using the out_row_width
field in the my_merged_upsampler struct to determine how many samples to
copy. out_row_width is set in jinit_merged_upsampler(), which is called
within the body of jpeg_start_decompress(). Since jpeg_crop_scanline()
must be called after jpeg_start_decompress(), jpeg_crop_scanline() must
modify the value of out_row_width if the h2v2 merged upsampler will be
used. Otherwise, merged_2v_upsample() can overflow the output buffer if
the number of bytes between the current output buffer position and the
end of the buffer is less than the number of bytes required to represent
an uncropped scanline of the output image. All of the destination
managers used by djpeg allocate either a whole image buffer or a
scanline buffer based on the uncropped output image width, so this issue
is not reproducible using djpeg.
Fixes#574
libjpeg-turbo has never supported non-ANSI C compilers. Per the spec,
ANSI C compilers must have locale.h, stddef.h, stdlib.h, memset(),
memcpy(), unsigned char, and unsigned short. They must also handle
undefined structures.
If NEON_INTRINSICS=0, then run the GAS sanity check from libjpeg-turbo
2.0.x and force-enable NEON_INTRINSICS if the test fails. This fixes
the AArch32 build when using Clang 6.0 on Linux or the Clang toolchain
in the Android NDK r15*-r16*. It also prevents users from manually
disabling NEON_INTRINSICS if doing so would break the build (such as
with Xcode 5.)
- Use check_c_source_compiles() rather than check_symbol_exists() to
detect the presence of vld1_s16_x3(), vld1_u16_x2(), and
vld1q_u8_x4(). check_symbol_exists() is unreliable for detecting
intrinsics, and in practice, it did not detect the presence of the
aforementioned intrinsics in versions of GCC that support them.
- Set DEFAULT_NEON_INTRINSICS=0 for GCC < 12, even if the aforementioned
intrinsics are available. The AArch64 back end in GCC 10 and 11
supports the necessary intrinsics, but the GAS implementation is still
faster when using those compilers.
Fixes#547
- Use JERR_NOTIMPL ("Not implemented yet") rather than JERR_NOT_COMPILED
("Requested feature was omitted at compile time") to indicate that
arithmetic coding is incompatible with Huffman table optimization.
This is more consistent with other parts of the libjpeg API code.
JERR_NOT_COMPILED is typically used to indicate that a major feature
was not compiled in, whereas JERR_NOTIMPL is typically used to
indicate that two features were compiled in but are incompatible with
each other (such as, for instance, two-pass color quantization and
partial image decompression.)
- Change the text of JERR_NOTIMPL to "Requested features are
incompatible". This is a more accurate description of the situation.
"Not implemented yet" implies that it may be possible to support the
requested combination of features in the future, but that is not true
in most of the cases where JERR_NOTIMPL is used.
Fixes#567
(regression introduced by aa7459050d)
cjpeg sets cinfo.in_color_space to JCS_RGB as an "arbitrary guess."
Since tjLoadImage() never uses JCS_RGB, the PGM reader should treat
JCS_RGB the same as JCS_UNKNOWN.
Fixes#566
This fixes an oversight from the integration of the arithmetic entropy
codec from libjpeg (66f97e6820). I chose
to integrate the latest implementation available at the time, which was
from jpeg-8b. However, I naively replaced cinfo->lim_Se with
DCTSIZE2 - 1, not realizing that-- because of SmartScale-- jpeg-8b
contains additional code
(https://github.com/libjpeg-turbo/libjpeg-turbo/blob/jpeg-8b/jdinput.c#L249-L334)
that guards against illegal values of cinfo->Se >= DCTSIZE2. Thus,
libjpeg-turbo's implementation of arithmetic decoding has never guarded
against such illegal values. This commit restores the relevant check
from the original jpeg-6b arithmetic entropy codec patch ("jpeg-ari",
1e247ac854).
Fixes#564
CIFuzz runs the project's fuzzers for a limited period of time any time
a commit is pushed or a PR is submitted. This is not intended to
replace OSS-Fuzz but rather to allow us to more quickly catch some
fuzzing failures, including fuzzer build regressions like the one
introduced in ecf021bc0d.
Closes#559
Recent FreeBSD/PowerPC compilers, such as Clang 11.0.x on FreeBSD 13, do
the equivalent of passing -maltivec to the compiler by default, so
run-time AltiVec detection is unnecessary. However, it becomes
necessary when using other compilers or when passing -mno-altivec to the
compiler.
Closes#552
- Don't check for exceptions immediately after invoking the
GetPrimitiveArrayCritical() method. That method does not throw
exceptions, and checking for them caused -Xcheck:jni to warn about
calling other JNI functions in the scope of
Get/ReleasePrimitiveArrayCritical().
- Check for exceptions immediately after invoking the
CallStaticObjectMethod() method in the PROP2ENV() macro.
- Don't use the Get/ReleasePrimitiveArrayCritical() methods for small
arrays. -Xcheck:jni didn't complain about that, but there is no
performance advantage to using those methods rather than the
Get*ArrayRegion() methods for small arrays, and using
Get*ArrayRegion() makes the code less error-prone.
- Don't release the source/destination planes arrays in the YUV methods
until after the corresponding C TurboJPEG functions have returned.
Referring to https://cmake.org/cmake/help/latest/policy/CMP0065.html,
CMake 3.3 and earlier automatically added compiler/linker flags such as
-rdynamic/-export-dynamic, which caused symbols to be exported from
executables. The primary purpose of this is to allow plugins loaded via
dlopen() to access symbols from the calling program. libjpeg-turbo
does not need this functionality, and enabling it needlessly increases
the size of the libjpeg-turbo executables.
Setting CMP0065 to NEW when using CMake 3.4 and later prevents CMake
from automatically adding the aforementioned compiler/linker flags
unless the ENABLE_EXPORTS property is set for a target (or the
CMAKE_ENABLE_EXPORTS variable is set, which causes ENABLE_EXPORTS to be
set for all targets.)
Closes#554
When building for 32-bit Arm platforms, test whether basic Neon
intrinsics will compile with the specified compiler and C flags. This
prevents the build system from enabling the Neon SIMD extensions when
targetting Armv6 and other legacy architectures that do not support Neon
instructions.
Regression introduced by bbd8089297.
(Checking whether gas-preprocessor.pl was needed for 32-bit Arm builds
had the effect of checking whether Neon instructions were supported.)
Fixes#553
We would have been perfectly happy for AppVeyor to delete all artifacts
other than those from the latest builds in each branch. Instead, they
chose to change the global artifact retention policy to 1 month. In
addition to being unworkable, that new policy uses more storage than the
policy we requested. Lose/lose, so we'll just deploy to S3 like we do
with other platforms.
This issue was introduced in 5557fd2217
due to an oversight, so it has existed in libjpeg-turbo since the
project's inception. However, the issue is effectively a non-issue.
Although #325 proposes allowing programs to override jpeg_get_*() and
jpeg_free_*() externally, there is currently no way to override those
functions without modifying the libjpeg-turbo source code.
libjpeg-turbo only includes the malloc()/free() memory manager from
libjpeg, and the implementation of jpeg_free_*() in that memory manager
ignores the size argument. libjpeg had several additional memory
managers for legacy systems (MS-DOS, System 7, etc.), but those memory
managers ignored the size argument to jpeg_free_*() as well. Thus, this
issue would have only potentially affected custom memory managers in
downstream libjpeg-turbo forks, and since no one has complained until
now, apparently those are rare.
Fixes#542
Attempting to losslessly transform certain malformed JPEG images can
cause the nbits table index in the Huffman encoder to exceed 32768, so
we need to pad the SSE2 implementation of that table to 65536 entries as
we do with the C implementation.
Regression introduced by 087c29e07fFixes#543
This is the same error that d147be83e9
suppressed in decode_mcu_slow(). The image that reproduces this error
in decode_mcu_fast() has been added to the libjpeg-turbo seed corpora.
Closes#537
Although sizeof(void *) == sizeof(size_t) for all architectures that are
currently supported by libjpeg-turbo, such is not guaranteed by the C
standard. Specifically, CHERI-enabled architectures (e.g. CHERI-RISC-V
or Arm's Morello) use capability pointers that are twice the size of
size_t (128 bits for Morello and RV64), so casting to size_t strips the
upper bits of the pointer (including the validity bit) and makes it
non-deferenceable, as indicated by the following compiler warning:
warning: cast from provenance-free integer type to pointer type will
give pointer that can not be dereferenced
[-Werror,-Wcheri-capability-misuse]
cvalue = values = (JCOEF *)PAD((size_t)values_unaligned, 16);
Ignoring this warning results in a run-time crash. Casting pointers to
uintptr_t, if it is available, avoids this problem, since uintptr_t is
defined as an unsigned integer type that can hold a pointer value.
Since C89 compatibility is still necessary in libjpeg-turbo, this commit
introduces a new typedef for pointer-to-integer casts that uses a
GNU-specific extension available in GCC 4.6+ and Clang 3.0+ and falls
back to using size_t if the extension is unavailable. The only other
options would require C99 or Clang-specific builtins.
Closes#538
'buffer' is both passed into the inline assembly code and modified by
it. See https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html, 6.47.2.3.
With GCC 4, this commit does not change the generated assembly code at
all.
With GCC 8, this commit fixes an assembly error:
/tmp/{foo}.s: Assembler messages:
/tmp/{foo}.s:775: Error: registers may not be the same --
`str r9,[r9],#4'
I'm not sure why that error went unnoticed, since I definitely
benchmarked the previous commit with GCC 8. Anyhow, this commit changes
the generated assembly code slightly but does not alter performance.
With Clang 10, this commit changes the generated assembly code slightly
but does not alter performance.
Refer to #529
Referring to the C standard
(http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf,
J.2 Undefined behavior), the behavior of the compiler is undefined if
"conversion between two pointer types produces a result that is
incorrectly aligned." Thus, the behavior of this code
*((uint32_t *)buffer) = BUILTIN_BSWAP32(put_buffer);
in the AArch32 version of the FLUSH() macro is undefined unless 'buffer'
is 32-bit-aligned. Referring to
https://bugs.llvm.org/show_bug.cgi?id=50785, certain versions of Clang,
when generating Thumb (T32) instructions, miscompile that code into an
assembly instruction (stm) that requires the destination to be
32-bit-aligned. Since such alignment cannot be guaranteed within the
Huffman encoder, this reportedly led to crashes (SIGBUS: illegal
alignment) with AArch32/Thumb builds of libjpeg-turbo running on Android
devices, although thus far I have been unable to reproduce those crashes
with a plain Linux/Arm system.
The miscompilation is visible with the Compiler Explorer:
https://godbolt.org/z/rv1ccx1Pb
However, it goes away when removing the return statement from the
function. Thus, it seems that Clang's behavior in this regard is
somewhat variable, which may explain why the crashes are only
reproducible on certain platforms.
The suggested workaround is to use memcpy(), but whereas Clang and
recent GCC releases are smart enough to compile a 4-byte memcpy() call
into a str instruction, GCC < 6 is not. Referring to
https://godbolt.org/z/ae7Wje3P6, the only way to consistently produce
the desired str instruction across all supported compilers is to use
inline assembly. Visual C++ presumably does not miscompile the code in
question, since no issues have been reported with it, but since the code
relies on undefined compiler behavior, prudence dictates that
e4ec23d7ae should be reverted for Visual
C++, which this commit does. The performance impact of
e4ec23d7ae for Visual C++/Arm builds is
unknown (I have no ability to test such builds), but regardless, this
commit reverts the Visual C++/Arm performance to that of libjpeg-turbo
2.1 beta1.
Closes#529
This resolves a conflict between the RPM generated by the libjpeg-turbo
build system and the Red Hat 'filesystem' RPM if
CMAKE_INSTALL_LIBDIR=/usr/lib[64]. This code was largely borrowed from
the VirtualGL RPM spec. (I can legally do that because I hold the
copyright on VirtualGL's implementation.)
Fixes#532
Arm compilers have three floating point ABI options:
'soft' compiles floating point operations as function calls into a
software floating point library, which emulates floating point
operations using integer operations. Floating point function arguments
are passed using integer registers.
'softfp' also compiles floating point operations as function calls into
a floating point library and passes floating point function arguments
using integer registers, but the floating point library functions can
use FPU instructions if the CPU supports them.
'hard' compiles floating point operations into inline FPU instructions,
similarly to x86 and other architectures, and passes floating point
function arguments using FPU registers.
Not all AArch32 CPUs have FPUs or support Neon instructions, so on Linux
and Android platforms, the AArch32 SIMD dispatcher in libjpeg-turbo only
enables the Neon SIMD extensions at run time if /proc/cpuinfo indicates
that the CPU supports Neon instructions or if Neon instructions are
explicitly enabled (e.g. by passing -mfpu=neon to the compiler.) In
order to support all AArch32 CPUs using the same code base, i.e. to
support run-time FPU and Neon auto-detection, it is necessary to compile
the scalar C source code using -mfloat-abi=soft. However, the 'soft'
floating point ABI cannot be used when compiling Neon intrinsics, so the
intrinsics implementation of the Neon SIMD extensions must be compiled
using -mfloat-abi=softfp if the scalar C source code is compiled using
-mfloat-abi=soft.
This commit modifies the build system so that it detects whether
-mfloat-abi=softfp must be explicitly added to the compiler flags when
building the intrinsics implementation of the Neon SIMD extensions.
This will be necessary if the build is using the 'soft' floating
point ABI along with run-time auto-detection of Neon instructions.
Fixes#523
The existing /*FALLTHROUGH*/ comments work with GCC but not Clang, so
this commit adds a FALLTHROUGH macro that uses the 'fallthrough'
attribute if the compiler supports it.
Refer to https://bugs.chromium.org/p/chromium/issues/detail?id=995993
NOTE: All versions of GCC that support -Wimplicit-fallthrough also
support the 'fallthrough' attribute, but certain other compilers (Oracle
Solaris Studio, for instance) support /*FALLTHROUGH*/ but not the
'fallthrough' attribute. Thus, this commit retains the /*FALLTHROUGH*/
comments, which have existed in the libjpeg code base in some form since
1994 (libjpeg v5.)
Closes#531
Referring to #527, the security community did not assign this CVE ID
until more than 8 months after the fix for the issue was released. By
the time they assigned the ID, libjpeg-turbo already had two production
releases containing the fix. This calls into question the usefulness of
assigning a CVE ID to the issue, particularly given that the buffer
overrun in question was fully contained in the stack, not detectable
with valgrind, and confined to lossless transformation (it did not
affect JPEG compression or decompression.)
https://vuldb.com/?id.176175
says that "the exploitability is told to be easy" but provides no
clarification, and given that the author of that page does not seem to
be aware that a fix for the issue has been available since early
December of 2019, it calls into question the accuracy of everything else
on the page.
It would really be nice if the security community approached me about
these things before wasting my time, but I guess it's my lot in life to
modify a change log entry from 2019 to include a CVE ID from 2020.
So it goes...
In theory, all objects that will be included in a Un*x shared library
must be built using PIC. In practice, most compilers don't require PIC
to be explicitly specified for jsimd_none.o, either because the compiler
automatically enables PIC in all cases (Ubuntu) or because the size of
the generated object is too small. But some rare compilers do require
PIC to be explicitly specified for jsimd_none.o.
Fixes#520
Our workflow script does not currently work with tags, and there is no
point to building tags anyhow, since we do not use the CI system to spin
official builds.
When using the in-memory destination manager, it is necessary to
explicitly call the destination manager's term_destination() method if
an error occurs. That method is called by jpeg_finish_compress() but
not by jpeg_abort_compress().
This fixes a potential double free() that could occur if tjCompress*()
or tjTransform() returned an error and the calling application tried to
clean up a JPEG buffer that was dynamically re-allocated by one of those
functions.
- UBSan complained that entropy->restarts_to_go was underflowing an
unsigned integer when it was decremented while
cinfo->restart_interval == 0. That was, of course, completely
innocuous behavior, since the result of the underflowing computation
was never used.
- d3a3a73f64 and
7bc9fca430 silenced a UBSan signed
integer overflow error, but unfortunately other malformed JPEG images
have been discovered that cause unsigned integer overflow in the same
computation. Since, to the best of our understanding, this behavior
is innocuous, this commit reverts the commits listed above, suppresses
the UBSan errors, and adds code comments to document the issue.
Referring to https://bugzilla.mozilla.org/show_bug.cgi?id=1050342,
there are certain very rare circumstances under which a malformed JPEG
image can cause different Huffman decoder output to be produced,
depending on the size of the source manager's I/O buffer. (More
specifically, the fast Huffman decoder didn't handle invalid codes in
the same manner as the slow decoder, and since the fast decoder requires
all data to be memory-resident, the buffering strategy determines
whether or not the fast decoder can be used on a particular MCU block.)
After extensive experimentation, the Mozilla and Chrome developers and I
determined that this truly was an innocuous issue. The patch that both
browsers adopted as a workaround caused a performance regression with
32-bit code, which is why it was not accepted into libjpeg-turbo. This
commit fixes the problem in a less disruptive way with no performance
regression.
After the completion of the start_input() method, it's too late to check
the image size, because the image readers may have already tried to
allocate memory for the image. If the width and height are excessively
large, then attempting to allocate memory for the image could slow
performance or lead to out-of-memory errors prior to the fuzz target
checking the image size.
NOTE: Specifically, the aforementioned OOM errors and slow units were
observed with the compression fuzz targets when using MSan.
Certain rare malformed input images can cause the Huffman encoder to
generate a value for nbits that corresponds to an uninitialized member
of the DC code table. The ramifications of this are minimal and would
basically amount to a different bogus JPEG image being generated from a
particular bogus input image.
- Referring to 3311fc0001, we need to use
unsigned intermediate math in order to make UBSan happy, even though
(JDIMENSION)(A * B) is effectively the same as
(JDIMENSION)A *(JDIMENSION)B, regardless of intermediate overflow.
- Because of the previous commit, it is now possible for bfOffBits to be
INT_MIN, which would cause the initial computation of bPad to
underflow a signed integer. Thus, we need to check for that
possibility as soon as we know the values of bfOffBits and headerSize.
The worst case from this regression is that bPad could wrap around to
a large positive value, which would cause a "Premature end of input
file" error in the subsequent read_byte() loop. Thus, this issue was
effectively innocuous as well, since it resulted in catching the same
error later and in a different way. Also, the issue was very
well-contained, since it was both introduced and fixed as part of the
ongoing OSS-Fuzz integration project.
- rdbmp.c: Because of 8fb37b8171,
bfOffBits, biClrUsed, and headerSize were made into unsigned ints.
Thus, if bPad would eventually be negative due to a malformed header,
UBSan complained about unsigned math being used in the intermediate
computations. It was unnecessary to make those variables unsigned,
since they are only meant to hold small values, so this commit makes
them signed again. The UBSan error was innocuous, because it is
effectively (if not officially) the case that
(int)((unsigned int)a - (unsigned int)b) == (int)a - (int)b.
- rdbmp.c: If (biWidth * source->bits_per_pixel / 8) would overflow an
unsigned int, then UBSan complained at the point at which row_width
was set in start_input_bmp(), even though the overflow would have been
detected later in the function. This commit adds overflow checks
prior to setting row_width.
- rdppm.c: read_pbm_integer() now bounds-checks the intermediate
value computations in order to catch integer overflow caused by a
malformed text PPM. It's possible, though extremely unlikely, that
the intermediate value computations could have wrapped around to a
value smaller than maxval, but the worst case is that this would have
generated a bogus pixel in the uncompressed image rather than throwing
an error.
A fuzzing test case that was effectively a 1-pixel PGM file with a
maximum value of 1 and an actual value of 8 caused an uninitialized
member of the rescale[] array to be accessed in get_gray_rgb_row() or
get_gray_cmyk_row(). Since, for performance reasons, those functions do
not perform bounds checking on the PPM values, we need to ensure that
unused members of the rescale[] array are initialized.
A fuzzing test case with an image width of 838860946 triggered a UBSan
error:
rdbmp.c:633:34: runtime error: signed integer overflow:
838860946 * 3 cannot be represented in type 'int'
Because the result is cast to an unsigned int (JDIMENSION), this error
is irrelevant, because
(unsigned int)((int)838860946 * (int)3) ==
(unsigned int)838860946 * (unsigned int)3
This limits the tjLoadImage() behavioral changes to the scope of the
compress_fuzzer target. Otherwise, TJBench in fuzzer builds would
refuse to load images larger than 1 Mpixel.
- The PPM reader now throws an error rather than segfaulting (due to a
buffer overrun) if an application attempts to load a 16-bit PPM file
into a grayscale uncompressed image buffer. No known applications
allowed that (not even the test applications in libjpeg-turbo),
because that mode of operation was never expected to work and did not
work under any circumstances. (In fact, it was necessary to modify
TJBench in order to reproduce the issue outside of a fuzzing
environment.) This was purely a matter of making the library bow out
gracefully rather than crash if an application tries to do something
really stupid.
- The PPM reader now throws an error rather than generating incorrect
pixels if an application attempts to load a 16-bit PGM file into an
RGB uncompressed image buffer.
- The PPM reader now correctly loads 16-bit PPM files into extended
RGB uncompressed image buffers. (Previously it generated incorrect
pixels unless the input colorspace was JCS_RGB or JCS_EXT_RGB.)
The only way that users could have potentially encountered these issues
was through the tjLoadImage() function. cjpeg and TJBench were
unaffected.
The non-default options were not being tested because of a pixel format
comparison buglet. This commit also changes the code in both
decompression fuzz targets such that non-default options are tested
based on the pixel format index rather than the pixel format value,
which is a bit more idiot-proof.
Otherwise, the targets will require libstdc++, the i386 version of which
is not available in the OSS-Fuzz runtime environment. The OSS-Fuzz
build environment passes -stdlib:libc++ in the CXXFLAGS environment
variable in order to mitigate this issue, since the runtime environment
has the i386 version of libc++, but using that compiler flag requires
using the C++ compiler.
This commit integrates OSS-Fuzz targets directly into the libjpeg-turbo
source tree, thus obsoleting and improving code coverage relative to
Google's OSS-Fuzz target for libjpeg-turbo (previously available here:
https://github.com/google/oss-fuzz).
I hope to eventually create fuzz targets for the BMP, GIF, and PPM
readers as well, which would allow for fuzz-testing compression, but
since those readers all require an input file, it is unclear how to
build an efficient fuzzer around them. It doesn't make sense to
fuzz-test compression in isolation, because compression can't accept
arbitrary input data.
Define compiler-independent byte-swap macros and use them instead of
executing 'rev' via inline assembly code with GCC-compatible compilers
or a slow shift-store sequence with Visual C++.
* This produces identical assembly code with:
- 64-bit GCC 8.4.0 (Linux)
- 64-bit GCC 9.3.0 (Linux)
- 64-bit Clang 10.0.0 (Linux)
- 64-bit Clang 10.0.0 (MinGW)
- 64-bit Clang 12.0.0 (Xcode 12.2, macOS)
- 64-bit Clang 12.0.0 (Xcode 12.2, iOS)
* This produces different assembly code with:
- 64-bit GCC 4.9.1 (Linux)
- 32-bit GCC 4.8.2 (Linux)
- 32-bit GCC 8.4.0 (Linux)
- 32-bit GCC 9.3.0 (Linux)
Since the intrinsics implementation of Huffman encoding is not used
by default with these compilers, this is not a concern.
- 32-bit Clang 10.0.0 (Linux)
Verified performance neutrality
Closes#507
(regression introduced by 16bd984557)
This implements the same fix for
jsimd_encode_mcu_AC_refine_prepare_sse2() that
a81a8c137b implemented for
jsimd_encode_mcu_AC_first_prepare_sse2().
Based on:
1a59587397eb176a91d8Fixes#509Closes#510
Referring to https://bugzilla.redhat.com/show_bug.cgi?id=1937385#c2,
it is my opinion that the severity of this bug was grossly overstated
and that a CVE never should have been assigned to it, but since one was
assigned, users need to know which version of libjpeg-turbo contains
the fix.
Dear security community, please learn what "DoS" actually means and stop
misusing that term for dramatic effect. Thanks.
* commit '8a2cad020171184a49fa8696df0b9e267f1cf2f6': (99 commits)
Build: Handle CMAKE_OSX_ARCHITECTURES=(i386|ppc)
Add Sponsor button for GitHub repository
Build: Support CMAKE_OSX_ARCHITECTURES
cjpeg: Fix FPE when compressing 0-width GIF
Fix build with Visual C++ and /std:c11 or /std:c17
Neon: Fix Huffman enc. error w/Visual Studio+Clang
Use CLZ compiler intrinsic for Windows/Arm builds
Build: Use correct SIMD exts w/VStudio IDE + Arm64
jcphuff.c: Fix compiler warning with clang-cl
Migrate from Travis CI to GitHub Actions
tjexample.c: Fix mem leak if tjTransform() fails
Build: Officially support Ninja
decompress_smooth_data(): Fix another uninit. read
LICENSE.md: Remove trailing whitespace
Build: Test for correct AArch32 RPM/DEBARCH value
LICENSE.md: Formatting tweak
Fix uninitialized read in decompress_smooth_data()
Fix buffer overrun with certain narrow prog JPEGs
Bump revision to 2.0.91 for post-beta fixes
Travis: Use Docker tag that matches Git branch
...
We don't officially support i386 or PowerPC Mac builds of libjpeg-turbo
anymore, but they still work (bearing in mind that PowerPC builds
require GCC v4.0 in Xcode 3.2.6, and i386 builds require Xcode 9.x or
earlier.) Referring to #495, apparently MacPorts needs this
functionality.
The GNU builtin function __builtin_clzl() accepts an unsigned long
argument, which is 8 bytes wide on LP64 systems (most Un*x systems,
including Mac) but 4 bytes wide on LLP64 systems (Windows.) This caused
the Neon intrinsics implementation of Huffman encoding to produce
mathematically incorrect results when compiled using Visual Studio with
Clang.
This commit changes all invocations of __builtin_clzl() in the Neon SIMD
extensions to __builtin_clzll(), which accepts an unsigned long long
argument that is guaranteed to be 8 bytes wide on all systems.
Fixes#480Closes#490
The __builtin_clz() compiler intrinsic was already used in the C Huffman
encoders when building libjpeg-turbo for Arm CPUs using a GCC-compatible
compiler. This commit modifies the C Huffman encoders so that they also
use__builtin_clz() when building for Arm CPUs using Visual Studio +
Clang, as well as the equivalent _CountLeadingZeros() compiler intrinsic
when building for Arm CPUs using Visual C++.
In addition to making the C Huffman encoders faster on Windows/Arm, this
also prevents jpeg_nbits_table from being included in Windows/Arm builds,
thus saving 128 KB of memory.
When configuring a Visual Studio IDE build and passing -A arm64 to
CMake, CMAKE_SYSTEM_PROCESSOR will be amd64, so we should set CPU_TYPE
based on the value of CMAKE_GENERATOR_PLATFORM rather than the value of
CMAKE_SYSTEM_PROCESSOR.
Note that this removes our ability to regression test the Armv8 and
PowerPC SIMD extensions, effectively reverting
a524b9b06b and
02227e48a9, but at the moment, there is no
other way.
Regression introduced by 42825b68d5
The test case
https://user-images.githubusercontent.com/3491627/101376530-fde56180-38b0-11eb-938d-734119a5b5ba.jpg
is a malformed progressive JPEG image containing an interleaved Y/Cb/Cr
DC scan followed by two non-interleaved Y DC scans. Thus, the
prev_coef_bits[] array was initialized for the Y component but not the
other components, the uninitialized values for Cb and Cr were
transferred to the prev_coef_bits_latch[] array in smoothing_ok(), and
because cinfo->master->last_good_iMCU_row was 0,
decompress_smooth_data() read those uninitialized values when attempting
to smooth the second iMCU row.
Possibly fixes#478
Regression introduced by 6d91e950c8
last_block_column in decompress_smooth_data() can be 0 if, for instance,
decompressing a 4:4:4 image of width 8 or less or a 4:2:2 or 4:2:0 image
of width 16 or less. Since last_block_column is an unsigned int,
subtracting 1 from it produced 0xFFFFFFFF, the test in line 590 passed,
and we attempted to access blocks from a second block column that didn't
actually exist.
Closes#476
- Use the _M_ARM and _M_ARM64 macros provided by Visual Studio for
compile-time detection of Arm builds, since __arm__ and __aarch64__
are only present in GNU-compatible compilers.
- Neon/intrinsics: Use the _CountLeadingZeros() and
_CountLeadingZeros64() intrinsics provided by Visual Studio, since
__builtin_clz() and __builtin_clzl() are only present in
GNU-compatible compilers.
- Neon/intrinsics: Since Visual Studio does not support static vector
initialization, replace static initialization of Neon vectors with the
appropriate intrinsics. Compared to the static initialization
approach, this produces identical assembly code with both GCC and
Clang.
- Neon/intrinsics: Since Visual Studio does not support inline assembly
code, provide alternative code paths for Visual Studio whenever inline
assembly is used.
- Build: Set FLOATTEST appropriately for AArch64 Visual Studio builds
(Visual Studio does not emit fused multiply-add [FMA] instructions by
default for such builds.)
- Neon/intrinsics: Move temporary buffer allocation outside of nested
loops. Since Visual Studio configures Arm builds with a relatively
small amount of stack memory, attempting to allocate those buffers
within the inner loops caused a stack overflow.
Closes#461Closes#475
Otherwise, because the file begins with an ASCII header, Git will
erroneously treat is as an ASCII file, and if Git for Windows is
configured with default options (specifically, "Checkout windows-style,
commit Unix-style line endings"), it will add carriage return characters
to all of the "linefeed" characters in the PPM file, thus corrupting it
and causing libjpeg-turbo's regression tests to fail.
There doesn't seem to be any performance or compatibility downside to
this, and it has the advantages of simplicity and consistency between
the PR and official builds.
- Rename IOS_ARMV8_BUILD to ARMV8_BUILD.
- Rename install_ios() to install_subbuild() in makemacpkg.
- Wordsmith the build instructions accordingly.
- Use xcode12.2 image in Travis CI.
This error occurs at the call to (*cinfo->cconvert->color_convert)() in
sep_upsample() whenever cinfo->upsample->need_context_rows == TRUE
(i.e. whenever h2v2 or h1v2 fancy upsampling is used.) The error is
innocuous, since (*cinfo->cconvert->color_convert)() points to a dummy
function (noop_convert()) in that case.
Fixes#470
The "32bit" vs. "64bit" floating point test results actually have
nothing to do with the FPU. That was a fallacious assumption based on
the observation that, with multiple CPU types, 32-bit and 64-bit builds
produce different floating point test results. It seems that this is,
in fact, due to differing compiler behavior-- more specifically, whether
fused multiply-add (FMA) instructions are used to combine multiple
floating point operations into a single instruction ("floating point
expression contraction".) GCC does this by default if the target
supports FMA instructions, which PowerPC and AArch64 targets both do.
Fixes#468
This allows the Neon intrinsics code to be built successfully (albeit
likely with reduced run-time performance) with Xcode 5.0-6.2
(iOS/AArch64) and Android NDK < r19 (AArch32). Note that Xcode 5.0-6.2
will not build the Armv8 GAS code without gas-preprocessor.pl, and no
version of Xcode will build the Armv7 GAS code without
gas-preprocessor.pl, so we always use the full Neon intrinsics
implementation by default with macOS and iOS builds.
Auto-detecting the completeness of the compiler's set of Neon intrinsics
also allows us to more intelligently set the default value of
NEON_INTRINSICS, based on the values of HAVE_VLD1*. This is a
reasonable, albeit imperfect, proxy for whether a compiler has a full
and optimal set of Neon intrinsics. Specific notes:
- 64-bit RGB-to-YCbCr color conversion
does not use any of the intrinsics in question, regresses with GCC
- 64-bit accurate integer forward DCT
uses vld1_s16_x3(), regresses with GCC
- 64-bit Huffman encoding
uses vld1q_u8_x4(), regresses with GCC
- 64-bit YCbCr-to-RGB color conversion
does not use any of the intrinsics in question, regresses with GCC
- 64-bit accurate integer inverse DCT
uses vld1_s16_x3(), regresses with GCC
- 64-bit 4x4 inverse DCT
uses vld1_s16_x3(). I did not test this algorithm in isolation, so
it may in fact regress with GCC, but the regression may be hidden by
the speedup from the new SIMD-accelerated upsampling algorithms.
- 32-bit RGB-to-YCbCr color conversion:
uses vld1_u16_x2(), regresses with GCC
- 32-bit accurate integer forward DCT
uses vld1_s16_x3(), regression irrelevant because there was no
previous implementation
- 32-bit accurate integer inverse DCT
uses vld1_s16_x3(), regresses with GCC
- 32-bit fast integer inverse DCT
does not use any of the intrinsics in question, regresses with GCC
- 32-bit 4x4 inverse DCT
uses vld1_s16_x3(). I did not test this algorithm in isolation, so
it may in fact regress with GCC, but the regression may be hidden by
the speedup from the new SIMD-accelerated upsampling algorithms.
Presumably when GCC includes a full and optimal set of Neon intrinsics,
the HAVE_VLD1* tests will pass, and the full Neon intrinsics
implementation will be enabled automatically.
- Remove gas-preprocessor.pl. None of the compilers that can build the
new intrinsics implementation require gas-preprocessor.pl (tested
with Xcode and with Clang 3.9+ for Linux.)
- Document that Xcode 6.3.x or later is now required for iOS builds
(older versions of Xcode do not have a full set of Neon intrinsics.)
- Add a change log entry.
- Do not enable the ASM CMake language unless NEON_INTRINSICS is false.
- Add a Clang/Arm64 test to .travis.yml in order to test the new
intrinsics implementation.
Closes#455
There was no previous GAS implementation.
NOTE: This doesn't produce much of a speedup when using -O3, because -O3
already enables Neon autovectorization, which works well for the scalar
C implementation of plain upsampling. However, the Neon SIMD
implementation will benefit other optimization levels.
There was no previous GAS implementation.
This commit also reverts 40557b2301 and
7723d7f7d0.
7723d7f7d0 was only necessary because
there was no Neon implementation of merged upsampling/color conversion,
and 40557b2301 was only necessary because
of 7723d7f7d0.
The previous AArch64 GAS implementation has been removed, since the
intrinsics implementation provides the same or better performance.
There was no previous AArch32 GAS implementation.
The previous AArch32 and AArch64 GAS implementations are retained by
default when using GCC, in order to avoid a performance regression. The
intrinsics implementation can be forced on or off using the new
NEON_INTRINSICS CMake variable.
The previous AArch32 GAS implementation is retained by default when
using GCC, in order to avoid a performance regression. The intrinsics
implementation can be forced on or off using the new NEON_INTRINSICS
CMake variable. The previous AArch64 GAS implementation has been
removed, since the intrinsics implementation provides the same or better
performance.
The previous AArch32 GAS implementation of h2v1 fancy upsampling has
been removed, since the intrinsics implementation provides the same or
better performance. There was no previous GAS implementation of h2v2
fancy upsampling, and there was no previous AArch64 GAS implementation
of h2v1 fancy upsampling.
The previous AArch64 GAS implementation is retained by default when
using GCC, in order to avoid a performance regression. The intrinsics
implementation can be forced on or off using the new NEON_INTRINSICS
CMake variable. The previous AArch32 GAS implementation has been
removed, since the intrinsics implementation provides the same or better
performance.
The previous AArch64 GAS implementation is retained by default when
using GCC, in order to avoid a performance regression. The intrinsics
implementation can be forced on or off using the new NEON_INTRINSICS
CMake variable. The previous AArch32 GAS implementation has been
removed, since the intrinsics implementation provides the same or better
performance.
The previous AArch64 GAS implementation is retained by default when
using GCC, in order to avoid a performance regression. The intrinsics
implementation can be forced on or off using the new NEON_INTRINSICS
CMake variable. The previous AArch32 GAS implementation has been
removed, since the intrinsics implementation provides the same or better
performance.
The previous AArch64 GAS implementation is retained by default when
using GCC, in order to avoid a performance regression. The intrinsics
implementation can be forced on or off using the new NEON_INTRINSICS
CMake variable. There was no previous AArch32 GAS implementation.
The previous AArch64 GAS implementation has been removed, since the
intrinsics implementation provides the same or better performance.
There was no previous AArch32 GAS implementation.
The previous AArch32 and AArch64 GAS implementations are retained by
default when using GCC, in order to avoid a performance regression. The
intrinsics implementation can be forced on or off using a new
NEON_INTRINSICS CMake variable.
Regression caused by
a46c111d9f
Because of 7723d7f7d0, which was
introduced in libjpeg-turbo 1.5.1 in response to #81, merged upsampling/
color conversion is disabled on platforms that have SIMD-accelerated
YCbCr -> RGB color conversion but not SIMD-accelerated merged
upsampling/color conversion. This was intended to improve performance
with the Neon SIMD extensions, since those are the only SIMD extensions
for which those circumstances apply. Under normal circumstances, the
separate "plain" (non-fancy) upsampling and color conversion routines
will produce bitwise-identical output to the merged upsampling/color
conversion routines, but that is not the case when skipping scanlines
starting at an odd-numbered scanline. The modified test introduced in
a46c111d9f does precisely that in order to
validate the fixes introduced in
9120a24743 and
a46c111d9f.
Because of 7723d7f7d0, the segfault fixed
in 9120a24743 and
a46c111d9f didn't affect the Neon SIMD
extensions, so this commit effectively reverts the test modifications in
a46c111d9f when using those SIMD
extensions. We can get rid of this hack, as well as
7723d7f7d0, once a Neon implementation of
merged upsampling/color conversion is available.
- Refer to the "slow" [I]DCT algorithms as "accurate" instead, since
they are not slow under libjpeg-turbo.
- Adjust documentation claims to reflect the fact that the "slow" and
"fast" algorithms produce about the same performance on AVX2-equipped
CPUs (because of the dual-lane nature of AVX2, it was not possible to
accelerate the "fast" algorithm beyond what was achievable with SSE2.)
Also adjust the claims to reflect the fact that the "fast" algorithm
tends to be ~5-15% faster than the "slow" algorithm on
non-AVX2-equipped CPUs, regardless of the use of the libjpeg-turbo
SIMD extensions.
- Indicate the legacy status of the "fast" and float algorithms in the
documentation and cjpeg/djpeg usage info.
- Remove obsolete paragraph in the djpeg man page that suggested that
the float algorithm could be faster than the "fast" algorithm on some
CPUs.
It is our convention to use the term "region" when referring to crop
specs, since this is more consistent with the terminology used by the
rest of the image processing community.
The checkstyle script was hastily developed prior to libjpeg-turbo 2.0
beta1, so it has a lot of exceptions and is thus prone to false
negatives. This commit eliminates some of those exceptions.
- Restore GIF read/compressed GIF write support from jpeg-6a and
jpeg-9d.
- Integrate jpegtran -wipe and -drop options from jpeg-9a and jpeg-9d.
- Integrate jpegtran -crop extension (for expanding the image size) from
jpeg-9a and jpeg-9d.
- Integrate other minor code tweaks from jpeg-9*
- Set CPU_TYPE=arm if performing a 32-bit build on an AArch64 system.
This eliminates the need to use a CMake toolchain file.
- Set RPMARCH=armv7hl if building on a 32-bit Arm system with an FPU.
- Set RPMARCH=armv7hl and DEBARCH=armhf if performing a 32-bit build
using a gnueabihf toolchain.
- If performing a 32-bit Arm build, generate a 32-bit supplementary DEB
package for AArch64 systems.
This commit modifies decompress_smooth_data(), adding missing MCU column
offsets to the prev_block_row and next_block_row indices that are used
for block rows other than the first and last. Effectively, this
eliminates unexpected visual artifacts when using jpeg_crop_scanline()
along with interblock smoothing while decompressing the DC scan of a
progressive JPEG image.
Based on:
0227d4fb48Fixes#456Closes#457
* tag '2.0.5':
TurboJPEG: Make global error handling thread-safe
ChangeLog.md: Add missing sub-header for 2.0.5
ChangeLog.md: List CVE ID fixed by previous commit
rdppm.c: Fix buf overrun caused by bad binary PPM
Build: Add missing jpegtran-icc test dependency
rdswitch.c: Eliminate spaces before semicolons
TJCompressor.compress(int): Fix YUV-to-JPEG error
Bump version to 2.0.5; Document previous commit
MIPS DSPr2: Work around various 'make test' errors
MIPS DSPr2: Fix compiler warning with -mdspr2
MIPS SIMD: Always honor JSIMD_FORCE* env vars
Test: Honor CMAKE_CROSSCOMPILING_EMULATOR variable
- Introduce a partial image decompression regression test script that
validates the correctness of jpeg_skip_scanlines() and
jpeg_crop_scanlines() for a variety of cropping regions and libjpeg
settings.
This regression test catches the following issues:
#182, fixed in 5bc43c7821#237, fixed in 6e95c08649794f5018608f37250026a45ead2db8
#244, fixed in 398c1e9acc#441, fully fixed in this commit
It does not catch the following issues:
#194, fixed in 773040f9d9#244 (additional segfault), fixed in
9120a24743
- Modify the libjpeg-turbo regression test suite (make test) so that it
checks for the issue reported in #441 (segfault in
jpeg_skip_scanlines() when used with 4:2:0 merged upsampling/color
conversion.)
- Fix issues in jpeg_skip_scanlines() that caused incorrect output with
h2v2 (4:2:0) merged upsampling/color conversion. The previous commit
fixed the segfault reported in #441, but that was a symptom of a
larger problem. Because merged 4:2:0 upsampling uses a "spare row"
buffer, it is necessary to allow the upsampler to run when skipping
rows (fancy 4:2:0 upsampling, which uses context rows, also requires
this.) Otherwise, if skipping starts at an odd-numbered row, the
output image will be incorrect.
- Throw an error if jpeg_skip_scanlines() is called with two-pass color
quantization enabled. With two-pass color quantization, the first
pass occurs within jpeg_start_decompress(), so subsequent calls to
jpeg_skip_scanlines() interfere with the multipass state and prevent
the second pass from occurring during subsequent calls to
jpeg_read_scanlines().
The additional segfault mentioned in #244 was due to the fact that
the merged upsamplers use a different private structure than the
non-merged upsamplers. jpeg_skip_scanlines() was assuming the latter, so
when merged upsampling was enabled, jpeg_skip_scanlines() clobbered one
of the IDCT method pointers in the merged upsampler's private structure.
For reasons unknown, the test image in #441 did not encounter this
segfault (too small?), but it encountered an issue similar to the one
fixed in 5bc43c7821, whereby it was
necessary to set up a dummy postprocessing function in
read_and_discard_scanlines() when merged upsampling was enabled.
Failing to do so caused either a segfault in merged_2v_upsample() (due
to a NULL pointer being passed to jcopy_sample_rows()) or an error
("Corrupt JPEG data: premature end of data segment"), depending on the
number of scanlines skipped and whether the first scanline skipped was
an odd- or even-numbered row.
Fixes#441Fixes#244 (for real this time)
IIRC, this was only necessary with the version of Java 1.5 that shipped
with OS X 10.4 "Tiger". Apple's implementation of Java 6 ("Java for
OS X Systems") supported both .jnilib and .dylib extensions for JNI
libraries, but Oracle's implementation of Java has only ever supported
the .dylib extension.
The scales have now tilted overwhelmingly in favor of eliminating
support for 32-bit Macs:
- 32-bit applications are only necessary in order to support OS X 10.5
"Leopard" and OS X 10.6 "Snow Leopard". OS X 10.7 "Lion" requires a
64-bit Mac and supports all 64-bit Macs.
- 32-bit applications are no longer allowed in the macOS App Store.
- 32-bit applications no longer run in macOS 10.15 "Catalina".
- 32-bit applications do not support thread-local storage, so the
TurboJPEG API library's global error handler is not thread-safe with
such applications.
- libjpeg-turbo 2.1.x no longer supports 32-bit iOS apps, so it makes
sense to also eliminate support for 32-bit macOS applications.
It's time.
We haven't provided official Cygwin builds since 1.4.x, since Cygwin
now supplies its own libjpeg-turbo packages (although they apparently
haven't been updated past 1.5.3.)
This extends the fix in 1e81b0c3ea to
include binary PPM files with maximum values < 255, thus preventing a
malformed binary PPM input file with those specifications from
triggering an overrun of the rescale array and potentially crashing
cjpeg, TJBench, or any program that uses the tjLoadImage() function.
Fixes#433
Referring to #408, this commit #ifdefs DSPr2 SIMD functions that only
work on little endian processors, and it completely excludes
jsimd_h2v1_downsample_dspr2() and jsimd_h2v2_downsample_dspr2(). The
latter two functions fail with the TJBench tiling regression tests, most
likely because the implementation of the functions predates those tests.
Previously, these environment variables were not honored unless a 74K
CPU was detected, but this detection doesn't work properly with QEMU's
user mode emulation. With all other CPU types, libjpeg-turbo honors
JSIMD_FORCE* regardless of CPU detection.
This CMake variable is intended to define a wrapper program for
executing cross-compiled executables. However, CTest doesn't use
CMAKE_CROSSCOMPILING_EMULATOR, because it isn't obvious which tests
should be executed with the wrapper and which tests are scripts that
don't need it. This commit manually prepends
${CMAKE_CROSSCOMPILING_EMULATOR} to all unit test command lines that
execute a program built by the libjpeg-turbo build system. Thus, one
can set CMAKE_CROSSCOMPILING_EMULATOR in a CMake toolchain file to (for
instance) "qemu-{architecture} {qemu_arguments}") in order to execute
all eligible unit tests using QEMU.
+ document that tjFree() accepts NULL pointers without complaint.
Effectively, it has had that behavior all along, but the API does not
guarantee that tjFree() will be implemented with free() behind the
scenes, so it's best to formalize the behavior.
This programming practice (which exists in other code bases as well)
is a by-product of having used early C compilers that did not properly
handle free(NULL). All modern compilers should properly handle that.
Fixes#398
- Don't enumerate the types of SIMD instructions that libjpeg-turbo
supports, as this can change without notice.
- Use more clear terminology when describing support for libjpeg v7/v8
features ("libjpeg" is, colloquially but not officially, the name for
the IJG's software, whereas the "libjpeg API" refers to our emulation
of said software.)
- "PhotoShop" = "Photoshop" (StudLy Caps Police)
- Adjust dynamic library versions to reflect the addition of
jpeg_read_icc_profile() and jpeg_write_icc_profile() in
libjpeg-turbo 2.0.x.
Move constants out of the .text section in simd/arm64/jsimd_neon.S and
into a .rodata section. This ensures that the ARMv8 NEON SIMD
extensions are compatible with memory layouts that are marked
execute-only (and thus unreadable.)
Based on:
88f3ca7664Closes#318
libjpeg-turbo never included that code, because it requires an external
library (the Utah Raster Toolkit.) The RLE image format was supplanted
by GIF in the late 1980s, so it is rarely seen these days. (It had a
lousy Weissman score, anyhow.)
- Enable progress reporting at run time using a new -report argument
(cjpeg now supports that argument as well)
- Limit the allowable number of scans using a new -maxscans argument
- Treat warnings as fatal using a new -strict argument
This mainly demonstrates how to work around the two issues with the
JPEG standard described here:
https://libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf
since those and similar issues continue to be erroneously reported as
libjpeg-turbo bugs.
Modern Loongson processors are MIPS64-compatible, and MMI instructions
are now supported in the mainline of GCC. Thus, this commit adds
compile-time and run-time auto-detection of MMI instructions and moves
the MMI SIMD extensions for libjpeg-turbo from simd/loongson/ to
simd/mips64/. That will allow MMI and MSA instructions to co-exist
in the same build once #377 has been integrated.
Based on:
82953ddd61Closes#383
Because of 01e3032354 (officially
eliminating support for compilers without unsigned char, since we never
effectively supported those compilers anyhow), GETJOCTET() is now a
no-op. Since that macro is in jmorecfg.h, it is part of the de facto
libjpeg API and must remain in the public headers. However, there is no
reason to continue using it internally, and eliminating its internal use
improves code readability.
This commit adds ARM64 NEON optimizations for the
encode_mcu_AC_first() and encode_mcu_AC_refine() functions used in
progressive Huffman encoding.
Compression speedups for the typical set of five libjpeg-turbo test
images (https://libjpeg-turbo.org/About/Performance):
Cortex-A53: 23.8-39.2% (avg. 32.2%)
Cortex-A72: 26.8-41.1% (avg. 33.5%)
Apple A7: 29.7-45.9% (avg. 39.6%)
Closes#229
Homebrew tends to drop support for a macOS release the second that Apple
stops releasing security updates for it, and that makes HB difficult to
use with some of the Travis macOS images. Furthermore, even on
supported macOS releases, HB sometimes tries to build GCC from source
even if a binary (bottle) is available. Long story short, MacPorts just
generally has better backward compatibility. MacPorts is also what I
personally use on the official libjpeg-turbo build machine.
... detected by ASan. This is a similar issue to the issue that was
fixed with 402a715f82. Apparently it is
possible to create a malformed JPEG image that exceeds the Huffman
encoder's 256-byte local buffer when attempting to losslessly tranform
the image. That makes sense, given that it was necessary to extend the
Huffman decoder's local buffer to 512 bytes in order to handle all
pathological cases (refer to 0463f7c9aad060fcd56e98d025ce16185279e2bc.)
Since this issue affected only lossless transformation, a workflow that
isn't generally exposed to arbitrary data exploits, and since the
overrun did not overflow the stack (i.e. it did not result in a segfault
or other user-visible issue, and valgrind didn't even detect it), it did
not likely pose a security risk.
Fixes#392
... that caused some JPEG images with unusual sampling factors to be
misidentified as 4:4:4. This led to a buffer overflow when attempting
to decompress some such images using tjDecompressToYUV*().
Regression introduced by 479501b07c
The correct behavior is for the TurboJPEG API to refuse to decompress
such images, which it did prior to the aforementioned commit.
Fixes#389
Referring to #289, I'm not sure where I arrived at the conclusion that
the SSE2 progressive Huffman encoder doesn't provide any speedup for
x32. Upon re-testing, I discovered it to be about 50% faster than the
C encoder.
This commit also re-purposes one of the CI tests (specifically, the
jpeg-7 API/ABI test) so that it tests x32 as well.
... introduced by 42825b68d5. In fact,
fault-tolerant multi-scan block smoothing cannot currently be used with
the arithmetic decoder, because that decoder doesn't have any way of
distinguishing a normal end of scan from an unexpected end of scan.
Thus, this commit also modifies the change log to reset the expectations
regarding the scope of the fault-tolerant multi-scan block smoothing
feature. If, at some point in the future, the arithmetic decoder can be
modified to detect an unexpected end of scan, then one would need only
set entropy->pub.insufficient_data = TRUE when the arithmetic decoder
encounters an unexpected end of scan in order to make fault-tolerant
block smoothing work properly with that decoder.
This commit modifies the behavior of the block smoothing algorithm in
the libjpeg API library so that, if a scan in a multi-scan JPEG image is
incomplete (due to premature termination of the image stream), the block
smoothing parameters from the previous (complete) scan are used to
smooth any iMCU rows that the incomplete scan does not contain.
Closes#343
Modifying a locally-defined non-volatile variable below the setjmp()
return point results in undefined behavior whereby the variable may not
have the expected value after setjmp() returns.
Fixes#379
The primary impetus for this is to eliminate build warnings, such as
(32-bit only)
section "__textcoal_nt" is deprecated
object file (XXXXXX.o) was built for newer OSX version (10.XX) than
being linked (10.5)
Upgrading to GCC 6 results in neutral performance for compression,
a measured average overall decompression speedup of 2.5% for 64-bit
code, and a measured average overall decompression speedup of -4.3% for
32-bit code on a 3 GHz Core i7. The 4.3% slow-down for 32-bit code is
deemed acceptable, given that 32-bit macOS apps are deprecated.
Splitting the pointer arithmetic in GET_SYM() into a separate add and
sub instruction was an attempt to work around an error ("invalid operand
type") that occurred when assembling the file with NASM. However, this
created a link error on macOS ("ld: illegal text-relocation to
'_jconst_huff_encode_one_block' in
simd/CMakeFiles/simd.dir/i386/jchuff-sse2.asm.o from
'_jsimd_huff_encode_one_block_sse2' in
simd/CMakeFiles/simd.dir/i386/jchuff-sse2.asm.o for architecture i386")
and also changed the alignment of the code in ways that might have
affected the previous benchmark results (which took a great deal of time
to obtain.) Ultimately, the path of least resistance is just to
require NASM 2.13 or later.
This commit improves the C and SSE2 Huffman encoding implementations in
the following ways:
- Avoid using xmm8-xmm15 in the x86-64 SSE2 implementation. There is no
actual need to use those registers, and avoiding them produces a
cleaner WIN64 function entry/exit-- as well as shorter code, since REX
prefixes can be avoided (this is helpful on certain CPUs, such as
Intel Atom, for which instruction fetch and decoding can be a
bottleneck.)
- Optimize register usage so that fewer REX prefixes and
register-register moves are needed.
- Use the bit counter to store the number of free bits in the bit buffer
rather than the number of bits in the bit buffer. This changes the
method for inserting a code into the bit buffer to:
(put_buffer |= code << (free_bits -= code_size));
As a result:
* Only one bit counter needs to stay in a register (we just keep it in
cl.)
* The bit buffer contents are already properly aligned to be written
out (after a byte swap.)
* Adjusting the free bits counter and checking if the bit buffer is
full can be combined into a single operation.
* We can wait to flush the bit buffer until the buffer is actually
full and not just in danger of becoming full. Thus, eight bytes can
be flushed at a time.
- Speed is quite sensitive to the alignment of branch target labels, so
insert some padding and remove branches from the flush code.
(Flushing this way isn't actually faster when compared to using
branches, but the branchless code doesn't need extra alignment and is
thus smaller.)
- Speculatively write out the bit buffer as a single 8-byte write,
falling back to a byte-by-byte write only if there are any 0xFF bytes
in the bit buffer that need to be encoded as 0xFF 0x00.
- Use MMX registers for the 32-bit implementation (so the bit buffer can
be 64 bits wide.)
- Slightly reduce overall function code size.
- Eliminate or combine a few SSE instructions.
- Make some minor improvements to instruction scheduling.
- Adjust flush_bits() in jchuff.c to handle cases in which the bit
buffer has less than 7 free bits (apparently that couldn't happen
before.)
Based on:
947a09defa262ebb6b816e9a091221
See change log for performance claims.
Closes#292
This macro is a relic of libjpeg's historic need to support a wide
variety of C compilers with varying degrees of compatibility. Such was
necessary during the open systems era, because C compilers were often
supplied by the system vendor. Prior to 1989, there was no C standard
per se, and even after ANSI C became a thing, there were still compilers
in use that didn't conform to it (libjpeg was first released in 1991.)
Realistically, only a handful of C compilers are in widespread use these
days, and all modern C compilers should support structure assignment.
... to avoid backward compatibility issues with GCC 4-6 MinGW
toolchains. Apparently GCC 7+ MinGW toolchains introduce a link-time
dependency with internal MinGW CRT functions that are meant to provide
compatibility with Microsoft's Universal CRT (ucrt) library, but those
internal functions are not available in GCC 4-6 MinGW toolchains. This
made it impossible to use the official builds of libjpeg.a and
libturbojpeg.a with GCC 4-6 MinGW toolchains (a fatal link error--
"undefined reference to '__imp___acrt_iob_func'"-- occurred.)
This problem was not immediately apparent after switching to the MSYS2
implementation of MinGW (d6d7b53968)
because, for a while, MSYS2 was still using GCC 5 and 6.
Refer to libjpeg-turbo/libjpeg-turbo#382
byte, word, dword, qword, oword, and yword are all assembler keywords,
so it makes sense to use lowercase for these so as not to mistake them
for macros or constants.
(AKA "Java for OS X systems.") This implementation of Java 1.6 is long
obsolete and not supported on any version of macOS past High Sierra.
Oracle no longer provides a 32-bit JVM on macOS, so it is no longer
necessary to provide a 32-bit version of the TurboJPEG Java wrapper on
macOS.
If the TurboJPEG instance passed to tjDecodeYUV[Planes]() was previously
used to decompress a progressive JPEG image, then we need to disable the
progressive decompression parameters in the underlying libjpeg instance
before calling jinit_master_decompress().
This commit also modifies the build system so that the "tjtest" target
will test for this issue, and it corrects a previous oversight in the
build system whereby tjbenchtest did not test progressive
compression/decompression unless WITH_JAVA was true.
(regression introduced by 5b177b3cab)
The SSE2 implementation of progressive Huffman encoding performed
extraneous iterations when the scan length was a multiple of 16.
Based on:
bb7f1ef983Fixes#335Closes#367
- Re-purpose the non-SIMD test to test with MSan as well.
- Re-purpose the ASan test to test with UBSan as well.
- Use the default Travis build environment rather than specifying Ubuntu
14.04. I think I added 'dist: trusty' back when 14.04 was newer than
the default, but now it's older than the default.
- Enable verbose output for any unit tests that fail (so we can see the
sanitizer output.)
Prevent several integer overflow issues and subsequent segfaults that
occurred when attempting to compress or decompress gigapixel images with
the TurboJPEG API:
- Modify tjBufSize(), tjBufSizeYUV2(), and tjPlaneSizeYUV() to avoid
integer overflow when computing the return values and to return an
error if such an overflow is unavoidable.
- Modify tjunittest to validate the above.
- Modify tjCompress2(), tjEncodeYUVPlanes(), tjDecompress2(), and
tjDecodeYUVPlanes() to avoid integer overflow when computing the row
pointers in the 64-bit TurboJPEG C API.
- Modify TJBench (both C and Java versions) to avoid overflowing the
size argument to malloc()/new and to fail gracefully if such an
overflow is unavoidable.
In general, this allows gigapixel images to be accommodated by the
64-bit TurboJPEG C API when using automatic JPEG buffer (re)allocation.
Such images cannot currently be accommodated without automatic JPEG
buffer (re)allocation, due to the fact that tjAlloc() accepts a 32-bit
integer argument (oops.) Such images cannot be accommodated in the
TurboJPEG Java API due to the fact that Java always uses a signed 32-bit
integer as an array index.
Fixes#361
This commit modifies h1v2_fancy_upsample() so that it uses an ordered
dither pattern, similar to that of h2v1_fancy_upsample(), rounding up or
down the result for alternate pixels rather than always rounding down.
This ensures that the decompression error pattern for a 4:4:0 JPEG image
will be similar to the rotated decompression error pattern for a 4:2:2
JPEG image. Thus, the final result will be similar regardless of
whether a 4:2:2 JPEG image is rotated or transposed before or after
decompression.
Closes#356
xcode7.3 is based on El Capitan, which is EOL, and Homebrew no longer
provides El Cap bottles (pre-compiled binaries.) Thus, Homebrew was
trying to build GCC 5, YASM, and the other packages we need from source,
which caused the Mac CI builds to time out. I tried goading Homebrew
into installing GCC 5.5.0_2 and YASM 1.3.0_1, which still have El Cap
bottles available, by using the URLs of those specific versions of the
formulae (from the Homebrew GitHub repository) as package names. This
failed, however, because 'brew bundle' converted the URLs to all
lowercase. I then tried explicitly installing the old formulae by using
'brew install' with the aforementioned URLs (bypassing the Travis
Homebrew addon), but Homebrew still tried to build all of the
dependencies from source.
Upgrading to Xcode 8.3.x necessitated regression testing the performance
on iOS, but that proved less of a pain than figuring out how to install
all of the old Homebrew bottles we needed. Also, this future-proofs the
CI builds against the inevitable discontinuation of the xcode7.3 image.
Note that Xcode 8.3.x improves iOS 64-bit decompression performance
significantly relative to Xcode 7.2.x or 7.3.x. iOS 32-bit performance
unfortunately regresses by as much as 5%, but it can't be helped (32-bit
iOS apps are no longer supported on iOS 11+ anyhow, and the next major
release of libjpeg-turbo will remove support for them as well.)
... including, but not limited to:
- unused macros
- private functions not marked as static
- unprototyped global functions
- variable shadowing
(detected by various non-default GCC 8 warning options)
According to Intel's manual [1], "If a value entered for CPUID.EAX is
higher than the maximum input value for basic or extended function for
that processor then the data for the highest basic information leaf is
returned."
Right now, libjpeg-turbo doesn't first check that leaf 07H is supported
before attempting to use it, so the ostensible AVX2 bit (Bit 05) of the
CPUID result might actually be Bit 05 from a lower leaf. That bit might
be set, even if the CPU doesn't support AVX2.
This commit modifies the x86 and x86-64 SIMD feature detection code so
that it first checks whether CPUID leaf 07H is supported before
attempting to use it to check for AVX2 instruction support.
DRC:
This commit should fix
https://bugzilla.mozilla.org/show_bug.cgi?id=1520760
However, I have not personally been able to reproduce that issue,
despite using a Nehalem (pre-AVX2) CPU on which the maximum CPUID leaf
has been limited via a BIOS setting.
Closes#348
[1]
"Intel® 64 and IA-32 Architectures Software Developer's Manual, Volume 2 (2A, 2B, 2C & 2D): Instruction Set Reference, A-Z", https://software.intel.com/sites/default/files/managed/a4/60/325383-sdm-vol-2abcd.pdf, page 3-192.
Same as d3a3a73f64 but in the fast decode
path. It was necessary to use a different-sized test image in order to
trigger the error in this location.
Refer to #347
5ea77d8b77 was insufficient to fix all of
these. In particular, we need to always release the primitive arrays
before throwing an exception, because throwing an exception qualifies as
"using JNI."
Refer to #300
CONTRIBUTING.md: Correct misuse of "as such" (Grammar Police)
bug-report.md: Clarify that the submitter should always test against the
latest stable code base.
Some pathological test images have been created that can cause s to
overflow or underflow the signed int data type during decompression.
This is technically undefined behavior according to the C spec, although
every modern implementation I'm aware of will treat the signed int as a
2's complement unsigned int, thus causing the value to wrap around to
INT_MIN if it exceeds INT_MAX. This commit simply makes that behavior
explicit in order to shut up UBSan. At least when building for x86-64
or i386 using Clang or GCC, this commit does not change the
compiler-generated assembly code at all.
The code that triggered this error has existed in the libjpeg code base
for at least 20 years (and probably much longer), so the fact that it
hasn't produced a user-visible problem in all of that time strongly
suggests that UBSan is being overly pedantic here. But if someone can
cough up a platform that doesn't wrap around to INT_MIN when 1 is added
to INT_MAX, then I'll happily change my opinion.
Fixes#347
... since www.nasm.us seems to be down frequently. This doesn't help us
at the moment, but hopefully once the site is back up this will prevent
future build failures.
When simd/arm/jsimd.c is compiled with __ARM_NEON__ defined (which will
be the case if -mfpu=neon is passed to the compiler), the
parse_proc_cpuinfo() and check_feature() functions and the bufsize
variable are unused and thus need to be #ifdef'ed out in order to avoid
compiler warnings. Note that the bufsize variable was already #ifdef'ed
out on Linux but not on Android due to lack of parentheses (&& takes
precedence over ||.)
Closes#331
%{_libdir}/pkgconfig is a directory and should thus be prefixed by
%{dir} (oops.) This issue caused the debuginfo build under RHEL 8
(which is apparently now enabled by default-- regardless of whether the
RPM actually contains debug info, but that's another matter) to fail
with:
RPM build errors:
File listed twice: /opt/libjpeg-turbo/lib64/pkgconfig/libjpeg.pc
File listed twice: /opt/libjpeg-turbo/lib64/pkgconfig/libturbojpeg.pc
On RHEL 7 and later (not sure exactly whether this is a product of the
newer RPM release or something distro-specific), macros are lazily
expanded, so we need to set _docdir using %global (which expands at
definition time) and prior to _prefix and _datarootdir (which affect
_defaultdocdir.) Otherwise, _docdir is set to a subdirectory of
/opt/libjpeg-turbo/share/doc or /opt/libjpeg-turbo/doc. The former
(which happens on RHEL 7) leads to incorrect documentation packaging
(the docs should be packaged under /usr/share/doc per Red Hat
standards), and the latter (which happens on RHEL 8) leads to an RPM
build error.
AFAICT, Requires and BuildRequires subsumed the functionality of Prereq
and BuildPrereq in RPM 4.0, and none of the platforms we support with
libjpeg-turbo 2.0.x has RPM < 4.4.
We have more than eight registers to work with, as well as three-operand
intrinsics, so there's no need for the implementation to be such a
literal port of the MMX code.
Unfortunately, this hack is necessary because:
- install(TARGETS, ...) doesn't support the RENAME option.
- We can't modify OUTPUT_NAME for the "-static" targets without breaking
the regression tests.
- ${CMAKE_CFG_INTDIR} doesn't seem to work properly in an install()
command.
Refer to #307
libjpeg-turbo has never really supported such compilers, since (AFAIK)
they are non-existent on any modern computing platform and thus
impossible for us to test. (Also, the TurboJPEG API would break without
unsigned chars.)
Furthermore, the unified CMake-based build system introduced in 2.0
always defines HAVE_UNSIGNED_CHAR, so retaining other code paths is
pointless. Eliminating support for compilers without unsigned char
eliminates the need for the GETJSAMPLE() macro, which improves the
readability of many parts of the code as well as improving the
performance of writing Targa and Windows BMP files.
Fixes#317
Including the license templates was confusing to some, since it made
it appear as if the copyright year and author were unspecified for the
libjpeg-turbo source. Thus, rather than include the zlib License
template, link to that template on opensource.org. For the Modified BSD
License, include a roll-up of copyright years and authors, since the
terms of that license require the text of it to be included in product
documentation for binary distributions without accompanying source code.
Use the android.toolchain.cmake toolchain file in the NDK (v13b or
later), since this toolchain file generally takes care of setting the
approprate compiler flags and dealing with the differences between
GCC and Clang. Our custom Android build procedure did not work with
Clang-based NDK toolchains, which meant that it could not be made to
work with NDK v18b or later.
Fixes#309
Normally, 4:4:4 JPEGs have horizontal x vertical luminance & chrominance
sampling factors of 1x1. However, it is technically legal to create
4:4:4 JPEGs with sampling factors of 2x1, 1x2, 3x1, or 1x3, since the
sums of the products of those sampling factors are still <= 10. The
libjpeg API correctly decodes such images, so the TurboJPEG API should
as well.
Fixes#323
If cinfo->quantize_colors == 1, then jpeg_calc_output_dimensions() will
set cinfo->output_components to 1, and if cinfo->out_color_space is not
RGB (or extended RGB), hilarity will ensue.
Fixes#305
Caused by 950580eb0c. Since the code that
sets CMAKE_INSTALL_RPATH now depends on ENABLE_SHARED, that code needed
to be moved to after the point at which ENABLE_SHARED is defined.
I give up on the public keyserver. It inexplicably just fails
sometimes. I was trying to use it out of an abundance of caution
(<cough> paranoia <cough>), but it seems like most open source projects
just serve up their public keys from their project web sites. The
private and public pre-release keys are still stored on separate sites,
the private key is still strongly encrypted by Travis, and we use a
separate key for pre-releases anyhow, so even if it's compromised, we
can quickly and easily deploy a new one.
... when downloading the RPM signing key. Apparently the key server
URL sometimes redirects to an https URL, which may explain why fetching
the RPM signing keys failed frequently when we used to run wget inside
of the CentOS 5 Docker container.
The build will consistently fail for days at a time with:
error: http://pool.sks-keyservers.net/pks/lookup?op=get&search=0x0575F26BD5B3FDB1: import read failed(-1).
I have a hunch that this is related to the CentOS 5 Docker container, so
this commit causes Travis to download the RPM signing key outside of
the container and share it with the container.
We shouldn't be making JNI calls between GetPrimitiveArrayCritical() and
ReleasePrimitiveArrayCritical(). Apparently Android is stricter about
this than desktop Java.
Issue was introduced in 0713c1bb54.
Fixes#300
Apparently Windows 7 without SP1 has O/S support for XSAVE but not for
YMM registers, and this exposed a bug in our usage of xgetbv. The test
instruction will set ZF only if none of the bits match between the two
operarands, so in effect, we were enabling AVX2 instructions if the O/S
supported XSAVE and the CPU supported AVX2 but the O/S only supported
XMM registers. This bug was not exposed on, for instance, Windows XP or
RHEL 5 because those O/S's do not support XSAVE.
Fixes#288
- CMake 3.10.x or later must be used with JDK 11, or an error
("regex not supported") will occur when CMake tries to parse the Java
version number.
- The JDK is no longer available at java.com.
The DSPr2 extensions have been verified to work with little endian MIPS.
Whether or not CMAKE_SYSTEM_PROCESSOR is set to "mips" or "mipsel" in a
little endian MIPS environment seems to be inconsistent, but our build
system needs to handle both cases.
(for instance, when passing -msoft-float to the compiler)
The instructions used by jsimd_quantize_float_dspr2() and
jsimd_convsamp_float_dspr2() don't work with the soft float ABI, so
disable those functions when soft float is enabled.
Based on:
129a739bfaCloses#272
When cli arguments request image-changing operation (like transform, scans or arith coding) to be applied, force output result file, even if it has bigger filesize than original
@rpath is only supported with 10.5 and later deployment targets.
libjpeg-turbo hasn't supported 10.4 "Tiger" since prior to 1.4, but I
still sometimes use the 10.4 SDK to test PowerPC code in a Snow Leopard
VM.
- When referring to specific clauses, annexes, tables, and figures, a
"timed reference" (a reference that includes the year) must be used in
order to avoid confusion.
- "CCITT" = "ITU-T"
- Replace ambiguous "JPEG spec" with the specific document number.
... in which one or more of the color indices is out of range for the
number of palette entries.
Fix partly borrowed from jpeg-9c. This commit also adopts Guido's
JERR_PPM_OUTOFRANGE enum value in lieu of our project-specific
JERR_PPM_TOOLARGE enum value.
Fixes#258
Normally the value of CMAKE_EXECUTABLE_SUFFIX is clobbered by project().
This allows for specifying an executable suffix of .html with Emscripten
builds, which causes Emscripten to build standalone HTML versions of the
libjpeg-turbo test programs.
The sentence:
"Indeed, one of the original reasons for developing this free software
was to help force convergence on common, interoperable format standards
for JPEG files."
might be seen to imply that JPEG 2000 and JPEG XR are not interoperable
with themselves, although it is certainly the case that those formats
are not interoperable with each other, nor with
ITU T.81 | ISO/IEC 10918. They are also certainly not as common as
ITU T.81 | ISO/IEC 10918, and (as an example) popular web browsers will
not display JPEG 2000 files.
The sentence in question was originally referring to proprietary,
non-standard formats and was meant to provide historical context.
libjpeg was originally released prior to the adoption of JFIF as an
official standard, so it encouraged adoption of JFIF as a de facto
standard by providing, under a business-friendly free software license,
a library for reading and writing images in that format.
This is basically the same test that was performed in acinclude.m4 in
the old autotools-based build system. It was not ported to the
CMake-based build system because I previously had no way of testing
a non-DSPr2 build environment.
Fixes#248
... caused by using certain specific combinations of
jpeg_skip_scanlines() and jpeg_read_scanlines() calls with progressive,
vertically-subsampled JPEG images.
Fixes#237
In rdbmp.c, it is necessary to guard against 32-bit overflow/wraparound
when allocating the row buffer, because since BMP files have 32-bit
width and height fields, the value of biWidth can be up to 4294967295.
Specifically, if biWidth is 1073741824 and cinfo->input_components = 4,
then the samplesperrow argument in alloc_sarray() would wrap around to
0, and a division by zero error would occur at line 458 in jmemmgr.c.
If biWidth is set to a higher value, then samplesperrow would wrap
around to a small number, which would likely cause a buffer overflow
(this has not been tested or verified.)
... in tjLoadImage()/tjSaveImage(). These error codes require an add-on
message table, and if it isn't initialized, then format_message()
produces "Bogus message code XXXX" instead.
This change is proposed to enable cases where this library and it's CMakeLists.txt are included in other projects/CMakeLists.txt files as a dependency via the add_subdirectory method, for example: add_subdirectory(mozjpeg). There are several reasons and workflows which "wrap" third party projects/builds using this method, including many enterprise build/devops pipelines.
This change will have no effect on users building mozjpeg by itself as usual, it very simply enables the wrapping use cases.
With this parem do not write JFIF APP0 marker segment. Reduce size in 18 bytes. This is a mandatory marker, but no error in know programs if are lost. Safe for web use.
(detected by enabling additional checkstyle modules)
This commit also removes unnecessary uses of the "private" modifier in
the Java tests/examples. The default access modifier disallows access
outside of the package, and none of these classes is in a package. The
only reason we use "private" with member variables in these classes is
to make checkstyle happy, because we want it to enforce that behavior in
the TurboJPEG API code.
... and modify tjbench.c to match the variable name changes made to
TJBench.java
("checkstyle" = http://checkstyle.sourceforge.net, not our regex-based
checkstyle script)
Arguably it doesn't make much sense for non-chroma components to be
subsampled (which is why this type of image was overlooked in
cd7c3e6672cce3779450c6dd10d0d70b0c2278b2-- I didn't realize it was a
thing), but certain Adobe applications apparently generate these images.
Fixes#236
The old Un*x (autotools-based) build system always auto-generated this
file, but that behavior was more or less a relic of the days before the
libjpeg-turbo colorspace extensions were implemented. The thinking was
that, if a particular developer wanted to change RGB_RED, RGB_GREEN,
RGB_BLUE, or RGB_PIXELSIZE in order to compress from/decompress to
different RGB pixel layouts, then the SIMD extensions should
automatically respond to those changes whenever they were made to
jmorecfg.h. The modern reality is that changing RGB_* is no longer
necessary because of the libjpeg-turbo colorspace extensions, and
changing any of the other constants in jsimdcfg.inc can't be done
without making deeper modifications to the SIMD extensions. In general,
we treat RGB_* as a de facto, immutable part of the legacy libpjeg API.
Realistically, since the values of those constants have been the same in
every Un*x distribution released in the past 20-30 years, any software
that uses a system-supplied build of libjpeg must assume that those
constants will have default values.
Furthermore, even if it made sense to auto-generate jsimdcfg.inc, it was
never possible to do so on Windows, so it was always going to be
necessary to manually generate the Windows version of the file whenever
any of the constants changed. This commit introduces a new custom CMake
target called "jsimdcfg" that can be used, on Un*x platforms, to
generate jsimdcfg.inc on demand, although this should only be necessary
when introducing new x86 SIMD instructions or making other deep
modifications, such as SIMD acceleration for 12-bit JPEGs.
For those who may be wondering why we don't do the same thing for
win/jconfig.h.in, it's because performing all of the necessary CMake
checks to populate that file is very slow on Windows.
These were necessary for the first iteration of the feature (see #46),
which provided a different C front end for the SIMD version of the
function. The final version of the feature uses a common C front end
for both SIMD and non-SIMD implementations, so these checks are no
longer necessary.
Closes#231
This commit merges the following paragraph from the latest libjpeg
release:
https://github.com/libjpeg-turbo/ijg/blob/jpeg-9c/README#L222-L229
which takes into account the fact that JFIF is now an official ISO/ITU-T
standard. I also included the ISO/IEC document number for the JFIF spec
(jpeg-9c included only the ITU-T rec number.)
This commit also heavily wordsmiths the "FILE FORMAT WARS" section.
In jpeg-7 and later, this section has become somewhat impolitic,
referring to JPEG 2000 and JPEG XR as "faulty technologies" and
"momentary mistakes." The original intent of this section, which was
introduced in jpeg-5 and refined in jpeg-6
(https://github.com/libjpeg-turbo/ijg/blob/jpeg-5/README#L317-L338,
https://github.com/libjpeg-turbo/ijg/blob/jpeg-6b/README#L335-L367)
was to highlight the problem of JPEG file format divergence that existed
in the 1990s prior to the adoption of JFIF as an official ISO/ITU-T
standard. That problem is fortunately no longer a problem, thanks in
part to the existence of libjpeg. I have attempted to preserve Tom's
intent of using this section to describe which file formats the code is
compatible with and why it isn't compatible with some file formats
bearing the name "JPEG." Such modifications always put our project in a
very awkward position, because we are not the IJG and do not claim to
be, but it is still necessary for us to modify the IJG README file from
time to time to eliminate obsolete information while attempting to
remain as neutral as possible.
Instructing the compiler to treat warnings as errors caused some of the
compiler tests to fail, because the test code was not 100% clean.
Note that we now use check_symbol_exists() to check for memset() and
memcpy(), since the test code for check_function_exists() produces a
compiler warning due to not including <string.h>.
... instead of the RSA code, the license for which contains an
advertising clause. It is strongly believed that the RSA advertising
clause is innocuous, because:
- A clarification from RSA
(http://www.ietf.org/ietf-ftp/IPR/RSA-MD-all), published in 2000,
stated:
"Implementations of these message-digest algorithms, including
implementations derived from the reference C code in RFC-1319,
RFC-1320, and RFC-1321, may be made, used, and sold without license
from RSA for any purpose."
Referring to the opinion from Fedora's legal team
(https://fedoraproject.org/wiki/Licensing:FAQ?rd=Licensing/FAQ#What_about_the_RSA_license_on_their_MD5_implementation.3F_Isn.27t_that_GPL-incompatible.3F),
this means that md5.c and md5.h, which were derived from the original
RFC 1321 reference code (http://www.faqs.org/rfcs/rfc1321.html), can
be used without the RSA license.
- In the context of libjpeg-turbo, RSA's MD5 code was used only in the
build/test system. It was not part of the libjpeg-turbo binary
distribution, and thus the only "material mentioning or referencing"
the MD5 code was the libjpeg-turbo source code, which-- by virtue of
including RSA's original copyright headers-- properly attributed the
code as required under the RSA license.
However, in light of the open source community's tendency to have
knee-jerk reactions to stuff like this, it would've been necessary to
include the above explanation in our source tree in order to head off
potential FUD, and a simple fix is always better than a complex
explanation.
This commit also assigns the 3-clause BSD license to my modifications of
the MD5 code. This license is the same one used by md5cmp and other
parts of the build system.
When attempting to configure an iOS/ARM build with Xcode 7.2 and CMake
2.8.12, I got the following errors:
CMake Error at CMakeLists.txt:560 (add_library):
Attempting to use MACOSX_RPATH without CMAKE_SHARED_LIBRARY_RUNTIME_C_FLAG
being set. This could be because you are using a Mac OS X version less
than 10.5 or because CMake's platform configuration is corrupt.
(x 3)
CMake Error at sharedlib/CMakeLists.txt:38 (add_library):
Attempting to use MACOSX_RPATH without CMAKE_SHARED_LIBRARY_RUNTIME_C_FLAG
being set. This could be because you are using a Mac OS X version less
than 10.5 or because CMake's platform configuration is corrupt.
(x 3)
Upgrading to CMake 3.x (tried 3.0 and 3.1) got rid of the errors, but
the resulting shared libs still did not use @rpath as expected. Note
also that CMake 3.x (at least the two versions I tested) does not
automatically set the MACOSX_RPATH property as claimed. I could find
nothing in the release notes for later CMake releases to indicate that
either problem has been fixed. What I did find was this little nugget
of code in the Darwin platform module:
f6b93fbf3a/Modules/Platform/Darwin.cmake (L33-L36)
This sets CMAKE_SHARED_LIBRARY_RUNTIME_C_FLAG="-Wl,-rpath," only if you
are running OS X 10.5 or later. It makes no such check for iOS, perhaps
because shared libraries aren't much of a thing with iOS apps. In any
event, this commit simply sets CMAKE_SHARED_LIBRARY_RUNTIME_C_FLAG if it
isn't set already, and that fixes all of the aforementioned problems.
- Travis doesn't set the $encrypted_* variables for PRs, so disable GPG
signing when building a PR (artifacts aren't deployed for PRs anyhow,
and even if they were, I wouldn't want them to be signed, as they may
contain unvetted code.)
- Take advantage of the new -d option in buildljt, which allows for
building from an existing Git clone directory. This eliminates the need
to rename and restore .git/shallow, allows the official build scripts to
work properly when building PRs, and prevents 'git clone' being invoked
twice in CI builds.
Refer to #217
These files are potentially useful to MinGW users, since MSYS2 MinGW
environments have a man command by default and provide an easy way to
install pkg-config.
Closes#223
This commit adds C and SSE2 optimizations for the encode_mcu_AC_first()
function used in progressive Huffman encoding.
The image used for testing can be retrieved from this page:
https://blog.cloudflare.com/doubling-the-speed-of-jpegtran
All timings done on `Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz`
clang version is `Apple LLVM version 9.0.0 (clang-900.0.39.2)`
gcc-5 version is `gcc-5 (Homebrew GCC 5.5.0) 5.5.0`
gcc-7 version is `gcc-7 (Homebrew GCC 7.2.0) 7.2.0`
Here are the results in comparison to libjpeg-turbo@293263c using
`time ./jpegtran -outfile /dev/null -progressive -optimise -copy none print_poster_0025.jpg`
C
clang x86_64: +19%
gcc-5 x86_64: +80%
gcc-7 x86_64: +57%
clang i386: +5%
gcc-5 i386: +59%
gcc-7 i386: +51%
SSE2
clang x86_64: +79%
gcc-5 x86_64: +158%
gcc-7 x86_64: +122%
clang i386: +71%
gcc-5 i386: +134%
gcc-7 i386: +135%
Discussion in libjpeg-turbo/libjpeg-turbo#46
This commit adds C and SSE2 optimizations for the encode_mcu_AC_refine()
function used in progressive Huffman encoding.
The image used for testing can be retrieved from this page:
https://blog.cloudflare.com/doubling-the-speed-of-jpegtran
All timings done on `Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz`
clang version is `Apple LLVM version 9.0.0 (clang-900.0.39.2)`
gcc-5 version is `gcc-5 (Homebrew GCC 5.5.0) 5.5.0`
gcc-7 version is `gcc-7 (Homebrew GCC 7.2.0) 7.2.0`
Here are the results in comparison to libjpeg-turbo@3c54642 using
`time ./jpegtran -outfile /dev/null -progressive -optimise -copy none print_poster_0025.jpg`
C
clang x86_64: +7%
gcc-5 x86_64: +30%
gcc-7 x86_64: +33%
clang i386: +0%
gcc-5 i386: +24%
gcc-7 i386: +23%
SSE2
clang x86_64: +42%
gcc-5 x86_64: +53%
gcc-7 x86_64: +64%
clang i386: +35%
gcc-5 i386: +46%
gcc-7 i386: +49%
Discussion in libjpeg-turbo/libjpeg-turbo#46
Within the libjpeg API code, it seems to be more the convention than not
to separate the macro name and value by two or more spaces, which
improves general readability. Making this consistent across all of
libjpeg-turbo is less about my individual preferences and more about
making it easy to automatically detect variations from our chosen
formatting convention. I intend to release the script I'm using to
validate this stuff, once it matures and stabilizes a bit.
* Modify the SIMD dispatchers so they guard their usage of getenv() with
the existing NO_GETENV preprocessor definition.
* Introduce a new NO_PUTENV preprocessor definition to guard the
usage of putenv() in the TurboJPEG API library.
This at least puts Windows Store compatibility within the realm of
possibility, although further steps are required.
Broken by previous commit. Although turbojpeg.c no longer needs
tjutil.h on Un*x, it still needs to include that file on Windows in
order to use snprintf() and strcasecmp() (which, on Windows, are macros
that wrap _snprintf_s() and stricmp().)
With rare exceptions ...
- Always separate line continuation characters by one space from
preceding code.
- Always use two-space indentation. Never use tabs.
- Always use K&R-style conditional blocks.
- Always surround operators with spaces, except in raw assembly code.
- Always put a space after, but not before, a comma.
- Never put a space between type casts and variables/function calls.
- Never put a space between the function name and the argument list in
function declarations and prototypes.
- Always surround braces ('{' and '}') with spaces.
- Always surround statements (if, for, else, catch, while, do, switch)
with spaces.
- Always attach pointer symbols ('*' and '**') to the variable or
function name.
- Always precede pointer symbols ('*' and '**') by a space in type
casts.
- Use the MIN() macro from jpegint.h within the libjpeg and TurboJPEG
API libraries (using min() from tjutil.h is still necessary for
TJBench.)
- Where it makes sense (particularly in the TurboJPEG code), put a blank
line after variable declaration blocks.
- Always separate statements in one-liners by two spaces.
The purpose of this was to ease maintenance on my part and also to make
it easier for contributors to figure out how to format patch
submissions. This was admittedly confusing (even to me sometimes) when
we had 3 or 4 different style conventions in the same source tree. The
new convention is more consistent with the formatting of other OSS code
bases.
This commit corrects deviations from the chosen formatting style in the
libjpeg API code and reformats the TurboJPEG API code such that it
conforms to the same standard.
NOTES:
- Although it is no longer necessary for the function name in function
declarations to begin in Column 1 (this was historically necessary
because of the ansi2knr utility, which allowed libjpeg to be built
with non-ANSI compilers), we retain that formatting for the libjpeg
code because it improves readability when using libjpeg's function
attribute macros (GLOBAL(), etc.)
- This reformatting project was accomplished with the help of AStyle and
Uncrustify, although neither was completely up to the task, and thus
a great deal of manual tweaking was required. Note to developers of
code formatting utilities: the libjpeg-turbo code base is an
excellent test bed, because AFAICT, it breaks every single one of the
utilities that are currently available.
- The legacy (MMX, SSE, 3DNow!) assembly code for i386 has been
formatted to match the SSE2 code (refer to
ff5685d5344273df321eb63a005eaae19d2496e3.) I hadn't intended to
bother with this, but the Loongson MMI implementation demonstrated
that there is still academic value to the MMX implementation, as an
algorithmic model for other 64-bit vector implementations. Thus, it
is desirable to improve its readability in the same manner as that of
the SSE2 implementation.
+ "JSIMD_ARM_NEON" = "JSIMD_NEON"
+ "JSIMD_MIPS_DSPR2" = "JSIMD_DSPR2"
+ "*_mips_dspr2" = "*_dspr2"
It's obvious that "NEON" refers to Arm and "DSPr2" refers to MIPS, and
this naming convention is consistent with the other SIMD extensions.
Newer versions of CMake (known to be the case with 3.7.x and 3.10.x)
fail to add a space between CMAKE_C_FLAGS and CMAKE_ASM_FLAGS, which
causes the build to fail when using the official build procedure.
Closes#216
Referring to https://docs.microsoft.com/en-US/cpp/build/stack-usage:
"All memory beyond the current address of RSP is considered volatile:
The OS, or a debugger, may overwrite this memory during a user debug
session, or an interrupt handler. Thus, RSP must always be set before
attempting to read or write values to a stack frame."
Basically, if-- under extremely rare circumstances-- a context swap were
to occur between saving the values of xmm8-xmm11 and setting the new
value of rsp, the O/S might not preserve that area of the stack. In
general, libjpeg-turbo should not be using xmm8-xmm11 before or after
the call to jsimd_huff_encode_one_block_sse2(), so this is probably a
non-issue, but it's still a good idea to fix it.
Based on
ff7d2030dd
NDK r16b moved some things around, so modify the Android build recipes
to take that into account while preserving compatibility with previous
NDK releases.
NOTE: the GCC 4.9 NDK toolchain is deprecated, so we will need to
develop new Android build recipes for libjpeg-turbo 1.6 that use the
Clang toolchain.
Closes#196
If jpeg_skip_scanlines() is used to skip to the end of a single-scan
image, then we need to change the library state such that subsequent
calls to jpeg_consume_input() will return JPEG_REACHED_EOI rather than
JPEG_SUSPENDED. (NOTE: not necessary for multi-scan images, since the
scans are processed prior to any call to jpeg_skip_scanlines().)
Unless I miss my guess, using jpeg_skip_scanlines() in this manner
will prevent any markers at the end of the JPEG image from being
read, but I don't think there is any way around that without actually
reading the data, which would defeat the purpose of
jpeg_skip_scanlines().
Fixes#194
Refer to travis-ci/travis-ci#8552. This was supposed to be fixed on
November 15, then on November 28. Travis blew through both deadlines,
so I have no confidence that the issue will be fixed as promised in a
timely manner. Adding 'brew update' to .travis.yml slows the OS X
build, but there is no choice at the moment.
Loading RGB image files into a grayscale buffer isn't a particularly
useful feature, given that libjpeg-turbo can perform this conversion
much more optimally (with SIMD acceleration on some platforms) during
the compression process. Also, the RGB2GRAY() macro was not producing
deterministic cross-platform results because of variations in the
round-off behavior of various floating point implementations, so
`tjunittest -bmp` was failing in i386 builds.
Also, set the red/green/blue offsets for TJPF_GRAY to -1 rather than 0.
It was undefined behavior for an application to use those arrays/methods
with TJPF_GRAY anyhow, and this makes it easier for applications to
programmatically detect whether a given pixel format has red, green, and
blue components.
The main justification for this is to provide new libjpeg-turbo users
with a quick & easy way of developing a complete JPEG
compression/decompression program without requiring them to build
libjpeg-turbo from source (which was necessary in order to use the
project-private bmp API) or to use external libraries. These new
functions build upon significant enhancements to rdbmp.c, wrbmp.c,
rdppm.c, and wrppm.c which allow those engines to convert directly
between the native pixel format of the file and a pixel format
("colorspace" in libjpeg parlance) specified by the calling program.
rdbmp.c and wrbmp.c have also been modified such that the calling
program can choose to read or write image rows in the native (bottom-up)
order of the file format, thus eliminating the need to use an inversion
array. tjLoadImage() and tjSaveImage() leverage these new underlying
features in order to significantly improve upon the performance of the
old bmp API.
Because these new functions cannot work without the libjpeg-turbo
colorspace extensions, the libjpeg-compatible code in turbojpeg.c has
been removed. That code was only there to serve as an example of how
to use the TurboJPEG API on top of libjpeg, but more specific, buildable
examples now exist in the https://github.com/libjpeg-turbo/ijg
repository.
tjbenchtest and its Java derivatives are useful for rooting out hidden
problems with the more esoteric TJBench and TurboJPEG features. For
instance, on Windows, running tjbenchtest uncovered
5fce2e9421.
This commit also causes tjbenchtest and tjbenchtest.java to append -yuv
and -alloc to their log file names, depending on the arguments passed,
and it causes the build system to clean up those log files when the
'testclean' target is built.
+ clean up log files when 'make testclean' is invoked
+ fix 'tjbenchtest -yuv -alloc'
+ fix tjexampletest so that it creates images under /tmp
+ clean up tjexampletest
The program crashed when a JPEG image was passed on the command line,
because we were mixing our metaphors vis-a-vis malloc()/free() and
tjAlloc()/tjFree() (malloc()/free() uses the tjbench.exe heap,
whereas tjAlloc()/tjFree() uses the turbojpeg.dll heap.)
- Referring to 073b0e88a1 and #185, the
reason why BMP and RLE didn't (and won't) work with partial image
decompression is that the output engines for both formats maintain a
whole-image buffer, which is used to reverse the order of scanlines.
However, it was straightforward to add -crop support for GIF and
Targa, which is useful for testing partial image decompression along
with color quantization.
- Such testing reproduced a bug reported by Mozilla (refer to PR #182)
whereby jpeg_skip_scanlines() would segfault if color quantization was
enabled. To fix this issue, read_and_discard_scanlines() now sets up
a dummy quantize function in the same manner that it sets up a dummy
color conversion function.
Closes#182
... and document that only PPM/PGM output images are supported with the
-crop option for the moment.
I investigated the possibility of supporting -crop with -bmp, but even
after resetting the buffer dimensions, I still kept getting virtual
array access errors. It seems that doing this the "right way" would
require creating a re-initialization function for each image format's
destination manager. I'm disinclined to do that right now, given that
this feature was Google's baby (developed as a prerequisite for
including libjpeg-turbo in Android), and the -crop option in djpeg is
intended only as an example of how to use the partial image
decompression API. Real-world applications would need to handle this
in their own destination managers.
It would probably be possible to make this work with Targa by employing
a similar hack to the one we used with PPM, but Targa isn't popular
enough to bother.
Fixes#185
Setting _libdir to CMAKE_INSTALL_FULL_LIBDIR only works when doing an
in-tree RPM build. SRPMs are architecture-agnostic, so the spec needs
to compute_libdir at the time the SRPM is rebuilt, not at the time it
is generated.
This is a regression introduced when implementing the new CMake-based
cross-platform build system.
The version of RPM on RHEL 5 and older platforms defines _libdir
as %{_exec_prefix}/%{_lib}, so defining _lib in the spec file redefined
_libdir. However, newer versions of RPM (probably >= 4.6, since that
was the version that introduced the ISA macros) define _libdir as either
%{_prefix}/lib or %{_prefix}/lib64. Thus, we need to explicitly
override _libdir in our spec file.
It is necessary for the C code to be aware of the machine's endianness,
which is why the TurboJPEG Java wrapper sets a different pixel format
for integer BufferedImages depending on ByteOrder.nativeOrder().
However, it isn't necessary to handle endianness in pure Java code such
as TJUnitTest (d'oh!) This was a product of porting the C version of
TJUnitTest too literally, and of insufficient testing (historically,
the big endian systems I had available for testing didn't have Java.)
planes == null is a valid argument to setBuf() if alloc == true, so we
need to make sure that planes is non-null before validating its length.
We also need to allocate one dimension of the planes array if it's null.
Fixes#168
Tag 1.5.2 release
* tag '1.5.2': (54 commits)
x86: Fix "short jump is out of range" w/ NASM<2.04
TurboJPEG: Document xform issue w/ big marker data
Java TJBench: Fix parsing of -warmup argument
Build: Disable warmup in TJBench regression tests
TJBench: Improve consistency of results
TurboJPEG: C API documentation buglet
TJBench: Code formatting tweaks
TJBench: Fix errors when decomp. files w/ ICC data
BUILDING.md: Include Android/x86 build recipes
Travis: Fix OS X build
Restore compatibility with older autoconf releases
Attribute ARM runtime detection code to Nokia
Honor max_memory_to_use/JPEGMEM/-maxmemory
AppVeyor: Fix CI build
TurboJPEG: Fix potential memory leaks
Always tweak EXIF w/h tags w/ lossless transforms
Fix error w/ lossless crop & libjpeg v7 emulation
Include jpeg_skip/crop_scanlines() in jpeg7.dll
libjpeg.txt: Include partial decomp. in TOC
Slightly de-confusify cjpeg, jpegtran usage info
...
Previously, -stoponwarning only had an effect on the underlying
TurboJPEG C functions, but TJBench still aborted if a non-fatal error
occurred. This commit modifies the C version of TJBench such that it
always recovers from a non-fatal error unless -stoponwarning is
specified. Furthermore, the benchmark stores the details of the last
non-fatal error and does not print any subsequent non-fatal error
messages unless they differ from the last one.
Due to limitations in the Java API (specifically, the fact that it
cannot communicate errors, fatal or otherwise, to the calling program
without throwing a TJException), it was only possible to make
decompression operations fully recoverable within TJBench. With other
operations, -stoponwarning still has an effect on the underlying C
library but has no effect at the Java level.
The Java API documentation has been amended to reflect that only certain
methods are truly recoverable, regardless of the state of
TJ.FLAG_STOPONWARNING.
Allow progressive entropy coding to be enabled on a
transform-by-transform basis, and implement a new transform option for
disabling the copying of markers.
Closes#153
If the source image for a transform operation has a lot of EXIF or ICC
data embedded in it, then it may cause the output image size to exceed
the worst-case size returned by tjBufSize() (because tjTransform()
transfers all markers to the output image.) This is only a problem if
TJFLAG_NOREALLOC is passed to the function. Since the TurboJPEG C API
doesn't require the destination image size to be set in this case, it
makes the documented assumption that the calling program has allocated
the destination buffer to exactly the size returned by tjBufSize().
Changing this assumption would change the API behavior and necessitate
a new function name (tjTransform2().) At the moment, it's easier to
just document this as a known issue, since it's easy to work around in
the C API.
The Java API is unfortunately a different story, since it must always
use TJFLAG_NOREALLOC (because, when using the TurboJPEG Java API, all
buffers are allocated on the Java heap, and thus they can't be
reallocated by the C code.) There is no easy way to work around this
without changing the C API as discussed above, because if the source
image contains large amounts of marker data, it's virtually impossible
to determine how big the output image will be.
Given that libjpeg-turbo can often process hundreds of megapixels/second
on modern hardware, the default of one warmup iteration was essentially
meaningless. Furthermore, the -warmup option was a bit clunky, since
it required some foreknowledge of how fast the benchmarks were going to
execute.
This commit introduces a 1-second warmup interval for each benchmark by
default, and the -warmup option has been retasked to control the length
of that interval.
- Provide a new C API function and TJException method that allows
calling programs to query the severity of a compression/decompression/
transform error.
- Provide a new flag that instructs the library to immediately stop
compressing/decompressing/transforming if a warning is encountered.
Fixes#151
Introduce a new C API function (tjGetErrorStr2()) that can be used to
retrieve compression/decompression/transform error messages in a
thread-safe (i.e. instance-specific) manner. Retrieving error messages
from global functions is still thread-unsafe.
Addresses a concern expressed in #151.
Embedded ICC profiles can cause the size of a JPEG file to exceed the
size returned by tjBufSize() (which is really meant to be used for
compression anyhow, not for decompression), and this was causing a
segfault (C) or an ArrayIndexOutOfBoundsException (Java) when
decompressing such files with TJBench. This commit modifies the
benchmark such that, when tiled decompression is disabled, it re-uses
the source buffer as the primary JPEG buffer.
The Travis xcode7.3 image now apparently includes GnuPG 1.4.x by
default, so use it instead of installing GnuPG 2. Using GnuPG 2.1.x,
the default version in Homebrew as of this writing, is problematic for
this reason:
https://wiki.archlinux.org/index.php/GnuPG#Unattended_passphrase
This re-introduces a feature of the obsolete system-specific libjpeg
memory managers-- namely the ability to limit the amount of main memory
used by the library during decompression or multi-pass compression.
This is mainly beneficial for two reasons:
- Works around a 2 GB limit in libFuzzer
- Allows security-sensitive applications to set a memory limit for the
JPEG decoder so as to work around the progressive JPEG exploit
(LJT-01-004) described here:
http://www.libjpeg-turbo.org/pmwiki/uploads/About/TwoIssueswiththeJPEGStandard.pdf
This commit also removes obsolete documentation regarding the MS-DOS
memory manager (which itself was removed long ago) and changes the
documentation of the -maxmemory switch and JPEGMEM environment variable
to reflect the fact that backing stores are never used in libjpeg-turbo.
Inspired by:
066fee2e7dCloses#143
Something changed in the CI build environment, and our previous trick of
setting the Git URL to file://c:/projects/libjpeg-turbo no longer works.
Using cygpath to translate the Windows path to a MinGW-friendly format
is a better solution anyhow.
Referring to https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=746,
it seems that the values of local buffer pointers in TurboJPEG API
functions aren't always preserved if longjmp() returns control to a
point prior to the allocation of the local buffers. This is known to
be an issue with GCC 4.x and clang with -O1 and higher optimization
levels but not with GCC 5.x and later. It is unknown why GCC 5.x and
6.x do not suffer from the issue, but possibly the local buffer pointers
are not allocated on the stack when using those more recent compilers.
In any case, this commit modifies the TurboJPEG API library code such
that the jump buffer is always updated after any local buffer pointers
are allocated but before any subsequent libjpeg API functions are
called.
Tag 1.5.1 release
* tag '1.5.1':
ARM64 NEON: Fix another ABI conformance issue
Build: Remove ARMv6 support from 'make iosdmg'
Fix out-of-bounds write in partial decomp. feature
Silence additional UBSan warnings
Fix unsigned int overflow in libjpeg memory mgr.
TurboJPEG: Decomp. 4:2:2/4:4:0 JPEGs w/unusual SFs
Silence pedantic GCC6 code formatting warnings
Use plain upsampling if merged isn't accelerated
Implement h1v2 fancy upsampling
Fix AArch64 ABI conformance issue in SIMD code
Don't install libturbojpeg.pc if TJPEG disabled
Linux/PPC: Only enable AltiVec if CPU supports it
ARM/MIPS: Change the behavior of JSIMD_FORCE*
Bump version to 1.5.1 to prepare for new commits
This commit does the following:
-- Merges the two glueware functions (read_icc_profile() and
write_icc_profile()) from iccjpeg.c, which is contained in downstream
projects such as LCMS, Ghostscript, Mozilla, etc. These functions were
originally intended for inclusion in libjpeg, but Tom Lane left the IJG
before that could be accomplished. Since then, programs and libraries
that needed to embed/extract ICC profiles in JPEG files had to include
their own local copy of iccjpeg.c, which is suboptimal.
-- The new functions were prefixed with jpeg_ and split into separate
files for the compressor and decompressor, per the existing libjpeg
coding standards.
-- jpeg_write_icc_profile() was made slightly more fault-tolerant.
It will now trigger a libjpeg error if it is called before
jpeg_start_compress() or if it is passed NULL arguments.
-- jpeg_read_icc_profile() was made slightly more fault-tolerant.
It will now trigger a libjpeg error if it is called before
jpeg_read_header() or if it is passed NULL arguments. It will also
now trigger libjpeg warnings if the ICC profile data is corrupt.
-- The code comments have been wordsmithed.
-- Note that the one-line setup_read_icc_profile() function was not
included. Instead, libjpeg.txt now documents the need to call
jpeg_save_markers(cinfo, JPEG_APP0 + 2, 0xFFFF) prior to calling
jpeg_read_header(), if jpeg_read_icc_profile() is to be used.
-- Adds documentation for the new functions to libjpeg.txt.
-- Adds an -icc switch to cjpeg and jpegtran that allows those programs
to embed an ICC profile in the JPEG files they generate.
-- Adds an -icc switch to djpeg that allows that program to extract an
ICC profile from a JPEG file while decompressing.
-- Adds appropriate unit tests for all of the above.
-- Bumps the SO_AGE of the libjpeg API library to indicate the presence
of new API functions.
Note that the licensing information was obtained from:
https://github.com/mm2/Little-CMS/issues/37#issuecomment-66450180
... even if using libjpeg v6b emulation. Previously
adjust_exif_parameters() was only called with libjpeg v7/v8 emulation,
but due to a bug (which this commit also fixes), it only worked properly
with libjpeg v8 emulation.
The JPEG_LIB_VERSION #ifdef in jtransform_adjust_parameters() was
incorrect, which caused a "Bogus virtual array access" error when
attempting to use the lossless crop feature.
Introduced in c04bd3cc97.
This also adds libjpeg v7 API/ABI emulation to the Travis CI tests.
LICENSE.md is included in the binary distributions as well, so it
doesn't make much sense to refer to license headers in source files that
aren't necessarily going to be there.
The whole point of `make tarball` is to make it easy for users to create
a binary distribution of libjpeg-turbo on platforms that aren't
supported by our official build system, so requiring root permissions
somewhat defeated that purpose. Intead, the script now attempts to
detect whether the system has GNU tar or a recent version of BSD tar
that supports setting the ownership of the files in the tarball.
Although there is little chance that we will ever have a package
conflict on OS X, the convention from our Linux packages is to use the
package name, not the project name, for the name of the documentation
directory.
These improvements enable build systems to use GNUInstallDirs to define
custom directory variables.
- The set_dir() macro was renamed to GNUInstallDirs_set_install_dir(),
in keeping with the module's established macro naming convention.
- Rather than detecting whether the prefix has changed, the new
GNUInstallDirs_set_install_dir() macro instead examines whether the
default for the variable in question has changed. This allows for
more flexibility, since build systems may decide to change the
defaults based on factors other than the prefix. It also enables the
macro to work properly outside of the module.
- The module now performs directory variable substitution within the
body of GNUInstallDirs_get_absolute_install_dir().
- The JAVADIR variable is no longer included in GNUInstallDirs. That
directory is not part of the GNU spec, and it turns out that various
operating systems use different conventions for the location of Java
classes. Instead, the variable is now implemented in our build
system as a demonstration of the aforementioned GNUInstallDirs
enhancements.
- GNUInstallDirs: any directory variable can now reference any other
directory variable by including its name in angle brackets (<>).
- Changed the documentation of the directory variables in BUILDING.md
accordingly. This commit also includes some formatting tweaks to
that section (using boldface for directory names, as is our
convention.)
- Changed the package scripts such that they use
CMAKE_INSTALL_DATAROOTDIR rather than CMAKE_INSTALL_DATADIR.
- We no longer override the install dir. defaults on Windows unless
performing an official build. It may be useful, for instance, to
use the GNU defaults when installing into an MSYS environment.
It isn't actually necessary to specify `CMAKE_INSTALL_DEFAULT_MANDIR`
for our official build. Because `CMAKE_INSTALL_DEFAULT_DATAROOTDIR` is
blank for the official build, the default of "<DATAROOTDIR>/man" will
resolve to "man".
For the same reason, this commit changes the specification of
`CMAKE_INSTALL_DEFAULT_DOCDIR` and `CMAKE_INSTALL_DEFAULT_JAVADIR` in
the official build to be dependent on the data root directory (mainly to
make it obvious what we're doing.)
This commit also tweaks the example CMake command line in the directory
variable documentation so that it shows the correct location of the
CMake argument.
YASM requires a debug format to be specified with -g. Currently the
only combination that I can make work at all is DWARF-2/ELF (YASM
doesn't support Mach-O debugging at all, and its support for CV8/MSVC
and MinGW/DWARF-2 appears to be broken), so debugging is only enabled
automatically for ELF at the moment. For other formats, we don't
specify -g at all, which is how the old build system behaved.
Fixes#125, Closes#126
This builds upon the existing GNUInstallDirs module in CMake but adds
the following features to that module:
- The ability to override the defaults for each install directory
through a new set of variables (`CMAKE_INSTALL_DEFAULT_*DIR`).
Before operating system vendors began shipping libjpeg-turbo, it was
meant to be a run-time drop-in replacement for the system's
distribution of libjpeg, so it has traditionally installed itself
under /opt/libjpeg-turbo on Un*x systems by default. On Windows, it
has traditionally installed itself under %SystemDrive%\libjpeg-turbo*,
which is not uncommon behavior for open source libraries (open source
SDKs tend to install outside of the Program Files directory so as to
avoid spaces in the directory name.) At least in the case of Un*x,
the install directory behavior is based somewhat on the Solaris
standard, which requires all non-O/S packages to install their files
under /opt/{package_name}. I adopted that standard for VirtualGL and
TurboVNC while working at Sun, because it allowed those packages to be
located under the same directory on all platforms. I adopted it for
libjpeg-turbo because it ensured that our files would never conflict
with the system's version of libjpeg. Even though many Un*x
distributions ship libjpeg-turbo these days, not all of them ship the
TurboJPEG API library or the Java classes or even the latest version
of the libjpeg API library, so there are still many cases in which it
is desirable to install a separate version of libjpeg-turbo than the
one installed by the system. Furthermore, installing the files under
/opt mimics the directory structure of our official binary packages,
and it makes it very easy to uninstall libjpeg-turbo.
For these reasons, our build system needs to be able to use
non-GNU-compliant defaults for each install directory if
`CMAKE_INSTALL_PREFIX` is set to the default value.
- For each directory variable, the module now detects changes to
`CMAKE_INSTALL_PREFIX` and changes the directory variable accordingly,
if the variable has not been changed by the user.
This makes it easy to switch between our "official" directory
structure and the GNU-compliant directory structure "on the fly"
simply by changing `CMAKE_INSTALL_PREFIX`. Also, this new mechanism
eliminated the need for the crufty mechanism that previously did the
same thing just for the library directory variable.
How it should work:
- If a dir variable is unset, then the module will set an internal
property indicating that the dir variable was initialized to its
default value.
- If the dir variable ever diverges from its default value, then the
internal property is cleared, and it cannot be set again without
unsetting the dir variable.
- If the install prefix changes, and if the internal property
indicates that the dir variable is still set to its default value,
and if the dir variable's value is not being manually changed at the
same time that the install prefix is being changed, then the dir
variable's value is automatically changed to the new default value
for that variable (as determined by the new install prefix.)
- The directory variables are now always cached, regardless of whether
they were set on the command line or not. This ensures that they can
easily be examined and modified after being set, regardless of how they
were set.
This was made possible by the introduction of the aforementioned
`CMAKE_INSTALL_DEFAULT_*DIR` variables.
- Improved directory variable documentation (based on descriptions at
https://www.gnu.org/prep/standards/html_node/Directory-Variables.html)
- The module now allows "<DATAROOTDIR>" to be used as a placeholder in
relative directory variables.
It is replaced "on the fly" with the actual path of
`CMAKE_INSTALL_DATAROOTDIR`.
This should more closely mimic the behavior of the old autotools build
system while retaining our customizations to it, and it should retain
the behavior of the old CMake build system.
Closes#124
Strict C89-conformant compilers don't support the "inline" keyword, but
most of them support "__inline__", and that keyword can be used with the
always_inline atribute as well. This commit also removes duplicate code
by using a foreach() loop to test the various keywords.
This is a subtle point, but AC_C_INLINE defines "inline" to be either
"inline", "__inline__", or "__inline". The subsequent test for
"inline __attribute__((always_inline))" uses this definition. The
attribute is irrespective of the inline keyword, so whereas
"__inline__ __attribute__((always_inline))" works under C89,
"inline __attribute__((always_inline))" doesn't, and defining INLINE to
the latter causes the build to fail. The easiest way around this is
simply to define "inline" ahead of "INLINE" in jconfigint.h,
which causes the inline keyword detected by AC_C_INLINE to modify the
INLINE macro if necessary.
+ advertise that full AltiVec SIMD acceleration is now available on
OpenBSD.
The relevant compilers probably all support C99 or GNU's variation of
C90 that allows variables to be declared anywhere, but our policy is to
conform to the C90 standard, if for no other reason than that it
improves code readability.
Fixes:
rdppm.c:257:14: warning: comparison of integers of different signs: 'int' and 'unsigned int' [-Wsign-compare]
if (temp > maxval)
~~~~ ^ ~~~~~~
rdppm.c:284:14: warning: comparison of integers of different signs: 'int' and 'unsigned int' [-Wsign-compare]
if (temp > maxval)
~~~~ ^ ~~~~~~
rdppm.c:289:14: warning: comparison of integers of different signs: 'int' and 'unsigned int' [-Wsign-compare]
if (temp > maxval)
~~~~ ^ ~~~~~~
rdppm.c:294:14: warning: comparison of integers of different signs: 'int' and 'unsigned int' [-Wsign-compare]
if (temp > maxval)
- Replace CMAKE_SOURCE_DIR with CMAKE_CURRENT_SOURCE_DIR
- Replace CMAKE_BINARY_DIR with CMAKE_CURRENT_BINARY_DIR
- Don't use "libjpeg-turbo" in any of the package system filenames
(because CMAKE_PROJECT_NAME will not be the same if building LJT as
a submodule.)
Closes#122
The xcode7.2 image is verfallen, verlumpt, verblunget, verkackt
This also ensures that the build scripts are checked out from a
branch matching the libjpeg-turbo repository branch (not strictly
necessary when building from master, but it keeps the code in sync with
dev.)
The previous hack (adding ${CMAKE_ASM_COMPILER} to CMAKE_ASM_FLAGS)
didn't work in all cases, because more recent versions of CMake place
the includes ahead of the flags (which meant that the real assembler
wasn't the first argument to gas-preprocessor.pl.)
CMAKE_INSTALL_RPATH has to be set before the targets are defined (oops.)
This also explicitly turns on MACOSX_RPATH for the shared libraries
(which is the default with newer versions of CMake but not with 2.8.x.)
The old autotools/libtool build system hard-coded the install name
directory of the OS X shared libraries to libdir, which meant that any
executable that linked against those libraries would also be hard-coded
to look for the libjpeg-turbo libraries in that directory. @rpath makes
the OS X version of libjpeg-turbo behave like the Linux version, in the
sense that the executables under /opt/libjpeg-turbo/bin will
automatically pick up the libraries under /opt/libjpeg-turbo/lib* by
default, but other executables won't unless they are linked with -rpath.
AppVeyor already has MinGW32 and MinGW64 flavors of GCC 5.3.0
installed under MSYS2, so there is no need to install our own builds of
MinGW. MinGW-builds is no longer an active project, and we were getting
occasional timeouts while wgetting those files from SourceForge.
Furthermore, GCC 5.3.0 should produce faster code than GCC 4.8.1.
Updated out-of-date information, wordsmithed and clarified many
sections, and generally cleaned up the build recipes (including a
complete overhaul of the iOS recipes.)
Regression caused by f9134384b7
This commit also makes the "testclean" target clean up the 4:1:1 test
images. This was implemented in the autotools build system in
1f3635c496 but was left out of the CMake
build system due to an oversight.
This has the following advantages:
-- It doesn't require checking a private SSH key into the repository.
(With SourceForge, an SSH key is the "keys to the kingdom".)
-- If the S3 key is compromised, it is very easy to revoke it and
generate a new one.
-- The S3 bucket is isolated, so even if it becomes compromised, then
the damage that one could do is limited.
-- It's much easier to manage files through S3's web interface than
through SourceForge.
-- The files are served via HTTPS.
-- Travis fully supports S3 as a deployment target, so this simplifies
.travis.yml somewhat.
Since we're still deploying our Linux/macOS CI artifacts to a web server
(specifically SourceForge Project Web Services) that doesn't support
HTTPS, it's a good idea to sign them. But since the private key has to
be checked into the repository, we use a different key for signing the
pre-releases (per project policy, the private signing keys for our
release binaries are never made available on any public server.)
Previously, simd/CMakeLists.txt was hard-coded to use NASM, and it was
necessary to override the NASM variable in order to use YASM. This
commit changes the behavior such that NASM is still preferred, but YASM
will be used if it is in the PATH and NASM isn't available. This brings
the actual behavior in line with the behavior described in BUILDING.md.
Based on
b0799a1598Closes#107
-- Use trusty for SIMD builds. Ubuntu 12.04 is still using NASM 2.09.x,
which isn't new enough to support AVX2.
-- Add a special test for the SSE2 code path, since it is no longer the
default.
Pass the actual repository and branch that Travis is using into the
builtljt script, so the official builds it generates will come from
the same code base as the other tested builds.
- Introduce a new FLOATTEST value ("387") on Un*x systems that will
compare the floating point DCT/IDCT algorithms against the expected
results from the C algorithms when built using 32-bit code and
-mfpmath=387.
- Extend the Windows regression tests so that they work properly when
building libjpeg-turbo with 32-bit code and without SIMD, using either
Visual C++ (tested with 2008, 2010, 2015) or MinGW.
Based on
98a5a9dc89
with wordsmithing by DRC.
In the AArch64 ABI, as in many others, it's forbidden to read/store data
below the stack pointer. Some SIMD functions were doing just that
(stack pointer misuse) when trying to preserve callee-saved registers,
and this resulted in those registers being restored with incorrect
contents under certain circumstances.
This patch fixes that behavior, and callee-saved registers are now
stored above the stack pointer throughout the function call. The patch
also removes register saving in places where it is unnecessary for this
ABI, or it makes use of unused scratch regiters instead of callee-saved
registers.
Fixes#97. Closes#101.
Refer also to https://bugzilla.redhat.com/show_bug.cgi?id=1368569
The last iDevice to require ARMv6 was the iPhone 3G, which required iOS
4.2.1 or older. Our binaries have always required iOS 4.3 or newer,
so I'm not sure if the ARMv6 fork of our binaries was ever useful to
begin with. In any case, if it ever was useful, it no longer is. Fat
binaries can still be generated with ARMv6 support by invoking
{build_directory}/pkgscripts/makemacpkg manually.
Reported by Clang UBSan (refer to
https://bugzilla.mozilla.org/show_bug.cgi?id=1301252 for test image.)
This appears to be a legitimate bug introduced by
3ab68cf563. Any component array, such
as first_MCU_col and last_MCU_col, should always be able to accommodate
MAX_COMPONENTS values. The aforementioned test image had 8 components,
which was not enough to make the out-of-bounds write bust out of the
jpeg_decomp_master struct (and fortunately the memory after last_MCU_col
is an integer used as a boolean, so stomping on it will do nothing other
than change the decoder state.) I crafted another special image that
has 10 components (the maximum allowable), but that was apparently not
enough to bust out of the allocated memory, either. Thus, it is
posited that the security threat posed by this bug is either extremely
minimal or non-existent.
When attempting to decode a malformed JPEG image (refer to
https://bugzilla.mozilla.org/show_bug.cgi?id=1295044) with dimensions
61472 x 32800, the maximum_space variable within the
realize_virt_arrays() function will exceed the maximum value of a 32-bit
integer and will wrap around. The memory manager subsequently fails
with an "Insufficient memory" error (case 4, in alloc_large()), so this
commit simply causes that error to be triggered earlier, before UBSan
has a chance to complain.
Note that this issue did not ever represent an exploitable security
threat, because the POSIX-based memory manager that we use doesn't ever
do anything meaningful with the value of maximum_space.
jpeg_mem_available() simply sets avail_mem = maximum_space, so the
subsequent behavior of the memory manager is the same regardless of
whether maximum_space is correct or not. This commit simply removes a
UBSan warning in order to make it easier to detect actual security
issues.
Normally, 4:2:2 JPEGs have horizontal x vertical luminance,chrominance
sampling factors of 2x1,1x1, and 4:4:0 JPEGs have horizontal x vertical
luminance,chrominance sampling factors of 1x2,1x1. However, it is
technically legal to create 4:2:2 JPEGs with sampling factors of
2x2,1x2 and 4:4:0 JPEGs with sampling factors of 2x2,2x1, since the
sums of the products of those sampling factors (2x2 + 1x2 + 1x2 and
2x2 + 2x1 + 2x1) are still <= 10. The libjpeg API correctly decodes
such images, so the TurboJPEG API should as well.
Fixes#92
Currently, this only affects ARM, since it is the only platform that
accelerates YCbCr-to-RGB conversion but not merged upsampling. Even if
"plain" upsampling isn't accelerated, the combination of accelerated
color conversion + unaccelerated plain upsampling is still faster than
the unaccelerated merged upsampling algorithms.
Closes#81
This allows fancy upsampling to be used when decompressing 4:2:2 images
that have been losslessly rotated or transposed.
(docs and comments added by DRC)
Based on f63aca945dCloses#89
cpuid tells us whether the O/S uses extended state management via
XSAVE/XRSTOR, but we have to call xgetbv to verify that it is using
XSAVE/XRSTOR to manage the state of XMM/YMM registers.
In the AArch64 ABI, the high (unused) DWORD of a 32-bit argument's
register is undefined, so it was incorrect to use 64-bit
instructions to transfer a JDIMENSION argument in the 64-bit NEON SIMD
functions. The code worked thus far only because the existing compiler
optimizers weren't smart enough to do anything else with the register in
question, so the upper 32 bits happened to be all zeroes.
The latest builds of Clang/LLVM have a smarter optimizer, and under
certain circumstances, it will attempt to load-combine adjacent 32-bit
integers from one of the libjpeg structures into a single 64-bit integer
and pass that 64-bit integer as a 32-bit argument to one of the SIMD
functions (which is allowed by the ABI, since the upper 32 bits of the
32-bit argument's register are undefined.) This caused the
libjpeg-turbo regression tests to crash.
This patch tries to use the Wn registers whenever possible. Otherwise,
it uses a zero-extend instruction to avoid using the upper 32 bits of
the 64-bit registers, which are not guaranteed to be valid for 32-bit
arguments.
Based on 1fbae13021Closes#91. Refer also to android-ndk/ndk#110 and
https://llvm.org/bugs/show_bug.cgi?id=28393
This fixes crashes that would occur when attempting to use
libjpeg-turbo's AVX2 extensions on older O/S's (such as Windows XP or
RHEL 5.) Even if the CPU supports AVX2, the O/S has to also support
saving/restoring YMM registers when switching contexts.
This eliminates "illegal instruction" errors when running libjpeg-turbo
under Linux on PowerPC chips that lack AltiVec support (e.g. the old
7XX/G3 models but also the newer e5500 series.)
The JSIMD_FORCE* environment variables previously meant "force the use
of this instruction set if it is available but others are available as
well", but that did nothing on ARM platforms, since there is only ever
one instruction set available. Since the ARM and MIPS CPU feature
detection code is less than bulletproof, and since there is only one
SIMD instruction set (currently) supported on those platforms, it makes
sense for the JSIMD_FORCE* environment variables on those platforms to
actually force the use of the SIMD instruction set, thus bypassing the
CPU feature detection code.
This addresses a concern raised in #88 whereby parsing /proc/cpuinfo
didn't work within a QEMU environment. This at least provides a
workaround, allowing users to force-enable or force-disable SIMD
instructions for ARM and MIPS builds of libjpeg-turbo.
This commit adds back instructive comments in the image-space
algorithms, similar to those in the SSE2 code. These comments make it
easier to follow the flow of data through the algorithms.
Expand collect_args/uncollect_args macros so that the number of
arguments can be specified. This prevents unnecessary push and mov
instructions.
NOTE: On Windows, the push/pop of xmm6 and xmm7 had to be moved to the
other end of the macro to ensure that rsp is aligned on a 16-byte
boundary.
28d1a1300c introduced the line
"nasm.exe should be in your PATH". This commit corrects an oversight in
8f1c0a681c /
e5091f2cf3 whereby this line should have
been extended to include yasm.exe.
The IJG convention is to format copyright notices as:
Copyright (C) YYYY, Owner.
We try to maintain this convention for any code that is part of the
libjpeg API library (with the exception of preserving the copyright
notices from Cendio's code verbatim, since those predate
libjpeg-turbo.)
Note that the phrase "All Rights Reserved" is no longer necessary, since
all Buenos Aires Convention signatories signed onto the Berne Convention
in 2000. However, our convention is to retain this phrase for any files
that have a self-contained copyright header but to leave it off of any
files that refer to another file for conditions of distribution and use.
For instance, all of the non-SIMD files in the libjpeg API library refer
to README.ijg, and the copyright message in that file contains "All
Rights Reserved", so it is unnecessary to add it to the individual
files.
The TurboJPEG code retains my preferred formatting convention for
copyright notices, which is based on that of VirtualGL (where the
TurboJPEG API originated.)
Calling jpeg_stdio_dest() followed by jpeg_mem_dest(), or jpeg_mem_src()
followed by jpeg_stdio_src(), is dangerous, because the existing opaque
structure would not be big enough to accommodate the new source/dest
manager. This issue was non-obvious to libjpeg-turbo consumers, since
it was only documented in code comments. Furthermore, the issue could
also occur if the source/dest manager was allocated by the calling
program, but it was not allocated with enough space to accommodate the
opaque stdio or memory source/dest manager structs. The safest thing to
do is to throw an error if one of these functions is called when there
is already a source/dest manager assigned to the object and it was
allocated elsewhere.
Closes#78, #79
The jpeg-7/jpeg-8 APIs/ABIs require arithmetic coding, and the jpeg-8
API/ABI requires the memory source/destination manager, so this commit
causes the build system to ignore --with-arith-enc/--without-arith-enc
and --with-arith-dec/--without-arith-dec (and the equivalent CMake
variables-- WITH_ARITH_ENC and WITH_ARITH_DEC) when v7/v8 API/ABI
emulation is enabled. Furthermore, the CMake build system now ignores
WITH_MEM_SRCDST whenever WITH_JPEG8 is specified (the autotools build
system already did that.)
GCC does support UAL syntax (strbeq) if the ".syntax unified" directive
is supplied. This directive is supported by all versions of GCC and
clang going back to 2003, so it should not create any backward
compatibility issues.
Based on 1264349e2fCloses#76
If wmic.exe wasn't available, then CMakeLists.txt would call
"cmd /C date /T" and parse the result in order to set the BUILD
variable. However, the parser assumed that the date was in MM/DD/YYYY
format, which is not generally the case unless the user's locale is U.S.
English with the default region/language settings for that locale.
This commit modifies CMakeLists.txt such that it uses the
string(TIMESTAMP) function available in CMake 2.8.11 and later to set
the BUILD variable, thus eliminating the need to use wmic.exe or any
other platform-specific hack.
This commit also modifies the build instructions to remove any reference
to CMake 2.6 (which hasn't been supported by our build system since
libjpeg-turbo 1.3.x.)
Closes#74
* libjpeg-turbo/master: (140 commits)
Increase severity of tjDecompressToYUV2() bug desc
Catch libjpeg errors in tjDecompressToYUV2()
BUILDING.md: Fix "... OR ..." indentation again
BUILDING.md: Fix confusing Windows build reqs
ChangeLog.md: Improve readability of plain text
change.log: Refer users to ChangeLog.md
Markdown version of ChangeLog.txt
Rename ChangeLog.txt
README.md: Link to BUILDING.md
BUILDING.md and README.md: Cosmetic tweaks
ChangeLog: "1.5 beta1" --> "1.4.90 (1.5 beta1)"
Java: Fix parallel make with autotools
Win/x64: Fix improper callee save of xmm8-xmm11
Bump TurboJPEG C API revision to 1.5
ChangeLog: Mention jpeg_crop_scanline() function
1.5 beta1
Fix v7/v8-compatible build
libjpeg API: Partial scanline decompression
Build: Make the NASM autoconf variable persistent
Use consistent/modern code formatting for dbl ptrs
...
* libjpeg-turbo/1.4.x: (94 commits)
CMakeLists.txt: Clarify that Un*x isn't supported
Catch libjpeg errors in tjDecompressToYUV2()
cjpeg: Fix buf overrun caused by bad bin PPM input
Add version/build info to global string table
Ensure that default Huffman tables are initialized
Fix memory leak when running tjunittest -yuv
Prevent overread when decoding malformed JPEG
Guard against wrap-around in alloc functions
Fix Visual C++ compiler warnings
rdppm.c: formatting tweaks
jmemmgr.c: formatting tweaks
TurboJPEG: Avoid dangling pointers
Update Android build instr. for ARMv8, PIE, etc.
Makefile.am: formatting tweak
Update build instructions for new autoconf, GitHub
1.4.3
Regression: Allow co-install of 32-bit/64-bit RPMs
Build: Use FILEPATH type for NASM CMake variable
Comment formatting tweaks
Fix 'make dist'
...
Tag 1.4.1 release
* tag '1.4.1': (427 commits)
Now that the TurboJPEG API is reporting libjpeg warnings as errors, an "Invalid SOS parameters for sequential JPEG" warning surfaced in tjDecodeYUV*(). This was caused by the Se member of jpeg_decompress_struct being set to 0 (it is normally set to a non-zero value when the start-of-scan markers are read, but there are no SOS markers in this case, because we're not actually decompressing a JPEG file.)
Fix a segfault that occured in the MIPS DSPr2 fancy upsampling routine when downsampled_width==3. Because the DSPr2 code unrolls the loop for the middle columns (refer to jdsample.c), it has the effect of performing two column iterations, and that only works properly if the number of columns (minus the first and last) is >= 2. For the specific case of downsampled_width==3, this patch skips to the second iteration of the unrolled column loop.
If a warning (such as "Premature end of JPEG file") is triggered in the underlying libjpeg API, make sure that the TurboJPEG API function returns -1. Unlike errors, however, libjpeg warnings do not make the TurboJPEG functions abort.
Back out r1555 and r1548. Using setenv() didn't fix the iOS simulator issue. It just replaced an undefined _putenv$UNIX2003 symbol with an undefined _setenv$UNIX2003 symbol. The correct solution seems to be to use -D_NONSTD_SOURCE when generating our official builds.
Fix the Windows build. I remember now why I used putenv() originally-- because Windows doesn't have setenv(). We could use _putenv_s(), but older versions of MinGW don't have that either. Fortunately, since all of the environment values we're setting in turbojpeg.c are static, we can just map setenv() to putenv() using a macro. NOTE: we still have to use _putenv_s() in turbojpeg-jni.c, but at least people who may need to build with an older version of MinGW can still do so by disabling the Java build.
Allow building only static or only shared libraries on Windows
__WORDSIZE doesn't seem to be available on platforms other than Mac or Linux, and best practices are for user-level code not to rely on it anyhow, since it's meant to be an internal macro. Fortunately, autoconf already has a way of determining the word size at configure time, so it can be passed into the compiler. This should work on any platform and has been tested on all of the Un*x platforms we support (Linux, Mac, FreeBSD, Solaris.)
Unless you define _ANSI_SOURCE, then putenv() on Mac is renamed to putenv$UNIX2003(), and this causes problems when trying to link an i386 iOS application (for the simulator) against the TurboJPEG static library. It's easiest to just use setenv() instead.
Fix a bug in the 64-bit Huffman encoder that Google discovered when encoding some very specific (and proprietary) aerial images using quality=98, an optimized Huffman table, and the ISLOW DCT. These images were causing the Huffman bit buffer to overflow, because the code for encoding the DC coefficient was using the equivalent of the 32-bit version of EMIT_BITS(). Thus, when 64-bit code was used, the DC coefficient code was not properly checking how many bits were in the buffer before attempting to add more bits to it. This issue appears to have existed in all versions of libjpeg-turbo.
Restore backward compatibility with MSVC < 2010 (broken by r1541)
Oops. OS X doesn't define __WORDSIZE unless you include stdint.h, so apparently the Huffman codec hasn't ever been fully accelerated on 64-bit OS X.
Allow the executables and libraries outside of the sharedlib/ directory to be linked against msvcr*.dll instead of libcmt*.lib. This is reported to be necessary when building libjpeg-turbo for use with C#.
Surround the usage of getenv() in the TurboJPEG API with #ifndef NO_GETENV so that developers can add -DNO_GETENV to the C flags when building for platforms that don't have getenv(). Currently this is known to be necessary when building for Windows Phone.
If libjpeg-turbo is configured with a non-default prefix, such as /usr, then use the docdir variable defined by autoconf 2.60 and later, if available. This will, for instance, install the documentation under /usr/share/doc/libjpeg-turbo by default if prefix=/usr, unless docdir is overridden. When using earlier versions of autoconf, docdir is set to ${datadir}/doc, as it always has been.
Enable silent build rules for the NASM objects, if the source is configured with automake 1.11 or later. NOTE: the build still spits out "error: ignoring unknown tag NASM" for each object, but unfortunately, if we remove "--tag NASM" from the command line, the build breaks under older versions of automake (it aborts with "unable to infer tagged configuration.")
Set the RPM and deb architecture properly on non-x86 platforms.
Come on, Cohaagen, you got what you want. Give these people air!
Oops. Need to set the alpha channel when using TYPE_4BYTE_ABGR*. This has no bearing on the actual tests, but it prevents the PNG pre-encode reference images for those tests from being blank.
Oops. The MIPS SIMD implementations of h2v1 and h2v2 upsampling were not checking for DSPr2 support, so running 'djpeg -nosmooth' on a non-DSPr2-enabled platform caused an "illegal instruction" error.
Introduce fast paths to speed up NULL color conversion somewhat, particularly when using 64-bit code; on the decompression side, the "slow path" also now use an approach similar to that of the compression side (with the component loop outside of the column loop rather than inside.) This is faster when using 32-bit code.
...
* origin/master: (108 commits)
Bump version number to 3.1.
jpegyuv: fix memory leak when path is invalid
jpegyuv: fix memory leak when @image_buffer allocation fails
yuvjpeg: fix memory leak when @image_buffer allocation fails
jpegtran: Do not leak the input and output buffers
Fix previous commit
Scan optimization: return error when unable to copy data buffer
cjpeg option for baseline quant tables
Fix#153
rdpng: convert 16-bit input to 8-bit
Larger number of DC trellis candidates
Fix overflow issue #157
Const on getters
Const on simple getters and copy source
Expanded .gitignore
Add pkg-config requirement
Re-order links.
Declare inbuffer const
Oops. Delete the duplicate copy of [lib]turbojpeg.dll in the binary directory when uninstalling the package.
Get rid of changelog file that we don't update.
...
* commit 'eca0637c8150d3d1c08a60c64d7ee16eaea4b198':
Remove trailing spaces
Another oops. tjBufSizeYUV2() should return -1 if width < 1.
Oops. tjPlaneSizeYUV() should return -1 if componentID > 0 and subsamp==TJSAMP_GRAY.
When building libjpeg-turbo on Un*x systems, INT32 is usually typedef'ed to long, not int, so we need to specify an int pointer when doing a 4-byte write to the RGB565 output buffer. On little endian systems, this doesn't matter, but when you write a 32-bit int to a 64-bit long pointer address on a big endian system, you are writing to the upper 4 bytes, not the lower 4 bytes. NOTE: this will probably break on big endian systems that use 16-bit ints (are there any of those still around?)
Fix Windows build
Fix issues with RGB565 color conversion on big endian machines. The RGB565 routines are now abstracted in a separate file, with separate little-endian and big-endian versions defined at compile time through the use of macros (this is similar to how the colorspace extension routines work.) This allows big-endian machines to take advantage of the same performance optimizations as little-endian machines, and it retains the performance on little-endian machines, since the conditional branch for endianness is at a very coarse-grained level.
Fix build on OS X PowerPC platforms
Oops. Forgot to alter the version header in the change log to indicate the release of 1.4 beta.
Create 1.4.x branch
At one time, it was possible to use CMake to build under Cygwin, but
that hasn't worked since 1.4.1 (due to the Huffman codec changes that
now require SIZEOF_SIZE_T to be defined for non-WIN32 platforms) and may
have even been broken before that. Originally, we used the "date"
command under MSYS in order to obtain the default build number, but that
was rendered unnecessary by 5e3bb3e9 (v1.3 beta.) 9fe22dac (1.4 beta)
further modified CMakeLists.txt so that the "date" command was only used
on Cygwin, but for unexplained reasons, that commit also applied the
(now vestigial) code to all non-WIN32 platforms. This prevented
CMakeLists.txt from displaying an error if someone attempted to use the
CMake build system on Un*x platforms, and that may have been behind the
flurry of pull requests and issues-- including #21, #29, #37, #58, #73--
complaining that the CMake build system didn't work on Un*x platforms
(although it was not until #73 that this bug came to light.)
This commit removes all vestiges of Un*x support from the CMake build
system and makes it clear that CMake cannot be used to build
libjpeg-turbo on non-WIN32 platforms. It is our position that CMake
will not be supported on non-WIN32 platforms until/unless the autotools
build system is removed, and this will not happen without broad support
from the community (including major O/S vendors.) If you are in favor
of migrating the entire build system to CMake, then please make your
voice heard by commenting on #56.
Even though tjDecompressToYUV2() is mostly just a wrapper for
tjDecompressToYUVPlanes(), tjDecompressToYUV2() still calls
jpeg_read_header(), so it needs to properly set up the libjpeg error
handler prior to making this call. Otherwise, under very esoteric (and
arguably incorrect) use cases, a program could call tjDecompressToYUV2()
without first checking the JPEG header using tjDecompressHeader3(), and
if the header was corrupt, then the libjpeg API would invoke
my_error_exit(). my_error_exit() would in turn call longjmp() on the
previous value of myerr->setjmp_buffer, which was probably set in a
previous TurboJPEG function, such as tjInitDecompress(). Thus, when a
libjpeg error was triggered within the body of tjDecompressToYUV2(), the
PC would jump to the error handler of the previous TurboJPEG function,
and this usually caused stack corruption in the calling program (because
the signature and return type of the previous TurboJPEG function
probably wasn't the same as that of tjDecompressToYUV2().)
Actually, what happened was that the longjmp() call within
my_error_exit() acted on the previous value of myerr->setjmp_buffer,
which was probably set in a previous TurboJPEG function, such as
tjInitDecompress(). Thus, when a libjpeg error was triggered within
the body of tjDecompressToYUV2(), the PC jumped to the error handler
of the previous TurboJPEG function, and this usually caused stack
corruption in the calling program (because the signature and return
type of the previous TurboJPEG function probably wasn't the same.)
Even though tjDecompressToYUV2() is mostly just a wrapper for
tjDecompressToYUVPlanes(), tjDecompressToYUV2() still calls
jpeg_read_header(), so it needs to properly set up the libjpeg error
handler prior to making this call. Otherwise, under very esoteric (and
arguably incorrect) use cases, a program can call tjDecompressToYUV2()
without first checking the JPEG header using tjDecompressHeader3(), and
if the header is corrupt, tjDecompressToYUV2() will abort without
triggering an error.
Fixes#72
<sigh> GitHub doesn't render indented text the same as my local MarkDown
viewer (MacDown), so it's necessary to indent "... OR ..." by 3 spaces
so both will display it on the same indentation level as "Visual C++
2005 or later" and "MinGW".
Indent "... OR ..." to make it clear that the choice is between Visual
C++ and MinGW, not Visual C++ and MinGW + NASM. Move NASM to the top of
the list to make that even more clear. Make it clear that nasm.exe
should be in the PATH.
Addresses concerns raised in #70
This extends the fix in 6709e4a0cf to
include binary PPM/PGM files, thus preventing a malformed binary
PPM/PGM input file from triggering an overrun of the rescale array and
potentially crashing cjpeg.
Note that this issue affected only cjpeg and not the underlying
libjpeg-turbo libraries, and thus it did not represent a security
threat.
Thanks to @hughdavenport for the discovery.
This is a common practice in other infrastructure libraries, such as
OpenSSL and libpng, because it makes it easy to examine an application
binary and determine which version of the library the application was
linked against.
Closes#66
This prevents a malformed motion-JPEG frame (MJPEG frames lack Huffman
tables) from causing the "fast path" of the Huffman decoder to read
uninitialized memory. Essentially, this is doing the same thing for
MJPEG frames as 43d8cf4d45 did for regular
images.
Running 'make -j{jobs}' on a build that was configured with Java
(--with-java) would previously cause an error:
make: *** No rule to make target `TJExample.class', needed by
`turbojpeg.jar'.
It seems that parallel make doesn't understand that the files in
$(JAVA_CLASSES) are all generated from the same invocation of javac, so
it tries to parallelize the building of those files (which of course
doesn't work.) This patch instead makes turbojpeg.jar depend on
classnoinst.stamp. This effectively creates a synchronization fence,
since that file is only created when all of the class files have been
built.
Fixes#62
The x86-64 SIMD accelerations for Huffman encoding used incorrect
stack math to save xmm8-xmm11 on Windows. This caused TJBench to
always report 1 Mpixel/sec for the compression performance, and it
likely would have caused other application issues as well.
The changes relative to 1.4.x are only cosmetic (using const pointers)
and should not affect API/ABI compatibility, but our practice is to
synchronize the API revision with the most recent release that provides
user-visible changes to the API.
This, in combination with the existing jpeg_skip_scanlines() function,
provides the ability to crop the image both horizontally and vertically
while decompressing (certain restrictions apply-- see libjpeg.txt.)
This also cleans up the documentation of the line skipping feature and
removes the "strip decompression" feature from djpeg, since the new
cropping feature is a superset of it.
Refer to #34 for discussion.
Closes#34
Previously, if a custom value of this variable was specified when
running configure, then that value would be lost if configure was
automatically re-run (as a result of changes to configure.ac, for
instance.)
As a bonus, the NASM variable is now also listed when running
'configure --help', so it is obvious how to override the default
NASM command.
The convention used by libjpeg:
type * variable;
is not very common anymore, because it looks too much like
multiplication. Some (particularly C++ programmers) prefer to tuck the
pointer symbol against the type:
type* variable;
to emphasize that a pointer to a type is effectively a new type.
However, this can also be confusing, since defining multiple variables
on the same line would not work properly:
type* variable1, variable2; /* Only variable1 is actually a
pointer. */
This commit reformats the entirety of the libjpeg-turbo code base so
that it uses the same code formatting convention for pointers that the
TurboJPEG API code uses:
type *variable1, *variable2;
This seems to be the most common convention among C programmers, and
it is the convention used by other codec libraries, such as libpng and
libtiff.
Place the authors in the following order:
* libjpeg-turbo authors (2009-) in descending order of the date of their
most recent contribution to the project, then in ascending order of
the date of their first contribution to the project
* Upstream authors in descending order of the date of the first
inclusion of their code (this indicates that their code serves as the
foundation of this code.)
This also adds Siarhei to the author list, since he contributed ARM SIMD
code both as a Nokia employee and more recently as an independent
developer.
We need to garbage collect between iterations of the outside loop in
bufSizeTest() in order to avoid exhausting the heap when running with
Java 6 (which is still used on Linux to test the 32-bit version of
libjpeg-turbo in automated builds.)
Broken by 46ecffa324.
gas-preprocessor.pl and/or the clang assembler apparently don't like
default values in macro arguments, and we need to use a separate const
section for each function (because of our use of adr, also necessitated
by the broken clang assembler.)
... and only if ThunderX is detected. This can be easily expanded later
on to include other CPUs that are known to suffer from slow LD3/ST3, but
it doesn't make sense to disable LD3/ST3 for all non-Android Linux
platforms just because ThunderX is slow.
Full-color compression speedups relative to previous commits:
Cortex-A53 (Nexus 5X), Android, 64-bit: 1.1-13% (avg. 6.0%)
Cortex-A57 (Nexus 5X), Android, 64-bit: 0.0-22% (avg. 6.3%)
Refer to #47 and #50 for discussion
Closes#50
Note that this commit introduces a similar /proc/cpuinfo parser to that
of the ARM32 implementation. It is used to specifically check whether
the code is running on Cavium ThunderX and, if so, disable the ARM64
SIMD Huffman routines (which slow performance by an average of 8% on
that CPU.)
Based on:
a8c282e5e5
Don't include the all: target as a dependency of the tests when
cross-compiling, and ensure that the files generated by the tests are
removed, even if they were created read-only (or if the tests are being
run on a different type of system that doesn't correctly interpret the
file permissions.) This allows one to easily build the code on one
machine and run 'make test' on another.
When cross-compiling, CMakeLists.txt now generates the CTest script
using relative paths, so that CTest can more easily be executed on a
different machine from the build machine. Furthermore, Windows builds
are now tested using md5cmp, just like on Linux, rather than a CMake
script. This prevents issues with differing CMake locations between
the build and test machines.
This also removes some trailing spaces from the md5cmp code and improves
the readability of the test code in CMakeLists.txt.
jinclude.h can't be safely included multiple times, so instead of
including it in the shared (broken-out) headers, it should instead be
included by the source files that include one or more of those headers.
The accelerated Huffman decoder was previously invoked if there were
> 128 bytes in the input buffer. However, it is possible to construct a
JPEG image with Huffman blocks > 430 bytes in length
(http://stackoverflow.com/questions/2734678/jpeg-calculating-max-size).
While such images are pathological and could never be created by a
JPEG compressor, it is conceivable that an attacker could use such an
artifially-constructed image to trigger an input buffer overrun in the
libjpeg-turbo decompressor and thus gain access to some of the data on
the calling program's heap.
This patch simply increases the minimum buffer size for the accelerated
Huffman decoder to 512 bytes, which should (hopefully) accommodate any
possible input.
This addresses a major issue (LJT-01-005) identified in a security audit
by Cure53.
Because of the exposed nature of the libjpeg API, alloc_small() and
alloc_large() can potentially be called by external code. If an
application were to call either of those functions with
sizeofobject > SIZE_MAX - ALIGN_SIZE - 1, then the math in
round_up_pow2() would wrap around to zero, causing that function to
return a small value. That value would likely not exceed
MAX_ALLOC_CHUNK, so the subsequent size checks in alloc_small() and
alloc_large() would not catch the error.
A similar problem could occur in 32-bit builds if alloc_sarray() were
called with
samplesperrow > SIZE_MAX - (2 * ALIGN_SIZE / sizeof(JSAMPLE)) - 1
This patch simply ensures that the size argument to the alloc_*()
functions will never exceed MAX_ALLOC_CHUNK (1 billion). If it did,
then subsequent size checks would eventually catch that error, so we
are instead catching the error before round_up_pow2() is called.
This addresses a minor concern (LJT-01-001) expressed in a security
audit by Cure53.
This addresses a minor concern (LJT-01-002) expressed in a security
audit by Cure53. _tjInitCompress() and _tjInitDecompress() call
(respectively) jpeg_mem_dest_tj() and jpeg_mem_src_tj() with a pointer
to a dummy buffer, in order to set up the destination/source manager.
The dummy buffer should never be used, but it's still better to make it
static so that the pointer in the destination/source manager always
points to a valid region of memory.
Document the latest benchmarks on the Nexus 5X and change the "2-4x"
overall claim to "2-6x". The peak performance on x86 platforms was
already closer to 5x, and the addition of SIMD-accelerated Huffman
encoding gave it that extra push over the cliff.
There aren't really any best practices to follow here. I tried as best
as I could to adopt a standard that would ease any future maintenance
burdens. The basic tenets of that standard are:
* Assembly instructions always start on Column 5, and operands always
start on Column 21, except:
- The instruction and operand can be indented (usually by 2 spaces)
to indicate a separate instruction stream.
- If the instruction is within an enclosing .if block in a macro,
it should always be indented relative to the .if block.
* Comments are placed with an eye toward readability. There are always
at least 2 spaces between the end of a line of code and the associated
in-line comment. Where it made sense, I tried to line up the comments
in blocks, and some were shifted right to avoid overlap with
neighboring instruction lines. Not an exact science.
* Assembler directives and macros use 2-space indenting rules. .if
blocks are indented relative to the macro, and code within the .if
blocks is indented relative to the .if directive.
* No extraneous spaces between operands. Lining up the operands
vertically did not really improve readability-- personally, I think it
made it worse, since my eye would tend to lose its place in the
uniform columns of characters. Also, code with a lot of vertical
alignment is really hard to maintain, since changing one line could
necessitate changing a bunch of other lines to avoid spoiling the
alignment.
* No extraneous spaces in #defines or other directives. In general, the
only extraneous spaces (other than indenting spaces) are between:
- Instructions and operands
- Operands and in-line comments
This standard should be more or less in keeping with other formatting
standards used within the project.
Decompression speedup relative to libjpeg-turbo 1.4.2 (ISLOW IDCT):
48-core ThunderX (RunAbove ARM Cloud), Linux, 64-bit: 60-113% (avg. 86%)
Cortex-A53 (Nexus 5X), Android, 64-bit: 6.8-27% (avg. 14%)
Cortex-A57 (Nexus 5X), Android, 64-bit: 2.0-14% (avg. 6.8%)
Decompression speedup relative to libjpeg-turbo 1.4.2 (IFAST IDCT):
48-core ThunderX (RunAbove ARM Cloud), Linux, 64-bit: 51-98% (avg. 75%)
Minimal speedup (1-5%) observed on iPhone 5S (Cortex-A7)
NOTE: This commit avoids the st3 instruction for non-Android and
non-Apple builds, which may cause a performance regression against
libjpeg-turbo 1.4.x on ARM64 systems that are running plain Linux.
Since ThunderX is the only platform known to suffer from slow ld3 and
st3 instructions, it is probably better to check for the CPU type
at run time and disable ld3/st3 only if ThunderX is detected.
This commit also enables the use of ld3 on Android platforms, which
should be a safe bet, at least for now. This speeds up compression on
the afore-mentioned Nexus Cortex-A53 by 5.5-19% (avg. 12%) and on the
Nexus Cortex-A57 by 1.2-14% (avg. 6.3%), relative to the previous
commits.
This commit also removes unnecessary macros.
Refer to #52 for discussion.
Closes#52.
Based on:
6bad905034488dd7bf174f4d057c1fd3198afc43
* Include information on how to do a 64-bit ARMv8 build with the latest
NDK
* Suggest -fPIE and -pie as default CFLAGS (required for android-16 and
later.
* Remove -fstrict-aliasing flag (-Wall already includes it)
* Include information on how to do a 64-bit ARMv8 build with the latest
NDK
* Suggest -fPIE and -pie as default CFLAGS (required for android-16 and
later.
* Remove -fstrict-aliasing flag (-Wall already includes it)
For whatever reason, the "write" global variable in tjbench.c was
overriding the linkage with the write() system function. This may have
affected other platforms as well but was not known to.
This allows a project to use PKG_CHECK_MODULES() in its configure.ac
file to easily check for the presence of libjpeg-turbo and modify the
compiler/linker flags accordingly. Note that if a project relies solely
on pkg-config to check for libjpeg-turbo, then it will not be possible
to build that project using libjpeg or an earlier version of
libjpeg-turbo.
Closes#53
Based on:
4967138719
Per @ssvb:
ThunderX is an ARM64 chip that dedicates most of its transistor real
estate to providing 48 cores, so each core is not as fast as a result.
Each core is dual-issue & in-order for scalar instructions and has only
a single-issue half-width NEON unit, so the peak throughput is one
128-bit instruction per 2 cycles. So careful instruction scheduling is
important. Furthermore, ThunderX has an extremely slow implementation
of ld2 and ld3, so this commit implements the equivalent of those
instructions using ld1.
Compression speedup relative to libjpeg-turbo 1.4.2:
48-core ThunderX (RunAbove ARM Cloud), Linux, 64-bit: 58-85% (avg. 74%)
relative to jpeg-6b: 1.75-2.14x (avg. 1.95x)
Refer to #49 and #51 for discussion.
Closes#51.
This commit also wordsmiths the ChangeLog entry (the ARMv8 SIMD
implementation is "complete" only for compression-- it still lacks some
decompression algorithms, as does the ARMv7 implementation.)
Based on:
9405b5fd03
which is based on:
f561944ff7962c8ab21f
This adds 64-bit NEON coverage for all of the algorithms that are
covered by the 32-bit NEON implementation, except for h2v1 (4:2:2) fancy
upsampling (used when decompressing 4:2:2 JPEG images.) It also adds
64-bit NEON SIMD coverage for:
* slow integer forward DCT (compressor)
* h2v2 (4:2:0) downsampling (compressor)
* h2v1 (4:2:2) downsampling (compressor)
which are not covered in the 32-bit implementation.
Compression speedups relative to libjpeg-turbo 1.4.2:
Apple A7 (iPhone 5S), iOS, 64-bit: 113-150% (reported)
48-core ThunderX (RunAbove ARM Cloud), Linux, 64-bit: 2.1-33% (avg. 15%)
Refer to #44 and #49 for discussion
This commit also removes the unnecessary
if (simd_support & JSIMD_ARM_NEON)
statements from the jsimd* algorithm functions. Since the jsimd_can*()
functions check for the existence of NEON, the corresponding algorithm
functions will never be called if NEON isn't available.
Based on:
dcd9d84f10b0d87b811f70cd5c8a493e58d9a064837b19542f73dc43ccc8a82b71a261c1b1188c21305c89284e7f443f99954c2b53b77d
Unified version with fixes:
1004a3cd05
Full-color compression speedups relative to libjpeg-turbo 1.4.2:
800 MHz ARM Cortex-A9, iOS, 32-bit: 26-44% (avg. 32%)
Refer to #42 and #47 for discussion.
This commit also removes the unnecessary
if (simd_support & JSIMD_ARM_NEON)
statements from the jsimd* algorithm functions. Since the jsimd_can*()
functions check for the existence of NEON, the corresponding algorithm
functions will never be called if NEON isn't available. Removing those
if statements improved performance across the board by a couple of
percent.
Based on:
fc023c880c
Partially reverts 54014d9c2a. When
building from a git sandbox, as opposed to from an official source
tarball, it is still necessary to run autoreconf.
Closes#48
The Linux build machine has been upgraded to autoconf 2.69, automake
1.15, m4 1.4.17, and libtool 2.4.6, so it is no longer necessary to
recommend running autoreconf prior to building the source, if one is
building from an official source tarball (as opposed to from a git
sandbox.) Also, there is no SVN repository anymore (oops.)
Full-color compression speedups relative to libjpeg-turbo 1.4.2:
2.8 GHz Intel Xeon W3530, Linux, 64-bit: 2.2-18% (avg. 9.5%)
2.8 GHz Intel Xeon W3530, Linux, 32-bit: 10-25% (avg. 17%)
2.3 GHz AMD A10-4600M APU, Linux, 64-bit: 4.9-17% (avg. 11%)
2.3 GHz AMD A10-4600M APU, Linux, 32-bit: 8.8-19% (avg. 15%)
3.0 GHz Intel Core i7, OS X, 64-bit: 3.5-16% (avg. 10%)
3.0 GHz Intel Core i7, OS X, 32-bit: 4.8-14% (avg. 11%)
2.6 GHz AMD Athlon 64 X2 5050e:
Performance-neutral (give or take a few percent)
Full-color compression speedups relative to IPP:
2.8 GHz Intel Xeon W3530, Linux, 64-bit: 4.8-34% (avg. 19%)
2.8 GHz Intel Xeon W3530, Linux, 32-bit: -19%-7.0% (avg. -7.0%)
Refer to #42 for discussion. Numerous other approaches were attempted,
but this one proved to be the most performant across all platforms.
This commit also fixes#3 (works around, really-- the clang-compiled version
of jchuff.c still performs 20% worse than its GCC-compiled counterpart, but
that code is now bypassed by the new SSE2 Huffman algorithm.)
Based on:
2cb4d4133036c94e050d
Traditionally, the x86-64 code did not call init_simd() because it had
no need to (only SSE2 was supported.) However, having the ability to
disable SIMD at run time is a useful testing tool, and all of the other
SIMD implementations have this ability.
Fix a regression introduced in 1.4.1 that prevented 32-bit and 64-bit
libjpeg-turbo RPMs from being installed simultaneously on recent Red
Hat/Fedora distributions. This was due to the addition of the
SIZEOF_SIZE_T macro in jconfig.h, which allows the Huffman codec to
determine the word size at compile time. Since that macro differs
between 32-bit and 64-bit builds, this caused a conflict between the
i386 and x86_64 RPMs (any differing files, other than executables, are
not allowed when 32-bit and 64-bit RPMs are installed simultaneously.)
Since the macro is used only internally, it has been moved into
jconfigint.h.
The unnecessary .arch directive was removed from the ARM64 SIMD code
in d70a5c12fc, thus allowing clang's
integrated assembler to assemble the code on Linux systems. However,
this broke the detection mechanism in acinclude.m4 that tells the build
system whether it needs to use gas-preprocessor.pl. Since one of the
primary motivators for using gas-preprocessor.pl with ARM64 builds is
the lack of .req/.unreq directives in Apple's implementation of clang,
acinclude.m4 now checks whether .req/.unreq can be properly assembled
and uses gas-preprocessor.pl if not.
Closes#33.
Quality values > 95 are not useless. They just may not provide as good
of a size vs. perceptual quality tradeoff as lower quality values. This
also displays the default quality value in the cjpeg usage.
Closes#39
Most of these involved overrunning the signed 32-bit JLONG type whenever
building libjpeg-turbo with a 32-bit compiler. These issues are not
believed to represent actual security threats, but eliminating them
makes it easier to detect such threats should they arise in the future.
These days, INT32 is a commonly-defined datatype in system headers. We
cannot eliminate the definition of that datatype from jmorecfg.h, since
the INT32 typedef has technically been part of the libjpeg API since
version 5 (1994.) However, using INT32 internally is risky, because the
inclusion of a particular header (Xmd.h, for instance) could change the
definition of INT32 from long to int on 64-bit platforms and thus change
the internal behavior of libjpeg-turbo in unexpected ways (for instance,
failing to correctly set __INT32_IS_ACTUALLY_LONG to match the INT32
typedef-- perhaps as a result of including the wrong version of
jpeglib.h-- could cause libjpeg-turbo to produce incorrect results.)
The library has always been built in environments in which INT32 is
effectively long (on Windows, long is always 32-bit, so effectively it's
the same as int), so it makes sense to turn INT32 into an explicitly
long datatype. This ensures that libjpeg-turbo will always behave
consistently, regardless of the headers included at compile time.
Addresses a concern expressed in #26.
The IJG README file has been renamed to README.ijg, in order to avoid
confusion (many people were assuming that that was our project's README
file and weren't reading README-turbo.txt) and to lay the groundwork for
markdown versions of the libjpeg-turbo README and build instructions.
Most of these involved left shifting a negative number, which is
technically undefined (although every modern compiler I'm aware of
will implement this by treating the signed integer as a 2's complement
unsigned integer-- the LEFT_SHIFT() macro just makes this behavior
explicit in order to shut up ubsan.) This also fixes a couple of
non-issues in the entropy codecs, whereby the sanitizer reported an
out-of-bounds index in the 4th argument of jpeg_make_d_derived_tbl().
In those cases, the index was actually out of bounds (caused by a
malformed JPEG image), but jpeg_make_d_derived_tbl() would have caught
the error and aborted prior to actually using the invalid address. Here
again, the fix was to make our intentions explicit so as to shut up
ubsan.
The DSPr2 code was errantly comparing the residual (t9, width & 0xF)
with the end pointer (t4, out + width) instead of the width directly
(a1). This would give the wrong results with any image whose output
width was less than 16. The other small changes (ulw to lw and removal
of the nop) are just some easy optimizations around this code.
This issue caused a buffer overrun and subsequent segfault on images
whose scaled output height was 1 pixel and whose scaled output width was
< 16 pixels. Note that the "plain" (non-fancy and non-merged) upsample
routine, which was affected by this bug, is normally not used except
when decompressing a non-YCbCr JPEG image, but it is also used when
decompressing a single-row image (because the other upsampling
algorithms require at least two rows.)
Closes#16.
(descriptions cribbed by DRC from discussion in #20)
In the x86-64 ABI, the high (unused) DWORD of a 32-bit argument's
register is undefined, so it was incorrect to use a 64-bit mov
instruction to transfer a JDIMENSION argument in the 64-bit SSE2 SIMD
functions. The code worked thus far only because the existing compiler
optimizers weren't smart enough to do anything else with the register in
question, so the upper 32 bits happened to be all zeroes-- for the past
6 years, on every x86-64 compiler previously known to mankind.
The bleeding-edge Clang/LLVM compiler has a smarter optimizer, and
under certain circumstances, it will attempt to load-combine adjacent
32-bit integers from one of the libjpeg structures into a single 64-bit
integer and pass that 64-bit integer as a 32-bit argument to one of the
SIMD functions (which is allowed by the ABI, since the upper 32 bits of
the 32-bit argument's register are undefined.) This caused the
libjpeg-turbo regression tests to crash.
Also enhance the documentation of JDIMENSION to explain that its size
is significant to the implementation of the SIMD code.
Closes#20. Refer also to http://crbug.com/532214.
Previously this information was found in a page on libjpeg-turbo.org,
but there was still some confusion, because README-turbo.txt wasn't
clear as to which license applied to what.
With certain images, compressing using quality=100 and the fast integer
forward DCT will cause the divisor passed to compute_reciprocal() to be
1. In those cases, the library already disables the SIMD quantization
algorithm to avoid 16-bit overflow. However, compute_reciprocal()
doesn't properly handle the divisor==1 case, so we need to use special
values in that case so that the C quantization algorithm will behave
like an identity function.
When compiled with -mfpxx (which is now the default on Debian), there are
some restrictions on the use of odd-numbered FP registers. More details
about FPXX can be found here:
https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
This commit simply changes all uses of FP registers to an even-numbered
equivalent like this:
f0 -> f0
f1 -> f2
f2 -> f4
...
f8 -> f16
This commit should have no observable effect except that the MIPS assembly
will now compile with -mfpxx.
Closes#11
rdbmp.c used the ambiguous INT32 datatype, which is sometimes typedef'ed
to long. Windows bitmap headers use 32-bit signed integers for the
width and height, because height can sometimes be negative (this
indicates a top-down bitmap.) If biWidth or biHeight was negative and
INT32 was a 64-bit long, then biWidth and biHeight were read as a
positive integer > INT32_MAX, which failed the test in line 385:
if (biWidth <= 0 || biHeight <= 0)
ERREXIT(cinfo, JERR_BMP_EMPTY);
This commit refactors rdbmp.c so that it uses the datatypes specified by
Microsoft for the Windows BMP header.
This closes#9 and also provides a better solution for mozilla/mozjpeg#153.
This reassures the caller that the buffers will not be modified and also
allows read-only buffers to be passed to the functions.
Partially reverts 3947a19f25fc8186d3812dbcf8e70baea36ef652.
Declare inbuffer arg in jpeg_mem_src() to be const
This reassures the caller that the buffer will not be modified and also
allows read-only buffers to be passed to the function.
rdbmp.c used the ambiguous INT32 datatype, which is sometimes typedef'ed
to long. Windows bitmap headers use 32-bit signed integers for the
width and height, because height can sometimes be negative (this
indicates a top-down bitmap.) If biWidth or biHeight was negative and
INT32 was a 64-bit long, then biWidth and biHeight were read as a
positive integer > INT32_MAX, which failed the test in line 385:
if (biWidth <= 0 || biHeight <= 0)
ERREXIT(cinfo, JERR_BMP_EMPTY);
This commit refactors rdbmp.c so that it uses the datatypes specified by
Microsoft for the Windows BMP header.
This closes#9 and also provides a better solution for mozilla/mozjpeg#153.
NASM 2.11.08 has a bug that prevents it from properly assembling a
macho64 version of libjpeg-turbo (the resulting binary generates corrupt
images.) 2.11.09 works properly. YASM also works properly and has been
a supported alternative since libjpeg-turbo 1.2.
NASM 2.11.08 has a bug that prevents it from properly assembling a
macho64 version of libjpeg-turbo (the resulting binary generates corrupt
images.) 2.11.09 works properly. YASM also works properly and has been
a supported alternative since libjpeg-turbo 1.2.
NASM 2.11.08 has a bug that prevents it from properly assembling a
macho64 version of libjpeg-turbo (the resulting binary generates corrupt
images.) 2.11.09 works properly. YASM also works properly and has been
a supported alternative since libjpeg-turbo 1.2.
NASM 2.11.08 has a bug that prevents it from properly assembling a
macho64 version of libjpeg-turbo (the resulting binary generates corrupt
images.) 2.11.09 works properly. YASM also works properly and has been
a supported alternative since libjpeg-turbo 1.2.
Under very rare circumstances, decompressing specific corrupt JPEG
images would create a situation whereby GET_BITS(1) was invoked
from within HUFF_DECODE_FAST() when bits_left=0. This produced a right
shift by a negative number of bits, which is undefined in C.
Under very rare circumstances, decompressing specific corrupt JPEG
images would create a situation whereby GET_BITS(1) was invoked
from within HUFF_DECODE_FAST() when bits_left=0. This produced a right
shift by a negative number of bits, which is undefined in C.
Under very rare circumstances, decompressing specific corrupt JPEG
images would create a situation whereby GET_BITS(1) was invoked
from within HUFF_DECODE_FAST() when bits_left=0. This produced a right
shift by a negative number of bits, which is undefined in C.
When using context-based upsampling, use a dummy color conversion
routine instead of a dummy row buffer. This improves performance
(since the actual color conversion routine no longer has to be called),
and it also fixes valgrind errors when decompressing to RGB565.
Valgrind previously complained, because using the RGB565 color
converter with the dummy row buffer was causing a table lookup with
undefined indices.
Under very rare circumstances, decompressing specific corrupt JPEG
images would create a situation whereby GET_BITS(1) was invoked
from within HUFF_DECODE_FAST() when bits_left=0. This produced a right
shift by a negative number of bits, which is undefined in C.
Use a new checked exception type (TJException) when passing through
errors from the underlying C library. This gives the application a
choice of catching all exceptions or just those from TurboJPEG.
Throw IllegalArgumentException at the JNI level when arguments to the
JNI function are incorrect, and when one of the TurboJPEG "utility"
functions returns an error (because, per the C API specification, those
functions will only return an error if one of their arguments is out of
range.)
Remove "throws Exception" from the signature of any methods that no
longer pass through an error from the TurboJPEG C library.
Credit Viktor for the new code
Code formatting tweaks
Change the behavior of the bailif0() macro in the JNI wrapper so that it doesn't throw an exception for an unexpected NULL condition. In fact, in all cases, the underlying JNI API function (such as GetFieldID(), etc.) will throw an Error on its own whenever it returns NULL, so our custom exceptions were never being thrown in that case anyhow. All we need to do is just detect the error and bail out of the C code.
This also corrects a couple of formatting issues (semicolons aren't needed at the end of class definitions, and @Override should be specified for the methods we're overriding from super-classes, so the compiler can sanity-check that we're actually overriding a method and not declaring a new one.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1595 632fc199-4ca6-4c93-a231-07263d6284db
-- Use macros to represent the fast FDCT constants, to facilitate comparing the AltiVec implementation of the algorithm with the SSE2 implementation.
-- Rename slow FDCT constants for consistency.
-- Use vec_sra() in all cases in the slow FDCT code. The SSE2 implementation uses psraw, which is an arithmetic shift, so we need to do likewise with AltiVec. Using vec_sr() hasn't caused any problems yet, but it is conceivable that it might cause different behavior in certain corner cases.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1444 632fc199-4ca6-4c93-a231-07263d6284db
This patch also removes an unneeded macro from jdmerge.c.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1403 632fc199-4ca6-4c93-a231-07263d6284db
This patch also removes an unneeded macro from jdmerge.c.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1403 632fc199-4ca6-4c93-a231-07263d6284db
This patch also removes an unneeded macro from jdmerge.c.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1402 632fc199-4ca6-4c93-a231-07263d6284db
-----
aee36252be.patch
From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15
The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).
The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.
The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.
Also use larger indentation in the source code for separating two independent
instruction streams.
-----
a5efdbf22c.patch
From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion
The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7: ~55 cycles
* ARM Cortex-A8: ~28 cycles
* ARM Cortex-A9: ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles
Based on the Linaro rgb565 patch from
https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
-----
aee36252be.patch
From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15
The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).
The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.
The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.
Also use larger indentation in the source code for separating two independent
instruction streams.
-----
a5efdbf22c.patch
From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion
The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7: ~55 cycles
* ARM Cortex-A8: ~28 cycles
* ARM Cortex-A9: ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles
Based on the Linaro rgb565 patch from
https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
We can't simply increase JMSG_LENGTH_MAX, because it is part of the libjpeg API, and it is generally assumed that a buffer of this length will be passed to format_message(). Thus, the easiest solution is simply to use a shorter copyright string for JMSG_COPYRIGHT.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@1320 632fc199-4ca6-4c93-a231-07263d6284db
We can't simply increase JMSG_LENGTH_MAX, because it is part of the libjpeg API, and it is generally assumed that a buffer of this length will be passed to format_message(). Thus, the easiest solution is simply to use a shorter copyright string for JMSG_COPYRIGHT.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1319 632fc199-4ca6-4c93-a231-07263d6284db
We can't simply increase JMSG_LENGTH_MAX, because it is part of the libjpeg API, and it is generally assumed that a buffer of this length will be passed to format_message(). Thus, the easiest solution is simply to use a shorter copyright string for JMSG_COPYRIGHT.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1319 632fc199-4ca6-4c93-a231-07263d6284db
We can't simply increase JMSG_LENGTH_MAX, because it is part of the libjpeg API, and it is generally assumed that a buffer of this length will be passed to format_message(). Thus, the easiest solution is simply to use a shorter copyright string for JMSG_COPYRIGHT.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1318 632fc199-4ca6-4c93-a231-07263d6284db
We can't simply increase JMSG_LENGTH_MAX, because it is part of the libjpeg API, and it is generally assumed that a buffer of this length will be passed to format_message(). Thus, the easiest solution is simply to use a shorter copyright string for JMSG_COPYRIGHT.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1318 632fc199-4ca6-4c93-a231-07263d6284db
-- Auto-generates HAVE_LOCALE_H macro and adds it to jconfig.h (this is used by rdjpgcom.c.)
-- Reconciles the description and ordering of macros between config.h.in and jconfig.h.in, so the two files can be easily diffed.
-- Eliminates the use of the autoheader-generated config.h in the project and moves relevant internal-only macros into a new file, jconfigint.h. This is to avoid "already defined" warnings in files that were including both config.h (to get the internal autotools package information or the INLINE definition) and jconfig.h.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1258 632fc199-4ca6-4c93-a231-07263d6284db
-- Auto-generates HAVE_LOCALE_H macro and adds it to jconfig.h (this is used by rdjpgcom.c.)
-- Reconciles the description and ordering of macros between config.h.in and jconfig.h.in, so the two files can be easily diffed.
-- Eliminates the use of the autoheader-generated config.h in the project and moves relevant internal-only macros into a new file, jconfigint.h. This is to avoid "already defined" warnings in files that were including both config.h (to get the internal autotools package information or the INLINE definition) and jconfig.h.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1258 632fc199-4ca6-4c93-a231-07263d6284db
-- Auto-generates HAVE_LOCALE_H macro and adds it to jconfig.h (this is used by rdjpgcom.c.)
-- Reconciles the description and ordering of macros between config.h.in and jconfig.h.in, so the two files can be easily diffed.
-- Eliminates the use of the autoheader-generated config.h in the project and moves relevant internal-only macros into a new file, jconfigint.h. This is to avoid "already defined" warnings in files that were including both config.h (to get the internal autotools package information or the INLINE definition) and jconfig.h.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1257 632fc199-4ca6-4c93-a231-07263d6284db
-- Auto-generates HAVE_LOCALE_H macro and adds it to jconfig.h (this is used by rdjpgcom.c.)
-- Reconciles the description and ordering of macros between config.h.in and jconfig.h.in, so the two files can be easily diffed.
-- Eliminates the use of the autoheader-generated config.h in the project and moves relevant internal-only macros into a new file, jconfigint.h. This is to avoid "already defined" warnings in files that were including both config.h (to get the internal autotools package information or the INLINE definition) and jconfig.h.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1257 632fc199-4ca6-4c93-a231-07263d6284db
delta = cur0 * 2;
cur0 += delta; /* form error * 3 */
errorptr[0] = (FSERROR) (bpreverr0 + cur0);
cur0 += delta; /* form error * 5 */
bpreverr0 = belowerr0 + cur0;
cur0 += delta; /* form error * 7 */
Each time cur0 is incremented by delta, the compiled code doubles the value of delta (WTF?!) Thus, by the time the end of the block is reached, cur0 is equal to 15 times its former self, not 7 times its former self as it should be. At any rate, it was a lot simpler to just refactor the code so that it uses multiplication.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@1255 632fc199-4ca6-4c93-a231-07263d6284db
delta = cur0 * 2;
cur0 += delta; /* form error * 3 */
errorptr[0] = (FSERROR) (bpreverr0 + cur0);
cur0 += delta; /* form error * 5 */
bpreverr0 = belowerr0 + cur0;
cur0 += delta; /* form error * 7 */
Each time cur0 is incremented by delta, the compiled code doubles the value of delta (WTF?!) Thus, by the time the end of the block is reached, cur0 is equal to 15 times its former self, not 7 times its former self as it should be. At any rate, it was a lot simpler to just refactor the code so that it uses multiplication.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1253 632fc199-4ca6-4c93-a231-07263d6284db
delta = cur0 * 2;
cur0 += delta; /* form error * 3 */
errorptr[0] = (FSERROR) (bpreverr0 + cur0);
cur0 += delta; /* form error * 5 */
bpreverr0 = belowerr0 + cur0;
cur0 += delta; /* form error * 7 */
Each time cur0 is incremented by delta, the compiled code doubles the value of delta (WTF?!) Thus, by the time the end of the block is reached, cur0 is equal to 15 times its former self, not 7 times its former self as it should be. At any rate, it was a lot simpler to just refactor the code so that it uses multiplication.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1253 632fc199-4ca6-4c93-a231-07263d6284db
delta = cur0 * 2;
cur0 += delta; /* form error * 3 */
errorptr[0] = (FSERROR) (bpreverr0 + cur0);
cur0 += delta; /* form error * 5 */
bpreverr0 = belowerr0 + cur0;
cur0 += delta; /* form error * 7 */
Each time cur0 is incremented by delta, the compiled code doubles the value of delta (WTF?!) Thus, by the time the end of the block is reached, cur0 is equal to 15 times its former self, not 7 times its former self as it should be. At any rate, it was a lot simpler to just refactor the code so that it uses multiplication.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1251 632fc199-4ca6-4c93-a231-07263d6284db
delta = cur0 * 2;
cur0 += delta; /* form error * 3 */
errorptr[0] = (FSERROR) (bpreverr0 + cur0);
cur0 += delta; /* form error * 5 */
bpreverr0 = belowerr0 + cur0;
cur0 += delta; /* form error * 7 */
Each time cur0 is incremented by delta, the compiled code doubles the value of delta (WTF?!) Thus, by the time the end of the block is reached, cur0 is equal to 15 times its former self, not 7 times its former self as it should be. At any rate, it was a lot simpler to just refactor the code so that it uses multiplication.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1251 632fc199-4ca6-4c93-a231-07263d6284db
-- The Mac and Cygwin packages will now be created with the directory structure defined by the configure variables "prefix", "bindir", "libdir", etc., with the exception that the docs are always installed under /usr/share/doc/{package_name}-{version} on Cygwin and /Library/Documentation/{package_name} on Mac.
-- Fixed a duplicate filename warning when generating RPMs with the default prefix of /opt/libjpeg-turbo.
-- Moved the TurboJPEG libraries out of the system directory on Windows and Mac. It is no longer necessary to put them there, since we are not trying to be backward compatible with TurboJPEG/IPP anymore.
-- Fixed an issue whereby building the "installer" target on Windows would not build the Java JAR file, thus causing an error if the JAR had not been previously built.
-- Building the "install" target on Windows will now install libjpeg-turbo into c:\libjpeg-turbo[-gcc][64] (the same directories used by the installers.) This can be overridden by setting CMAKE_INSTALL_PREFIX.
-- The Java classes on all platforms will now look for the JNI library in the directory under which the build/packaging system installs it.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@946 632fc199-4ca6-4c93-a231-07263d6284db
-- The Mac and Cygwin packages will now be created with the directory structure defined by the configure variables "prefix", "bindir", "libdir", etc., with the exception that the docs are always installed under /usr/share/doc/{package_name}-{version} on Cygwin and /Library/Documentation/{package_name} on Mac.
-- Fixed a duplicate filename warning when generating RPMs with the default prefix of /opt/libjpeg-turbo.
-- Moved the TurboJPEG libraries out of the system directory on Windows and Mac. It is no longer necessary to put them there, since we are not trying to be backward compatible with TurboJPEG/IPP anymore.
-- Fixed an issue whereby building the "installer" target on Windows would not build the Java JAR file, thus causing an error if the JAR had not been previously built.
-- Building the "install" target on Windows will now install libjpeg-turbo into c:\libjpeg-turbo[-gcc][64] (the same directories used by the installers.) This can be overridden by setting CMAKE_INSTALL_PREFIX.
-- The Java classes on all platforms will now look for the JNI library in the directory under which the build/packaging system installs it.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@946 632fc199-4ca6-4c93-a231-07263d6284db
If the default prefix (/opt/libjpeg-turbo) is used, then we now always install 32-bit libraries in /opt/libjpeg-turbo/lib32 and 64-bit libraries in /opt/libjpeg-turbo/lib64 instead of trying to conform to the Debian or Red Hat conventions. The RPM and DEB packages will now be created with the directory structure defined by the configure variables "prefix", "bindir", "libdir", etc., with the exception that the docs are always installed under /usr/share/doc/{package_name}-{version}.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@944 632fc199-4ca6-4c93-a231-07263d6284db
If the default prefix (/opt/libjpeg-turbo) is used, then we now always install 32-bit libraries in /opt/libjpeg-turbo/lib32 and 64-bit libraries in /opt/libjpeg-turbo/lib64 instead of trying to conform to the Debian or Red Hat conventions. The RPM and DEB packages will now be created with the directory structure defined by the configure variables "prefix", "bindir", "libdir", etc., with the exception that the docs are always installed under /usr/share/doc/{package_name}-{version}.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@944 632fc199-4ca6-4c93-a231-07263d6284db
build to fail when using the Visual Studio IDE.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.1.x@554 632fc199-4ca6-4c93-a231-07263d6284db
be used instead of 4:2:2 when decompressing JPEG images using SSE2 code
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@223 632fc199-4ca6-4c93-a231-07263d6284db
* Use jsimd_i386.c instead of the attic jsimd.c
* Corrected include of jsimd.h in jsimd_i386.c.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@59 632fc199-4ca6-4c93-a231-07263d6284db
Use RIP relative addressing as that works in both PIC and non-PIC mode.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@51 632fc199-4ca6-4c93-a231-07263d6284db
The SIMD glue code has gotten a bit #ifdef heavy so clean it up by having
one file for each possible SIMD arch. This also allows a simplification of
the x86_64 code as SSE/SSE2 is always known to exist on that arch.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@49 632fc199-4ca6-4c93-a231-07263d6284db
Older versions of automake doesn't properly support no-recursive make.
Reimplement the build system by having a local Makefile.am in the
simd/ directory.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@31 632fc199-4ca6-4c93-a231-07263d6284db
We use the heap allocators to avoid having more than one implementation
of the alignment logic.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@19 632fc199-4ca6-4c93-a231-07263d6284db
This has no measurable difference right now but makes it possible to do
SIMD implementations of this stage.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@16 632fc199-4ca6-4c93-a231-07263d6284db
Add NASM support and stub routine for detecting SIMD extensions.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@15 632fc199-4ca6-4c93-a231-07263d6284db
Designed to impose minimal changes on the "normal" code.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@14 632fc199-4ca6-4c93-a231-07263d6284db
Divide it into sample conversion, DCT and quantization in order to
easily provide alternative implementations of each stage.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@13 632fc199-4ca6-4c93-a231-07263d6284db
Fix some broken assumptions and allow any alignment, not just those
associated with C types.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@12 632fc199-4ca6-4c93-a231-07263d6284db
Studio build. A static jconfig.h has been re-added, but in a separate
directory, to avoid clash with jconfig.h generated by configure
script. Also, jconfig.h now includes the inline macro. jpeg.dsp has
been modified to search in the "win" subdir, to find jconfig.h.
This patch is in spirit similar to r121.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@5 632fc199-4ca6-4c93-a231-07263d6284db
**Complete description of the bug fix or feature that this pull request implements**
**Checklist before submitting the pull request, to maximize the chances that the pull request will be accepted**
- [ ] Read CONTRIBUTING.md, a link to which appears under "Helpful resources" below. That document discusses general guidelines for contributing to libjpeg-turbo, as well as the types of contributions that will not be accepted or are unlikely to be accepted.
- [ ] Search the existing issues and pull requests (both open and closed) to ensure that a similar request has not already been submitted and rejected.
- [ ] Discuss the proposed bug fix or feature in a GitHub issue, through direct e-mail with the project maintainer, or on the libjpeg-turbo-devel mailing list.
* If building on macOS, NASM or Yasm can be obtained from
[MacPorts](https://macports.org) or [Homebrew](https://brew.sh).
- NOTE: Currently, if it is desirable to hide the SIMD function symbols in
Mac executables or shared libraries that statically link with
libjpeg-turbo, then NASM 2.14 or later or Yasm must be used when
building libjpeg-turbo.
* If NASM or Yasm is not in your `PATH`, then you can specify the full path
to the assembler by using either the `CMAKE_ASM_NASM_COMPILER` CMake
variable or the `ASM_NASM` environment variable. On Windows, use forward
slashes rather than backslashes in the path (for example,
**c:/nasm/nasm.exe**).
* NASM and Yasm are located in the CRB (Code Ready Builder) or PowerTools
repository on Red Hat Enterprise Linux 8+ and derivatives, which is not
enabled by default.
### Un*x Platforms (including Linux, Mac, FreeBSD, Solaris, and Cygwin)
- GCC v4.1 (or later) or Clang recommended for best performance
- If building the TurboJPEG Java wrapper, JDK or OpenJDK 1.5 or later is
required. Most modern Linux distributions, as well as Solaris 10 and later,
include JDK or OpenJDK. For other systems, you can obtain the Oracle Java
Development Kit from
<https://oracle.com/java/technologies/downloads>.
* If using JDK 11 or later, CMake 3.10.x or later must also be used.
### Windows
- Microsoft Visual C++ 2005 or later
If you don't already have Visual C++, then the easiest way to get it is by
installing
[Visual Studio Community Edition](https://visualstudio.microsoft.com),
which includes everything necessary to build libjpeg-turbo.
* You can also download and install the standalone Windows SDK (for Windows 7
or later), which includes command-line versions of the 32-bit and 64-bit
Visual C++ compilers.
* If you intend to build libjpeg-turbo from the command line, then add the
appropriate compiler and SDK directories to the `INCLUDE`, `LIB`, and
`PATH` environment variables. This is generally accomplished by
executing `vcvars32.bat` or `vcvars64.bat`, which are located in the same
directory as the compiler.
* If built with Visual C++ 2015 or later, the libjpeg-turbo static libraries
cannot be used with earlier versions of Visual C++, and vice versa.
* The libjpeg API DLL (**jpeg{version}.dll**) will depend on the C run-time
DLLs corresponding to the version of Visual C++ that was used to build it.
- Vcpkg
You need to download and install libpng using the [vcpkg](https://github.com/Microsoft/vcpkg) dependency manager:
git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.bat
./vcpkg integrate install
./vcpkg install libpng:x64-windows
./vcpkg install libpng:x64-windows-static
Actually, you can just download and install MozJPEG using vcpkg dependency manager:
./vcpkg install mozjpeg
The mozjpeg port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please [create an issue or pull request](https://github.com/Microsoft/vcpkg) on the vcpkg repository.
... OR ...
- MinGW
[MSYS2](https://msys2.org) or [tdm-gcc](https://jmeubank.github.io/tdm-gcc)
recommended if building on a Windows machine. Both distributions install a
Start Menu link that can be used to launch a command prompt with the
appropriate compiler paths automatically set.
- If building the TurboJPEG Java wrapper, JDK 1.5 or later is required. This
can be downloaded from
<https://oracle.com/java/technologies/downloads>.
* If using JDK 11 or later, CMake 3.10.x or later must also be used.
Sub-Project Builds
------------------
The libjpeg-turbo build system does not support being included as a sub-project
using the CMake `add_subdirectory()` function. Use the CMake
`ExternalProject_Add()` function instead.
Out-of-Tree Builds
------------------
Binary objects, libraries, and executables are generated in the directory from
which CMake is executed (the "binary directory"), and this directory need not
necessarily be the same as the libjpeg-turbo source directory. You can create
multiple independent binary directories, in which different versions of
libjpeg-turbo can be built from the same source tree using different compilers
or settings. In the sections below, *{build_directory}* refers to the binary
directory, whereas *{source_directory}* refers to the libjpeg-turbo source
directory. For in-tree builds, these directories are the same.
Ninja
-----
If using Ninja, then replace `make` or `nmake` with `ninja`, and replace the
CMake generator (specified with the `-G` option) with `Ninja`, in all of the
procedures and recipes below.
Build Procedure
---------------
NOTE: The build procedures below assume that CMake is invoked from the command
line, but all of these procedures can be adapted to the CMake GUI as
well.
### Un*x
The following procedure will build libjpeg-turbo on Unix and Unix-like systems.
(On Solaris, this generates a 32-bit build. See "Build Recipes" below for
This repository is governed by Mozilla's code of conduct and etiquette guidelines.
For more details, please read the
[Mozilla Community Participation Guidelines](https://www.mozilla.org/about/governance/policies/participation/).
## How to Report
For more information on how to report violations of the Community Participation Guidelines, please read our '[How to Report](https://www.mozilla.org/about/governance/policies/participation/reporting/)' page.
<!--
## Project Specific Etiquette
In some cases, there will be additional project etiquette i.e.: (https://bugzilla.mozilla.org/page.cgi?id=etiquette.html).
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the name of the mozjpeg Project nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS", AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Mozilla JPEG Encoder Project [](https://ci.appveyor.com/project/kornel/mozjpeg-4ekrx)
============================
This project's goal is to reduce the size of JPEG files without reducing quality or compatibility with the vast majority of the world's deployed decoders.
MozJPEG improves JPEG compression efficiency achieving higher visual quality and smaller file sizes at the same time. It is compatible with the JPEG standard, and the vast majority of the world's deployed JPEG decoders.
The idea is to reduce transfer times for JPEGs on the Web, thus reducing page load times.
MozJPEG is a patch for [libjpeg-turbo](https://github.com/libjpeg-turbo/libjpeg-turbo). **Please send pull requests to libjpeg-turbo** if the changes aren't specific to newly-added MozJPEG-only compression code. This project aims to keep differences with libjpeg-turbo minimal, so whenever possible, improvements and bug fixes should go there first.
'mozjpeg' is not intended to be a general JPEG library replacement. It makes tradeoffs that are intended to benefit Web use cases and focuses solely on improving encoding. It is best used as part of a Web encoding workflow. For a general JPEG library (e.g. your system libjpeg), especially if you care about decoding, we recommend libjpeg-turbo.
MozJPEG is compatible with the libjpeg API and ABI. It is intended to be a drop-in replacement for libjpeg. MozJPEG is a strict superset of libjpeg-turbo's functionality. All MozJPEG's improvements can be disabled at run time, and in that case it behaves exactly like libjpeg-turbo.
More information:
MozJPEG is meant to be used as a library in graphics programs and image processing tools. We include a demo `cjpeg` command-line tool, but it's not intended for serious use. We encourage authors of graphics programs to use libjpeg's [C API](libjpeg.txt) and link with MozJPEG library instead.
* Progressive encoding with "jpegrescan" optimization. It can be applied to any JPEG file (with `jpegtran`) to losslessly reduce file size.
* Trellis quantization. When converting other formats to JPEG it maximizes quality/filesize ratio.
* Comes with new quantization table presets, e.g. tuned for high-resolution displays.
* Fully compatible with all web browsers.
* Can be seamlessly integrated into any program that uses the industry-standard libjpeg API. There's no need to write any MozJPEG-specific integration code.
See [BUILDING](BUILDING.md). MozJPEG is built exactly the same way as libjpeg-turbo, so if you need additional help please consult [libjpeg-turbo documentation](https://libjpeg-turbo.org/).
"Directory containing cross-compiled x86-64 or Armv8 (64-bit) iOS or macOS build to include in universal binaries")
set(MACOS_APP_CERT_NAME""CACHESTRING
"Name of the Developer ID Application certificate (in the macOS keychain) that should be used to sign the libjpeg-turbo DMG. Leave this blank to generate an unsigned DMG.")
set(MACOS_INST_CERT_NAME""CACHESTRING
"Name of the Developer ID Installer certificate (in the macOS keychain) that should be used to sign the libjpeg-turbo installer package. Leave this blank to generate an unsigned package.")
[C compiler flags needed to include jni.h (default: -I/System/Library/Frameworks/JavaVM.framework/Headers on OS X, '-I/usr/java/include -I/usr/java/include/solaris' on Solaris, and '-I/usr/java/default/include -I/usr/java/default/include/linux' on Linux)])
AC_MSG_CHECKING([whether to build TurboJPEG Java wrapper])
AC_ARG_WITH([java],
AC_HELP_STRING([--with-java], [Build Java wrapper for the TurboJPEG library]))
if test "x$with_12bit" = "xyes" -o "x$with_turbojpeg" = "xno"; then
<pathd="m 14,-1.1445312 c -2.824372,0 -5.1445313,2.320159 -5.1445312,5.1445312 v 72 c 0,2.824372 2.3201592,5.144531 5.1445312,5.144531 h 52 c 2.824372,0 5.144531,-2.320159 5.144531,-5.144531 V 23.699219 a 1.1447968,1.1447968 0 0 0 -0.01563,-0.1875 C 70.977847,22.605363 70.406495,21.99048 70.007812,21.591797 L 48.208984,-0.20898438 C 47.606104,-0.81186474 46.804652,-1.1445313 46,-1.1445312 Z m 1.144531,6.2890624 H 42.855469 V 24 c 0,1.724372 1.420159,3.144531 3.144531,3.144531 H 64.855469 V 74.855469 H 15.144531 Z m 34,4.4179688 L 60.4375,20.855469 H 49.144531 Z"/>
</g>
<gstyle="fill:#D8DFEE;stroke-width:0">
<pathd="M 3.0307167,13.993174 V 7.0307167 h 2.7576792 2.7576792 v 1.8826151 c 0,1.2578262 0.0099,1.9287572 0.029818,2.0216512 0.03884,0.181105 0.168631,0.348218 0.33827,0.43554 l 0.1355017,0.06975 1.9598092,0.0079 1.959809,0.0078 v 4.749829 4.749829 H 8 3.0307167 Z"transform="matrix(5,0,0,5,0,-30)"/>
<pathd="M 9.8293515,9.0581469 V 7.9456453 l 1.1058025,1.1055492 c 0.608191,0.6080521 1.105802,1.1086775 1.105802,1.1125015 0,0.0038 -0.497611,0.007 -1.105802,0.007 H 9.8293515 Z"transform="matrix(5,0,0,5,0,-30)"/>
<pathd="m 14,-1.1445312 c -2.824372,0 -5.1445313,2.320159 -5.1445312,5.1445312 v 72 c 0,2.824372 2.3201592,5.144531 5.1445312,5.144531 h 52 c 2.824372,0 5.144531,-2.320159 5.144531,-5.144531 V 23.699219 a 1.1447968,1.1447968 0 0 0 -0.01563,-0.1875 C 70.977847,22.605363 70.406495,21.99048 70.007812,21.591797 L 48.208984,-0.20898438 C 47.606104,-0.81186474 46.804652,-1.1445313 46,-1.1445312 Z m 1.144531,6.2890624 H 42.855469 V 24 c 0,1.724372 1.420159,3.144531 3.144531,3.144531 H 64.855469 V 74.855469 H 15.144531 Z m 34,4.4179688 L 60.4375,20.855469 H 49.144531 Z"/>
</g>
<gstyle="fill:#4665A2;stroke-width:0">
<pathd="M 3.0307167,13.993174 V 7.0307167 h 2.7576792 2.7576792 v 1.8826151 c 0,1.2578262 0.0099,1.9287572 0.029818,2.0216512 0.03884,0.181105 0.168631,0.348218 0.33827,0.43554 l 0.1355017,0.06975 1.9598092,0.0079 1.959809,0.0078 v 4.749829 4.749829 H 8 3.0307167 Z"transform="matrix(5,0,0,5,0,-30)"/>
<pathd="M 9.8293515,9.0581469 V 7.9456453 l 1.1058025,1.1055492 c 0.608191,0.6080521 1.105802,1.1086775 1.105802,1.1125015 0,0.0038 -0.497611,0.007 -1.105802,0.007 H 9.8293515 Z"transform="matrix(5,0,0,5,0,-30)"/>
<pathd="M 5.6063709,24.951908 C 4.3924646,24.775461 3.4197129,23.899792 3.1031586,22.698521 L 3.0216155,22.389078 V 13.997725 5.6063709 L 3.1037477,5.2982247 C 3.3956682,4.2029881 4.1802788,3.412126 5.2787258,3.105917 5.5646428,3.0262132 5.6154982,3.0244963 8.0611641,3.0119829 l 2.4911989,-0.012746 1.932009,1.9300342 c 1.344142,1.3427669 1.976319,1.9498819 2.07763,1.9952626 0.137456,0.061571 0.474218,0.066269 6.006826,0.083795 l 5.861206,0.018568 0.29124,0.081916 c 1.094895,0.3079569 1.890116,1.109428 2.175567,2.192667 l 0.08154,0.3094425 V 16 22.389078 l -0.08154,0.309443 c -0.28446,1.079482 -1.086411,1.888085 -2.175567,2.193614 l -0.29124,0.0817 -10.302616,0.0049 c -5.700217,0.0027 -10.4001945,-0.0093 -10.5210471,-0.02684 z"/>
<pathd="M 5.6063709,24.951908 C 4.3924646,24.775461 3.4197129,23.899792 3.1031586,22.698521 L 3.0216155,22.389078 V 13.997725 5.6063709 L 3.1037477,5.2982247 C 3.3956682,4.2029881 4.1802788,3.412126 5.2787258,3.105917 5.5646428,3.0262132 5.6154982,3.0244963 8.0611641,3.0119829 l 2.4911989,-0.012746 1.932009,1.9300342 c 1.344142,1.3427669 1.976319,1.9498819 2.07763,1.9952626 0.137456,0.061571 0.474218,0.066269 6.006826,0.083795 l 5.861206,0.018568 0.29124,0.081916 c 1.094895,0.3079569 1.890116,1.109428 2.175567,2.192667 l 0.08154,0.3094425 V 16 22.389078 l -0.08154,0.309443 c -0.28446,1.079482 -1.086411,1.888085 -2.175567,2.193614 l -0.29124,0.0817 -10.302616,0.0049 c -5.700217,0.0027 -10.4001945,-0.0093 -10.5210471,-0.02684 z"/>
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.