Commit Graph

2601 Commits

Author SHA1 Message Date
DRC
8d76e4e550 Doc: "EXIF" = "Exif" 2024-08-31 15:33:55 -04:00
DRC
9983840eb6 TJ/xform: Check crop region against dest. image
Lossless cropping is performed after other lossless transform
operations, so the cropping region must be specified relative to the
destination image dimensions and level of chrominance subsampling, not
the source image dimensions and level of chrominance subsampling.

More specifically, if the lossless transform operation swaps the X and Y
axes, or if the image is converted to grayscale, then that changes the
cropping region requirements.
2024-08-31 15:04:30 -04:00
DRC
8456d2b98c Doc: "MCU block" = "iMCU" or "MCU"
The JPEG-1 spec never uses the term "MCU block".  That term is rarely
used in other literature to describe the equivalent of an MCU in an
interleaved JPEG image, but the libjpeg documentation uses "iMCU" to
describe the same thing.  "iMCU" is a better term, since the equivalent
of an interleaved MCU can contain multiple DCT blocks (or samples in
lossless mode) that are only grouped together if the image is
interleaved.

In the case of restart markers, "MCU block" was used in the libjpeg
documentation instead of "MCU", but "MCU" is more accurate and less
confusing.  (The restart interval is literally in MCUs, where one MCU
is one data unit in a non-interleaved JPEG image and multiple data units
in a multi-component interleaved JPEG image.)

In the case of 9b704f96b2, the issue was
actually with progressive JPEG images exactly two DCT blocks wide, not
two MCU blocks wide.

This commit also defines "MCU" and "MCU row" in the description of the
various restart marker options/parameters.  Although an MCU row is
technically always a row of samples in lossless mode, "sample row" was
confusing, since it is used in other places to describe a row of samples
for a single component (whereas an MCU row in a typical lossless JPEG
image consists of a row of interleaved samples for all components.)
2024-08-30 14:16:09 -04:00
DRC
5cf7960678 Undocument TJ*PARAM_RESTARTBLOCKS for lossless
TJ*PARAM_RESTARTBLOCKS technically works with lossless compression, but
it is not useful, since the value must be equal to the number of samples
in a row.  (In other words, it is no different than
TJ*PARAM_RESTARTINROWS, except that it requires the user to do more
math.)
2024-08-28 18:36:37 -04:00
DRC
d62079717c TJBench: Don't override subsamp until args parsed
Otherwise, passing -subsamp after -lossless might cause the worst-case
JPEG buffer size to be too small.
2024-08-28 18:21:55 -04:00
DRC
c72bbd9c06 tjbench.c: (Re)allow unreduced scaling factors
For reasons I can't recall, fc01f4673b
(the TurboJPEG 3 API overhaul) changed the C version of TJBench, but not
the Java version, so that it requires an exact scaling factor match (as
opposed to allowing unreduced scaling factors, as djpeg does and prior
versions of TJBench did.)  That might have been temporary testing code
that was accidentally committed.
2024-08-26 18:14:12 -04:00
DRC
35199878f6 TurboJPEG doc: Fix incorrect/confusing parentheses 2024-08-26 16:35:30 -04:00
DRC
00a261c473 tjbench.c: Code formatting tweak 2024-08-26 16:31:02 -04:00
DRC
4851cbe406 djpeg/jpeg_crop_scanline(): Disallow crop vals < 0
Because the crop spec was parsed using unsigned 32-bit integers,
negative numbers were interpreted as values ~= UINT_MAX (4,294,967,295).
This had the following ramifications:

- If the cropping region width was negative and the adjusted width + the
  adjusted left boundary was greater than 0, then the 32-bit unsigned
  integer bounds checks in djpeg and jpeg_crop_scanline() overflowed and
  failed to detect the out-of-bounds width, jpeg_crop_scanline() set
  cinfo->output_width to a value ~= UINT_MAX, and a buffer overrun and
  subsequent segfault occurred in the upsampling or color conversion
  routine.  The segfault occurred in the body of
  jpeg_skip_scanlines() --> read_and_discard_scanlines() if the cropping
  region upper boundary was greater than 0 and the JPEG image used
  chrominance subsampling and in the body of jpeg_read_scanlines()
  otherwise.

- If the cropping region width was negative and the adjusted width + the
  adjusted left boundary was 0, then a zero-width output image was
  generated.

- If the cropping region left boundary was negative, then an output
  image with bogus data was generated.

This commit modifies djpeg and jpeg_crop_scanline() so that the
aforementioned bounds checks use 64-bit unsigned integers, thus guarding
against overflow.  It similarly modifies jpeg_skip_scanlines().  In the
case of jpeg_skip_scanlines(), the issue was not reproducible with
djpeg, but passing a negative number of lines to jpeg_skip_scanlines()
caused a similar overflow if the number of lines +
cinfo->output_scanline was greater than 0.  That caused
jpeg_skip_scanlines() to read past the end of the JPEG image, throw a
warning ("Corrupt JPEG data: premature end of data segment"), and fail
to return unless warnings were treated as fatal.  Also, djpeg now parses
the crop spec using signed integers and checks for negative values.
2024-08-26 16:24:33 -04:00
DRC
548f732432 TJBench: Usage screen tweak
Indicate that -maxmemory and -maxpixels take an integer argument.
2024-08-26 10:14:11 -04:00
DRC
de4bbac55e TJCompressor.compress(): Fix lossls buf size calc 2024-08-23 12:48:34 -04:00
DRC
0acb084464 JNI: Fix *Image() array size issues w/ align != 1
+ check for mismatch between C and Java APIs in *saveImage().
2024-08-22 16:58:50 -04:00
DRC
d44fc54f94 Java: Unset srcBuf12/16 with BufferedImage/YUV src
Due to an oversight in the multi-precision feature,
TJCompressor.srcBuf12 and TJCompressor.srcBuf16 were not set to null
in TJCompressor.setSourceImage(YUVImage) or
TJCompressor.setSourceImage(BufferedImage, ...).  Thus, if an
application set a 12-bit or 16-bit packed-pixel buffer as the source
image then set a BufferedImage with integer pixels as the source image,
TJCompress.compress() would compress from the 12-bit or 16-bit
packed-pixel buffer instead of the BufferedImage.  The odds of an
application actually doing that are very slim, however.
2024-08-21 16:44:46 -04:00
DRC
8c2e7306cf Java doc: Minor formatting tweak 2024-08-21 13:03:41 -04:00
DRC
49f1b5807d TJBench.java: Explicitly set restartIntervalBlocks
This is just a readability thing.  Java initializes integer fields to 0
by default.
2024-08-20 15:11:20 -04:00
DRC
a9723f8a98 cjpeg/djpeg/jpegtran: Restore jpeg-6b arg abbrevs
There are two approaches to handling abbreviated command-line options:

1. If a new option is introduced that begins with the same letters as an
existing option, require a longer abbreviation for the existing option
in order to ensure that abbreviations are always unique.

2. Require a unique abbreviation only for new options, and match all
non-unique abbreviations with existing options, thus maintaining
backward compatibility.

keymatch() supports either approach, and Tom Lane historically seemed to
prefer Approach 2, whereas both approaches have been applied
inconsistently in the years since.  This commit consistently applies
Approach 2.

More specific notes:

We unnecessarily required 'cjpeg -progressive' to be abbreviated as
'cjpeg -pro' rather than 'cjpeg -p' when the -precision option was
introduced in libjpeg-turbo 3.0 beta.

The IJG unnecessarily required 'cjpeg -scans' to be abbreviated as
'cjpeg -scan' rather than 'cjpeg -sc' when the -scale option was
introduced in jpeg-7.  We even more unnecessarily adopted that
requirement, even though we never adopted the -scale option.

We unnecessarily required 'djpeg -scale' to be abbreviated as
'djpeg -sc' rather than 'djpeg -s' when the -skip option was introduced
in libjpeg-turbo 1.5 beta.

The IJG unnecessarily required 'jpegtran -copy' to be abbreviated as
'jpegtran -co' rather than 'jpegtran -c' when the -crop option was
introduced in jpeg-7.

The IJG unnecessarily required 'jpegtran -progressive' to be abbreviated
as 'jpegtran -pr' rather than 'jpegtran -p' when the -perfect option was
introduced in jpeg-7.
2024-08-19 17:12:33 -04:00
DRC
562ad7612e OSS-Fuzz: More MSan fixes
We need to use tj3Alloc() (which, when ZERO_BUFFERS is defined, calls
calloc() instead of malloc()) to allocate all destination buffers.
Otherwise, if the compression/decompression/transform operation fails,
then the buffer checksum (which is computed to prevent the compiler from
optimizing out the whole test, since the destination buffer is never
used otherwise) will depend upon values in the destination buffer that
were never written, and MSan will complain.
2024-08-19 10:06:59 -04:00
DRC
488d42a8a5 OSS-Fuzz: Define ZERO_BUFFERS for MSan build
... and use tj3Alloc() to allocate compression/transformation
destination buffers.
2024-08-16 12:17:17 -04:00
DRC
0c23b0ad60 Various doc tweaks
- "Optimized baseline entropy coding" = "Huffman table optimization"

  "Optimized baseline entropy coding" was meant to emphasize that the
  feature is only useful when generating baseline (single-scan lossy
  8-bit-per-sample Huffman-coded) JPEG images, because it is
  automatically enabled when generating Huffman-coded progressive
  (multi-scan), 12-bit-per-sample, and lossless JPEG images.  However,
  Huffman table optimization isn't actually an integral part of those
  non-baseline modes.  You can forego Huffman table optimization with
  12-bit data precision if you supply your own Huffman tables.  The spec
  doesn't require it with progressive or lossless mode, either, although
  our implementation does.  Furthermore, "baseline" describes more than
  just the type of entropy coding used.  It was incorrect to say that
  optimized "baseline" entropy coding is automatically enabled for
  Huffman-coded progressive, 12-bit-per-sample, and lossless JPEG
  images, since those are clearly not baseline images.

- "Progressive entropy coding" = "Progressive JPEG"

  "Progressive" describes more than just the type of entropy coding
  used.  (In fact, both Huffman-coded and arithmetic-coded images can be
  progressive.)

- Mention that TJPARAM_OPTIMIZE/TJ.PARAM_OPTIMIZE can be used with
  lossless transformation as well.

- General wordsmithing

- Formatting tweaks
2024-08-16 11:49:00 -04:00
DRC
b4336c3afb Work around valgrind/MSan SIMD false positives
Referring to
https://sourceforge.net/p/libjpeg-turbo/bugs/48,
https://sourceforge.net/p/libjpeg-turbo/bugs/82,
 #15, #238, #253, and #619,
valgrind and MSan have failed to properly detect data initialization by
libjpeg-turbo's x86 SIMD extensions for the entire 14 years that
libjpeg-turbo has been a project, resulting in false positives unless
libjpeg-turbo is built with WITH_SIMD=0 or run with JSIMD_FORCENONE=1.
This commit introduces a new C preprocessor macro (ZERO_BUFFERS) that,
if set, causes libjpeg-turbo to zero certain buffers in order to work
around the specific valgrind/MSan test failures caused by the
aforementioned false positives.  This allows us to more closely
approximate the production configuration of libjpeg-turbo when testing
with valgrind or MSan.

Closes #781
2024-08-13 16:23:40 -04:00
DRC
8db0312668 example.c: Write correct dimensions to PPM header
The dimensions in the PPM header of the output file generated by
'example decompress' were always 640 x 480, regardless of the size of
the JPEG image being decompressed.  Our regression tests (which this
commit also fixes) missed the bug because they decompressed the
640 x 480 image generated by 'example compress'.

Fixes #778
2024-08-02 09:06:40 -04:00
DRC
0566d51e09 GitHub Actions: Specify Monterey for macOS build
The Big Sur hosted runner is no longer available.
2024-07-09 17:18:53 -04:00
DRC
e287a35762 Build: Put gastest.o in ${CMAKE_BINARY_DIR}/simd
Closes #776
2024-07-04 10:03:50 -04:00
DRC
51d021bf01 TurboJPEG: Fix 12-bit-per-sample arith-coded compr
(Regression introduced by 7bb958b732)

Because of 7bb958b732, the TurboJPEG
compression and encoding functions no longer transfer the value of
TJPARAM_OPTIMIZE into cinfo->data_precision unless the data precision
is 8.  The intent of that was to prevent using_std_huff_tables() from
being called more than once when reusing the same compressor object to
generate multiple 12-bit-per-sample JPEG images.  However, because
cinfo->optimize_coding is always set to TRUE by jpeg_set_defaults() if
the data precision is 12, calling applications that use 12-bit data
precision had to unset cinfo->optimize_coding if they set
cinfo->arith_code after calling jpeg_set_defaults().  Because of
7bb958b732, the TurboJPEG API stopped
doing that except with 8-bit data precision.  Thus, attempting to
generate a 12-bit-per-sample arithmetic-coded lossy JPEG image using
the TurboJPEG API failed with "Requested features are incompatible."

Since the compressor will always fail if cinfo->arith_code and
cinfo->optimize_coding are both set, and since cinfo->optimize_coding
has no relevance for arithmetic coding, the most robust and user-proof
solution is for jinit_c_master_control() to set cinfo->optimize_coding
to FALSE if cinfo->arith_code is TRUE.

This commit also:
- modifies TJBench so that it no longer reports that it is using
  optimized baseline entropy coding in modes where that setting
  is irrelevant,
- amends the cjpeg documentation to clarify that -optimize is implied
  when specifying -progressive or '-precision 12' without -arithmetic,
  and
- prevents jpeg_set_defaults() from uselessly checking the value of
  cinfo->arith_code immediately after it has been set to FALSE.
2024-06-24 22:15:55 -04:00
DRC
bb3ab53157 cjpeg: Don't enable lossless until precision known
jpeg_enable_lossless() checks the point transform value against the data
precision, so we need to defer calling jpeg_enable_lossless() until
after all command-line options have been parsed.
2024-06-24 22:15:55 -04:00
DRC
a8aaaf5d5b cjpeg -verbose: Print PBMPLUS input file precision
This gives the user a hint as to whether converting the input file into
a JPEG file with a particular data precision will result in a loss of
precision.  Also modify usage.txt and the cjpeg man page to warn the
user about that possibility.
2024-06-24 22:15:55 -04:00
DRC
94c64ead85 Various doc tweaks
- "bits per component" = "bits per sample"

  Describing the data precision of a JPEG image using "bits per
  component" is technically correct, but "bits per sample" is the
  terminology that the JPEG-1 spec uses.  Also, "bits per component" is
  more commonly used to describe the precision of packed-pixel formats
  (as opposed to "bits per pixel") rather than planar formats, in which
  all components are grouped together.

- Unmention legacy display technologies.  Colormapped and monochrome
  displays aren't a thing anymore, and even when they were still a
  thing, it was possible to display full-color images to them.  In 1991,
  when JPEG decompression time was measured in minutes per megapixel, it
  made sense to keep a decompressed copy of JPEG images on disk, in a
  format that could be displayed without further color conversion (since
  color conversion was slow and memory-intensive.)  In 2024, JPEG
  decompression time is measured in milliseconds per megapixel, and
  color conversion is even faster.  Thus, JPEG images can be
  decompressed, displayed, and color-converted (if necessary) "on the
  fly" at speeds too fast for human vision to perceive.  (In fact, your
  TV performs much more complicated decompression algorithms at least 60
  times per second.)

- Document that color quantization (and associated features), GIF
  input/output, Targa input/output, and OS/2 BMP input/output are legacy
  features.  Legacy status doesn't necessarily mean that the features
  are deprecated.  Rather, it is meant to discourage users from using
  features that may be of little or no benefit on modern machines (such
  as low-quality modes that had significant performance advantages in
  the early 1990s but no longer do) and that are maintained on a
  break/fix basis only.

- General wordsmithing, grammar/punctuation policing, and formatting
  tweaks

- Clarify which data precisions each cjpeg input format and each djpeg
  output format supports.

- cjpeg.1: Remove unnecessary and impolitic statement about the -targa
  switch.

- Adjust or remove performance claims to reflect the fact that:
  * On modern machines, the djpeg "-fast" switch has a negligible effect
    on performance.
  * There is a measurable difference between the performance of Floyd-
    Steinberg dithering and no dithering, but it is not likely
    perceptible to most users.
  * There is a measurable difference between the performance of 1-pass
    and 2-pass color quantization, but it is not likely perceptible to
    most users.
  * There is a measurable difference between the performance of
    full-color and grayscale output when decompressing a full-color JPEG
    image, but it is not likely perceptible to most users.
  * IDCT scaling does not necessarily improve performance.  (It
    generally does if the scaling factor is <= 1/2 and generally doesn't
    if the scaling factor is > 1/2, at least on my machine.  The
    performance claim made in jpeg-6b was probably invalidated when we
    merged the additional scaling factors from jpeg-7.)

- Clarify which djpeg switches/output formats cannot be used when
  decompressing lossless JPEG images.

- Remove djpeg hints, since those involve quality vs. speed tradeoffs
  that are no longer relevant for modern machines.

- Remove documentation regarding using color quantization with 16-bit
  data precision.  (Color quantization requires lossy mode.)

- Java: Fix typos in TJDecompressor.decompress12() and
  TJDecompressor.decompress16() documentation.

- jpegtran.1: Fix truncated paragraph

  In a man page, a single quote at the start of a line is interpreted as
  a macro.

  Closes #775

- libjpeg.txt:
  * Mention J16SAMPLE data type (oversight.)
  * Remove statement about extending jdcolor.c.  (libjpeg-turbo is not
    quite as DIY as libjpeg once was.)
  * Remove paragraph about tweaking the various typedefs in jmorecfg.h.
    It is no longer relevant for modern machines.
  * Remove caveat regarding systems with ints less than 16 bits wide.
    (ANSI/ISO C requires an int to be at least 16 bits wide, and
    libjpeg-turbo has never supported non-ANSI compilers.)

- usage.txt:
  * Add copyright header.
  * Document cjpeg -icc, -memdst, -report, -strict, and -version
    switches.
  * Document djpeg -icc, -maxscans, -memsrc, -report, -skip, -crop,
    -strict, and -version switches.
  * Document jpegtran -icc, -maxscans, -report, -strict, and -version
    switches.
2024-06-24 22:11:43 -04:00
DRC
2c5312fd12 cderror.h: Always include all img I/O err messages
The 8-bit-per-sample image I/O modules have always been built with BMP,
GIF, PBMPLUS, and Targa support, and the 12-bit-per-sample image I/O
modules have always been built with only GIF and PBMPLUS support.  In
libjpeg-turbo 2.1.x and prior, cjpeg and djpeg were built with the same
image format support as the image I/O modules.  However, because of the
multi-precision feature introduced in libjpeg-turbo 3.0.x, cjpeg and
djpeg are now always built with support for all image formats.  Thus,
the error message table compiled into cjpeg and djpeg was a superset of
the error message table compiled into the 12-bit-per-sample and
16-bit-per-sample image I/O modules.  If, for example, the
12-bit-per-sample PPM writer threw JERR_PPM_COLORSPACE (== 15, because
it was built with only GIF and PBMPLUS support), then djpeg interpreted
that as JERR_GIF_COLORSPACE (== 15, because it was built with support
for all image formats.)  There was no chance of a buffer overrun, since
the error message table lookup was performed against the table compiled
into cjpeg and djpeg, which contained all possible entries.  However,
this issue caused an incorrect error message to be displayed by cjpeg or
djpeg if a 12-bit-per-sample or 16-bit-per-sample image I/O module threw
an error.  This commit simply removes the *_SUPPORTED #ifdefs from
cderror.h so that all image I/O error messages are always included in
every instance of the image I/O error message table.

Note that a similar issue could have also occurred with the
12-bit-per-sample and 16-bit-per-sample TurboJPEG image I/O functions,
since the 12-bit-per-sample and 16-bit-per-sample image I/O modules
supporting those functions are built with only PBMPLUS support whereas
the library as a whole is built with BMP and PBMPLUS support.
2024-06-13 22:34:23 -04:00
DRC
3c17063ef1 Guard against dupe SOF w/ incorrect source manager
Referring to https://bugzilla.mozilla.org/show_bug.cgi?id=1898606,
attempting to decompress a specially-crafted malformed JPEG image
(specifically an image with a complete 12-bit Start Of Frame segment
followed by an incomplete 8-bit Start Of Frame segment) using the
default marker processor, buffered-image mode, and input prefetching
triggered the following sequence of events:

- When the 12-bit SOF segment was encountered (in the body of
  jpeg_read_header()), the marker processor's read_markers() method
  called the get_sof() function, which processed the 12-bit SOF segment
  and set cinfo->data_precision to 12.

- If the application subsequently called jpeg_consume_input() in a loop
  to prefetch input data, and it didn't stop calling
  jpeg_consume_input() when the function returned JPEG_REACHED_SOS, then
  the 8-bit SOF segment was encountered in the body of
  jpeg_consume_input().  As a result, the marker processor's
  read_markers() method called get_sof(), which started to process the
  8-bit SOF segment and set cinfo->data_precision to 8.

- Since the 8-bit SOF segment was incomplete, the end of the JPEG data
  stream was encountered when get_sof() attempted to read the image
  height, width, and number of components.

- If the fill_input_buffer() method in the application's custom source
  manager incorrectly returned FALSE in response to a prematurely-
  terminated JPEG data stream, then get_sof() returned FALSE while
  attempting to read the image height, width, and number of components
  (before the duplicate SOF check was reached.)  That caused the default
  marker processor's read_markers() method, and subsequently
  jpeg_consume_input(), to return JPEG_SUSPENDED.

- If the application failed to respond to the JPEG_SUSPENDED return
  value and subsequently attempted to call jpeg_read_scanlines(),
  then the data precision check in jpeg_read_scanlines() succeeded
  (because cinfo->data_precision was now 8.)  However, because
  cinfo->data_precision had been 12 during the previous call to
  jpeg_start_decompress(), only the 12-bit version of the main
  controller was initialized, and the cinfo->main->process_data() method
  was undefined.  Thus, a segfault occurred when jpeg_read_scanlines()
  attempted to invoke that method.

Scenarios in which the issue was thwarted:

1. The default source managers handle a prematurely-terminated JPEG data
stream by inserting a fake EOI marker into the data stream.  Thus, when
using one of those source managers, the INPUT_2BYTES() and INPUT_BYTE()
macros (which get_sof() invokes to read the image height, width, and
number of components) succeeded-- albeit with bogus data, since the fake
EOI marker was read into those fields.  The duplicate SOF check in
get_sof() then failed, generating a fatal libjpeg error.

2. When using a custom source manager that correctly returns TRUE in
response to a prematurely-terminated JPEG data stream, the
aforementioned INPUT_2BYTES() and INPUT_BYTE() macros also succeeded
(albeit with bogus data read from the previous bytes of the data
stream), and the duplicate SOF check failed.

3. If the application did not prefetch input data, or if it stopped
invoking jpeg_consume_input() when the function returned
JPEG_REACHED_SOS, then the duplicate SOF segment was not read prior to
the first call to jpeg_read_scanlines().  Thus, the data precision check
in jpeg_read_scanlines() failed.  If the application instead called
jpeg12_read_scanlines() (that is, if it properly supported multiple data
precisions), then the duplicate SOF segment was not read until the body
of jpeg_finish_decompress().  At that point, its only negative effect
was to cause jpeg_finish_decompress() to return FALSE before the
duplicate SOF check was reached.

In other words, this issue depended not only upon an incorrectly-written
source manager but also upon a very specific sequence of API calls.  It
also depended upon the multi-precision feature introduced in
libjpeg-turbo 3.0.x.  When using an 8-bit-per-sample build of
libjpeg-turbo 2.1.x, jpeg_read_header() failed with "Unsupported JPEG
data precision 12" after the 12-bit SOF segment was processed.  When
using a 12-bit-per-sample build of libjpeg-turbo 2.1.x, the behavior
was the same as if the application called jpeg12_read_scanlines() in
Scenario 3 above.

This commit simply moves the duplicate SOF check to the top of
get_sof() so the check will fail before the marker processor attempts to
read the duplicate SOF.  It should be noted that this issue isn't a
libjpeg-turbo bug per se, because it occurs only when the calling
application does something it shouldn't.  It is, rather, an issue of API
hardening/caller-proofing.
2024-05-29 10:08:24 -04:00
DRC
46173635be Remove jpeg_std_message_table[] extern sym comment
jpeg_std_message_table[] was never a documented part of the libjpeg API,
nor was it exposed in jpegint.h or the Windows libjpeg API DLL.  Thus,
it was never a good idea (nor was it even remotely necessary) for
applications to use it.  However, referring to #763 and #767, one
application (RawTherapee) did use it.
34c055851e hid the symbol, which broke the
Un*x builds of RawTherapee.  (RawTherapee already did the right thing on
Windows, because jpeg_std_message_table[] was not exposed in the
Windows libjpeg API DLL.  Referring to
https://github.com/Beep6581/RawTherapee/issues/7074, RawTherapee now
does the right thing on Un*x platforms as well.)

The comment in jerror.c indicated that Tom Lane intended the symbol to
be external "just in case any applications want to refer to it
directly."  However, with respect to libjpeg-turbo, the comment was
effectively already a lie on Windows.  My choices at this point are:

1. Revert 34c055851e and start exposing
   the symbol on Windows, thus making the comment true again.
2. Remove the comment.

Option 1 would have created whiplash, since 3.0.3 has already been
released with the symbol hidden, and that option would have further
encouraged ill-advised behavior on the part of applications.  Since the
issue has already been fixed in RawTherapee, and since it is known to
affect only that application, I chose Option 2.

In my professional opinion, a code comment does not rise to the level
of a contract between a library and a calling application.  In other
words, the comment did not make the symbol part of the API, even though
the symbol was exposed on some platforms.  Applications that use
internal symbols (even the symbols defined in jpegint.h) do so at their
own risk.  There is no guarantee that those symbols will remain
unchanged from release to release, as it is sometimes necessary to
modify the internal (opaque) structures and methods in order to
implement new features or even fix bugs.  Some implementations of the
libjpeg API (such as the one in Jpegli, for instance), do not provide
those internal symbols at all.  Best practices are for applications that
use the internal libjpeg-turbo symbols to provide their own copy of
libjpeg-turbo (for instance, via static linking), so they can manage any
unexpected changes in those symbols.  (In fact, that is how libjpeg was
originally used.  Application developers built it from source with
compile-time options to exclude unneeded functionality.  Thus, no two
builds of libjpeg were guaranteed to be API/ABI-compatible.  The idea of
a libjpeg shared library that exposes a pseudo-standard ABI came later,
and that has always been an awkward fit for an API that was intended to
be modified at compile time to suit specific application needs.
libjpeg-turbo's colorspace extensions are but one example, among many,
of our attempts to reconcile that awkwardness while preserving backward
compatibility with the aforementioned pseudo-standard ABI.)
2024-05-29 00:18:05 -04:00
DRC
9ddcae4a8f jpeglib.h: Document the need to include stdio.h
libjpeg.txt documents the need to "include system headers that define at
least the typedefs FILE and size_t" before including jpeglib.h.
However, some software developers unfortunately assume that any
downstream build failure is due to an upstream oversight (as if such an
oversight would have gone unnoticed for 14 years in a library as
ubiquitous as libjpeg-turbo) and "come in hot" with a proposal and
arguments that have already been carefully considered and rejected
multiple times (as opposed to grepping the documentation or searching
existing issues and PRs to find out whether the upstream behavior is
intentional.)

Multiple issues and PRs (including #17, #18, #116, #737, and #766) have
been filed regarding this, so this commit adds a comment to jpeglib.h in
the hope of heading off future issues and PRs.  I suspect that some
people will still choose to waste my time by arguing about it, even
though the decision was made by Tom Lane nearly 20 years before
libjpeg-turbo even existed, the decision was supportable based on the
needs of early 1990s computing systems, and reversing that decision
would break backward API compatibility (which is one of the reasons that
many downstream projects adopted libjpeg-turbo in the first place.)  At
least now, if someone posts an issue or PR about this again, I can just
link to this commit and close the issue/PR without comment.
2024-05-20 13:03:49 -04:00
DRC
bc491b16e2 ChangeLog.md: Document previous commit 2024-05-16 17:32:02 -04:00
DRC
0fc7313e54 Don't traverse linked list when saving a marker
If the calling application invokes jpeg_save_markers() to save a
particular type of marker, then the save_marker() function will be
invoked for every marker of that type that is encountered.  Previously,
only the head of the marker linked list was stored (in
jpeg_decompress_struct), so save_marker() had to traverse the entire
linked list before it could add a new marker to the tail of the list.
That caused CPU usage to grow exponentially with the number of markers.
Referring to #764, it is possible to create a JPEG image that contains
an excessive number of markers.  The specific reproducer that uncovered
this issue is a specially-crafted 1-megabyte malformed JPEG image with
tens of thousands of APP1 markers, which required approximately 30
seconds of CPU time (on a modern Intel processor) to process.  However,
it should also be possible to create a legitimate JPEG image that
reproduces the issue (such as an image with tens of thousands of
duplicate EXIF tags.)

This commit introduces a new pointer (in jpeg_decomp_master, in order to
preserve backward ABI compatibility) that is used to store the tail of
the marker linked list whenever a marker is added to it.  Thus, it is no
longer necessary to traverse the list when adding a marker, and CPU
usage will grow linearly rather than exponentially with the number of
markers.

Fixes #764
2024-05-14 14:46:33 -04:00
DRC
7fa4b5b762 jerror.c: Silence MSan uninitialized value warning
If an error manager instance is passed to jpeg_std_error(), then its
format_message() method will point to the format_message() function in
jerror.c.  The format_message() function passes all eight values from
the jpeg_error_mgr::msg_parm.i[] array as arguments to
snprintf()/_snprintf_s(), even if the format string doesn't use all of
those values.  Subsequently invoking one of the ERREXIT[1-6]() macros
will leave the unused values uninitialized, and if the
-fsanitize-memory-param-retval option (introduced in Clang 14) is
enabled (which it is by default in Clang 16 and later), then MSan will
complain when the format_message() function tries to pass the
uninitialized-but-unused values as function arguments.

This commit modifies jpeg_std_error() so that it zeroes out the error
manager instance passed to it, thus working around the warning as well
as simplifying the code.

Closes #761
2024-05-06 18:24:15 -04:00
DRC
3f43bb3320 Build: Don't use COMPONENT w/install(INCLUDES ...)
(Regression introduced by 24e09baaf0)

The install() INCLUDES option is not an artifact option.  It specifies
a list of directories that will be added to the
INTERFACE_INCLUDE_DIRECTORIES target property when the target is
exported using the install() EXPORT option, which occurs when CMake
package config files are generated.  Specifying 'COMPONENT include' with
the install() INCLUDES option caused the INTERFACE_INCLUDE_DIRECTORIES
properties in our CMake package config files to contain
'${_IMPORT_PREFIX}/COMPONENT', which caused errors of the form 'Imported
target "libjpeg-turbo::XXXX" includes non-existent path' when downstream
build systems attempted to include libjpeg-turbo using find_package().

Fixes #759
Closes #760
2024-05-06 13:38:59 -04:00
Kleis Auke Wolthuizen
24e09baaf0 Build: Add COMPONENT to all install() commands
This makes it possible for downstream packagers and other integrators of
libjpeg-turbo to include only specific directories from the
libjpeg-turbo installation (or to install specific directories under a
different prefix, etc.)  The names of the components correspond to the
directories into which they will be installed.

Refer to libvips/libvips#3931, #265, #338

Closes #756
2024-05-04 14:36:14 -04:00
DRC
6a522fcda4 jpegtran -drop: Ensure all quant tables defined
It is possible to craft a malformed JPEG image in which all of the
scans contain fewer components than the number of components specified
in the Start Of Frame (SOF) segment.  Attempting to use such an image as
either an input image or a drop image with 'jpegtran -drop' caused a
NULL dereference and subsequent segfault in transupp.c:adjust_quant(),
so this commit adds appropriate checks to guard against that.

Since the issue involved an interface that is only exposed on the
jpegtran command line, it did not represent a security risk.
'jpegtran -drop' could not ever be used successfully with images such as
the ones described above.  This commit simply makes jpegtran fail
gracefully rather than crash.

Fixes #758
2024-05-02 14:54:36 -04:00
DRC
2dfe6c0fe9 CI: Work around segfaults in ASan/MSan jobs
Referring to actions/runner-images#9491, the sanitizers in LLVM 14 that
ships with Ubuntu 22.04 are incompatible with high-entropy address space
layout randomization (ASLR), which is enabled in the GitHub runners via
their use of a newer kernel than ubuntu 22.04 uses.
2024-03-18 14:51:04 -04:00
DRC
710865cf0e Build: Don't explicitly set CMP0065 to NEW
This is no longer necessary, because of
1644bdb7d2.
2024-03-18 12:56:42 -04:00
DRC
fe218ca19c Build: Handle CMAKE_C_COMPILER_ID=AppleClang
Because of 1644bdb7d2, we are now
effectively using the NEW behavior for all CMake policies introduced in
all CMake versions up to and including CMake 3.28.  The NEW behavior for
CMP0025, introduced in CMake 3.0, sets CMAKE_C_COMPILER_ID to
"AppleClang" instead of "Clang" when using Apple's variant of Clang (in
Xcode), so we need to match all values of CMAKE_C_COMPILER_ID that
contain "Clang".

This fixes three issues:

- -O2 was not replaced with -O3 in CMAKE_C_FLAGS_RELWITHDEBINFO.  This
  was a minor issue, since -O3 is now the default in
  CMAKE_C_FLAGS_RELEASE, and we use CMAKE_BUILD_TYPE=Release in our
  official builds.

- The build system erroneously set the default value of FLOATTEST8 and
  FLOATTEST12 to no-fp-contract when compiling for PowerPC or Arm using
  Apple Clang 14+ (effectively reverting
  5b2beb4bc4f41dd9dd2a905cb931b8d5054d909b.)  Because Clang 14+ now
  enables -ffp-contract=on by default, this issue caused floating point
  test failures unless FLOATTEST8 and FLOATTEST12 were overridden.

- The build system set MD5_PPM_3x2_FLOAT_FP_CONTRACT as appropriate for
  GCC, not as appropriate for Clang (effectively reverting
  47656a082091f9c9efda054674522513f4768c6c.)  This also caused floating
  point test failures.

Fixes #753
Closes #755
2024-03-18 12:46:11 -04:00
DRC
dfde1f857d Fix (and test) more Clang 14 compiler warnings
-Woverlength-strings, -Wshift-negative-value, -Wsign-compare
2024-03-08 12:50:32 -05:00
DRC
905ec0fa01 Avoid tautological comparisons
Several TurboJPEG functions store their return value in an unsigned
long long intermediate and compare it against the maximum value of
unsigned long or size_t in order to avoid integer overflow.  However,
such comparisons are tautological (always true, i.e. redundant) unless
the size of unsigned long or size_t is less than the size of unsigned
long long.  Explicitly guarding the comparisons with #if avoids compiler
warnings with -Wtautological-constant-in-range-compare in Clang and also
makes it clear to the reader that the comparisons are only intended for
32-bit code.

Refer to #752
2024-03-08 11:52:38 -05:00
DRC
34c055851e Fix warnings with -Wmissing-variable-declarations 2024-03-06 15:17:08 -05:00
DRC
7bb958b732 12-bit: Don't gen opt Huff tbls if tbls supplied
(regression introduced by e8b40f3c2b)

The documented behavior of the libjpeg API is to compute optimal Huffman
tables when generating 12-bit lossy Huffman-coded JPEG images, unless
the calling application supplies its own Huffman tables.  However,
e8b40f3c2b and
96bc40c1b3 modified
jinit_c_master_control() so that it always set cinfo->optimize_coding to
TRUE when generarating 12-bit lossy Huffman-coded JPEG images, which
prevented calling applications from supplying custom Huffman tables for
such images.

This commit modifies jinit_c_master_control() so that it only overrides
cinfo->optimize_coding when generating 12-bit lossy Huffman-coded JPEG
images if all Huffman table slots are empty or all slots contain default
Huffman tables.  Determining whether the latter is true requires using
memcmp() to compare the allocated Huffman tables with the default
Huffman tables, because:

- The documented behavior of jpeg_set_defaults() is to initialize any
  empty Huffman table slot with the default Huffman table corresponding
  to that slot, regardless of the data precision.  There is also no
  requirement that the data precision be specified prior to calling
  jpeg_set_defaults().  Thus, there is no reliable way to prevent
  jpeg_set_defaults() from initializing empty Huffman table slots with
  default Huffman tables, which are useless for 12-bit data precision.

- There is no requirement that custom Huffman tables be defined prior to
  calling jpeg_set_defaults().  A calling application could call
  jpeg_set_defaults() and modify the values in the default Huffman
  tables rather than allocating new tables.  Thus, there is no reliable
  way to detect whether the allocated Huffman tables contain default
  values without comparing the tables with the default Huffman tables.

Fortunately, comparing the allocated Huffman tables with the default
Huffman tables is the last stop on the logic train, so it won't happen
unless cinfo->data_precision == 12, cinfo->arith_code == FALSE,
cinfo->optimize_coding == FALSE, and one or more Huffman tables are
allocated.  (If the compressor object is reused, this ensures that the
full comparison will be performed at most once.)  Custom Huffman tables
will be flagged as non-default when the first non-default value is
encountered, and the worst case (comparing 400 bytes) is very fast on
modern CPUs anyhow.

Fixes #751
2024-03-04 17:45:40 -05:00
DRC
3202feb08a x86-64 SIMD: Support CET if C compiler enables it
- Detect at configure time, via the __CET__ C preprocessor macro,
  whether the C compiler will include either indirect branch tracking
  (IBT) or shadow stack support, and define a NASM macro (__CET__) if
  so.

- Modify the x86-64 SIMD code so that it includes appropriate endbr64
  instructions (to support IBT) and an appropriate .note.gnu.property
  section (to support both IBT and shadow stack) when __CET__ is
  defined.

Closes #350
2024-02-29 16:37:30 -05:00
DRC
1335547558 x86 SIMD: Capitalize all instruction-like macros
(to improve code readability)
2024-02-29 12:18:49 -05:00
DRC
26fc07c8d1 Build: Set MSVC run-time lib based on IDE config 2024-02-08 12:03:37 -05:00
Alyssa Ross
b6ee1016ab Build: Fix tests w/ emulators that don't check CWD
While QEMU will run executables from the current working directory,
other emulators may not.  It is more reliable to pass the full
executable path to the emulator.  The add_test(NAME ... COMMAND ...)
syntax automatically invokes the emulator (e.g. the command specified
in CMAKE_CROSSCOMPILING_EMULATOR) and passes the full executable path to
it, as long as the first COMMAND argument is the name of a target.  This
cleans up the CMake code somewhat as well, since it is no longer
necessary to manually invoke CMAKE_CROSSCOMPILING_EMULATOR.

Closes #747
2024-01-30 18:26:21 -05:00
DRC
d59b1a3bce Build: Reformat lines longer than 80 columns ...
... to ensure that no function argument starts beyond the 80th column.
2024-01-30 15:40:51 -05:00
DRC
36c51dd3eb GitHub: Update checkout, AWS credentials actions
... to silence deprecation warning regarding Node.js 12 and 16 actions.
2024-01-26 15:55:19 -05:00