Commit Graph

290 Commits

Author SHA1 Message Date
DRC
eb75363004 Update URLs
- Eliminate unnecessary "www."
- Use HTTPS.
- Update Java, MSYS, tdm-gcc, and NSIS URLs.
- Update URL and title of Agner Fog's assembly language optimization
  manual.
- Remove extraneous information about MASM and Borland Turbo Assembler
  and outdated NASM URLs from the x86 assembly headers, and mention
  Yasm.
2024-08-31 16:50:08 -04:00
DRC
8d76e4e550 Doc: "EXIF" = "Exif" 2024-08-31 15:33:55 -04:00
DRC
9983840eb6 TJ/xform: Check crop region against dest. image
Lossless cropping is performed after other lossless transform
operations, so the cropping region must be specified relative to the
destination image dimensions and level of chrominance subsampling, not
the source image dimensions and level of chrominance subsampling.

More specifically, if the lossless transform operation swaps the X and Y
axes, or if the image is converted to grayscale, then that changes the
cropping region requirements.
2024-08-31 15:04:30 -04:00
DRC
8456d2b98c Doc: "MCU block" = "iMCU" or "MCU"
The JPEG-1 spec never uses the term "MCU block".  That term is rarely
used in other literature to describe the equivalent of an MCU in an
interleaved JPEG image, but the libjpeg documentation uses "iMCU" to
describe the same thing.  "iMCU" is a better term, since the equivalent
of an interleaved MCU can contain multiple DCT blocks (or samples in
lossless mode) that are only grouped together if the image is
interleaved.

In the case of restart markers, "MCU block" was used in the libjpeg
documentation instead of "MCU", but "MCU" is more accurate and less
confusing.  (The restart interval is literally in MCUs, where one MCU
is one data unit in a non-interleaved JPEG image and multiple data units
in a multi-component interleaved JPEG image.)

In the case of 9b704f96b2, the issue was
actually with progressive JPEG images exactly two DCT blocks wide, not
two MCU blocks wide.

This commit also defines "MCU" and "MCU row" in the description of the
various restart marker options/parameters.  Although an MCU row is
technically always a row of samples in lossless mode, "sample row" was
confusing, since it is used in other places to describe a row of samples
for a single component (whereas an MCU row in a typical lossless JPEG
image consists of a row of interleaved samples for all components.)
2024-08-30 14:16:09 -04:00
DRC
4851cbe406 djpeg/jpeg_crop_scanline(): Disallow crop vals < 0
Because the crop spec was parsed using unsigned 32-bit integers,
negative numbers were interpreted as values ~= UINT_MAX (4,294,967,295).
This had the following ramifications:

- If the cropping region width was negative and the adjusted width + the
  adjusted left boundary was greater than 0, then the 32-bit unsigned
  integer bounds checks in djpeg and jpeg_crop_scanline() overflowed and
  failed to detect the out-of-bounds width, jpeg_crop_scanline() set
  cinfo->output_width to a value ~= UINT_MAX, and a buffer overrun and
  subsequent segfault occurred in the upsampling or color conversion
  routine.  The segfault occurred in the body of
  jpeg_skip_scanlines() --> read_and_discard_scanlines() if the cropping
  region upper boundary was greater than 0 and the JPEG image used
  chrominance subsampling and in the body of jpeg_read_scanlines()
  otherwise.

- If the cropping region width was negative and the adjusted width + the
  adjusted left boundary was 0, then a zero-width output image was
  generated.

- If the cropping region left boundary was negative, then an output
  image with bogus data was generated.

This commit modifies djpeg and jpeg_crop_scanline() so that the
aforementioned bounds checks use 64-bit unsigned integers, thus guarding
against overflow.  It similarly modifies jpeg_skip_scanlines().  In the
case of jpeg_skip_scanlines(), the issue was not reproducible with
djpeg, but passing a negative number of lines to jpeg_skip_scanlines()
caused a similar overflow if the number of lines +
cinfo->output_scanline was greater than 0.  That caused
jpeg_skip_scanlines() to read past the end of the JPEG image, throw a
warning ("Corrupt JPEG data: premature end of data segment"), and fail
to return unless warnings were treated as fatal.  Also, djpeg now parses
the crop spec using signed integers and checks for negative values.
2024-08-26 16:24:33 -04:00
DRC
de4bbac55e TJCompressor.compress(): Fix lossls buf size calc 2024-08-23 12:48:34 -04:00
DRC
0c23b0ad60 Various doc tweaks
- "Optimized baseline entropy coding" = "Huffman table optimization"

  "Optimized baseline entropy coding" was meant to emphasize that the
  feature is only useful when generating baseline (single-scan lossy
  8-bit-per-sample Huffman-coded) JPEG images, because it is
  automatically enabled when generating Huffman-coded progressive
  (multi-scan), 12-bit-per-sample, and lossless JPEG images.  However,
  Huffman table optimization isn't actually an integral part of those
  non-baseline modes.  You can forego Huffman table optimization with
  12-bit data precision if you supply your own Huffman tables.  The spec
  doesn't require it with progressive or lossless mode, either, although
  our implementation does.  Furthermore, "baseline" describes more than
  just the type of entropy coding used.  It was incorrect to say that
  optimized "baseline" entropy coding is automatically enabled for
  Huffman-coded progressive, 12-bit-per-sample, and lossless JPEG
  images, since those are clearly not baseline images.

- "Progressive entropy coding" = "Progressive JPEG"

  "Progressive" describes more than just the type of entropy coding
  used.  (In fact, both Huffman-coded and arithmetic-coded images can be
  progressive.)

- Mention that TJPARAM_OPTIMIZE/TJ.PARAM_OPTIMIZE can be used with
  lossless transformation as well.

- General wordsmithing

- Formatting tweaks
2024-08-16 11:49:00 -04:00
DRC
51d021bf01 TurboJPEG: Fix 12-bit-per-sample arith-coded compr
(Regression introduced by 7bb958b732)

Because of 7bb958b732, the TurboJPEG
compression and encoding functions no longer transfer the value of
TJPARAM_OPTIMIZE into cinfo->data_precision unless the data precision
is 8.  The intent of that was to prevent using_std_huff_tables() from
being called more than once when reusing the same compressor object to
generate multiple 12-bit-per-sample JPEG images.  However, because
cinfo->optimize_coding is always set to TRUE by jpeg_set_defaults() if
the data precision is 12, calling applications that use 12-bit data
precision had to unset cinfo->optimize_coding if they set
cinfo->arith_code after calling jpeg_set_defaults().  Because of
7bb958b732, the TurboJPEG API stopped
doing that except with 8-bit data precision.  Thus, attempting to
generate a 12-bit-per-sample arithmetic-coded lossy JPEG image using
the TurboJPEG API failed with "Requested features are incompatible."

Since the compressor will always fail if cinfo->arith_code and
cinfo->optimize_coding are both set, and since cinfo->optimize_coding
has no relevance for arithmetic coding, the most robust and user-proof
solution is for jinit_c_master_control() to set cinfo->optimize_coding
to FALSE if cinfo->arith_code is TRUE.

This commit also:
- modifies TJBench so that it no longer reports that it is using
  optimized baseline entropy coding in modes where that setting
  is irrelevant,
- amends the cjpeg documentation to clarify that -optimize is implied
  when specifying -progressive or '-precision 12' without -arithmetic,
  and
- prevents jpeg_set_defaults() from uselessly checking the value of
  cinfo->arith_code immediately after it has been set to FALSE.
2024-06-24 22:15:55 -04:00
DRC
bb3ab53157 cjpeg: Don't enable lossless until precision known
jpeg_enable_lossless() checks the point transform value against the data
precision, so we need to defer calling jpeg_enable_lossless() until
after all command-line options have been parsed.
2024-06-24 22:15:55 -04:00
DRC
94c64ead85 Various doc tweaks
- "bits per component" = "bits per sample"

  Describing the data precision of a JPEG image using "bits per
  component" is technically correct, but "bits per sample" is the
  terminology that the JPEG-1 spec uses.  Also, "bits per component" is
  more commonly used to describe the precision of packed-pixel formats
  (as opposed to "bits per pixel") rather than planar formats, in which
  all components are grouped together.

- Unmention legacy display technologies.  Colormapped and monochrome
  displays aren't a thing anymore, and even when they were still a
  thing, it was possible to display full-color images to them.  In 1991,
  when JPEG decompression time was measured in minutes per megapixel, it
  made sense to keep a decompressed copy of JPEG images on disk, in a
  format that could be displayed without further color conversion (since
  color conversion was slow and memory-intensive.)  In 2024, JPEG
  decompression time is measured in milliseconds per megapixel, and
  color conversion is even faster.  Thus, JPEG images can be
  decompressed, displayed, and color-converted (if necessary) "on the
  fly" at speeds too fast for human vision to perceive.  (In fact, your
  TV performs much more complicated decompression algorithms at least 60
  times per second.)

- Document that color quantization (and associated features), GIF
  input/output, Targa input/output, and OS/2 BMP input/output are legacy
  features.  Legacy status doesn't necessarily mean that the features
  are deprecated.  Rather, it is meant to discourage users from using
  features that may be of little or no benefit on modern machines (such
  as low-quality modes that had significant performance advantages in
  the early 1990s but no longer do) and that are maintained on a
  break/fix basis only.

- General wordsmithing, grammar/punctuation policing, and formatting
  tweaks

- Clarify which data precisions each cjpeg input format and each djpeg
  output format supports.

- cjpeg.1: Remove unnecessary and impolitic statement about the -targa
  switch.

- Adjust or remove performance claims to reflect the fact that:
  * On modern machines, the djpeg "-fast" switch has a negligible effect
    on performance.
  * There is a measurable difference between the performance of Floyd-
    Steinberg dithering and no dithering, but it is not likely
    perceptible to most users.
  * There is a measurable difference between the performance of 1-pass
    and 2-pass color quantization, but it is not likely perceptible to
    most users.
  * There is a measurable difference between the performance of
    full-color and grayscale output when decompressing a full-color JPEG
    image, but it is not likely perceptible to most users.
  * IDCT scaling does not necessarily improve performance.  (It
    generally does if the scaling factor is <= 1/2 and generally doesn't
    if the scaling factor is > 1/2, at least on my machine.  The
    performance claim made in jpeg-6b was probably invalidated when we
    merged the additional scaling factors from jpeg-7.)

- Clarify which djpeg switches/output formats cannot be used when
  decompressing lossless JPEG images.

- Remove djpeg hints, since those involve quality vs. speed tradeoffs
  that are no longer relevant for modern machines.

- Remove documentation regarding using color quantization with 16-bit
  data precision.  (Color quantization requires lossy mode.)

- Java: Fix typos in TJDecompressor.decompress12() and
  TJDecompressor.decompress16() documentation.

- jpegtran.1: Fix truncated paragraph

  In a man page, a single quote at the start of a line is interpreted as
  a macro.

  Closes #775

- libjpeg.txt:
  * Mention J16SAMPLE data type (oversight.)
  * Remove statement about extending jdcolor.c.  (libjpeg-turbo is not
    quite as DIY as libjpeg once was.)
  * Remove paragraph about tweaking the various typedefs in jmorecfg.h.
    It is no longer relevant for modern machines.
  * Remove caveat regarding systems with ints less than 16 bits wide.
    (ANSI/ISO C requires an int to be at least 16 bits wide, and
    libjpeg-turbo has never supported non-ANSI compilers.)

- usage.txt:
  * Add copyright header.
  * Document cjpeg -icc, -memdst, -report, -strict, and -version
    switches.
  * Document djpeg -icc, -maxscans, -memsrc, -report, -skip, -crop,
    -strict, and -version switches.
  * Document jpegtran -icc, -maxscans, -report, -strict, and -version
    switches.
2024-06-24 22:11:43 -04:00
DRC
3c17063ef1 Guard against dupe SOF w/ incorrect source manager
Referring to https://bugzilla.mozilla.org/show_bug.cgi?id=1898606,
attempting to decompress a specially-crafted malformed JPEG image
(specifically an image with a complete 12-bit Start Of Frame segment
followed by an incomplete 8-bit Start Of Frame segment) using the
default marker processor, buffered-image mode, and input prefetching
triggered the following sequence of events:

- When the 12-bit SOF segment was encountered (in the body of
  jpeg_read_header()), the marker processor's read_markers() method
  called the get_sof() function, which processed the 12-bit SOF segment
  and set cinfo->data_precision to 12.

- If the application subsequently called jpeg_consume_input() in a loop
  to prefetch input data, and it didn't stop calling
  jpeg_consume_input() when the function returned JPEG_REACHED_SOS, then
  the 8-bit SOF segment was encountered in the body of
  jpeg_consume_input().  As a result, the marker processor's
  read_markers() method called get_sof(), which started to process the
  8-bit SOF segment and set cinfo->data_precision to 8.

- Since the 8-bit SOF segment was incomplete, the end of the JPEG data
  stream was encountered when get_sof() attempted to read the image
  height, width, and number of components.

- If the fill_input_buffer() method in the application's custom source
  manager incorrectly returned FALSE in response to a prematurely-
  terminated JPEG data stream, then get_sof() returned FALSE while
  attempting to read the image height, width, and number of components
  (before the duplicate SOF check was reached.)  That caused the default
  marker processor's read_markers() method, and subsequently
  jpeg_consume_input(), to return JPEG_SUSPENDED.

- If the application failed to respond to the JPEG_SUSPENDED return
  value and subsequently attempted to call jpeg_read_scanlines(),
  then the data precision check in jpeg_read_scanlines() succeeded
  (because cinfo->data_precision was now 8.)  However, because
  cinfo->data_precision had been 12 during the previous call to
  jpeg_start_decompress(), only the 12-bit version of the main
  controller was initialized, and the cinfo->main->process_data() method
  was undefined.  Thus, a segfault occurred when jpeg_read_scanlines()
  attempted to invoke that method.

Scenarios in which the issue was thwarted:

1. The default source managers handle a prematurely-terminated JPEG data
stream by inserting a fake EOI marker into the data stream.  Thus, when
using one of those source managers, the INPUT_2BYTES() and INPUT_BYTE()
macros (which get_sof() invokes to read the image height, width, and
number of components) succeeded-- albeit with bogus data, since the fake
EOI marker was read into those fields.  The duplicate SOF check in
get_sof() then failed, generating a fatal libjpeg error.

2. When using a custom source manager that correctly returns TRUE in
response to a prematurely-terminated JPEG data stream, the
aforementioned INPUT_2BYTES() and INPUT_BYTE() macros also succeeded
(albeit with bogus data read from the previous bytes of the data
stream), and the duplicate SOF check failed.

3. If the application did not prefetch input data, or if it stopped
invoking jpeg_consume_input() when the function returned
JPEG_REACHED_SOS, then the duplicate SOF segment was not read prior to
the first call to jpeg_read_scanlines().  Thus, the data precision check
in jpeg_read_scanlines() failed.  If the application instead called
jpeg12_read_scanlines() (that is, if it properly supported multiple data
precisions), then the duplicate SOF segment was not read until the body
of jpeg_finish_decompress().  At that point, its only negative effect
was to cause jpeg_finish_decompress() to return FALSE before the
duplicate SOF check was reached.

In other words, this issue depended not only upon an incorrectly-written
source manager but also upon a very specific sequence of API calls.  It
also depended upon the multi-precision feature introduced in
libjpeg-turbo 3.0.x.  When using an 8-bit-per-sample build of
libjpeg-turbo 2.1.x, jpeg_read_header() failed with "Unsupported JPEG
data precision 12" after the 12-bit SOF segment was processed.  When
using a 12-bit-per-sample build of libjpeg-turbo 2.1.x, the behavior
was the same as if the application called jpeg12_read_scanlines() in
Scenario 3 above.

This commit simply moves the duplicate SOF check to the top of
get_sof() so the check will fail before the marker processor attempts to
read the duplicate SOF.  It should be noted that this issue isn't a
libjpeg-turbo bug per se, because it occurs only when the calling
application does something it shouldn't.  It is, rather, an issue of API
hardening/caller-proofing.
2024-05-29 10:08:24 -04:00
DRC
bc491b16e2 ChangeLog.md: Document previous commit 2024-05-16 17:32:02 -04:00
DRC
6a522fcda4 jpegtran -drop: Ensure all quant tables defined
It is possible to craft a malformed JPEG image in which all of the
scans contain fewer components than the number of components specified
in the Start Of Frame (SOF) segment.  Attempting to use such an image as
either an input image or a drop image with 'jpegtran -drop' caused a
NULL dereference and subsequent segfault in transupp.c:adjust_quant(),
so this commit adds appropriate checks to guard against that.

Since the issue involved an interface that is only exposed on the
jpegtran command line, it did not represent a security risk.
'jpegtran -drop' could not ever be used successfully with images such as
the ones described above.  This commit simply makes jpegtran fail
gracefully rather than crash.

Fixes #758
2024-05-02 14:54:36 -04:00
DRC
7bb958b732 12-bit: Don't gen opt Huff tbls if tbls supplied
(regression introduced by e8b40f3c2b)

The documented behavior of the libjpeg API is to compute optimal Huffman
tables when generating 12-bit lossy Huffman-coded JPEG images, unless
the calling application supplies its own Huffman tables.  However,
e8b40f3c2b and
96bc40c1b3 modified
jinit_c_master_control() so that it always set cinfo->optimize_coding to
TRUE when generarating 12-bit lossy Huffman-coded JPEG images, which
prevented calling applications from supplying custom Huffman tables for
such images.

This commit modifies jinit_c_master_control() so that it only overrides
cinfo->optimize_coding when generating 12-bit lossy Huffman-coded JPEG
images if all Huffman table slots are empty or all slots contain default
Huffman tables.  Determining whether the latter is true requires using
memcmp() to compare the allocated Huffman tables with the default
Huffman tables, because:

- The documented behavior of jpeg_set_defaults() is to initialize any
  empty Huffman table slot with the default Huffman table corresponding
  to that slot, regardless of the data precision.  There is also no
  requirement that the data precision be specified prior to calling
  jpeg_set_defaults().  Thus, there is no reliable way to prevent
  jpeg_set_defaults() from initializing empty Huffman table slots with
  default Huffman tables, which are useless for 12-bit data precision.

- There is no requirement that custom Huffman tables be defined prior to
  calling jpeg_set_defaults().  A calling application could call
  jpeg_set_defaults() and modify the values in the default Huffman
  tables rather than allocating new tables.  Thus, there is no reliable
  way to detect whether the allocated Huffman tables contain default
  values without comparing the tables with the default Huffman tables.

Fortunately, comparing the allocated Huffman tables with the default
Huffman tables is the last stop on the logic train, so it won't happen
unless cinfo->data_precision == 12, cinfo->arith_code == FALSE,
cinfo->optimize_coding == FALSE, and one or more Huffman tables are
allocated.  (If the compressor object is reused, this ensures that the
full comparison will be performed at most once.)  Custom Huffman tables
will be flagged as non-default when the first non-default value is
encountered, and the worst case (comparing 400 bytes) is very fast on
modern CPUs anyhow.

Fixes #751
2024-03-04 17:45:40 -05:00
DRC
3202feb08a x86-64 SIMD: Support CET if C compiler enables it
- Detect at configure time, via the __CET__ C preprocessor macro,
  whether the C compiler will include either indirect branch tracking
  (IBT) or shadow stack support, and define a NASM macro (__CET__) if
  so.

- Modify the x86-64 SIMD code so that it includes appropriate endbr64
  instructions (to support IBT) and an appropriate .note.gnu.property
  section (to support both IBT and shadow stack) when __CET__ is
  defined.

Closes #350
2024-02-29 16:37:30 -05:00
DRC
17df25f92c Build/Win: Eliminate MSVC run-time DLL dependency
(regression introduced by 1644bdb7d2)

Setting a maximum version in cmake_minimum_required() effectively sets
the behavior to NEW for all policies introduced in all CMake versions up
to and including that maximum version.  The NEW behavior for CMP0091,
introduced in CMake 3.15, uses CMake variables to specify the MSVC
runtime library against which to link, rather than placing the relevant
flags in CMAKE_C_FLAGS*.  Thus, replacing /MD with /MT in CMAKE_C_FLAGS*
no longer has any effect when using CMake 3.15+.
2024-01-26 09:43:58 -05:00
DRC
335ed793f9 Assume 3-comp lossls JPEG w/o Adobe marker is RGB
libjpeg-turbo always includes Adobe APP14 markers in the lossless JPEG
images that it generates, but some compressors (e.g. accusoft PICTools
Medical) do not.

Fixes #743
2024-01-19 13:14:21 -05:00
DRC
3eee0dd747 ChangeLog.md: "since" = "relative to" 2023-11-29 10:03:49 -05:00
DRC
55d342c788 TurboJPEG: Expose/extend hidden "max pixels" param
TJPARAM_MAXPIXELS was previously hidden and used only for fuzz testing,
but it is potentially useful for calling applications as well,
particularly if they want to guard against excessive memory consumption
by the tj3LoadImage*() functions.  The parameter has also been extended
to decompression and lossless transformation functions/methods, mainly
as a convenience.  (It was already possible for calling applications to
impose their own JPEG image size limits by reading the JPEG header prior
to decompressing or transforming the image.)
2023-11-16 15:36:47 -05:00
DRC
df9dbff830 TurboJPEG: New param to limit virt array mem usage
This corresponds to max_memory_to_use in the jpeg_memory_mgr struct in
the libjpeg API, except that the TurboJPEG parameter is specified in
megabytes.  Because this is 2023 and computers with less than 1 MB of
memory are not a thing (at least not within the scope of libjpeg-turbo
support), it isn't useful to allow a limit less than 1 MB to be
specified.  Furthermore, because TurboJPEG parameters are signed
integers, if we allowed the memory limit to be specified in bytes, then
it would be impossible to specify a limit larger than 2 GB on 64-bit
machines.  Because max_memory_to_use is a long signed integer,
effectively we can specify a limit of up to 2 petabytes on 64-bit
machines if the TurboJPEG parameter is specified in megabytes.  (2 PB
should be enough for anybody, right?)

This commit also bumps the TurboJPEG API version to 3.0.1.  Since the
TurboJPEG API version no longer tracks the libjpeg-turbo version, it
makes sense to increment the API revision number when adding constants,
to increment the minor version number when adding functions, and to
increment the major version number for a complete overhaul.

This commit also removes the vestigial TJ_NUMPARAM macro, which was
never defined because it proved unnecessary.

Partially implements #735
2023-11-14 10:19:06 -05:00
DRC
78eaf0d46d tj3*YUV8(): Fix int overflow w/ huge row alignment
If the align parameter was set to an unreasonably large value, such as
0x2000000, strides[0] * ph0 and strides[1] * ph1 could have overflowed
the int datatype and wrapped around when computing (src|dst)Planes[1]
and (src|dst)Planes[2] (respectively.)  This would have caused
(src|dst)Planes[1] and (src|dst)Planes[2] to point to lower addresses in
the YUV buffer than expected, so the worst case would have been a
visually incorrect output image, not a buffer overrun or other
exploitable issue.
2023-11-07 15:39:16 -05:00
DRC
da48edfc49 jchuff.c: Fix uninit read w/ AArch64, WITH_SIMD=0
Because of bf01ed2fbc, the simd field in
huff_entropy_encoder (and, by extension, the simd field in
savable_state) is only initialized if WITH_SIMD is defined.  Due to an
oversight, the simd field in savable_state was queried in flush_bits()
regardless of whether WITH_SIMD was defined.  In most cases, both
branches of the query have identical code, and the optimizer removes the
branch.  However, because the legacy Neon GAS Huffman encoder uses the
older bit buffer logic from libjpeg-turbo 2.0.x and prior (refer to
087c29e07f), the branches do not have
identical code when building for AArch64 with NEON_INTRINSICS undefined
(which will be the case if WITH_SIMD is undefined.)  Thus, if
libjpeg-turbo was built for AArch64 with the SIMD extensions disabled
at build time, it was possible for the Neon GAS branch in flush_bits()
to be taken, which would have set put_bits to a value that is incorrect
for the C Huffman encoder.  Referring to #728, a user reported that this
issue sometimes caused libjpeg-turbo to generate bogus JPEG images if it
was built for AArch64 without SIMD extensions and subsequently used
through the Qt framework.  (It should be noted, however, that disabling
the SIMD extensions in AArch64 builds of libjpeg-turbo is inadvisable
for performance reasons.)

I was unable to reproduce the issue on Linux/AArch64 using libjpeg-turbo
alone, despite testing various versions of GCC and Clang and various
optimization levels.  However, the issue is reproducible using MSan with
-O0, so this commit also modifies the GitHub Actions workflow so that
compiler optimization is disabled in the linux-msan job.  That should
prevent the issue or similar issues from re-emerging.

Fixes #728
2023-10-10 14:58:34 -04:00
DRC
c0412b56d6 ChangeLog.md: List CVE ID fixed by ccaba5d7
Closes #724
2023-09-14 17:20:37 -04:00
DRC
9b704f96b2 Fix block smoothing w/vert.-subsampled prog. JPEGs
The 5x5 interblock smoothing implementation, introduced in libjpeg-turbo
2.1, improperly extended the logic from the traditional 3x3 smoothing
implementation.  Both implementations point prev_block_row and
next_block_row to the current block row when processing, respectively,
the first and the last block row in the image:

  if (block_row > 0 || cinfo->output_iMCU_row > 0)
    prev_block_row =
      buffer[block_row - 1] + cinfo->master->first_MCU_col[ci];
  else
    prev_block_row = buffer_ptr;

  if (block_row < block_rows - 1 ||
      cinfo->output_iMCU_row < last_iMCU_row)
    next_block_row =
      buffer[block_row + 1] + cinfo->master->first_MCU_col[ci];
  else
    next_block_row = buffer_ptr;

6d91e950c8 naively extended that logic to
accommodate a 5x5 smoothing window:

  if (block_row > 1 || cinfo->output_iMCU_row > 1)
    prev_prev_block_row =
      buffer[block_row - 2] + cinfo->master->first_MCU_col[ci];
  else
    prev_prev_block_row = prev_block_row;

  if (block_row < block_rows - 2 ||
      cinfo->output_iMCU_row + 1 < last_iMCU_row)
    next_next_block_row =
      buffer[block_row + 2] + cinfo->master->first_MCU_col[ci];
  else
    next_next_block_row = next_block_row;

However, this new logic was only correct if block_rows == 1, so the
values of prev_prev_block_row and next_next_block_row were incorrect
when processing, respectively, the second and second to last iMCU rows
in a vertically-subsampled progressive JPEG image.

The intent was to:
- point prev_block_row to the current block row when processing the
  first block row in the image,
- point prev_prev_block_row to prev_block_row when processing the first
  two block rows in the image,
- point next_block_row to the current block row when processing the
  last block row in the image, and
- point next_next_block_row to next_block_row when processing the last
  two block rows in the image.

This commit modifies decompress_smooth_data() so that it computes the
current block row's position relative to the whole image and sets
the block row pointers based on that value.

This commit also restores a line of code that was accidentally deleted
by 6d91e950c8:

  access_rows += compptr->v_samp_factor; /* prior iMCU row too */

access_rows is merely a sanity check that tells the access_virt_barray()
method to generate an error if accessing the specified number of rows
would cause a buffer overrun.  Essentially, it is a belt-and-suspenders
measure to ensure that j*init_d_coef_controller() allocated enough rows
for the full-image virtual array.  Thus, excluding that line of code did
not cause an observable issue.

This commit also documents dbae59281f in
the change log.

Fixes #721
2023-08-15 15:21:29 -04:00
DRC
7b844bfda6 x86-64 SIMD: Use std stack frame/prologue/epilogue
This allows debuggers and profilers to reliably capture backtraces from
within the x86-64 SIMD functions.

In places where rbp was previously used to access temporary variables
(after stack alignment), we now use r15 and save/restore it accordingly.
The total amount of work is approximately the same, because the previous
code pushed the pre-alignment stack pointer to the aligned stack.  The
new prologue and epilogue actually have fewer instructions.

Also note that the {un}collect_args macros now use rbp instead of rax to
access arguments passed on the stack, so we save a few instructions
there as well.

Based on:
debcc7c3b4

Closes #707
Closes #708
2023-08-08 12:33:06 -04:00
DRC
e0c53aa38f jchuff.c: Test for out-of-range coefficients
Restore two coefficient range checks from libjpeg to the C baseline
Huffman encoder.  This fixes an issue
(https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=60253) whereby
the encoder could read from uninitialized memory when attempting to
transform a specially-crafted malformed arithmetic-coded JPEG source
image into a baseline Huffman-coded JPEG destination image with default
Huffman tables.  More specifically, the out-of-range coefficients caused
r to equal 256, which overflowed the actbl->ehufsi[] array.  Because the
overflow was contained within the huff_entropy_encoder structure, this
issue was not exploitable (nor was it observable at all on x86 or Arm
CPUs unless JSIMD_NOHUFFENC=1 or JSIMD_FORCENONE=1 was set in the
environment or unless libjpeg-turbo was built with WITH_SIMD=0.)

The fix is performance-neutral (+/- 1-2%) for x86-64 code and causes a
0-4% (avg. 1-2%, +/- 1-2%) compression regression for i386 code on Intel
CPUs when the C baseline Huffman encoder is used (JSIMD_NOHUFFENC=1).
The fix is performance-neutral (+/- 1-2%) on Intel CPUs when all of the
libjpeg-turbo SIMD extensions are disabled (JSIMD_FORCENONE=1).  The fix
causes a 0-2% (avg. <1%, +/- 1%) compression regression for PowerPC
code.
2023-07-01 07:56:50 -04:00
DRC
240a5a5cbb ChangeLog.md: Fix typo 2023-06-29 17:45:07 -04:00
DRC
7173889223 djpeg: Fix -map option with 12-bit data precision 2023-06-29 16:36:43 -04:00
DRC
bf9f319cb4 Disallow color quantization with lossless decomp
Color quantization is a legacy feature that serves little or no purpose
with lossless JPEG images.  9f756bc67a
eliminated interaction issues between the lossless decompressor and the
color quantizers related to out-of-range 12-bit samples, but referring
to #701, other interaction issues apparently still exist.  Such issues
are likely, given the fact that the color quantizers were not designed
with lossless decompression in mind.

This commit reverts 9f756bc67a, since the
issues it fixed are no longer relevant because of this commit and
2192560d74.

Fixed #672
Fixes #673
Fixes #674
Fixes #676
Fixes #677
Fixes #678
Fixes #679
Fixes #681
Fixes #683
Fixes #701
2023-06-29 16:36:29 -04:00
DRC
c8d52f1c4c tj3Transform: Calc dst buf size from xformed dims
When used with TJPARAM_NOREALLOC and with TJXOP_TRANSPOSE,
TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270, tj3Transform()
incorrectly based the destination buffer size for a transform on the
source image dimensions rather than the transformed image dimensions.
This was apparently a long-standing bug that had existed in the
tj*Transform() function since its inception.  As initially implemented
in the evolving libjpeg-turbo v1.2 code base, tjTransform() required
dstSizes[i] to be set regardless of whether TJFLAG_NOREALLOC (the
predecessor to TJPARAM_NOREALLOC) was set.
ff78e37595, which was introduced later in
the evolving libjpeg-turbo v1.2 code base, removed that requirement and
planted the seed for the bug.  However, the bug was not activated until
9b49f0e4c7 was introduced still later in
the evolving libjpeg-turbo v1.2 code base, adding a subsampling type
argument to the (new at the time) tjBufSize() function and thus making
the width and height arguments no longer commutative.

The bug opened up the possibility that a JPEG source image could cause
tj3Transform() to overflow the destination buffer for a transform if all
of the following were true:
- The JPEG source image used 4:2:2, 4:4:0, 4:1:1, or 4:4:1 subsampling.
  (These are the only subsampling types for which the width and height
  arguments to tj3JPEGBufSize() are not commutative.)
- The width and height of the JPEG source image were such that
  tj3JPEGBufSize(height, width, subsamplingType) returned a smaller
  value than tj3JPEGBufSize(width, height, subsamplingType).
- The JPEG source image contained enough metadata that the size of the
  transformed image was larger than
  tj3JPEGBufSize(height, width, subsamplingType).
- TJPARAM_NOREALLOC was set.
- TJXOP_TRANSPOSE, TJXOP_TRANSVERSE, TJXOP_ROT90, or TJXOP_ROT270 was
  used.
- TJXOPT_COPYNONE was not set.
- TJXOPT_CROP was not set.
- The calling program allocated
  tj3JPEGBufSize(height, width, subsamplingType) bytes for the
  destination buffer, as the API documentation instructs.

The API documentation cautions that JPEG source images containing a
large amount of extraneous metadata (EXIF, IPTC, ICC, etc.) cannot
reliably be transformed if TJPARAM_NOREALLOC is set and TJXOPT_COPYNONE
is not set.  Irrespective of the bug, there are still cases in which a
JPEG source image with a large amount of metadata can, when transformed,
exceed the worst-case transformed JPEG image size.  For instance, if you
try to losslessly crop a JPEG image with 3 kB of EXIF data to 16x16
pixels, then you are guaranteed to exceed the worst-case 16x16 JPEG
image size unless you discard the EXIF data.

Even without the bug, tj3Transform() will still fail with "Buffer passed
to JPEG library is too small" when attempting to transform JPEG source
images that meet the aforementioned criteria.  The bug is that the
function segfaults rather than failing gracefully, but the chances of
that occurring in a real-world application are very slim.  Any
real-world application developers who attempted to transform arbitrary
JPEG source images with TJPARAM_NOREALLOC set would very quickly realize
that they cannot reliably do that without also setting TJXOPT_COPYNONE.
Thus, I posit that the actual risk posed by this bug is low.
Applications such as web browsers that are the most exposed to security
risks from arbitrary JPEG source images do not use the TurboJPEG
lossless transform feature.  (None of those applications even use the
TurboJPEG API, to the best of my knowledge, and the public libjpeg API
has no equivalent transform function.)  Our only command-line interface
to the tj3Transform() function, TJBench, was not exposed to the bug
because it had a compatible bug whereby it allocated the JPEG
destination buffer to the same size that tj3Transform() erroneously
expected.  The TurboJPEG Java API was also not exposed to the bug
because of a similar compatible bug in the
Java_org_libjpegturbo_turbojpeg_TJTransformer_transform() JNI function.
(This commit fixes both compatible bugs.)

In short, best practices for tj3Transform() are to use TJPARAM_NOREALLOC
only with JPEG source images that are known to be free of metadata (such
as images generated by tj3Compress*()) or to use TJXOPT_COPYNONE along
with TJPARAM_NOREALLOC.  Still, however, the function shouldn't segfault
as long as the calling program allocates the suggested amount of space
for the JPEG destination buffer.

Usability notes:
tj3Transform() could hypothetically require dstSizes[i] to be set
regardless of the value of TJPARAM_NOREALLOC, but there are usability
pitfalls either way.  The main pitfall I sought to avoid with
ff78e37595 was a calling program failing
to set dstSizes[i] at all, thus leaving its value undefined.  It could
be argued that requiring dstSizes[i] to be set in all cases is more
consistent, but it could also be argued that not requiring it to be set
when TJPARAM_NOREALLOC is set is more user-proof.  tj3Transform() could
also hypothetically set TJXOPT_COPYNONE automatically when
TJPARAM_NOREALLOC is set, but that could lead to user confusion.
Ultimately, I would like to address these issues in TurboJPEG v4 by
using managed buffer objects, but that would be an extensive overhaul.
2023-06-27 18:36:01 -04:00
DRC
36aaeebb55 ChangeLog.md: List CVE ID fixed by 9f756bc6 2023-05-30 17:46:58 -04:00
DRC
3a53627306 jpeg_crop_scanline: Fix calc w/sclg + 2x4,4x2 samp
When computing the downsampled width for a particular component,
jpeg_crop_scanline() needs to take into account the fact that the
libjpeg code uses a combination of IDCT scaling and upsampling to
implement 4x2 and 2x4 upsampling with certain decompression scaling
factors.  Failing to account for that led to incomplete upsampling of
4x2- or 2x4-subsampled components, which caused the color converter to
read from uninitialized memory.  With 12-bit data precision, this caused
a buffer overrun or underrun and subsequent segfault if the
uninitialized memory contained a value that was outside of the valid
sample range (because the color converter uses the value as an array
index.)

Fixes #669
2023-04-06 22:00:43 -05:00
DRC
62590d428b Decomp: Don't enable 2-pass color quant w/ RGB565
The 2-pass color quantization algorithm assumes 3-sample pixels.  RGB565
is the only 3-component colorspace that doesn't have 3-sample pixels, so
we need to treat it as a special case when determining whether to enable
2-pass color quantization.  Otherwise, attempting to initialize 2-pass
color quantization with an RGB565 output buffer could cause
prescan_quantize() to read from uninitialized memory and subsequently
underflow/overflow the histogram array.

djpeg is supposed to fail gracefully if both -rgb565 and -colors are
specified, because none of its destination managers (image writers)
support color quantization with RGB565.  However, prescan_quantize() was
called before that could occur.  It is possible but very unlikely that
these issues could have been reproduced in applications other than
djpeg.  The issues involve the use of two features (12-bit precision and
RGB565) that are incompatible, and they also involve the use of two
rarely-used legacy features (RGB565 and color quantization) that don't
make much sense when combined.

Fixes #668
Fixes #671
Fixes #680
2023-04-04 20:38:00 -05:00
DRC
9f756bc67a Lossless decomp: Range-limit 12-bit samples
12-bit is the only data precision for which the range of the sample data
type exceeds the valid sample range, so it is possible to craft a 12-bit
lossless JPEG image that contains out-of-range 12-bit samples.
Attempting to decompress such an image using color quantization or merged
upsampling (NOTE: libjpeg-turbo cannot generate YCbCr or subsampled
lossless JPEG images, but it can decompress them) caused segfaults or
buffer overruns when those algorithms attempted to use the out-of-range
sample values as array indices.  This commit modifies the lossless
decompressor so that it range-limits the output of the scaler when using
12-bit samples.

Fixes #670
Fixes #672
Fixes #673
Fixes #674
Fixes #675
Fixes #676
Fixes #677
Fixes #678
Fixes #679
Fixes #681
Fixes #683
2023-04-04 20:37:54 -05:00
DRC
fc881ebb21 TurboJPEG: Implement 4:4:1 chrominance subsampling
This allows losslessly transposed or rotated 4:1:1 JPEG images to be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images.

Because tj3Transform() allows multiple lossless transformations to be
chained together, all subsampling options need to have a corresponding
transposed subsampling option.  (This is why 4:4:0 was originally
implemented as well.)  Otherwise, the documentation would be technically
incorrect.  It says that images with unknown subsampling types cannot be
losslessly cropped, partially decompressed, or decompressed to planar
YUV images, but it doesn't say anything about images with known
subsampling types whose subsampling type becomes unknown if the image is
rotated or transposed.  This is one of those situations in which it is
easier to implement a feature that works around the problem than to
document the problem.

Closes #659
2023-03-10 10:46:14 -06:00
DRC
0827eaff11 ChangeLog.md: Add literal vers # to 3.0 beta2 hdr
(per our convention)
2023-03-10 09:30:05 -06:00
DRC
6c61033349 ChangeLog.md: Document 4e028ecd
+ bump version to 3.0 beta2
2023-02-08 10:14:04 -06:00
DRC
fd8c4da0ac Bump revision to 2.1.90 to prepare for beta
+ acknowledge upcoming 2.1.5 release
2023-01-27 14:05:07 -06:00
DRC
db9f297f1c ChangeLog.md: Document TurboJPEG 3 API overhaul 2023-01-27 07:10:49 -06:00
DRC
7ab6222cff Merge branch 'main' into dev 2023-01-20 14:09:25 -06:00
DRC
98a6455875 TJBench: Set TJ*OPT_PROGRESSIVE with -progressive
The documented behavior of the -progressive option is to use progressive
entropy coding in JPEG images generated by compression and transform
operations.  However, setting TJFLAG_PROGRESSIVE was insufficient to
accomplish that, because TJBench doesn't enable lossless transformation
if xformOpt == 0.
2023-01-20 13:23:00 -06:00
DRC
b99e7590b0 TJBench/Java: Fix parsing of quality ranges 2023-01-20 13:02:38 -06:00
DRC
c7c02d9288 Merge branch 'main' into dev 2023-01-17 18:31:31 -06:00
DRC
08cbc23334 12-bit: Set alpha channel to 4095 rather than 255 2023-01-17 15:29:02 -06:00
DRC
d4589f4f1c Merge branch 'main' into dev 2023-01-14 18:07:53 -06:00
DRC
94a2b95342 tjDecompressToYUV2: Use scaled dims for plane calc
The documented behavior of the function is to use decompression scaling
to generate the largest possible image that will fit within the desired
image dimensions.  Thus, if the desired image dimensions are larger than
the scaled image dimensions, then tjDecompressToYUV2() should use the
scaled image dimensions when computing the plane pointers and strides to
pass to tjDecompressToYUVPlanes().

Note that this bug was not previously detected, because tjunittest and
tjbench always passed the scaled image dimensions to
tjDecompressToYUV2().
2023-01-14 17:26:17 -06:00
DRC
9a146f0f23 TurboJPEG: Numerous documentation improvements
- Wordsmithing, formatting, and grammar tweaks

- Various clarifications and corrections, including specifying whether
  a particular buffer or image is used as a source or destination

- Accommodate/mention features that were introduced since the API
  documentation was created.

- For clarity, use "packed-pixel" to describe uncompressed
  source/destination images that are not planar YUV.

- Use "row" rather than "line" to refer to a single horizontal group of
  pixels or component values, for consistency with the libjpeg API
  documentation.  (libjpeg also uses "scanline", which is a more archaic
  term.)

- Use "alignment" rather than "padding" to refer to the number of bytes
  by which a row's width is evenly divisible.  This consistifies the
  documention of the YUV functions and tjLoadImage().  ("Padding"
  typically refers to the number of bytes added to each row, which is
  not the same thing.)

- Remove all references to "the underlying codec."  Although the
  TurboJPEG API originated as a cross-platform wrapper for the Intel
  Integrated Performance Primitives, Sun mediaLib, QuickTime, and
  libjpeg, none of those TurboJPEG implementations has been maintained
  since 2009.  Nothing would prevent someone from implementing the
  TurboJPEG API without libjpeg-turbo, but such an implementation would
  not necessarily have an "underlying codec."  (It could be fully
  self-contained.)

- Use "destination image" rather than "output image", for consistency,
  or describe the type of image that will be output.

- Avoid the term "image buffer" and instead use "byte buffer" to
  refer to buffers that will hold JPEG images, or describe the type of
  image that will be contained in the buffer.  (The Java documentation
  doesn't use "byte buffer", because the buffer arrays literally have
  "byte" in front of them, and since Java doesn't have pointers, it is
  not possible for mere mortals to store any other type of data in those
  arrays.)

- C: Use "unified" to describe YUV images stored in a single buffer, for
  consistency with the Java documentation.

- Use "planar YUV" rather than "YUV planar".  Is is our convention to
  describe images using {component layout} {colorspace/pixel format}
  {image function}, e.g. "packed-pixel RGB source image" or "planar YUV
  destination image."

- C: Document the TurboJPEG API version in which a particular function
  or macro was introduced, and reorder the backward compatibility
  function stubs in turbojpeg.h alphabetically by API version.

- C: Use Markdown rather than HTML tags, where possible, in the Doxygen
  comments.
2023-01-14 17:10:31 -06:00
DRC
d260858395 TurboJPEG: Ensure 'pad' arg is a power of 2
Because the PAD() macro can only handle powers of 2, this is a necessary
restriction (and a documented one, except in the case of
tjCompressFromYUV()-- oops.)  Failing to check the 'pad' argument
caused tjBufSizeYUV2() to return bogus results if 'pad' was less than 1
or otherwise not a power of 2.  tjEncodeYUV3() and tjDecodeYUV()
effectively treated a 'pad' value of 0 as unpadded, but that was subtle
and undocumented behavior.  tjCompressFromYUV() did not check whether
'pad' was a power of 2, so the strides passed to
tjCompressFromYUVPlanes() would have been incorrect if 'pad' was not a
power of 2.  That would not have caused tjCompressFromYUV() to overrun
the source buffer, as long as the calling application allocated the
buffer based on the return value of tjBufSizeYUV2() (which computes the
strides in the same manner as tjCompressFromYUV().)  However, if the
calling application attempted to initialize the source buffer using
correctly-computed strides, then it could have overrun its own
buffer in certain cases or produced incorrect JPEG images in others.

Realistically, there is no reason why an application would want to pass
a non-power-of-2 'pad' value to a TurboJPEG API function, so this commit
is about user-proofing the API rather than fixing any known issue.
2023-01-05 14:22:17 -06:00
DRC
2241434eb9 16-bit lossless JPEG support 2022-12-16 13:57:03 -06:00
DRC
803523402f Merge branch 'main' into dev 2022-12-07 14:11:37 -06:00