mozjpeg

Author	SHA1	Message	Date
DRC	0ef07927c3	Bump copyright year to 2024	2024-01-23 10:47:40 -05:00
DRC	1644bdb7d2	BUILD: Silence CMake 3.28.x deprecation warning Closes #740	2024-01-22 14:33:31 -05:00
DRC	5a2353c2da	GNUInstallDirs.cmake: Various improvements - Integrate `c07bba2730 (diff-1e2deb5301e9481203102fcddd1b2d0d2bf0ddc1cbb445c7f4b6414a3b869ce8)` so that the default man directory is <CMAKE_INSTALL_PREFIX>/share/man on FreeBSD systems. - For good measure, integrate `f835f189ae` so that the default info directory is <CMAKE_INSTALL_PREFIX>/share/info on FreeBSD systems, even though we don't use that directory. - Automatically set the CMake variable type to PATH for any user-specified CMAKE_INSTALL_*DIR variables. Addresses concerns raised in #326, #346, #648 Closes #648	2024-01-22 14:15:02 -05:00
DRC	335ed793f9	Assume 3-comp lossls JPEG w/o Adobe marker is RGB libjpeg-turbo always includes Adobe APP14 markers in the lossless JPEG images that it generates, but some compressors (e.g. accusoft PICTools Medical) do not. Fixes #743	2024-01-19 13:14:21 -05:00
DRC	fa2b6ea092	Eliminate duplicate copies of jpeg_nbits_table `ef9a4e05ba` (libjpeg-turbo 1.4.x), which was based on https://bug815473.bmoattachments.org/attachment.cgi?id=692126 (https://bugzilla.mozilla.org/show_bug.cgi?id=815473), modified the C baseline Huffman encoder so that it precomputes jpeg_nbits_table, in order to facilitate sharing the table among multiple processes. However, libjpeg-turbo never shared the table, and because the table was implemented as a static array, `f3a8684cd1` (libjpeg-turbo 1.5.x) and `37bae1a0e9` (libjpeg-turbo 2.0.x) each introduced a duplicate copy of the table for (respectively) the SSE2 baseline Huffman encoder and the C progressive Huffman encoder. This commit does the following: - Move the duplicated code in jchuff.c and jcphuff.c, originally introduced in `0cfc4c17b7` and `37bae1a0e9`, into a header (jpeg_nbits.h). - Credit the co-author of `0cfc4c17b7`. (Refer to https://sourceforge.net/p/libjpeg-turbo/patches/57). - Modify the SSE2 baseline Huffman encoder so that the C Huffman encoders can share its definition of jpeg_nbits_table. - Move the definition of jpeg_nbits_table into a C source file (jpeg_nbits.c) rather than a header, and define the table only if USE_CLZ_INTRINSIC is undefined and the SSE2 baseline Huffman encoder will not be built. - Apply hidden symbol visibility to the shared definition of jpeg_nbits_table, if the compiler supports the necessary attribute. (In practice, only Visual C++ doesn't.) Closes #114 See also: https://bugzilla.mozilla.org/show_bug.cgi?id=1501523	2024-01-16 17:33:30 -05:00
DRC	be96fa0a40	Doc: Lossless JPEG clarifications - Clarify that lossless JPEG is slower than and doesn't compress as well as lossy JPEG. (That should be obvious, because "lossy" literally means that data is thrown away.) - Re-generate TurboJPEG C API documentation using Doxygen 1.9.8. - Clarify that setting the data_precision field in jpeg_compress_struct to 16 requires lossless mode. - Explain what the predictor selection value actually does. (Refer to Recommendation ITU-T T.81 (1992) \| ISO/IEC 10918-1:1994, Section H.1.2.1.)	2023-12-14 13:33:46 -05:00
DRC	abeca1f0cc	Move official releases to GitHub	2023-11-30 09:36:05 -05:00
DRC	3eee0dd747	ChangeLog.md: "since" = "relative to"	2023-11-29 10:03:49 -05:00
DRC	55d342c788	TurboJPEG: Expose/extend hidden "max pixels" param TJPARAM_MAXPIXELS was previously hidden and used only for fuzz testing, but it is potentially useful for calling applications as well, particularly if they want to guard against excessive memory consumption by the tj3LoadImage*() functions. The parameter has also been extended to decompression and lossless transformation functions/methods, mainly as a convenience. (It was already possible for calling applications to impose their own JPEG image size limits by reading the JPEG header prior to decompressing or transforming the image.)	2023-11-16 15:36:47 -05:00
DRC	6136a9e285	Java doc: Terminology tweaks - "function" = "method" - "decompression and transform functions" = "decompression and transform operations" (for consistency with the 2.1.x documentation) - "return an error" = "throw an error" - "ceil()" = "Math.ceil()"	2023-11-16 14:23:39 -05:00
DRC	40419472fa	SECURITY.md: Further clarify security adv. policy Security advisories should only be filed against official releases.	2023-11-15 13:42:34 -05:00
DRC	45f018cb22	SECURITY.md: Clarify security advisories policy Unfortunately, most of the GitHub security advisories filed against libjpeg-turbo thus far have been the result of non-exploitable API abuses triggered by randomly-generated test programs and accompanied by wild claims of denials of service with no demonstrable or even probable exploit that might cause such a DoS (assuming a service even existed that used the API in question.) Security advisories remain private unless accepted, and I cannot accept them if they do not describe an actual security issue. Thus, it's best to steer most users toward regular bug reports.	2023-11-15 13:20:27 -05:00
DRC	d51830473a	tjexampletest.in: Fix code formatting issue (introduced by `837e471a90`)	2023-11-15 11:06:23 -05:00
DRC	8a4ce73c92	tj3Transform(): Range check transform operations	2023-11-15 11:02:42 -05:00
DRC	a27bad6552	tj3Transform(): Ensure JPEG hdr successfully read Because of the TurboJPEG 3 API overhaul, the legacy decompression and lossless transformation functions now wrap the new TurboJPEG 3 functions. For performance reasons, we don't want to read the JPEG header more than once during the same operation, so the wrapped functions do not read the header if it has already been read by a wrapper function. Initially the TurboJPEG 3 functions used a state variable to track whether the header had already been read, but `b94041390c` made this more robust by using the libjpeg global decompression state instead. If a wrapper function has already read the JPEG header successfully, then the global decompression state will be DSTATE_READY, and the logic introduced in `b94041390c` will prevent the header from being read again. A subtle issue arises because tj3DecompressHeader() does not call jpeg_abort_decompress() if jpeg_read_header() fails. (That is arguably a bug, but it has existed since the very first implementation of the function.) Depending on the nature of the failure, this can cause tj3DecompressHeader() to return an error code and leave the libjpeg global decompression state set to DSTATE_INHEADER. If a misbehaved application ignored the error and subsequently called a TurboJPEG decompression or lossless transformation function, then the function would fail to read the JPEG header because the global decompression state was greater than DSTATE_START. In the case of the decompression functions, this was innocuous, because jpeg_calc_output_dimensions() and jpeg_start_decompress() both sanity check the global decompression state. However, it was possible for a misbehaved application to call tj3DecompressHeader() with junk data, ignore the return value, and pass the same junk data into tj3Transform(). Because tj3DecompressHeader() left the global decompression state set to DSTATE_INHEADER, tj3Transform() failed to detect the junk data (because it didn't try to read the JPEG header), and it called jtransform_request_workspace() with dinfo->image_width and dinfo->image_height still initialized to 0. Because jtransform_request_workspace() does not sanity check the decompression state, a division-by-zero error occurred with certain combinations of transform options in which TJXOPT_TRIM or TJXOPT_CROP was specified. However, it should be noted that TJXOPT_TRIM and TJXOPT_CROP cannot be expected to work properly without foreknowledge of the JPEG source image dimensions, which cannot be gained except by calling tj3DecompressHeader() successfully. Thus, a calling application is inviting trouble if it does not check the return value of tj3DecompressHeader() and sanity check the JPEG source image dimensions before calling tj3Transform(). This commit softens the failure, but the failure is still due to improper API usage.	2023-11-14 19:22:43 -05:00
DRC	837e471a90	tjexample.c: Fix error when decompressing (regression introduced by `300a344d65`) `300a344d65` fixed the recompression code path but also broke the pure decompression code path, because the fix caused the TurboJPEG decompression instance to be destroyed before tj3SaveImage() could use it. Furthermore, the fix in `300a344d65` prevented pixel density information from being transferred from the input image to the output image when recompressing. This commit does the following: - Modify tjexample.c so that a single TurboJPEG instance is initialized for lossless transformation and shared by all code paths. In addition to fixing both of the aforementioned issues, this makes the code more readable. - Extend tjexampletest to test the recompression code path, thus ensuring that the issues fixed by this commit and `300a344d65` are not reintroduced. - Modify tjexample.c to remove redundant fclose(), tj3Destroy(), and tj3Free() calls.	2023-11-14 12:08:02 -05:00
DRC	df9dbff830	TurboJPEG: New param to limit virt array mem usage This corresponds to max_memory_to_use in the jpeg_memory_mgr struct in the libjpeg API, except that the TurboJPEG parameter is specified in megabytes. Because this is 2023 and computers with less than 1 MB of memory are not a thing (at least not within the scope of libjpeg-turbo support), it isn't useful to allow a limit less than 1 MB to be specified. Furthermore, because TurboJPEG parameters are signed integers, if we allowed the memory limit to be specified in bytes, then it would be impossible to specify a limit larger than 2 GB on 64-bit machines. Because max_memory_to_use is a long signed integer, effectively we can specify a limit of up to 2 petabytes on 64-bit machines if the TurboJPEG parameter is specified in megabytes. (2 PB should be enough for anybody, right?) This commit also bumps the TurboJPEG API version to 3.0.1. Since the TurboJPEG API version no longer tracks the libjpeg-turbo version, it makes sense to increment the API revision number when adding constants, to increment the minor version number when adding functions, and to increment the major version number for a complete overhaul. This commit also removes the vestigial TJ_NUMPARAM macro, which was never defined because it proved unnecessary. Partially implements #735	2023-11-14 10:19:06 -05:00
DRC	bf248a5093	tj3Compress(): Free virt arrays if mem limit hit This is very subtle, but if a user specifies a libjpeg virtual array memory limit via the JPEGMEM environment variable and one of the tj3Compress() functions hits that limit, the libjpeg error handler will be invoked in jpeg_start_compress() (more specifically in realize_virt_arrays() in jinit_compress_master()) before the libjpeg global compression state can be incremented. Thus, jpeg_abort_compress() will not be called before the tj3Compress() function exits, the unrealized virtual arrays will not be freed, and if the TurboJPEG compression instance is reused, those unrealized virtual arrays will count against the specified memory limit. This could cause subsequent compression operations that require smaller virtual arrays (or even no virtual arrays at all) to fail when they would otherwise succeed. In reality, the vast majority of calling programs would abort and free the TurboJPEG compression instance if one of the tj3Compress() functions failed, but TJBench is a rare exception. This issue does not bear documenting because of its subtlety and rarity and because JPEGMEM is not a documented feature of the TurboJPEG API. Note that the issue does not exist in the tj3Encode() and tj3Decode() functions, because realize_virt_arrays() is never called in the body of those functions. The issue also does not exist in the tj3Decompress*() and tj3Transform() functions, because those functions ensure that the JPEG header is read (and thus the libjpeg global decompression state is incremented) prior to calling a function that calls realize_virt_arrays() (i.e. jpeg_start_decompress() or jpeg_read_coefficients().) If realize_virt_arrays() failed in the body of jpeg_write_coefficients(), then tj3Transform() would abort without calling jpeg_abort_compress(). However, since jpeg_start_compress() is never called in the body of tj3Transform(), no virtual arrays are ever requested from the compression object, so failing to call jpeg_abort_compress() would be innocuous.	2023-11-11 15:59:09 -05:00
DRC	220bd76188	turbojpeg.h: Fix broken refs in API documentation "TJPARAM_DENSITYUNIT" should be "TJPARAM_DENSITYUNITS" (oops.)	2023-11-10 11:08:22 -05:00
DRC	cbdc20fbf4	example.c: Remove comments regarding temp files Those comments could be confusing to new libjpeg-turbo users, and the same information already exists in libjpeg.txt and structure.txt.	2023-11-08 11:43:58 -05:00
DRC	7c61794faf	jmemsys.h: Remove unused MS-DOS/Mac mem. mgr. code Those obsolete memory managers have never been included in libjpeg-turbo, nor has libjpeg-turbo ever claimed to support MS-DOS or Mac operating systems prior to OS X 10.4 "Tiger." Note that we retain the ability to use temp files, even though libjpeg-turbo does not use them. This allows downstream implementations to write their own memory managers that use temp files, if necessary.	2023-11-08 11:16:14 -05:00
DRC	78eaf0d46d	tj3YUV8(): Fix int overflow w/ huge row alignment If the align parameter was set to an unreasonably large value, such as 0x2000000, strides[0] ph0 and strides[1] * ph1 could have overflowed the int datatype and wrapped around when computing (src\|dst)Planes[1] and (src\|dst)Planes[2] (respectively.) This would have caused (src\|dst)Planes[1] and (src\|dst)Planes[2] to point to lower addresses in the YUV buffer than expected, so the worst case would have been a visually incorrect output image, not a buffer overrun or other exploitable issue.	2023-11-07 15:39:16 -05:00
DRC	0e2d289fab	Bump version to 3.0.2 to prepare for new commits	2023-11-07 12:38:05 -05:00
Kornel	6c9f0897af	MozJPEG 4.1.5 v4.1.5	2023-10-12 17:52:31 +01:00
DRC	ec32420f6b	example.c: Fix 12-bit PPM write w/ big endian CPUs	2023-10-11 16:11:45 -04:00
DRC	5b2beb4bc4	Build: Set FLOATTEST* by default for AArch64, PPC Because of `47656a0820`, we can now reliably determine the correct default values for FLOATTEST8 and FLOATTEST12 when using Clang or GCC to build for AArch64 or PowerPC platforms. (Testing confirms that this is the case with GCC 5-13 and Clang 5-14 on Ubuntu/AArch64, GCC 4 on CentOS 7/PPC, and GCC 8-10 and Clang 6-12 on Ubuntu/PPCLE.) Other CPU architectures and compilers can be added on a case-by-case basis as they are tested.	2023-10-11 16:11:36 -04:00
DRC	da48edfc49	jchuff.c: Fix uninit read w/ AArch64, WITH_SIMD=0 Because of `bf01ed2fbc`, the simd field in huff_entropy_encoder (and, by extension, the simd field in savable_state) is only initialized if WITH_SIMD is defined. Due to an oversight, the simd field in savable_state was queried in flush_bits() regardless of whether WITH_SIMD was defined. In most cases, both branches of the query have identical code, and the optimizer removes the branch. However, because the legacy Neon GAS Huffman encoder uses the older bit buffer logic from libjpeg-turbo 2.0.x and prior (refer to `087c29e07f`), the branches do not have identical code when building for AArch64 with NEON_INTRINSICS undefined (which will be the case if WITH_SIMD is undefined.) Thus, if libjpeg-turbo was built for AArch64 with the SIMD extensions disabled at build time, it was possible for the Neon GAS branch in flush_bits() to be taken, which would have set put_bits to a value that is incorrect for the C Huffman encoder. Referring to #728, a user reported that this issue sometimes caused libjpeg-turbo to generate bogus JPEG images if it was built for AArch64 without SIMD extensions and subsequently used through the Qt framework. (It should be noted, however, that disabling the SIMD extensions in AArch64 builds of libjpeg-turbo is inadvisable for performance reasons.) I was unable to reproduce the issue on Linux/AArch64 using libjpeg-turbo alone, despite testing various versions of GCC and Clang and various optimization levels. However, the issue is reproducible using MSan with -O0, so this commit also modifies the GitHub Actions workflow so that compiler optimization is disabled in the linux-msan job. That should prevent the issue or similar issues from re-emerging. Fixes #728	2023-10-10 14:58:34 -04:00
DRC	3d1d68cf96	README.md: Mention 4:4:0 math. incomp. vs. jpeg-6b libjpeg-turbo implements 4:4:0 "fancy" (smooth) upsampling, which is enabled by default when decompressing JPEG images that use 4:4:0 chrominance subsampling. libjpeg did not and does not implement fancy 4:4:0 upsampling.	2023-10-04 13:20:38 -04:00
DRC	2c97a1ff07	GitHub: Use Ubuntu 20.04 runner for x32 build/test The Ubuntu 22.04 kernel no longer supports the x32 ABI.	2023-10-03 12:08:31 -04:00
DRC	47656a0820	Test: Fix float test errors w/ Clang & fp-contract The MD5 sums associated with FLOATTEST8=fp-contract and FLOATTEST12=fp-contract are appropriate for GCC (tested v5 through v13) with -ffp-contract=fast, which is the default when compiling for an architecture that has fused multiply-add (FMA) instructions. However, different MD5 sums are needed for Clang (tested v5 through v14) with -ffp-contract=on, which is now the default in Clang 14 when compiling for an architecture that has FMA instructions. Refer to #705, #709, #710	2023-10-03 11:28:21 -04:00
Kornel	0c6302e086	Merge remote-tracking branch 'libjpeg-turbo/2.1.x' into HEAD * libjpeg-turbo/2.1.x: ChangeLog.md: List CVE ID fixed by `ccaba5d7` jpeglib.h: Document that JCS_RGB565 is decomp-only Fix block smoothing w/vert.-subsampled prog. JPEGs	2023-09-23 22:31:18 +01:00
DRC	4ae5d3dac5	ChangeLog.md: List CVE ID fixed by `ccaba5d7` Closes #724	2023-09-14 17:23:13 -04:00
DRC	c0412b56d6	ChangeLog.md: List CVE ID fixed by `ccaba5d7` Closes #724	2023-09-14 17:20:37 -04:00
DRC	7722c54c63	jpeglib.h: Document that JCS_RGB565 is decomp-only Closes #723	2023-09-11 12:47:39 -04:00
DRC	f3c7116eaf	jpeglib.h: Document that JCS_RGB565 is decomp-only Closes #723	2023-09-11 12:46:13 -04:00
Kornel	61e03cc6b2	Fix #220	2023-09-04 15:58:41 +01:00
Kornel	3a691f41f9	Merge commit 'eadd243' # By DRC # Via DRC * commit 'eadd243': Fix interblock smoothing with narrow prog. JPEGs jchuff.c/flush_bits(): Guard against free_bits < 0 jchuff.c/flush_bits(): Guard against put_bits < 0 Restore xform fuzzer behavior from before `19f9d8f0` xform fuzz: Use src subsamp to calc dst buf size Doc: Mention that we are a JPEG ref implementation jchuff.c: Test for out-of-range coefficients turbojpeg.h: Make customFilter() proto match doc ChangeLog.md: Fix typo tjTransform(): Calc dst buf size from xformed dims Fix build warnings/errs w/ -DNO_GETENV/-DNO_PUTENV GitHub: Fix x32 build tjexample.c: Prevent integer overflow jpeg_crop_scanline: Fix calc w/sclg + 2x4,4x2 samp Decomp: Don't enable 2-pass color quant w/ RGB565 TJBench: w/JPEG input imgs, set min tile= MCU size Bump version to 2.1.6 to prepare for new commits GitHub: Add pull request template Build: Clarify CMAKE_OSX_ARCHITECTURES error Build: Fail if included with add_subdirectory() # Conflicts: # .github/workflows/build.yml # CMakeLists.txt # README.md # release/deb-control.in	2023-08-26 21:52:39 +01:00
Kornel	3bd754ee68	Merge tag '2.1.5.1' # By DRC # Via DRC * tag '2.1.5.1': ChangeLog.md: Document `d743a2c1` OSS-Fuzz: Bail out immediately on decomp failure SIMD/x86: Initialize simd_support before every use Build: Define THREAD_LOCAL even if !WITH_TURBOJPEG # Conflicts: # CMakeLists.txt v4.1.4	2023-08-26 21:49:51 +01:00
DRC	a9d87361c8	Fix block smoothing w/vert.-subsampled prog. JPEGs The 5x5 interblock smoothing implementation, introduced in libjpeg-turbo 2.1, improperly extended the logic from the traditional 3x3 smoothing implementation. Both implementations point prev_block_row and next_block_row to the current block row when processing, respectively, the first and the last block row in the image: if (block_row > 0 \|\| cinfo->output_iMCU_row > 0) prev_block_row = buffer[block_row - 1] + cinfo->master->first_MCU_col[ci]; else prev_block_row = buffer_ptr; if (block_row < block_rows - 1 \|\| cinfo->output_iMCU_row < last_iMCU_row) next_block_row = buffer[block_row + 1] + cinfo->master->first_MCU_col[ci]; else next_block_row = buffer_ptr; `6d91e950c8` naively extended that logic to accommodate a 5x5 smoothing window: if (block_row > 1 \|\| cinfo->output_iMCU_row > 1) prev_prev_block_row = buffer[block_row - 2] + cinfo->master->first_MCU_col[ci]; else prev_prev_block_row = prev_block_row; if (block_row < block_rows - 2 \|\| cinfo->output_iMCU_row + 1 < last_iMCU_row) next_next_block_row = buffer[block_row + 2] + cinfo->master->first_MCU_col[ci]; else next_next_block_row = next_block_row; However, this new logic was only correct if block_rows == 1, so the values of prev_prev_block_row and next_next_block_row were incorrect when processing, respectively, the second and second to last iMCU rows in a vertically-subsampled progressive JPEG image. The intent was to: - point prev_block_row to the current block row when processing the first block row in the image, - point prev_prev_block_row to prev_block_row when processing the first two block rows in the image, - point next_block_row to the current block row when processing the last block row in the image, and - point next_next_block_row to next_block_row when processing the last two block rows in the image. This commit modifies decompress_smooth_data() so that it computes the current block row's position relative to the whole image and sets the block row pointers based on that value. This commit also restores a line of code that was accidentally deleted by `6d91e950c8`: access_rows += compptr->v_samp_factor; /* prior iMCU row too / access_rows is merely a sanity check that tells the access_virt_barray() method to generate an error if accessing the specified number of rows would cause a buffer overrun. Essentially, it is a belt-and-suspenders measure to ensure that jinit_d_coef_controller() allocated enough rows for the full-image virtual array. Thus, excluding that line of code did not cause an observable issue. This commit also documents `eadd2436ee` in the change log. Fixes #721	2023-08-15 15:40:37 -04:00
DRC	9b704f96b2	Fix block smoothing w/vert.-subsampled prog. JPEGs The 5x5 interblock smoothing implementation, introduced in libjpeg-turbo 2.1, improperly extended the logic from the traditional 3x3 smoothing implementation. Both implementations point prev_block_row and next_block_row to the current block row when processing, respectively, the first and the last block row in the image: if (block_row > 0 \|\| cinfo->output_iMCU_row > 0) prev_block_row = buffer[block_row - 1] + cinfo->master->first_MCU_col[ci]; else prev_block_row = buffer_ptr; if (block_row < block_rows - 1 \|\| cinfo->output_iMCU_row < last_iMCU_row) next_block_row = buffer[block_row + 1] + cinfo->master->first_MCU_col[ci]; else next_block_row = buffer_ptr; `6d91e950c8` naively extended that logic to accommodate a 5x5 smoothing window: if (block_row > 1 \|\| cinfo->output_iMCU_row > 1) prev_prev_block_row = buffer[block_row - 2] + cinfo->master->first_MCU_col[ci]; else prev_prev_block_row = prev_block_row; if (block_row < block_rows - 2 \|\| cinfo->output_iMCU_row + 1 < last_iMCU_row) next_next_block_row = buffer[block_row + 2] + cinfo->master->first_MCU_col[ci]; else next_next_block_row = next_block_row; However, this new logic was only correct if block_rows == 1, so the values of prev_prev_block_row and next_next_block_row were incorrect when processing, respectively, the second and second to last iMCU rows in a vertically-subsampled progressive JPEG image. The intent was to: - point prev_block_row to the current block row when processing the first block row in the image, - point prev_prev_block_row to prev_block_row when processing the first two block rows in the image, - point next_block_row to the current block row when processing the last block row in the image, and - point next_next_block_row to next_block_row when processing the last two block rows in the image. This commit modifies decompress_smooth_data() so that it computes the current block row's position relative to the whole image and sets the block row pointers based on that value. This commit also restores a line of code that was accidentally deleted by `6d91e950c8`: access_rows += compptr->v_samp_factor; /* prior iMCU row too / access_rows is merely a sanity check that tells the access_virt_barray() method to generate an error if accessing the specified number of rows would cause a buffer overrun. Essentially, it is a belt-and-suspenders measure to ensure that jinit_d_coef_controller() allocated enough rows for the full-image virtual array. Thus, excluding that line of code did not cause an observable issue. This commit also documents `dbae59281f` in the change log. Fixes #721	2023-08-15 15:21:29 -04:00
DRC	7b844bfda6	x86-64 SIMD: Use std stack frame/prologue/epilogue This allows debuggers and profilers to reliably capture backtraces from within the x86-64 SIMD functions. In places where rbp was previously used to access temporary variables (after stack alignment), we now use r15 and save/restore it accordingly. The total amount of work is approximately the same, because the previous code pushed the pre-alignment stack pointer to the aligned stack. The new prologue and epilogue actually have fewer instructions. Also note that the {un}collect_args macros now use rbp instead of rax to access arguments passed on the stack, so we save a few instructions there as well. Based on: `debcc7c3b4` Closes #707 Closes #708	2023-08-08 12:33:06 -04:00
DRC	e17fa3a271	Bump version to 3.0.1 to prepare for new commits	2023-08-08 11:28:12 -04:00
DRC	eadd2436ee	Fix interblock smoothing with narrow prog. JPEGs Due to an oversight, the assignment of DC05, DC10, DC15, DC20, and DC25 (the right edge coefficients in the 5x5 interblock smoothing window) in decompress_smooth_data() was incorrect for images exactly two MCU blocks wide. For such images, DC04, DC09, DC14, DC19, and DC24 were assigned values based on the last MCU column, but DC05, DC10, DC15, DC20, and DC25 were assigned values based on the first MCU column (because block_num + 1 was never less than last_block_column.) This commit modifies jdcoefct.c so that, for images at least two MCU blocks wide, DC05, DC10, DC15, DC20, and DC25 are assigned the same values as DC04, DC09, DC14, DC19, and DC24 (respectively.) DC05, DC10, DC15, DC20, and DC25 are then immediately overwritten for images more than two MCU blocks wide. Since this issue was minor and not likely obvious to an end user, the fix is undocumented. Fixes #700	2023-08-03 16:07:54 -04:00
DRC	dbae59281f	Fix interblock smoothing with narrow prog. JPEGs Due to an oversight, the assignment of DC05, DC10, DC15, DC20, and DC25 (the right edge coefficients in the 5x5 interblock smoothing window) in decompress_smooth_data() was incorrect for images exactly two MCU blocks wide. For such images, DC04, DC09, DC14, DC19, and DC24 were assigned values based on the last MCU column, but DC05, DC10, DC15, DC20, and DC25 were assigned values based on the first MCU column (because block_num + 1 was never less than last_block_column.) This commit modifies jdcoefct.c so that, for images at least two MCU blocks wide, DC05, DC10, DC15, DC20, and DC25 are assigned the same values as DC04, DC09, DC14, DC19, and DC24 (respectively.) DC05, DC10, DC15, DC20, and DC25 are then immediately overwritten for images more than two MCU blocks wide. Since this issue was minor and not likely obvious to an end user, the fix is undocumented. Fixes #700	2023-08-03 15:43:46 -04:00
DRC	300a344d65	tjexample.c: Fix error when recompressing (regression introduced by `fc01f4673b`) Because of the TurboJPEG 3 API overhaul, the pure compression code path in tjexample.c now creates a TurboJPEG compression instance prior to calling tj3LoadImage(). However, due to an oversight, a compression instance was no longer created in the recompression code path. Thus, that code path attempted to reuse the TurboJPEG decompression instance, which caused an error ("tj3Set(): TJPARAM_QUALITY is not applicable to decompression instances.") This commit modifies tjexample.c so that the recompression code path creates a new TurboJPEG compression instance if one has not already been created. Fixes #716	2023-07-30 11:20:39 -04:00
DRC	21f5688aa4	jchuff.c/flush_bits(): Guard against free_bits < 0 This fixes a buffer overrun, reported by OSS-Fuzz, that occurred when attempting to transform a specially-crafted malformed arithmetic-coded JPEG image into a baseline Huffman-coded JPEG destination image with default Huffman tables. This issue probably had a similar root cause to the issue fixed in `31a301389b`, but in this case, the issue only occurred with the SIMD baseline Huffman encoder in libjpeg-turbo 2.1.x and 2.0.x. It was not reproducible in 3.0.x or when using the C baseline Huffman encoder. (NOTE: In order to reproduce the issue with 2.1.x, it was necessary to revert 58cee6d90cd616c86fae142891b987c289ea055a.)	2023-07-25 17:40:35 -04:00
DRC	ebca79d508	xform fuzz: Test optimized baseline entropy coding Because of `d011622f4b`, optimized baseline entropy coding wasn't actually being tested.	2023-07-25 16:47:44 -04:00
DRC	63bd71885f	Build: Unset FLOATTEST* by default for non-x86 Because libjpeg-turbo 3.0.x now supports multiple data precisions in the same build, the regression test system can test the 8-bit and 12-bit floating point DCT/IDCT algorithms separately. The expected MD5 sums for those tests are communicated to the test system using the FLOATTEST8 and FLOATTEST12 CMake variables. Whereas it is possible to intelligently set a default value for FLOATTEST8 when building for x86[-64] and a default value for FLOATTEST12 when building for x86-64, it is not possible with other architectures. (Refer to #705, #709, and #710.) Clang 14, for example, now enables FMA (fused multiply-add) instructions by default on architectures that support them, but with AArch64 builds, the results are not the same as when using GCC/AArch64 with FMA instructions enabled. Thus, setting FLOATTEST12=fp-contract doesn't make the tests pass. It was already impossible to intelligently set a default for FLOATTEST8 with i386 builds, but referring to #710, that appears to be the case with other non-x86-64 builds as well. Back in 1991, when Tom Lane first released libjpeg, some CPUs had floating point units and some didn't. It could take minutes to compress or decompress a 1-megapixel JPEG image using the "slow" integer DCT/IDCT algorithms, and the floating point algorithms were significantly faster on systems that had an FPU. On systems without FPUs, floating point math was emulated and really slow, so Tom also developed "fast" integer DCT/IDCT algorithms to speed up JPEG performance, at the expense of accuracy, on those systems. Because of libjpeg-turbo's SIMD extensions, the floating point algorithms are now significantly slower than the "slow" integer algorithms without being significantly more accurate, and the "fast" integer algorithms fail the ISO/ITU-T conformance tests without being any faster than the "slow" integer algorithms on x86 systems. Thus, the floating point and "fast" integer algorithms are considered legacy features. In order for the floating point regression tests to be useful, the results of the tests must be validated against an independent metric. (In other words, it wouldn't be useful to use the floating point DCT/IDCT algorithms to determine the expected results of the floating point DCT/IDCT algorithms.) In the past, I attempted without success to develop a low-level floating point test that would run at configure time and determine the appropriate default value of FLOATTEST*. Barring that approach, the only other possibilities would be: 1. Develop a test framework that compares the floating point results with a margin of error, as TJUnitTest does. However, that effort isn't justified unless it could also benefit non-legacy features. 2. Compare the floating point results against an expected MD5 sum, as we currently do. However, as previously described, it isn't possible in most cases to determine an appropriate default value for the expected MD5 sum. For the moment, it makes the most sense to disable the 8-bit floating point tests by default except with x86[-64] builds and to disable the 12-bit floating point tests by default except with x86-64 builds. That means that the floating point algorithms will still be regression tested when performing x86[-64] builds, but other types of builds will have to opt in to the same regression tests. Since the floating point DCT/IDCT algorithms are unlikely to change ever again (the only reason they still exist at all is to maintain backward compatibility with libjpeg), this seems like a reasonable tradeoff.	2023-07-25 13:38:52 -04:00
DRC	d6914b6b1e	CMakeLists.txt: Fix comment buglet	2023-07-24 16:41:18 -04:00
DRC	041c80a42e	jchuff.c/flush_bits(): Guard against put_bits < 0 This fixes a UBSan negative shift warning, reported by OSS-Fuzz, that occurred when attempting to transform a specially-crafted malformed arithmetic-coded JPEG image into a baseline Huffman-coded JPEG destination image with default Huffman tables. This issue probably had a similar root cause to the issue fixed in `31a301389b`, but in this case, the issue only occurred with the SIMD baseline Huffman encoder in libjpeg-turbo 2.1.x. It was not reproducible in 2.0.x or 3.0.x or when using the C baseline Huffman encoder.	2023-07-12 14:16:13 -04:00

1 2 3 4 5 ...

4654 Commits