mozjpeg

Author	SHA1	Message	Date
DRC	3c17063ef1	Guard against dupe SOF w/ incorrect source manager Referring to https://bugzilla.mozilla.org/show_bug.cgi?id=1898606, attempting to decompress a specially-crafted malformed JPEG image (specifically an image with a complete 12-bit Start Of Frame segment followed by an incomplete 8-bit Start Of Frame segment) using the default marker processor, buffered-image mode, and input prefetching triggered the following sequence of events: - When the 12-bit SOF segment was encountered (in the body of jpeg_read_header()), the marker processor's read_markers() method called the get_sof() function, which processed the 12-bit SOF segment and set cinfo->data_precision to 12. - If the application subsequently called jpeg_consume_input() in a loop to prefetch input data, and it didn't stop calling jpeg_consume_input() when the function returned JPEG_REACHED_SOS, then the 8-bit SOF segment was encountered in the body of jpeg_consume_input(). As a result, the marker processor's read_markers() method called get_sof(), which started to process the 8-bit SOF segment and set cinfo->data_precision to 8. - Since the 8-bit SOF segment was incomplete, the end of the JPEG data stream was encountered when get_sof() attempted to read the image height, width, and number of components. - If the fill_input_buffer() method in the application's custom source manager incorrectly returned FALSE in response to a prematurely- terminated JPEG data stream, then get_sof() returned FALSE while attempting to read the image height, width, and number of components (before the duplicate SOF check was reached.) That caused the default marker processor's read_markers() method, and subsequently jpeg_consume_input(), to return JPEG_SUSPENDED. - If the application failed to respond to the JPEG_SUSPENDED return value and subsequently attempted to call jpeg_read_scanlines(), then the data precision check in jpeg_read_scanlines() succeeded (because cinfo->data_precision was now 8.) However, because cinfo->data_precision had been 12 during the previous call to jpeg_start_decompress(), only the 12-bit version of the main controller was initialized, and the cinfo->main->process_data() method was undefined. Thus, a segfault occurred when jpeg_read_scanlines() attempted to invoke that method. Scenarios in which the issue was thwarted: 1. The default source managers handle a prematurely-terminated JPEG data stream by inserting a fake EOI marker into the data stream. Thus, when using one of those source managers, the INPUT_2BYTES() and INPUT_BYTE() macros (which get_sof() invokes to read the image height, width, and number of components) succeeded-- albeit with bogus data, since the fake EOI marker was read into those fields. The duplicate SOF check in get_sof() then failed, generating a fatal libjpeg error. 2. When using a custom source manager that correctly returns TRUE in response to a prematurely-terminated JPEG data stream, the aforementioned INPUT_2BYTES() and INPUT_BYTE() macros also succeeded (albeit with bogus data read from the previous bytes of the data stream), and the duplicate SOF check failed. 3. If the application did not prefetch input data, or if it stopped invoking jpeg_consume_input() when the function returned JPEG_REACHED_SOS, then the duplicate SOF segment was not read prior to the first call to jpeg_read_scanlines(). Thus, the data precision check in jpeg_read_scanlines() failed. If the application instead called jpeg12_read_scanlines() (that is, if it properly supported multiple data precisions), then the duplicate SOF segment was not read until the body of jpeg_finish_decompress(). At that point, its only negative effect was to cause jpeg_finish_decompress() to return FALSE before the duplicate SOF check was reached. In other words, this issue depended not only upon an incorrectly-written source manager but also upon a very specific sequence of API calls. It also depended upon the multi-precision feature introduced in libjpeg-turbo 3.0.x. When using an 8-bit-per-sample build of libjpeg-turbo 2.1.x, jpeg_read_header() failed with "Unsupported JPEG data precision 12" after the 12-bit SOF segment was processed. When using a 12-bit-per-sample build of libjpeg-turbo 2.1.x, the behavior was the same as if the application called jpeg12_read_scanlines() in Scenario 3 above. This commit simply moves the duplicate SOF check to the top of get_sof() so the check will fail before the marker processor attempts to read the duplicate SOF. It should be noted that this issue isn't a libjpeg-turbo bug per se, because it occurs only when the calling application does something it shouldn't. It is, rather, an issue of API hardening/caller-proofing.	2024-05-29 10:08:24 -04:00
DRC	0fc7313e54	Don't traverse linked list when saving a marker If the calling application invokes jpeg_save_markers() to save a particular type of marker, then the save_marker() function will be invoked for every marker of that type that is encountered. Previously, only the head of the marker linked list was stored (in jpeg_decompress_struct), so save_marker() had to traverse the entire linked list before it could add a new marker to the tail of the list. That caused CPU usage to grow exponentially with the number of markers. Referring to #764, it is possible to create a JPEG image that contains an excessive number of markers. The specific reproducer that uncovered this issue is a specially-crafted 1-megabyte malformed JPEG image with tens of thousands of APP1 markers, which required approximately 30 seconds of CPU time (on a modern Intel processor) to process. However, it should also be possible to create a legitimate JPEG image that reproduces the issue (such as an image with tens of thousands of duplicate EXIF tags.) This commit introduces a new pointer (in jpeg_decomp_master, in order to preserve backward ABI compatibility) that is used to store the tail of the marker linked list whenever a marker is added to it. Thus, it is no longer necessary to traverse the list when adding a marker, and CPU usage will grow linearly rather than exponentially with the number of markers. Fixes #764	2024-05-14 14:46:33 -04:00
DRC	97772cba65	Merge branch 'ijg.lossless' into dev Refer to #402	2022-11-14 15:36:25 -06:00
DRC	217d1a75f5	Clean up the lossless JPEG feature - Rename jpeg_simple_lossless() to jpeg_enable_lossless() and modify the function so that it stores the lossless parameters directly in the Ss and Al fields of jpeg_compress_struct rather than using a scan script. - Move the cjpeg -lossless switch into "Switches for advanced users". - Document the libjpeg API and run-time features that are unavailable in lossless mode, and ensure that all parameters, functions, and switches related to unavailable features are ignored or generate errors in lossless mode. - Defer any action that depends on whether lossless mode is enabled until jpeg_start_compress()/jpeg_start_decompress() is called. - Document the purpose of the point transform value. - "Codec" stands for coder/decoder, so it is a bit awkward to say "lossless compression codec" and "lossless decompression codec". Use "lossless compressor" and "lossless decompressor" instead. - Restore backward API/ABI compatibility with libjpeg v6b: * Move the new 'lossless' field from the exposed jpeg_compress_struct and jpeg_decompress_struct structures into the opaque jpeg_comp_master and jpeg_decomp_master structures, and allocate the master structures in the body of jpeg_create_compress() and jpeg_create_decompress(). * Remove the new 'process' field from jpeg_compress_struct and jpeg_decompress_struct and replace it with the old 'progressive_mode' field and the new 'lossless' field. * Remove the new 'data_unit' field from jpeg_compress_struct and jpeg_decompress_struct and replace it with a locally-computed data unit variable. * Restore the names of macros and fields that refer to DCT blocks, and document that they have a different meaning in lossless mode. (Most of them aren't very meaningful in lossless mode anyhow.) * Remove the new alloc_darray() method from jpeg_memory_mgr and replace it with an internal macro that wraps the alloc_sarray() method. * Move the JDIFF* data types from jpeglib.h and jmorecfg.h into jpegint.h. * Remove the new 'codec' field from jpeg_compress_struct and jpeg_decompress_struct and instead reuse the existing internal coefficient control, forward/inverse DCT, and entropy encoding/decoding structures for lossless compression/decompression. * Repurpose existing error codes rather than introducing new ones. (The new JERR_BAD_RESTART and JWRN_MUST_DOWNSCALE codes remain, although JWRN_MUST_DOWNSCALE will probably be removed in libjpeg-turbo, since we have a different way of handling multiple data precisions.) - Automatically enable lossless mode when a scan script with parameters that are only valid for lossless mode is detected, and document the use of scan scripts to generate lossless JPEG images. - Move the sequential and shared Huffman routines back into jchuff.c and jdhuff.c, and document that those routines are shared with jclhuff.c and jdlhuff.c as well as with jcphuff.c and jdphuff.c. - Move MAX_DIFF_BITS from jchuff.h into jclhuff.c, the only place where it is used. - Move the predictor and scaler code into jclossls.c and jdlossls.c. - Streamline register usage in the [un]differencers (inspired by similar optimizations in the color [de]converters.) - Restructure the logic in a few places to reduce duplicated code. - Ensure that all lossless-specific code is guarded by C_LOSSLESS_SUPPORTED or D_LOSSLESS_SUPPORTED and that the library can be built successfully if either or both of those macros is undefined. - Remove all short forms of external names introduced by the lossless JPEG patch. (These will not be needed by libjpeg-turbo, so there is no use cleaning them up.) - Various wordsmithing, formatting, and punctuation tweaks - Eliminate various compiler warnings.	2022-11-14 14:55:04 -06:00
DRC	e8b40f3c2b	Vastly improve 12-bit JPEG integration The Gordian knot that `7fec5074f9` attempted to unravel was caused by the fact that there are several data-precision-dependent (JSAMPLE-dependent) fields and methods in the exposed libjpeg API structures, and if you change the exposed libjpeg API structures, then you have to change the whole API. If you change the whole API, then you have to provide a whole new library to support the new API, and that makes it difficult to support multiple data precisions in the same application. (It is not impossible, as example.c demonstrated, but using data-precision-dependent libjpeg API structures would have made the cjpeg, djpeg, and jpegtran source code hard to read, so it made more sense to build, install, and package 12-bit-specific versions of those applications.) Unfortunately, the result of that initial integration effort was an unreadable and unmaintainable mess, which is a problem for a library that is an ISO/ITU-T reference implementation. Also, as I dug into the problem of lossless JPEG support, I realized that 16-bit lossless JPEG images are a thing, and supporting yet another version of the libjpeg API just for those images is untenable. In fact, however, the touch points for JSAMPLE in the exposed libjpeg API structures are minimal: - The colormap and sample_range_limit fields in jpeg_decompress_struct - The alloc_sarray() and access_virt_sarray() methods in jpeg_memory_mgr - jpeg_write_scanlines() and jpeg_write_raw_data() - jpeg_read_scanlines() and jpeg_read_raw_data() - jpeg_skip_scanlines() and jpeg_crop_scanline() (This is subtle, but both of those functions use JSAMPLE-dependent opaque structures behind the scenes.) It is much more readable and maintainable to provide 12-bit-specific versions of those six top-level API functions and to document that the aforementioned methods and fields must be type-cast when using 12-bit samples. Since that eliminates the need to provide a 12-bit-specific version of the exposed libjpeg API structures, we can: - Compile only the precision-dependent libjpeg modules (the coefficient buffer controllers, the colorspace converters, the DCT/IDCT managers, the main buffer controllers, the preprocessing and postprocessing controller, the downsampler and upsamplers, the quantizers, the integer DCT methods, and the IDCT methods) for multiple data precisions. - Introduce 12-bit-specific methods into the various internal structures defined in jpegint.h. - Create precision-independent data type, macro, method, field, and function names that are prefixed by an underscore, and use an internal header to convert those into precision-dependent data type, macro, method, field, and function names, based on the value of BITS_IN_JSAMPLE, when compiling the precision-dependent libjpeg modules. - Expose precision-dependent jinit() functions for each of the precision-dependent libjpeg modules. - Abstract the precision-dependent libjpeg modules by calling the appropriate precision-dependent jinit() function, based on the value of cinfo->data_precision, from top-level libjpeg API functions.	2022-11-04 12:30:33 -05:00
DRC	ec6e451d05	Lossless JPEG support: Add copyright attributions Referring to https://github.com/libjpeg-turbo/libjpeg-turbo/issues/402#issuecomment-768348440 and https://github.com/libjpeg-turbo/libjpeg-turbo/issues/402#issuecomment-770221584 Ken Murchison clarified that it was his intent to release the lossless JPEG patch under the IJG License and that adding his name to the copyright headers would be sufficient to acknowledge that any derivatives are based on his work.	2022-10-21 16:53:53 -05:00
Ken Murchison	2e8360e061	IJG's JPEG software v6b with lossless JPEG support Patch obtained from: https://sourceforge.net/projects/jpeg/files/ftp.oceana.com Author date taken from original announcement and timestamp of patch tarball: https://groups.google.com/g/comp.protocols.dicom/c/rrkP8BxoMRk/m/Ij4dfprggp8J	2022-10-21 13:42:59 -05:00
DRC	7fec5074f9	Support 8-bit & 12-bit JPEGs using the same build Partially implements #199 This commit also implements a request from #178 (the ability to compile the libjpeg example as a standalone program.)	2022-03-10 22:56:17 -06:00
DRC	172972394a	Eliminate non-ANSI C compatibility macros libjpeg-turbo has never supported non-ANSI C compilers. Per the spec, ANSI C compilers must have locale.h, stddef.h, stdlib.h, memset(), memcpy(), unsigned char, and unsigned short. They must also handle undefined structures.	2022-01-06 11:50:26 -06:00
DRC	52fef34928	Eliminate internal use of GETJOCTET() macro Because of `01e3032354` (officially eliminating support for compilers without unsigned char, since we never effectively supported those compilers anyhow), GETJOCTET() is now a no-op. Since that macro is in jmorecfg.h, it is part of the de facto libjpeg API and must remain in the public headers. However, there is no reason to continue using it internally, and eliminating its internal use improves code readability.	2019-12-10 19:10:55 -06:00
DRC	19c791cdac	Improve code formatting consistency With rare exceptions ... - Always separate line continuation characters by one space from preceding code. - Always use two-space indentation. Never use tabs. - Always use K&R-style conditional blocks. - Always surround operators with spaces, except in raw assembly code. - Always put a space after, but not before, a comma. - Never put a space between type casts and variables/function calls. - Never put a space between the function name and the argument list in function declarations and prototypes. - Always surround braces ('{' and '}') with spaces. - Always surround statements (if, for, else, catch, while, do, switch) with spaces. - Always attach pointer symbols ('' and '') to the variable or function name. - Always precede pointer symbols ('' and '**') by a space in type casts. - Use the MIN() macro from jpegint.h within the libjpeg and TurboJPEG API libraries (using min() from tjutil.h is still necessary for TJBench.) - Where it makes sense (particularly in the TurboJPEG code), put a blank line after variable declaration blocks. - Always separate statements in one-liners by two spaces. The purpose of this was to ease maintenance on my part and also to make it easier for contributors to figure out how to format patch submissions. This was admittedly confusing (even to me sometimes) when we had 3 or 4 different style conventions in the same source tree. The new convention is more consistent with the formatting of other OSS code bases. This commit corrects deviations from the chosen formatting style in the libjpeg API code and reformats the TurboJPEG API code such that it conforms to the same standard. NOTES: - Although it is no longer necessary for the function name in function declarations to begin in Column 1 (this was historically necessary because of the ansi2knr utility, which allowed libjpeg to be built with non-ANSI compilers), we retain that formatting for the libjpeg code because it improves readability when using libjpeg's function attribute macros (GLOBAL(), etc.) - This reformatting project was accomplished with the help of AStyle and Uncrustify, although neither was completely up to the task, and thus a great deal of manual tweaking was required. Note to developers of code formatting utilities: the libjpeg-turbo code base is an excellent test bed, because AFAICT, it breaks every single one of the utilities that are currently available. - The legacy (MMX, SSE, 3DNow!) assembly code for i386 has been formatted to match the SSE2 code (refer to ff5685d5344273df321eb63a005eaae19d2496e3.) I hadn't intended to bother with this, but the Loongson MMI implementation demonstrated that there is still academic value to the MMX implementation, as an algorithmic model for other 64-bit vector implementations. Thus, it is desirable to improve its readability in the same manner as that of the SSE2 implementation.	2018-03-16 02:14:34 -05:00
DRC	bd49803f92	Use consistent/modern code formatting for pointers The convention used by libjpeg: type * variable; is not very common anymore, because it looks too much like multiplication. Some (particularly C++ programmers) prefer to tuck the pointer symbol against the type: type* variable; to emphasize that a pointer to a type is effectively a new type. However, this can also be confusing, since defining multiple variables on the same line would not work properly: type* variable1, variable2; /* Only variable1 is actually a pointer. / This commit reformats the entirety of the libjpeg-turbo code base so that it uses the same code formatting convention for pointers that the TurboJPEG API code uses: type variable1, *variable2; This seems to be the most common convention among C programmers, and it is the convention used by other codec libraries, such as libpng and libtiff.	2016-02-19 09:10:07 -06:00
DRC	1e32fe3113	Replace INT32 with a new internal datatype (JLONG) These days, INT32 is a commonly-defined datatype in system headers. We cannot eliminate the definition of that datatype from jmorecfg.h, since the INT32 typedef has technically been part of the libjpeg API since version 5 (1994.) However, using INT32 internally is risky, because the inclusion of a particular header (Xmd.h, for instance) could change the definition of INT32 from long to int on 64-bit platforms and thus change the internal behavior of libjpeg-turbo in unexpected ways (for instance, failing to correctly set __INT32_IS_ACTUALLY_LONG to match the INT32 typedef-- perhaps as a result of including the wrong version of jpeglib.h-- could cause libjpeg-turbo to produce incorrect results.) The library has always been built in environments in which INT32 is effectively long (on Windows, long is always 32-bit, so effectively it's the same as int), so it makes sense to turn INT32 into an explicitly long datatype. This ensures that libjpeg-turbo will always behave consistently, regardless of the headers included at compile time. Addresses a concern expressed in #26.	2015-10-14 20:34:32 -05:00
DRC	7e3acc0e0a	Rename README, LICENSE, BUILDING text files The IJG README file has been renamed to README.ijg, in order to avoid confusion (many people were assuming that that was our project's README file and weren't reading README-turbo.txt) and to lay the groundwork for markdown versions of the libjpeg-turbo README and build instructions.	2015-10-10 10:31:33 -05:00
Thomas G. Lane	5ead57a34a	The Independent JPEG Group's JPEG software v6b	2015-07-27 13:43:00 -05:00
Thomas G. Lane	489583f516	The Independent JPEG Group's JPEG software v6a	2015-07-29 15:32:35 -05:00
Thomas G. Lane	bc79e0680a	The Independent JPEG Group's JPEG software v6	2015-07-29 15:31:30 -05:00
Thomas G. Lane	36a4ccccd3	The Independent JPEG Group's JPEG software v5	2015-07-29 15:28:00 -05:00
DRC	5de454b291	libjpeg-turbo has never supported non-ANSI compilers, so get rid of the crufty SIZEOF() macro. It was not being used consistently anyhow, so it would not have been possible to build prior releases of libjpeg-turbo using the broken compilers for which that macro was designed. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1313 632fc199-4ca6-4c93-a231-07263d6284db	2014-05-18 19:04:03 +00:00
DRC	5033f3e19a	Remove MS-DOS code and information, and adjust copyright headers to reflect the removal of features in r1307 and r1308. libjpeg-turbo has never supported MS-DOS, nor is it even possible for us to do so. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1312 632fc199-4ca6-4c93-a231-07263d6284db	2014-05-18 18:33:44 +00:00
DRC	e5eaf37440	Convert tabs to spaces in the libjpeg code and the SIMD code (TurboJPEG retains the use of tabs for historical reasons. They were annoying in the libjpeg code primarily because they were not consistently used and because they were used to format as well as indent the code. In the case of TurboJPEG, tabs are used just to indent the code, so even if the editor assumes a different tab width, the code will still be readable.) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1278 632fc199-4ca6-4c93-a231-07263d6284db	2014-05-09 18:00:32 +00:00
DRC	d4ab63d191	Fix several potential overflow issues identified by the community. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1114 632fc199-4ca6-4c93-a231-07263d6284db	2014-02-06 19:31:50 +00:00
DRC	7ebf2941a9	Fix CVE-2013-6629 and CVE-2013-6630 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1089 632fc199-4ca6-4c93-a231-07263d6284db	2013-11-21 18:32:48 +00:00
DRC	43d8cf4d45	Fix CVE-2013-6629 and CVE-2013-6630 git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@1090 632fc199-4ca6-4c93-a231-07263d6284db	2013-11-21 18:34:39 +00:00
DRC	a6ef282a49	Some of the IJG headers say "Modified by", so clarify that our "Modifications" are not referring to these. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1053 632fc199-4ca6-4c93-a231-07263d6284db	2013-09-28 03:23:49 +00:00
DRC	a73e870ad0	Change the copyright notices to make it clear that our modified files are not part of the IJG's software. git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@873 632fc199-4ca6-4c93-a231-07263d6284db	2012-12-31 02:52:30 +00:00
DRC	dd2b651243	Guard against num_components being a ridiculous value due to a corrupt header git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@831 632fc199-4ca6-4c93-a231-07263d6284db	2012-05-30 20:36:42 +00:00
DRC	8c8124bf51	Oops. Need to handle cases in which num_components > n git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@751 632fc199-4ca6-4c93-a231-07263d6284db	2012-01-28 01:19:23 +00:00
DRC	12781cb555	Properly decompress erroneous CMYK/YCCK images whose K component has an ID of 1 instead of 4 (this is to support SumatraPDF) git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@740 632fc199-4ca6-4c93-a231-07263d6284db	2012-01-27 01:23:20 +00:00

29 Commits