DRC
|
3c50582e77
|
Oops. The Windows version of collect_args/uncollect_args uses rsp, so we still need the rsp prologue/epilogue, despite the fact that we aren't using the stack as a work area. This fixes a segfault on Windows caused by r1335.
|
2014-08-09 22:58:18 +00:00 |
|
DRC
|
b7efc273a0
|
Attempt to improve performance by refactoring the compression-side color conversion and DCT algorithms so that they take full advantage of the additional registers available with 64-bit SSE2. This produces a somewhat yawn-worthy speedup of 2-3%, but at least the code is a lot more readable now.
|
2014-08-09 14:30:28 +00:00 |
|
DRC
|
82b8751482
|
Fix performance and other issues uncovered in testing with actual ARM64 hardware; formatting tweaks; remove NEON platform check (NEON is always available with ARMv8)
|
2014-07-23 14:14:14 +00:00 |
|
DRC
|
e2f5c7cab3
|
Add proper support for Borland compilers (Borland needs section names to be prefixed with an underscore, and it needs OMF object files.)
|
2014-06-22 21:14:39 +00:00 |
|
DRC
|
76ef3c5dda
|
Allow for building the MIPS DSPr2 extensions if the host is mips-* as well as mipsel-*. The DSPr2 extensions are little endian, so we still have to check that the compiler defines __MIPSEL__ before enabling them. This paves the way for supporting big-endian MIPS, and in the near term, it allows the SIMD extensions to be built with Sourcery CodeBench.
|
2014-05-19 19:13:22 +00:00 |
|
DRC
|
6263c1fc1b
|
SIMD-accelerated int upsample routine for MIPS DSPr2
|
2014-05-18 20:04:47 +00:00 |
|
DRC
|
ecfbabdbf3
|
Fix MIPS build
|
2014-05-18 19:36:05 +00:00 |
|
DRC
|
144e7b79e4
|
Remove MS-DOS code and information, and adjust copyright headers to reflect the removal of features in r1307 and r1308. libjpeg-turbo has never supported MS-DOS, nor is it even possible for us to do so.
|
2014-05-18 18:33:44 +00:00 |
|
DRC
|
f8301c92dd
|
Get rid of the HAVE_PROTOTYPES configuration option, as well as the related JMETHOD and JPP macros. libjpeg-turbo has never supported compilers that don't handle prototypes. Doing so requires ansi2knr, which isn't even supported in the IJG code anymore.
|
2014-05-16 10:43:44 +00:00 |
|
DRC
|
2c0b793539
|
Remove all of the NEED_SHORT_EXTERNAL_NAMES stuff. There is scant information available as to which linkers ever had a 15-character global symbol name limit. AFAICT, it might have been a VMS and/or a.out BSD thing, but none of those platforms have ever been supported by libjpeg-turbo (nor are such systems supported by other open source libraries of this nature.)
|
2014-05-15 20:30:16 +00:00 |
|
DRC
|
5d5b9a497b
|
Clean up code formatting in the SIMD interface functions
|
2014-05-15 19:45:11 +00:00 |
|
DRC
|
99de998e2c
|
SIMD-accelerated NULL convert routine for MIPS DSPr2
|
2014-05-15 18:26:01 +00:00 |
|
DRC
|
a37736dd43
|
Fix error in MIPS DSPr2 accelerated smooth downsample routine
|
2014-05-15 17:10:39 +00:00 |
|
DRC
|
c4c3ac6305
|
SIMD-accelerated h2v2 smooth downsampling routine for MIPS DSPr2
|
2014-05-14 15:00:10 +00:00 |
|
DRC
|
38bfd451d5
|
SIMD-accelerated merged upsampling routines for MIPS DSPr2
|
2014-05-13 18:40:14 +00:00 |
|
DRC
|
84f9fbfe3e
|
Modify Windows build system to take into account new assembly file names
|
2014-05-10 10:10:03 +00:00 |
|
DRC
|
1bd801a872
|
Using subdirectories unfortunately opened up a can of worms. In order to prevent object name conflicts, it is necessary to use the subdir-objects automake directive, but it simply doesn't work right on some of the versions of automake we still have to support. Another option would be to add a separate Makefile.am file to each subdirectory, but that requires maintaining a completely different set of build rules for each one. Fortunately, however, we're in the 21st century now, so we can use filenames longer than 8.3.
|
2014-05-10 09:53:34 +00:00 |
|
DRC
|
6af3f00efa
|
Re-organize the x86/x86-64 SIMD routines into separate folders by instruction set so we can name each routine similarly to its corresponding C file. This also makes it easier to add support for new instruction sets.
|
2014-05-09 20:14:26 +00:00 |
|
DRC
|
0d25e86574
|
Remove trailing spaces (+ one additional tab in TJUnitTest.java that was missed in the previous commit)
|
2014-05-09 18:06:58 +00:00 |
|
DRC
|
e45363d7c2
|
Convert tabs to spaces in the libjpeg code and the SIMD code (TurboJPEG retains the use of tabs for historical reasons. They were annoying in the libjpeg code primarily because they were not consistently used and because they were used to format as well as indent the code. In the case of TurboJPEG, tabs are used just to indent the code, so even if the editor assumes a different tab width, the code will still be readable.)
|
2014-05-09 18:00:32 +00:00 |
|
DRC
|
ef1c66701a
|
Fix an error in the MIPS DSPr2 fancy upsampling routine
|
2014-05-09 14:45:55 +00:00 |
|
DRC
|
7824f70008
|
SIMD-accelerated slow integer IDCT routine for MIPS DSPr2
|
2014-05-06 09:53:21 +00:00 |
|
DRC
|
bf417e56e0
|
Remove trailing space
|
2014-02-06 19:13:24 +00:00 |
|
DRC
|
890f35098a
|
Create a separate stub file for 64-bit ARM, since it currently implements only the decompression-related functions.
|
2014-02-05 19:03:41 +00:00 |
|
DRC
|
1bb1e69186
|
First pass at ARMv8 64-bit NEON SIMD support
|
2014-02-05 08:15:44 +00:00 |
|
DRC
|
abb6a513fa
|
Formatting tweaks
|
2014-02-05 07:39:38 +00:00 |
|
DRC
|
bd029eb0f7
|
Make environment variable syntax consistent between ARM and x86 code, and add an option to disable SIMD on x86 (this option will be added to the x86-64 code as well, but it makes more sense to add it when we add AVX support.)
|
2013-10-31 07:40:24 +00:00 |
|
DRC
|
c6c8c7911f
|
SIMD-accelerated integer convsamp routine for MIPS DSPr2
|
2013-10-12 21:39:20 +00:00 |
|
DRC
|
3c6b1ba545
|
SIMD-accelerated floating point quantize and convsamp routines for MIPS DSPr2
|
2013-10-09 18:39:44 +00:00 |
|
DRC
|
10138c9d35
|
SIMD-accelerated fast integer inverse DCT routine for MIPS DSPr2
|
2013-10-08 02:18:59 +00:00 |
|
DRC
|
6addfed58b
|
SIMD-accelerated fast integer forward DCT routine for MIPS DSPr2
|
2013-10-08 02:11:21 +00:00 |
|
DRC
|
01f46504ee
|
SIMD-accelerated slow integer forward DCT and quantize routines for MIPS DSPr2
|
2013-09-30 18:13:27 +00:00 |
|
DRC
|
198cc7c161
|
SIMD-accelerated 3/4 and 3/2 decompression scaling for MIPS DSPr2
|
2013-09-27 17:51:08 +00:00 |
|
DRC
|
f934fc621e
|
SIMD-accelerated 1/2 and 1/4 decompression scaling for MIPS DSPr2
|
2013-09-27 17:43:23 +00:00 |
|
DRC
|
154c2dc749
|
SIMD-optimized RGB-to-grayscale conversion for MIPS DSPr2
|
2013-09-27 17:39:57 +00:00 |
|
DRC
|
8abacdbd4b
|
Fix segfault in MIPS DSPr2 upsample routines that occurred when doing 'make test'
|
2013-09-25 17:33:37 +00:00 |
|
DRC
|
0213d619bf
|
Fix 'make dist'
|
2013-08-23 07:57:21 +00:00 |
|
DRC
|
aa5a1808fe
|
SIMD support for performing upsampling using MIPS DSPr2 instructions
|
2013-07-27 21:50:02 +00:00 |
|
DRC
|
3f2e3b11f0
|
SIMD support for performing downsampling using MIPS DSPr2 instructions
|
2013-07-27 21:48:18 +00:00 |
|
DRC
|
41e3657631
|
SIMD support for performing fancy upsampling using MIPS DSPr2 instructions
|
2013-07-27 21:44:14 +00:00 |
|
DRC
|
64da9d6ba8
|
SIMD support for performing color conversion using MIPS DSPr2 instructions
|
2013-07-24 21:50:20 +00:00 |
|
DRC
|
c4c137f5e9
|
Fix the x86 build with NASM 0.98. Since NASM 0.98 is the default version on OS X, we want to at least allow people to build 32-bit code with it, even though it can't properly build 64-bit code.
|
2013-01-13 12:12:53 +00:00 |
|
DRC
|
f506e2cbb5
|
Fix build issues that occurred whenever the source directory contained the letters "col", "mer", or "gra".
|
2012-08-07 21:59:28 +00:00 |
|
DRC
|
7b8366223a
|
More recent versions of autoconf add -traditional-cpp to the CPP flags, which causes jsimdcfg.inc.h to not preprocess correctly unless we expand all of the instances of the #definev macro.
|
2012-06-28 23:24:29 +00:00 |
|
DRC
|
791e9b4316
|
Fixed regression caused by a bug in the 32-bit strict memory access code in jdmrgss2.asm (contributed by Chromium to stop valgrind from whining whenever the output buffer size was not evenly divisible by 16 bytes.) On Linux/x86, this regression caused incorrect pixels on the right-hand side of images whose rows were not 16-byte aligned, whenever fancy upsampling was used. This patch also enables the strict memory access code on all platforms, not just Linux (it does no harm on other platforms) and removes a couple of pcmpeqb instructions that were rendered unnecessary by r836.
|
2012-06-15 21:58:06 +00:00 |
|
DRC
|
7702469e15
|
Eliminate the use of the MASKMOVDQU instruction, to speed up decompression performance by 10x on AMD Bobcat embedded processors (and ~5% on AMD desktop processors.)
|
2012-06-13 01:23:09 +00:00 |
|
DRC
|
ca423d39a3
|
Preserve all 128 bits of xmm6 and xmm7
|
2012-04-26 19:48:33 +00:00 |
|
DRC
|
f3e8b02df5
|
Visual Studio 2010 doesn't like the wildcard, so let CMake expand it instead.
|
2012-03-17 14:32:05 +00:00 |
|
DRC
|
dbfa2648d8
|
Accelerated 4:2:2 upsampling routine for ARM (improves performance ~20-30% when decompressing 4:2:2 JPEGs using fancy upsampling)
|
2012-02-02 22:32:45 +00:00 |
|
DRC
|
a112f12efd
|
Compiler warnings
|
2012-01-31 05:27:41 +00:00 |
|