DRC
1dbceb51c9
ARM64 NEON SIMD support for YCC-to-RGB565 conversion
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1386 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-23 15:57:38 +00:00
DRC
d166012ae3
ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code:
...
-----
aee36252be .patch
From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com >
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15
The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).
The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.
The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.
Also use larger indentation in the source code for separating two independent
instruction streams.
-----
a5efdbf22c .patch
From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com >
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion
The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7: ~55 cycles
* ARM Cortex-A8: ~28 cycles
* ARM Cortex-A9: ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles
Based on the Linaro rgb565 patch from
https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-23 15:47:51 +00:00
DRC
19a5f802bc
Revert r1335 and r1336. It was a valiant effort, but on Windows, xmm8-xmm15 are non-volatile, and the overhead of pushing them onto the stack at the beginning of each function and popping them at the end was causing worse performance (in the neighborhood of 3-5%) than just using the work areas and limiting the register usage to xmm0-xmm7. Best to leave the SSE2 code alone. We can optimize the register usage for AVX2, once that port takes place.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1382 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-22 18:30:44 +00:00
DRC
c079546d32
.func/.endfunc are only necessary when generating STABS debug info, which basically went out of style with parachute pants and Rick Astley. At any rate, none of the platforms for which we're building the ARM code use it (DWARF is the common format these days), and the .func/.endfunc directives cause the clang integrated assembler to fail ( http://llvm.org/bugs/show_bug.cgi?id=20424 ).
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1375 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-22 11:31:46 +00:00
DRC
6ca1aab2f7
Oops. The Windows version of collect_args/uncollect_args uses rsp, so we still need the rsp prologue/epilogue, despite the fact that we aren't using the stack as a work area. This fixes a segfault on Windows caused by r1335.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1336 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-09 22:58:18 +00:00
DRC
f493ae8ff3
Attempt to improve performance by refactoring the compression-side color conversion and DCT algorithms so that they take full advantage of the additional registers available with 64-bit SSE2. This produces a somewhat yawn-worthy speedup of 2-3%, but at least the code is a lot more readable now.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1335 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-09 14:30:28 +00:00
DRC
838e00fd54
Fix performance and other issues uncovered in testing with actual ARM64 hardware; formatting tweaks; remove NEON platform check (NEON is always available with ARMv8)
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1333 632fc199-4ca6-4c93-a231-07263d6284db
2014-07-23 14:14:14 +00:00
DRC
001189cb7a
Add proper support for Borland compilers (Borland needs section names to be prefixed with an underscore, and it needs OMF object files.)
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1328 632fc199-4ca6-4c93-a231-07263d6284db
2014-06-22 21:14:39 +00:00
DRC
0045d0072b
Allow for building the MIPS DSPr2 extensions if the host is mips-* as well as mipsel-*. The DSPr2 extensions are little endian, so we still have to check that the compiler defines __MIPSEL__ before enabling them. This paves the way for supporting big-endian MIPS, and in the near term, it allows the SIMD extensions to be built with Sourcery CodeBench.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1316 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-19 19:13:22 +00:00
DRC
7bd7012c98
SIMD-accelerated int upsample routine for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1315 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-18 20:04:47 +00:00
DRC
3c9850814b
Fix MIPS build
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1314 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-18 19:36:05 +00:00
DRC
5ad13790c2
Remove MS-DOS code and information, and adjust copyright headers to reflect the removal of features in r1307 and r1308. libjpeg-turbo has never supported MS-DOS, nor is it even possible for us to do so.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1312 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-18 18:33:44 +00:00
DRC
1aa56251a9
Get rid of the HAVE_PROTOTYPES configuration option, as well as the related JMETHOD and JPP macros. libjpeg-turbo has never supported compilers that don't handle prototypes. Doing so requires ansi2knr, which isn't even supported in the IJG code anymore.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1308 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-16 10:43:44 +00:00
DRC
3854b5ef59
Remove all of the NEED_SHORT_EXTERNAL_NAMES stuff. There is scant information available as to which linkers ever had a 15-character global symbol name limit. AFAICT, it might have been a VMS and/or a.out BSD thing, but none of those platforms have ever been supported by libjpeg-turbo (nor are such systems supported by other open source libraries of this nature.)
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1307 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-15 20:30:16 +00:00
DRC
b284a473f7
Clean up code formatting in the SIMD interface functions
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1305 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-15 19:45:11 +00:00
DRC
51b63d0931
SIMD-accelerated NULL convert routine for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1304 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-15 18:26:01 +00:00
DRC
ccee85d6e2
Fix error in MIPS DSPr2 accelerated smooth downsample routine
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1302 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-15 17:10:39 +00:00
DRC
f069238be3
SIMD-accelerated h2v2 smooth downsampling routine for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1301 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-14 15:00:10 +00:00
DRC
b530bd1f33
SIMD-accelerated merged upsampling routines for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1297 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-13 18:40:14 +00:00
DRC
3b85b4e1bd
Modify Windows build system to take into account new assembly file names
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1283 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-10 10:10:03 +00:00
DRC
4528256c56
Using subdirectories unfortunately opened up a can of worms. In order to prevent object name conflicts, it is necessary to use the subdir-objects automake directive, but it simply doesn't work right on some of the versions of automake we still have to support. Another option would be to add a separate Makefile.am file to each subdirectory, but that requires maintaining a completely different set of build rules for each one. Fortunately, however, we're in the 21st century now, so we can use filenames longer than 8.3.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1282 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-10 09:53:34 +00:00
DRC
eeace92a35
Re-organize the x86/x86-64 SIMD routines into separate folders by instruction set so we can name each routine similarly to its corresponding C file. This also makes it easier to add support for new instruction sets.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1280 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-09 20:14:26 +00:00
DRC
b7a30f1186
Remove trailing spaces (+ one additional tab in TJUnitTest.java that was missed in the previous commit)
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1279 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-09 18:06:58 +00:00
DRC
16d7eeb944
Convert tabs to spaces in the libjpeg code and the SIMD code (TurboJPEG retains the use of tabs for historical reasons. They were annoying in the libjpeg code primarily because they were not consistently used and because they were used to format as well as indent the code. In the case of TurboJPEG, tabs are used just to indent the code, so even if the editor assumes a different tab width, the code will still be readable.)
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1278 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-09 18:00:32 +00:00
DRC
9a4231edfe
Fix an error in the MIPS DSPr2 fancy upsampling routine
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1277 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-09 14:45:55 +00:00
DRC
fbde10ef57
SIMD-accelerated slow integer IDCT routine for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1269 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-06 09:53:21 +00:00
DRC
f6903faab4
Remove trailing space
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1110 632fc199-4ca6-4c93-a231-07263d6284db
2014-02-06 19:13:24 +00:00
DRC
7544a924e3
Remove trailing space
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1111 632fc199-4ca6-4c93-a231-07263d6284db
2014-02-06 19:15:03 +00:00
DRC
08b4ed7490
Create a separate stub file for 64-bit ARM, since it currently implements only the decompression-related functions.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1109 632fc199-4ca6-4c93-a231-07263d6284db
2014-02-05 19:03:41 +00:00
DRC
786bb7f8c9
First pass at ARMv8 64-bit NEON SIMD support
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1108 632fc199-4ca6-4c93-a231-07263d6284db
2014-02-05 08:15:44 +00:00
DRC
2935005850
Formatting tweaks
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1107 632fc199-4ca6-4c93-a231-07263d6284db
2014-02-05 07:40:00 +00:00
DRC
974aa1692a
Formatting tweaks
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1106 632fc199-4ca6-4c93-a231-07263d6284db
2014-02-05 07:39:38 +00:00
DRC
fabf58ed48
Make environment variable syntax consistent between ARM and x86 code, and add an option to disable SIMD on x86 (this option will be added to the x86-64 code as well, but it makes more sense to add it when we add AVX support.)
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1073 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-31 07:40:24 +00:00
DRC
3ea68ac65e
SIMD-accelerated integer convsamp routine for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1059 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-12 21:39:20 +00:00
DRC
7b0d18f239
SIMD-accelerated floating point quantize and convsamp routines for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1058 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-09 18:39:44 +00:00
DRC
838af42eaf
SIMD-accelerated fast integer inverse DCT routine for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1056 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-08 02:18:59 +00:00
DRC
d4d76b46c8
SIMD-accelerated fast integer forward DCT routine for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1055 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-08 02:11:21 +00:00
DRC
d820c88ab2
SIMD-accelerated slow integer forward DCT and quantize routines for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1054 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-30 18:13:27 +00:00
DRC
95257c488f
SIMD-accelerated 3/4 and 3/2 decompression scaling for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1047 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-27 17:51:08 +00:00
DRC
e2da0467a6
SIMD-accelerated 1/2 and 1/4 decompression scaling for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1046 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-27 17:43:23 +00:00
DRC
101197d2a2
SIMD-optimized RGB-to-grayscale conversion for MIPS DSPr2
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1045 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-27 17:39:57 +00:00
DRC
159f02b3a9
Fix segfault in MIPS DSPr2 upsample routines that occurred when doing 'make test'
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1040 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-25 17:33:37 +00:00
DRC
446d58f4de
Fix 'make dist'
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1024 632fc199-4ca6-4c93-a231-07263d6284db
2013-08-23 07:57:21 +00:00
DRC
c10837d8a8
SIMD support for performing upsampling using MIPS DSPr2 instructions
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@996 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-27 21:50:02 +00:00
DRC
b481543bc7
SIMD support for performing downsampling using MIPS DSPr2 instructions
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@995 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-27 21:48:18 +00:00
DRC
31087f88c7
SIMD support for performing fancy upsampling using MIPS DSPr2 instructions
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@994 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-27 21:44:14 +00:00
DRC
5c4338be32
SIMD support for performing color conversion using MIPS DSPr2 instructions
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@993 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-24 21:50:20 +00:00
DRC
b87a0b4687
Fix the x86 build with NASM 0.98. Since NASM 0.98 is the default version on OS X, we want to at least allow people to build 32-bit code with it, even though it can't properly build 64-bit code.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@906 632fc199-4ca6-4c93-a231-07263d6284db
2013-01-13 12:15:58 +00:00
DRC
29d8f253f8
Fix build issues that occurred whenever the source directory contained the letters "col", "mer", or "gra".
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@860 632fc199-4ca6-4c93-a231-07263d6284db
2012-08-07 21:59:59 +00:00
DRC
112a0bb968
More recent versions of autoconf add -traditional-cpp to the CPP flags, which causes jsimdcfg.inc.h to not preprocess correctly unless we expand all of the instances of the #definev macro.
...
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@848 632fc199-4ca6-4c93-a231-07263d6284db
2012-06-28 23:25:34 +00:00