Commit Graph

11 Commits

Author SHA1 Message Date
DRC
b052d67eb1 ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code:
-----

aee36252be.patch

From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15

The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).

The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.

The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.

Also use larger indentation in the source code for separating two independent
instruction streams.

-----

a5efdbf22c.patch

From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion

The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7:  ~55 cycles
* ARM Cortex-A8:  ~28 cycles
* ARM Cortex-A9:  ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles

Based on the Linaro rgb565 patch from
    https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.


git-svn-id: svn://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-23 15:47:51 +00:00
DRC
6263c1fc1b SIMD-accelerated int upsample routine for MIPS DSPr2 2014-05-18 20:04:47 +00:00
DRC
f8301c92dd Get rid of the HAVE_PROTOTYPES configuration option, as well as the related JMETHOD and JPP macros. libjpeg-turbo has never supported compilers that don't handle prototypes. Doing so requires ansi2knr, which isn't even supported in the IJG code anymore. 2014-05-16 10:43:44 +00:00
DRC
2c0b793539 Remove all of the NEED_SHORT_EXTERNAL_NAMES stuff. There is scant information available as to which linkers ever had a 15-character global symbol name limit. AFAICT, it might have been a VMS and/or a.out BSD thing, but none of those platforms have ever been supported by libjpeg-turbo (nor are such systems supported by other open source libraries of this nature.) 2014-05-15 20:30:16 +00:00
DRC
99de998e2c SIMD-accelerated NULL convert routine for MIPS DSPr2 2014-05-15 18:26:01 +00:00
DRC
c4c3ac6305 SIMD-accelerated h2v2 smooth downsampling routine for MIPS DSPr2 2014-05-14 15:00:10 +00:00
DRC
0d25e86574 Remove trailing spaces (+ one additional tab in TJUnitTest.java that was missed in the previous commit) 2014-05-09 18:06:58 +00:00
DRC
25299d0d2f Updated (C) 2011-02-18 20:43:04 +00:00
DRC
439527e0b9 SIMD-accelerated RGB-to-Grayscale color conversion 2011-02-18 11:23:45 +00:00
DRC
d83e1f8900 Clarify that the C wrappers fall under the same license as the rest of the SIMD code 2011-02-02 05:38:34 +00:00
Pierre Ossman
c5cf41164a Framework for supporting SIMD acceleration
Designed to impose minimal changes on the "normal" code.
2009-03-09 13:15:56 +00:00