mozjpeg

Author	SHA1	Message	Date
DRC	33a4b3d400	Reformat code per Siarhei's original patch (to clearly indicate that the offset instructions are completely independent) and add Siarhei as an individual author (he no longer works for Nokia.) git-svn-id: svn://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1388 632fc199-4ca6-4c93-a231-07263d6284db	2014-08-25 15:26:09 +00:00
DRC	b052d67eb1	ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code: ----- `aee36252be`.patch From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka <siarhei.siamashka@gmail.com> Date: Wed, 13 Aug 2014 03:50:22 +0300 Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15 The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9. Tuning it for newer ARM processors can introduce some speed-up (up to 20%). The performance of the inner loop (conversion of 8 pixels) improves from ~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles down to ~18 cycles on ARM Cortex-A15. The performance remains exactly the same on ARM Cortex-A7 (~58 cycles), ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors. Also use larger indentation in the source code for separating two independent instruction streams. ----- `a5efdbf22c`.patch From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001 From: Siarhei Siamashka <siarhei.siamashka@gmail.com> Date: Wed, 13 Aug 2014 07:23:09 +0300 Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion The performance of the inner loop (conversion of 8 pixels): * ARM Cortex-A7: ~55 cycles * ARM Cortex-A8: ~28 cycles * ARM Cortex-A9: ~32 cycles * ARM Cortex-A15: ~20 cycles * Qualcomm Krait: ~24 cycles Based on the Linaro rgb565 patch from https://sourceforge.net/p/libjpeg-turbo/patches/24/ but implements better instructions scheduling. git-svn-id: svn://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db	2014-08-23 15:47:51 +00:00
DRC	83052612d0	.func/.endfunc are only necessary when generating STABS debug info, which basically went out of style with parachute pants and Rick Astley. At any rate, none of the platforms for which we're building the ARM code use it (DWARF is the common format these days), and the .func/.endfunc directives cause the clang integrated assembler to fail (http://llvm.org/bugs/show_bug.cgi?id=20424 ). git-svn-id: svn://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1375 632fc199-4ca6-4c93-a231-07263d6284db	2014-08-22 11:31:46 +00:00
DRC	abb6a513fa	Formatting tweaks	2014-02-05 07:39:38 +00:00
DRC	dbfa2648d8	Accelerated 4:2:2 upsampling routine for ARM (improves performance ~20-30% when decompressing 4:2:2 JPEGs using fancy upsampling)	2012-02-02 22:32:45 +00:00
DRC	e808882c95	Update Nokia contact info	2011-09-06 18:58:22 +00:00
DRC	a02a9af565	Improve performance of IFAST iDCT by changing the order of transpose and descale steps	2011-09-06 18:57:53 +00:00
DRC	061f96dc7d	Make ARM ISLOW iDCT faster on typical cases, and eliminate the possibility of 16-bit overflows when handling arbitrary coefficients.	2011-09-06 18:55:45 +00:00
DRC	00e258dedd	Improve the performance of YCbCr to RGB conversion on ARM	2011-08-24 23:27:44 +00:00
DRC	7672bd3ac5	NEON-accelerated slow integer inverse DCT	2011-08-22 13:48:01 +00:00
DRC	00a69f142a	NEON-accelerated quantization	2011-08-17 21:00:59 +00:00
DRC	dbb92f2eee	Improve performance of ARM NEON IFAST iDCT	2011-08-15 08:36:51 +00:00
DRC	22b4359e42	ARM NEON-accelerated RGB-to-YCbCr conversion	2011-08-12 19:27:20 +00:00
DRC	ce02d1d62a	Support for accelerated forward DCT using ARM NEON instructions	2011-08-10 23:31:13 +00:00
DRC	e3f7e75525	NEON-optimized 2x2 and 4x4 scaled iDCTs	2011-06-17 21:12:58 +00:00
DRC	d02c734a19	iOS ARM support	2011-06-14 22:16:50 +00:00
DRC	99799a6c29	ARM NEON support	2011-05-03 08:47:43 +00:00

17 Commits