Commit Graph

27 Commits

Author SHA1 Message Date
DRC
7bfb22af12 Fix broken MIPS build
Regression introduced by 9055fb408d

Fixes #104
2016-09-26 18:01:54 -05:00
DRC
9055fb408d ARM/MIPS: Change the behavior of JSIMD_FORCE*
The JSIMD_FORCE* environment variables previously meant "force the use
of this instruction set if it is available but others are available as
well", but that did nothing on ARM platforms, since there is only ever
one instruction set available.  Since the ARM and MIPS CPU feature
detection code is less than bulletproof, and since there is only one
SIMD instruction set (currently) supported on those platforms, it makes
sense for the JSIMD_FORCE* environment variables on those platforms to
actually force the use of the SIMD instruction set, thus bypassing the
CPU feature detection code.

This addresses a concern raised in #88 whereby parsing /proc/cpuinfo
didn't work within a QEMU environment.  This at least provides a
workaround, allowing users to force-enable or force-disable SIMD
instructions for ARM and MIPS builds of libjpeg-turbo.
2016-07-07 13:38:48 -05:00
DRC
123f7258a8 Format copyright headers more consistently
The IJG convention is to format copyright notices as:

Copyright (C) YYYY, Owner.

We try to maintain this convention for any code that is part of the
libjpeg API library (with the exception of preserving the copyright
notices from Cendio's code verbatim, since those predate
libjpeg-turbo.)

Note that the phrase "All Rights Reserved" is no longer necessary, since
all Buenos Aires Convention signatories signed onto the Berne Convention
in 2000.  However, our convention is to retain this phrase for any files
that have a self-contained copyright header but to leave it off of any
files that refer to another file for conditions of distribution and use.
For instance, all of the non-SIMD files in the libjpeg API library refer
to README.ijg, and the copyright message in that file contains "All
Rights Reserved", so it is unnecessary to add it to the individual
files.

The TurboJPEG code retains my preferred formatting convention for
copyright notices, which is based on that of VirtualGL (where the
TurboJPEG API originated.)
2016-05-28 19:16:58 -05:00
DRC
bd49803f92 Use consistent/modern code formatting for pointers
The convention used by libjpeg:

    type * variable;

is not very common anymore, because it looks too much like
multiplication.  Some (particularly C++ programmers) prefer to tuck the
pointer symbol against the type:

    type* variable;

to emphasize that a pointer to a type is effectively a new type.
However, this can also be confusing, since defining multiple variables
on the same line would not work properly:

    type* variable1, variable2;  /* Only variable1 is actually a
                                    pointer. */

This commit reformats the entirety of the libjpeg-turbo code base so
that it uses the same code formatting convention for pointers that the
TurboJPEG API code uses:

    type *variable1, *variable2;

This seems to be the most common convention among C programmers, and
it is the convention used by other codec libraries, such as libpng and
libtiff.
2016-02-19 09:10:07 -06:00
DRC
f3a8684cd1 SSE2 SIMD implementation of Huffman encoding
Full-color compression speedups relative to libjpeg-turbo 1.4.2:

2.8 GHz Intel Xeon W3530, Linux, 64-bit:  2.2-18% (avg. 9.5%)
2.8 GHz Intel Xeon W3530, Linux, 32-bit:  10-25% (avg. 17%)

2.3 GHz AMD A10-4600M APU, Linux, 64-bit:  4.9-17% (avg. 11%)
2.3 GHz AMD A10-4600M APU, Linux, 32-bit:  8.8-19% (avg. 15%)

3.0 GHz Intel Core i7, OS X, 64-bit:  3.5-16% (avg. 10%)
3.0 GHz Intel Core i7, OS X, 32-bit:  4.8-14% (avg. 11%)

2.6 GHz AMD Athlon 64 X2 5050e:
Performance-neutral (give or take a few percent)

Full-color compression speedups relative to IPP:

2.8 GHz Intel Xeon W3530, Linux, 64-bit:  4.8-34% (avg. 19%)
2.8 GHz Intel Xeon W3530, Linux, 32-bit:  -19%-7.0% (avg. -7.0%)

Refer to #42 for discussion.  Numerous other approaches were attempted,
but this one proved to be the most performant across all platforms.

This commit also fixes #3 (works around, really-- the clang-compiled version
of jchuff.c still performs 20% worse than its GCC-compiled counterpart, but
that code is now bypassed by the new SSE2 Huffman algorithm.)

Based on:
2cb4d41330
36c94e050d
2016-01-12 03:03:49 -06:00
DRC
8b5a0093ee Oops. The MIPS SIMD implementations of h2v1 and h2v2 upsampling were not checking for DSPr2 support, so running 'djpeg -nosmooth' on a non-DSPr2-enabled platform caused an "illegal instruction" error.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1523 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-21 17:42:28 +00:00
DRC
d729f4da9c ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code:
-----

aee36252be.patch

From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15

The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).

The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.

The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.

Also use larger indentation in the source code for separating two independent
instruction streams.

-----

a5efdbf22c.patch

From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion

The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7:  ~55 cycles
* ARM Cortex-A8:  ~28 cycles
* ARM Cortex-A9:  ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles

Based on the Linaro rgb565 patch from
    https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.


git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-23 15:47:51 +00:00
DRC
495e43426e Allow for building the MIPS DSPr2 extensions if the host is mips-* as well as mipsel-*. The DSPr2 extensions are little endian, so we still have to check that the compiler defines __MIPSEL__ before enabling them. This paves the way for supporting big-endian MIPS, and in the near term, it allows the SIMD extensions to be built with Sourcery CodeBench.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1316 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-19 19:13:22 +00:00
DRC
5ef463056a SIMD-accelerated int upsample routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1315 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-18 20:04:47 +00:00
DRC
c728cfd8f2 Fix MIPS build
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1314 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-18 19:36:05 +00:00
DRC
1419852c42 Clean up code formatting in the SIMD interface functions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1305 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-15 19:45:11 +00:00
DRC
1b3fd7eead SIMD-accelerated NULL convert routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1304 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-15 18:26:01 +00:00
DRC
6a61c1e6dc SIMD-accelerated h2v2 smooth downsampling routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1301 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-14 15:00:10 +00:00
DRC
b844eaa360 SIMD-accelerated merged upsampling routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1297 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-13 18:40:14 +00:00
DRC
343478622d SIMD-accelerated slow integer IDCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1269 632fc199-4ca6-4c93-a231-07263d6284db
2014-05-06 09:53:21 +00:00
DRC
fff6c23a65 SIMD-accelerated integer convsamp routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1059 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-12 21:39:20 +00:00
DRC
3d72728169 SIMD-accelerated floating point quantize and convsamp routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1058 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-09 18:39:44 +00:00
DRC
d3131c1b3d SIMD-accelerated fast integer inverse DCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1056 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-08 02:18:59 +00:00
DRC
71e06a7d81 SIMD-accelerated fast integer forward DCT routine for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1055 632fc199-4ca6-4c93-a231-07263d6284db
2013-10-08 02:11:21 +00:00
DRC
a6b7fbd352 SIMD-accelerated slow integer forward DCT and quantize routines for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1054 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-30 18:13:27 +00:00
DRC
e500591710 SIMD-accelerated 3/4 and 3/2 decompression scaling for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1047 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-27 17:51:08 +00:00
DRC
2ccf4d1a70 SIMD-accelerated 1/2 and 1/4 decompression scaling for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1046 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-27 17:43:23 +00:00
DRC
49eaa7572d SIMD-optimized RGB-to-grayscale conversion for MIPS DSPr2
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1045 632fc199-4ca6-4c93-a231-07263d6284db
2013-09-27 17:39:57 +00:00
DRC
16962c1132 SIMD support for performing upsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@996 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-27 21:50:02 +00:00
DRC
6f2d3c2c97 SIMD support for performing downsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@995 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-27 21:48:18 +00:00
DRC
86fbf35fb6 SIMD support for performing fancy upsampling using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@994 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-27 21:44:14 +00:00
DRC
0be9fa5735 SIMD support for performing color conversion using MIPS DSPr2 instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@993 632fc199-4ca6-4c93-a231-07263d6284db
2013-07-24 21:50:20 +00:00