Commit Graph

25 Commits

Author SHA1 Message Date
DRC
123f7258a8 Format copyright headers more consistently
The IJG convention is to format copyright notices as:

Copyright (C) YYYY, Owner.

We try to maintain this convention for any code that is part of the
libjpeg API library (with the exception of preserving the copyright
notices from Cendio's code verbatim, since those predate
libjpeg-turbo.)

Note that the phrase "All Rights Reserved" is no longer necessary, since
all Buenos Aires Convention signatories signed onto the Berne Convention
in 2000.  However, our convention is to retain this phrase for any files
that have a self-contained copyright header but to leave it off of any
files that refer to another file for conditions of distribution and use.
For instance, all of the non-SIMD files in the libjpeg API library refer
to README.ijg, and the copyright message in that file contains "All
Rights Reserved", so it is unnecessary to add it to the individual
files.

The TurboJPEG code retains my preferred formatting convention for
copyright notices, which is based on that of VirtualGL (where the
TurboJPEG API originated.)
2016-05-28 19:16:58 -05:00
mattsarett
2e480fa2a3 ARMv7 SIMD: Fix clang compatibility (Part 2)
GCC does support UAL syntax (strbeq) if the ".syntax unified" directive
is supplied.  This directive is supported by all versions of GCC and
clang going back to 2003, so it should not create any backward
compatibility issues.

Based on 1264349e2f

Closes #76
2016-05-03 13:08:58 -05:00
mattsarett
5e576386b5 ARMv7 SIMD: Fix clang compatibility
By design, clang only supports Unified Assembler Language (and not
pre-UAL syntax):
https://llvm.org/bugs/show_bug.cgi?id=23507
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0473c/BABJIHGJ.html

Thus, clang only supports the strbeq instruction and not streqb, but
unfortunately some versions of GCC only support streqb.  Go, go
Gadget #ifdef...

Based on https://github.com/mattsarett/libjpeg-turbo/commit/a82e63aac63f8fa3
95fa4caad4de6859623ee2e2

Closes #75
2016-05-02 12:20:10 -05:00
DRC
bd49803f92 Use consistent/modern code formatting for pointers
The convention used by libjpeg:

    type * variable;

is not very common anymore, because it looks too much like
multiplication.  Some (particularly C++ programmers) prefer to tuck the
pointer symbol against the type:

    type* variable;

to emphasize that a pointer to a type is effectively a new type.
However, this can also be confusing, since defining multiple variables
on the same line would not work properly:

    type* variable1, variable2;  /* Only variable1 is actually a
                                    pointer. */

This commit reformats the entirety of the libjpeg-turbo code base so
that it uses the same code formatting convention for pointers that the
TurboJPEG API code uses:

    type *variable1, *variable2;

This seems to be the most common convention among C programmers, and
it is the convention used by other codec libraries, such as libpng and
libtiff.
2016-02-19 09:10:07 -06:00
DRC
15aaa7f7e2 ARM SIMD: Comment tweaks 2016-02-07 17:39:33 -06:00
DRC
cf888486d1 Use consistent formatting in ARM NEON SIMD code
There aren't really any best practices to follow here.  I tried as best
as I could to adopt a standard that would ease any future maintenance
burdens.  The basic tenets of that standard are:

* Assembly instructions always start on Column 5, and operands always
  start on Column 21, except:
  - The instruction and operand can be indented (usually by 2 spaces)
    to indicate a separate instruction stream.
  - If the instruction is within an enclosing .if block in a macro,
    it should always be indented relative to the .if block.
* Comments are placed with an eye toward readability.  There are always
  at least 2 spaces between the end of a line of code and the associated
  in-line comment.  Where it made sense, I tried to line up the comments
  in blocks, and some were shifted right to avoid overlap with
  neighboring instruction lines.  Not an exact science.
* Assembler directives and macros use 2-space indenting rules.  .if
  blocks are indented relative to the macro, and code within the .if
  blocks is indented relative to the .if directive.
* No extraneous spaces between operands.  Lining up the operands
  vertically did not really improve readability-- personally, I think it
  made it worse, since my eye would tend to lose its place in the
  uniform columns of characters.  Also, code with a lot of vertical
  alignment is really hard to maintain, since changing one line could
  necessitate changing a bunch of other lines to avoid spoiling the
  alignment.
* No extraneous spaces in #defines or other directives.  In general, the
  only extraneous spaces (other than indenting spaces) are between:
  - Instructions and operands
  - Operands and in-line comments
This standard should be more or less in keeping with other formatting
standards used within the project.
2016-02-02 23:39:03 -06:00
DRC
499c470b63 ARM32 NEON SIMD implementation of Huffman encoding
Full-color compression speedups relative to libjpeg-turbo 1.4.2:

800 MHz ARM Cortex-A9, iOS, 32-bit:  26-44% (avg. 32%)

Refer to #42 and #47 for discussion.

This commit also removes the unnecessary

    if (simd_support & JSIMD_ARM_NEON)

statements from the jsimd* algorithm functions.  Since the jsimd_can*()
functions check for the existence of NEON, the corresponding algorithm
functions will never be called if NEON isn't available.  Removing those
if statements improved performance across the board by a couple of
percent.

Based on:
fc023c880c
2016-01-13 23:38:35 -06:00
DRC
1e32fe3113 Replace INT32 with a new internal datatype (JLONG)
These days, INT32 is a commonly-defined datatype in system headers.  We
cannot eliminate the definition of that datatype from jmorecfg.h, since
the INT32 typedef has technically been part of the libjpeg API since
version 5 (1994.)  However, using INT32 internally is risky, because the
inclusion of a particular header (Xmd.h, for instance) could change the
definition of INT32 from long to int on 64-bit platforms and thus change
the internal behavior of libjpeg-turbo in unexpected ways (for instance,
failing to correctly set __INT32_IS_ACTUALLY_LONG to match the INT32
typedef-- perhaps as a result of including the wrong version of
jpeglib.h-- could cause libjpeg-turbo to produce incorrect results.)

The library has always been built in environments in which INT32 is
effectively long (on Windows, long is always 32-bit, so effectively it's
the same as int), so it makes sense to turn INT32 into an explicitly
long datatype.  This ensures that libjpeg-turbo will always behave
consistently, regardless of the headers included at compile time.

Addresses a concern expressed in #26.
2015-10-14 20:34:32 -05:00
DRC
a6efae1488 Reformat code per Siarhei's original patch (to clearly indicate that the offset instructions are completely independent) and add Siarhei as an individual author (he no longer works for Nokia.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1388 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-25 15:26:09 +00:00
DRC
d729f4da9c ARM NEON SIMD support for YCC-to-RGB565 conversion, and optimizations to the existing YCC-to-RGB conversion code:
-----

aee36252be.patch

From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15

The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).

The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.

The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.

Also use larger indentation in the source code for separating two independent
instruction streams.

-----

a5efdbf22c.patch

From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion

The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7:  ~55 cycles
* ARM Cortex-A8:  ~28 cycles
* ARM Cortex-A9:  ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles

Based on the Linaro rgb565 patch from
    https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.


git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-23 15:47:51 +00:00
DRC
2e2ce5a1da .func/.endfunc are only necessary when generating STABS debug info, which basically went out of style with parachute pants and Rick Astley. At any rate, none of the platforms for which we're building the ARM code use it (DWARF is the common format these days), and the .func/.endfunc directives cause the clang integrated assembler to fail (http://llvm.org/bugs/show_bug.cgi?id=20424).
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1375 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-22 11:31:46 +00:00
DRC
3e00f03aea Formatting tweaks
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.3.x@1107 632fc199-4ca6-4c93-a231-07263d6284db
2014-02-05 07:40:00 +00:00
DRC
316617faf4 Accelerated 4:2:2 upsampling routine for ARM (improves performance ~20-30% when decompressing 4:2:2 JPEGs using fancy upsampling)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.2.x@837 632fc199-4ca6-4c93-a231-07263d6284db
2012-06-13 05:17:03 +00:00
DRC
b071f01cbb Update Nokia contact info
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@694 632fc199-4ca6-4c93-a231-07263d6284db
2011-09-06 18:58:22 +00:00
DRC
ad6955d46a Improve performance of IFAST iDCT by changing the order of transpose and descale steps
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@693 632fc199-4ca6-4c93-a231-07263d6284db
2011-09-06 18:57:53 +00:00
DRC
5129e3960f Make ARM ISLOW iDCT faster on typical cases, and eliminate the possibility of 16-bit overflows when handling arbitrary coefficients.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@692 632fc199-4ca6-4c93-a231-07263d6284db
2011-09-06 18:55:45 +00:00
DRC
98a44fe07b Improve the performance of YCbCr to RGB conversion on ARM
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@691 632fc199-4ca6-4c93-a231-07263d6284db
2011-08-24 23:27:44 +00:00
DRC
ce4e3e8690 NEON-accelerated slow integer inverse DCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@690 632fc199-4ca6-4c93-a231-07263d6284db
2011-08-22 13:48:01 +00:00
DRC
82bd52196d NEON-accelerated quantization
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@689 632fc199-4ca6-4c93-a231-07263d6284db
2011-08-17 21:00:59 +00:00
DRC
4b024a62b2 Improve performance of ARM NEON IFAST iDCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@686 632fc199-4ca6-4c93-a231-07263d6284db
2011-08-15 08:36:51 +00:00
DRC
7a9376c114 ARM NEON-accelerated RGB-to-YCbCr conversion
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@682 632fc199-4ca6-4c93-a231-07263d6284db
2011-08-12 19:27:20 +00:00
DRC
b740054f3c Support for accelerated forward DCT using ARM NEON instructions
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@678 632fc199-4ca6-4c93-a231-07263d6284db
2011-08-10 23:31:13 +00:00
DRC
8c60d22ff5 NEON-optimized 2x2 and 4x4 scaled iDCTs
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@662 632fc199-4ca6-4c93-a231-07263d6284db
2011-06-17 21:12:58 +00:00
DRC
4346f91fcb iOS ARM support
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@659 632fc199-4ca6-4c93-a231-07263d6284db
2011-06-14 22:16:50 +00:00
DRC
321e068601 ARM NEON support
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@607 632fc199-4ca6-4c93-a231-07263d6284db
2011-05-03 08:47:43 +00:00