The convention used by libjpeg:
type * variable;
is not very common anymore, because it looks too much like
multiplication. Some (particularly C++ programmers) prefer to tuck the
pointer symbol against the type:
type* variable;
to emphasize that a pointer to a type is effectively a new type.
However, this can also be confusing, since defining multiple variables
on the same line would not work properly:
type* variable1, variable2; /* Only variable1 is actually a
pointer. */
This commit reformats the entirety of the libjpeg-turbo code base so
that it uses the same code formatting convention for pointers that the
TurboJPEG API code uses:
type *variable1, *variable2;
This seems to be the most common convention among C programmers, and
it is the convention used by other codec libraries, such as libpng and
libtiff.
There aren't really any best practices to follow here. I tried as best
as I could to adopt a standard that would ease any future maintenance
burdens. The basic tenets of that standard are:
* Assembly instructions always start on Column 5, and operands always
start on Column 21, except:
- The instruction and operand can be indented (usually by 2 spaces)
to indicate a separate instruction stream.
- If the instruction is within an enclosing .if block in a macro,
it should always be indented relative to the .if block.
* Comments are placed with an eye toward readability. There are always
at least 2 spaces between the end of a line of code and the associated
in-line comment. Where it made sense, I tried to line up the comments
in blocks, and some were shifted right to avoid overlap with
neighboring instruction lines. Not an exact science.
* Assembler directives and macros use 2-space indenting rules. .if
blocks are indented relative to the macro, and code within the .if
blocks is indented relative to the .if directive.
* No extraneous spaces between operands. Lining up the operands
vertically did not really improve readability-- personally, I think it
made it worse, since my eye would tend to lose its place in the
uniform columns of characters. Also, code with a lot of vertical
alignment is really hard to maintain, since changing one line could
necessitate changing a bunch of other lines to avoid spoiling the
alignment.
* No extraneous spaces in #defines or other directives. In general, the
only extraneous spaces (other than indenting spaces) are between:
- Instructions and operands
- Operands and in-line comments
This standard should be more or less in keeping with other formatting
standards used within the project.
Full-color compression speedups relative to libjpeg-turbo 1.4.2:
800 MHz ARM Cortex-A9, iOS, 32-bit: 26-44% (avg. 32%)
Refer to #42 and #47 for discussion.
This commit also removes the unnecessary
if (simd_support & JSIMD_ARM_NEON)
statements from the jsimd* algorithm functions. Since the jsimd_can*()
functions check for the existence of NEON, the corresponding algorithm
functions will never be called if NEON isn't available. Removing those
if statements improved performance across the board by a couple of
percent.
Based on:
fc023c880c
These days, INT32 is a commonly-defined datatype in system headers. We
cannot eliminate the definition of that datatype from jmorecfg.h, since
the INT32 typedef has technically been part of the libjpeg API since
version 5 (1994.) However, using INT32 internally is risky, because the
inclusion of a particular header (Xmd.h, for instance) could change the
definition of INT32 from long to int on 64-bit platforms and thus change
the internal behavior of libjpeg-turbo in unexpected ways (for instance,
failing to correctly set __INT32_IS_ACTUALLY_LONG to match the INT32
typedef-- perhaps as a result of including the wrong version of
jpeglib.h-- could cause libjpeg-turbo to produce incorrect results.)
The library has always been built in environments in which INT32 is
effectively long (on Windows, long is always 32-bit, so effectively it's
the same as int), so it makes sense to turn INT32 into an explicitly
long datatype. This ensures that libjpeg-turbo will always behave
consistently, regardless of the headers included at compile time.
Addresses a concern expressed in #26.
-----
aee36252be.patch
From aee36252be20054afce371a92406fc66ba6627b5 Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 03:50:22 +0300
Subject: [PATCH] ARM: Faster NEON yuv->rgb conversion for Krait and Cortex-A15
The older code was developed and tested only on ARM Cortex-A8 and ARM Cortex-A9.
Tuning it for newer ARM processors can introduce some speed-up (up to 20%).
The performance of the inner loop (conversion of 8 pixels) improves from
~27 cycles down to ~22 cycles on Qualcomm Krait 300, and from ~20 cycles
down to ~18 cycles on ARM Cortex-A15.
The performance remains exactly the same on ARM Cortex-A7 (~58 cycles),
ARM Cortex-A8 (~25 cycles) and ARM Cortex-A9 (~30 cycles) processors.
Also use larger indentation in the source code for separating two independent
instruction streams.
-----
a5efdbf22c.patch
From a5efdbf22ce9c1acd4b14a353cec863c2c57557e Mon Sep 17 00:00:00 2001
From: Siarhei Siamashka <siarhei.siamashka@gmail.com>
Date: Wed, 13 Aug 2014 07:23:09 +0300
Subject: [PATCH] ARM: NEON optimized yuv->rgb565 conversion
The performance of the inner loop (conversion of 8 pixels):
* ARM Cortex-A7: ~55 cycles
* ARM Cortex-A8: ~28 cycles
* ARM Cortex-A9: ~32 cycles
* ARM Cortex-A15: ~20 cycles
* Qualcomm Krait: ~24 cycles
Based on the Linaro rgb565 patch from
https://sourceforge.net/p/libjpeg-turbo/patches/24/
but implements better instructions scheduling.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1385 632fc199-4ca6-4c93-a231-07263d6284db