Commit Graph

394 Commits

Author SHA1 Message Date
DRC
eea6424596 Typo
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1570 632fc199-4ca6-4c93-a231-07263d6284db
2015-06-20 15:51:34 +00:00
DRC
f15ef33768 Fix a segfault that occured in the MIPS DSPr2 fancy upsampling routine when downsampled_width==3. Because the DSPr2 code unrolls the loop for the middle columns (refer to jdsample.c), it has the effect of performing two column iterations, and that only works properly if the number of columns (minus the first and last) is >= 2. For the specific case of downsampled_width==3, this patch skips to the second iteration of the unrolled column loop.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1562 632fc199-4ca6-4c93-a231-07263d6284db
2015-06-08 17:41:34 +00:00
DRC
3b7015d5d8 Enable silent build rules for the NASM objects, if the source is configured with automake 1.11 or later. NOTE: the build still spits out "error: ignoring unknown tag NASM" for each object, but unfortunately, if we remove "--tag NASM" from the command line, the build breaks under older versions of automake (it aborts with "unable to infer tagged configuration.")
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1534 632fc199-4ca6-4c93-a231-07263d6284db
2015-02-23 19:03:29 +00:00
DRC
771ab19437 Extend the AltiVec VMX SIMD routines to support little endian PowerPC platforms.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1529 632fc199-4ca6-4c93-a231-07263d6284db
2015-02-20 19:57:21 +00:00
DRC
3f7608348d Come on, Cohaagen, you got what you want. Give these people air!
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1527 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-28 00:25:43 +00:00
DRC
8b5a0093ee Oops. The MIPS SIMD implementations of h2v1 and h2v2 upsampling were not checking for DSPr2 support, so running 'djpeg -nosmooth' on a non-DSPr2-enabled platform caused an "illegal instruction" error.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1523 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-21 17:42:28 +00:00
DRC
246b01bb0a Revert r1506 (we actually are generating columns with the IDCT, so the naming makes sense in retrospect); further de-confusification in the forward DCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1507 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-16 03:13:16 +00:00
DRC
c4e3c361d7 De-confusify the variable names a bit -- "out" represents the output of the IDCT kernel, so use "final" to represent the packed data that will be stored to memory.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1506 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-15 08:51:31 +00:00
DRC
c641cddb80 AltiVec SIMD implementation of H2V1 and H2V2 plain upsampling (used only when decompressing YCCK images with fast upsampling enabled.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1504 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-14 15:41:11 +00:00
DRC
86af36aebc AltiVec SIMD implementation of H2V1 and H2V2 merged upsampling
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1503 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-14 13:27:32 +00:00
DRC
11c4010c3c Fix an overread detected by valgrind
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1502 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-14 13:07:06 +00:00
DRC
2517ef72ed Fix bugs in the AltiVec fancy upsampling routines uncovered during additional testing with small image sizes. Since the input width is half the output width, the upsampler should only write a second 16-byte chuck if there are more than 8 input columns left. Additionally, if the width is < 16, then we need to insert a dummy sample (the SSE2 code does this as well, but I neglected to port that portion of the code for some reason.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1501 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-14 10:45:31 +00:00
DRC
cbcb53617b Fix a bug in the AltiVec downsampling routines uncovered during additional testing with small image sizes. Since the output width is half the input width, the downsampler should only read a second 16-byte chunk if there are more than 8 output columns left.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1499 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-14 08:31:54 +00:00
DRC
ada430bb3d Make the formatting and naming of variables and constants more consistent
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1498 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-13 11:49:15 +00:00
DRC
406bb01ca2 Make the formatting and naming of variables and constants more consistent
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1497 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-13 11:47:20 +00:00
DRC
a6a24c270e Make the formatting and naming of variables and constants more consistent
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1496 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-13 10:00:12 +00:00
DRC
52a4ec6c8a AltiVec SIMD implementation of H2V1 and H2V2 fancy upsampling
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1495 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-13 09:02:29 +00:00
DRC
51eba06011 Minor code readability tweak
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1492 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-11 07:38:35 +00:00
DRC
d71a6e0c25 Use intrinsics for loading aligned data in the IDCT functions. This has no effect on performance, but it makes it more obvious what that code is doing.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1491 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-11 06:34:47 +00:00
DRC
ac4daa7716 AltiVec SIMD implementation of YCC-to-RGB color conversion
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1489 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-10 22:56:26 +00:00
DRC
af69295d3f Fix minor issue in code comments
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1488 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-10 12:25:42 +00:00
DRC
6aed11293d Fix minor issue in code comments
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1487 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-10 12:10:08 +00:00
DRC
a5005751a3 Simplify the code somewhat. It actually wasn't necessary to have a "fast path" and a "medium path"-- they perform the same.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1486 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-10 12:09:11 +00:00
DRC
25347882ed Overhaul the AltiVec vector loading code in the compression-side colorspace conversion routines. The existing code was sometimes overreading the source buffer (at least according to valgrind), and it was necessary to increase the complexity of the code in order to prevent this without significantly compromising performance.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1485 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-10 11:32:36 +00:00
DRC
8de75d0f57 Fix minor issue in code comments
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1484 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-08 10:42:54 +00:00
DRC
22048207ea AltiVec SIMD implementation of 2x1 and 2x2 downsampling
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1483 632fc199-4ca6-4c93-a231-07263d6284db
2015-01-08 06:18:33 +00:00
DRC
577ecd93f1 AltiVec SIMD implementation of sample conversion and integer quantization
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1474 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-23 04:14:54 +00:00
DRC
ff30c63934 Document the fact that the AltiVec implementation uses the same modified algorithms as the SSE2 implementation
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1473 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-23 02:42:59 +00:00
DRC
45453085e7 Use intrinsics for loading/storing data in the DCT/IDCT functions. This has no effect on the performance of the aligned loads/stores, but it makes it more obvious what that code is doing. Using intrinsics for the unaligned stores in the inverse DCT functions increases overall decompression performance by 1-2%.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1472 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-22 16:04:17 +00:00
DRC
b1fec4ff98 AltiVec SIMD implementation of RGB-to-Grayscale color conversion
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1471 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-22 14:10:33 +00:00
DRC
5976e42550 Remove unneeded code; Make sure jccolor-altivec.o will be rebuilt if jccolext-altivec.c changes.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1470 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-22 13:57:30 +00:00
DRC
62bae204f4 AltiVec SIMD implementation of RGB-to-YCC color conversion
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1469 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-22 13:42:26 +00:00
DRC
13af139624 Make comments more consistent
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1466 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-22 01:38:01 +00:00
DRC
bf8a5feb79 Cosmetic tweaks to the PowerPC SIMD stubs
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1464 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-22 01:10:11 +00:00
DRC
535674b1b8 Split AltiVec algorithms into separate files for ease of maintenance; Rename constants using lowercase so they are not confused with macros
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1463 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-22 01:00:42 +00:00
DRC
c9da78525b Optimizations to the AltiVec DCT algorithms (pre-compute constants and combine multiply/add operations)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1462 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-20 03:32:59 +00:00
DRC
0691162a8b AltiVec SIMD implementation of slow integer inverse DCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1461 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-20 01:17:39 +00:00
DRC
9cb418d221 Use macros to allocate constants statically, rather than reading them from a table using vec_splat*(). This improves code readability and probably improves performance a bit as well.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1460 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-20 01:16:26 +00:00
DRC
935d1d6398 Swap the order of the IFAST and ISLOW FDCT functions so that it matches the order of the prototypes in jsimd.h and the stubs in jsimd_powerpc.c.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1459 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-20 01:14:38 +00:00
DRC
86ae5913b1 Include ARMv8 binaries when generating a combined OS X/iOS package using 'make iosdmg'
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1458 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-19 18:28:00 +00:00
DRC
62999d7708 Modify the ARM64 assembly file so that it uses only syntax that the clang assembler in XCode 5.x can understand. These changes should all be cosmetic in nature-- they do not change the meaning or readability of the code nor the ability to build it for Linux. Actually, the code is now more in compliance with the ARM64 programming manual. In addition to these changes, there were a couple of instructions that clang simply doesn't support, so gas-preprocessor.pl was modified so that it now converts those into equivalent instructions that clang can handle.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/branches/1.4.x@1450 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-19 15:36:39 +00:00
DRC
6cb7f40a18 AltiVec SIMD implementation of fast integer inverse DCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1445 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-18 10:12:29 +00:00
DRC
040435afb9 Further cleanup of the AltiVec forward DCT code:
-- Use macros to represent the fast FDCT constants, to facilitate comparing the AltiVec implementation of the algorithm with the SSE2 implementation.
-- Rename slow FDCT constants for consistency.
-- Use vec_sra() in all cases in the slow FDCT code.  The SSE2 implementation uses psraw, which is an arithmetic shift, so we need to do likewise with AltiVec.  Using vec_sr() hasn't caused any problems yet, but it is conceivable that it might cause different behavior in certain corner cases.


git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1444 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-18 09:49:39 +00:00
DRC
fb0c394037 AltiVec SIMD implementation of slow integer forward DCT; Clean up fast integer forward DCT code so that it is easier to see how it derives from the SSE2 code and to make it play more nicely with the slow FDCT code.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1443 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-17 08:04:39 +00:00
DRC
e1ac40104c Fix cosmetic issues in AltiVec comments
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1442 632fc199-4ca6-4c93-a231-07263d6284db
2014-12-17 08:00:29 +00:00
DRC
7affbfc241 The AltiVec code actually works on 32-bit PowerPC platforms as well, so change the "powerpc64" token to "powerpc". Also clean up the shift code, which wasn't building properly on OS X.
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1406 632fc199-4ca6-4c93-a231-07263d6284db
2014-09-05 07:23:12 +00:00
DRC
cd2d8e1cc7 AltiVec SIMD implementation of fast forward DCT
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1405 632fc199-4ca6-4c93-a231-07263d6284db
2014-09-05 06:33:42 +00:00
DRC
0a9a252654 Rename the ARM64 assembly file to match the C file
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1390 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-29 01:53:17 +00:00
DRC
de26249a99 Fix several mathematical issues discovered in the ARM64 NEON code while running the extended regression tests introduced in r1267. Specific comments can be found in the original patches:
https://sourceforge.net/p/libjpeg-turbo/patches/64/


git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1389 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-29 01:49:59 +00:00
DRC
a6efae1488 Reformat code per Siarhei's original patch (to clearly indicate that the offset instructions are completely independent) and add Siarhei as an individual author (he no longer works for Nokia.)
git-svn-id: svn+ssh://svn.code.sf.net/p/libjpeg-turbo/code/trunk@1388 632fc199-4ca6-4c93-a231-07263d6284db
2014-08-25 15:26:09 +00:00