Fix MIPS DSPr2 4:2:0 upsample bug w/ small images

The DSPr2 code was errantly comparing the residual (t9, width & 0xF) with the end pointer (t4, out + width) instead of the width directly (a1). This would give the wrong results with any image whose output width was less than 16. The other small changes (ulw to lw and removal of the nop) are just some easy optimizations around this code. This issue caused a buffer overrun and subsequent segfault on images whose scaled output height was 1 pixel and whose scaled output width was < 16 pixels. Note that the "plain" (non-fancy and non-merged) upsample routine, which was affected by this bug, is normally not used except when decompressing a non-YCbCr JPEG image, but it is also used when decompressing a single-row image (because the other upsampling algorithms require at least two rows.) Closes #16.
2015-09-16 23:05:46 -05:00
parent 498d9bc92f
commit 54792ba340
2 changed files with 11 additions and 5 deletions
--- a/ChangeLog.txt
+++ b/ChangeLog.txt
@@ -21,6 +21,13 @@ Clang/LLVM optimizer uses load combining to transfer multiple adjacent 32-bit
 structure members into a single 64-bit register, and this exposed the ABI
 conformance issue.

+[4] Fixed a bug in the MIPS DSPr2 4:2:0 "plain" (non-fancy and non-merged)
+upsampling routine that caused a buffer overflow (and subsequent segfault) when
+decompressing a 4:2:0 JPEG image whose scaled output width was less than 16
+pixels.  The "plain" upsampling routines are normally only used when
+decompressing a non-YCbCr JPEG image, but they are also used when decompressing
+a JPEG image whose scaled output height is 1.
+

 1.4.1
 =====
--- a/simd/jsimd_mips_dspr2.S
+++ b/simd/jsimd_mips_dspr2.S
@@ -1811,12 +1811,11 @@ LEAF_MIPS_DSPR2(jsimd_h2v2_upsample_mips_dspr2)
    bgtz    t4, 2b
     addiu  t5, 2
 3:
-    ulw     t6, 0(t7)       // t6 = outptr
-    ulw     t5, 4(t7)       // t5 = outptr[1]
+    lw      t6, 0(t7)       // t6 = outptr[0]
+    lw      t5, 4(t7)       // t5 = outptr[1]
    addu    t4, t6, a1      // t4 = new end address
-    subu    t8, t4, t9
-    beqz    t8, 5f
-     nop
+    beq     a1, t9, 5f
+     subu   t8, t4, t9
 4:
    ulw     t0, 0(t6)
    ulw     t1, 4(t6)