There was no previous GAS implementation.
NOTE: This doesn't produce much of a speedup when using -O3, because -O3
already enables Neon autovectorization, which works well for the scalar
C implementation of plain upsampling. However, the Neon SIMD
implementation will benefit other optimization levels.
The previous AArch32 GAS implementation of h2v1 fancy upsampling has
been removed, since the intrinsics implementation provides the same or
better performance. There was no previous GAS implementation of h2v2
fancy upsampling, and there was no previous AArch64 GAS implementation
of h2v1 fancy upsampling.