# AltiVec

## Paper: AltiVec vectorization of hashing algorithms, 2007

This paper is hosted here actually! First version was using drutex-rendered of LaTeX excerpts of the original paper, today I'm using MathJax, much better. I wrote this paper when I was attempting to optimize MySQL with Altivec as part of a Genesi project, unfortunately it didn't amount to much in terms of accelerating MySQL, but I did invent an algorithm to vectorize a whole certain family of hashing functions.

The result was this paper

## Paper: Inverse of Matrix 4x4 using partitioning in Altivec, 2008

Actually that one is already on this site :)

In 2008, I tried to revive my original idea of vectorizing the world for Altivec, I actually made good progress, then I made the mistake of getting a completely unrelated project (Java EE, ugh) that basically eventually made me shutdown my company, and lose 2 years of possible progress in Altivec and vectorizations.

Check here for the paper.

## Paper: vectorized Adler32 in Altivec, 2005

Back in 2005, I was convinced that I could vectorize most/all of vital/unoptimized core routines of the system to use Altivec. Sadly, I was wrong, it was a huge task and it wasn't even my full-time job. I did however manage to optimize *some* routines, even as a proof of concept. Adler32 hashing function was the first of those and to prove my point, I wrote a small paper for it. It wasn't really entirely rigorous in terms of mathematical fullness of proof, but it was correct and the code was indeed that much faster.

## Yellow Dog Linux 6.2 includes libfreevec!

Here's the link to the announcement:

http://lists.fixstars.com/pipermail/yellowdog-announce/2009-June/000214.html

From the press release:

YDL 6.2 now offers libfreevec, a (LGPL) library with replacement routines for GLIBC, such as memcpy(), strlen(), etc. These routines, which have been rewritten and optimized to use the AltiVec vector engine found in the G4/G4+ PowerPC CPUs, can provide for up to 25% increase in application performance.

## 32-bit *signed* integer multiplication with AltiVec

While completing Eigen2 AltiVec support (should be almost complete now), I noticed that the 32-bit integer multiplication didn't work correctly all of the time. As AltiVec does not really include any instruction to do 32-bit integer multiplication, I used Apple's routine from the Apple Developer's site. But this didn't work and some results were totally off. With some debugging, I found out that this routine works for unsigned 32-bit integers, where Eigen2 uses signed integers! So, I had to search more, and to my surprise, I found no reference of any similar work. So I had 2 choices: a) ditch AltiVec integer vectorisation from Eigen2 (not acceptable!) b) implement my own method! It is obvious which choice I followed :)

UPDATE: Thanks to Matt Sealey, who noticed I could have used vec_abs() instead of vec_sub() and vec_max(). Duh! :D