SIMD book, first draft published!

Check activity here:

From the update:

Ok, I've been busy the past days, I started writing the book (using LaTeX :), and I'd like to say that progress has been good. I fixed the current list of SIMD engines that I'm going to include and it's a long one:

Powerbook G4 12" revamping, part 1

I decided to give my trusty powerbook G4 a second chance. But I thought it might be a good idea to upgrade some parts of it in the meantime. Now being as it is, I can't upgrade the CPU or RAM (G4 is fixed at 1Ghz and RAM at 1.2GB), but I could upgrade the disk and screen. This time I upgraded the disk plus I replaced the thermal toothpaste with something much more efficient so it wouldn't get as hot.

I won't go into the actual details of doing the upgrades, these are covered by the excellent articles: back online!

32-bit *signed* integer multiplication with AltiVec

While completing Eigen2 AltiVec support (should be almost complete now), I noticed that the 32-bit integer multiplication didn't work correctly all of the time. As AltiVec does not really include any instruction to do 32-bit integer multiplication, I used Apple's routine from the Apple Developer's site. But this didn't work and some results were totally off. With some debugging, I found out that this routine works for unsigned 32-bit integers, where Eigen2 uses signed integers! So, I had to search more, and to my surprise, I found no reference of any similar work. So I had 2 choices: a) ditch AltiVec integer vectorisation from Eigen2 (not acceptable!) b) implement my own method! It is obvious which choice I followed :)
UPDATE: Thanks to Matt Sealey, who noticed I could have used vec_abs() instead of vec_sub() and vec_max(). Duh! :D

Inverse of Matrix 4x4 using partitioning in Altivec

We tackle the 4x4 matrix inversion using the matrix partitioning method, as described in the "Numerical Recipes in C" book (2nd ed., though I guess it will be similar in the 3rd edition). Using the AltiVec SIMD unit, we achieve almost 300% increase in performance, making the routine the fastest -at least known to us, matrix inversion method!

AltiVec runtime detection in Linux

After a little search I did on Google to find how to detect AltiVec runtime in Linux (I used keywords such as runtime altivec detection and similar), I found that there is no single nice article anywhere that describes something so simple. Thankfully, I got a few good answers from benh and dwmw2 in #mklinux/FreeNode, and I decided to put these down in a cleaned up form.

Matrix 4x4 Identity matrix

The nice thing about the identity matrix, is that we don't have to do any reading of the matrix. And since the form of the identity matrix is already known:

Matrix 4x4 Transpose (floats)

For the theory behind matrix transposition, please see here.

So, the 4x4 transpose would be:


Subscribe to Front page feed