|
We aim to provide an up-to-date library of information for SIMD architectures, like algorithms, tutorials, comparisons, benchmarks, etc.
libfreevec NG!!Submitted by markos on Tue, 03/24/2009 - 23:24.
I'm in the process of rewriting libfreevec and porting it to other SIMD platforms, apart from AltiVec (which I consider dead or dying, unfortunately, thanks to the Big Powers that decided it's no longer important along with PowerPC, but that should be another topic). Anyway, the main platforms chosen are AltiVec (of course :), SSE (SSE2, SSE3 and possiby SSE4), ARM NEON and Cell SPU. ( categories: )
32-bit *signed* integer multiplication with AltiVecSubmitted by markos on Sat, 08/23/2008 - 21:55.
While completing Eigen2 AltiVec support (should be almost complete now), I noticed that the 32-bit integer multiplication didn't work correctly all of the time. As AltiVec does not really include any instruction to do 32-bit integer multiplication, I used Apple's routine from the Apple Developer's site. But this didn't work and some results were totally off. With some debugging, I found out that this routine works for unsigned 32-bit integers, where Eigen2 uses signed integers! So, I had to search more, and to my surprise, I found no reference of any similar work. So I had 2 choices: a) ditch AltiVec integer vectorisation from Eigen2 (not acceptable!) b) implement my own method! It is obvious which choice I followed :) libfreevec 1.0.4 benchmarks updated!Submitted by markos on Thu, 08/21/2008 - 11:23.
Hello again, I managed to find time to update all of the libfreevec benchmarks to the latest version 1.0.4 and also include more complete tests and added a non-ppc architecture (an Athlon X2 5000 @2.6Ghz) where the same tests were run (as 32-bit apps on a 64-bit Linux) for comparison. This is important for two reasons:
All benchmarks were run on OpenSuse 11.0, except for the G5 which uses Debian Lenny/testing. The compiler used was gcc 4.3.2. All functions have been tested to work correctly on each platform. ( categories: )
HOWTO: Using libfreevec using LD_PRELOADSubmitted by markos on Tue, 08/19/2008 - 13:18.
Ok, let's suppose you've downloaded libfreevec, built it successfully and now you want to use it for the whole system, without recompiling the whole system to use the library! Is it possible? Thanks to a glibc feature you can! There are two ways to do that:
( categories: )
Inverse of Matrix 4x4 using partitioningSubmitted by markos on Fri, 04/18/2008 - 17:31.
We tackle the 4x4 matrix inversion using the matrix partitioning method, as described in the "Numerical Recipes in C" book (2nd ed., though I guess it will be similar in the 3rd edition). Using the AltiVec SIMD unit, we achieve almost 300% increase in performance, making the routine the fastest -at least known to us, matrix inversion method! SIMD |
SIMDUser login |