EPFL
DOMAINE-IT
EPFL
français | English
EPFL > DIT > GPU@EPFL > Smith-Waterman
This page presents the exact Smith-Waterman algorithm running on commodity hardware. It is implemented1 in the recently released (2.0) CUDA programming environment by NVidia. CUDA allows direct access to the hardware primitives of the last-generation Graphics Processing Units (GPU). Speeds of more than 3.0 GCUPS (Giga Cell Updates Per Second) are achieved on a machine running two GeForce GTX 280*.

Database: Swiss-Prot (Sequences number: 402216, Residues number: 145550887) )
Scoring matrix:
Penalty for gap initiation: (default 10) - Penalty for gap extension: (default 2)
Query sequence (UniProtKB/Swiss-Prot O29181) :
Device(s): CPU  1x GPU  2x GPU

*bandwidthTest Running on :GeForce GTX 280
Host to Device Bandwidth for Pageable memory
Transfer Size (Bytes)
33554432
Bandwidth(MB/s)
2579.8
Device to Host Bandwidth for Pageable memory
Transfer Size (Bytes)
33554432
Bandwidth(MB/s)
2262.2
Device to Device Bandwidth
Transfer Size (Bytes)
33554432
Bandwidth(MB/s)
119247.2
*deviceQuery Running on :GeForce GTX 280
Major revision number: 1
Minor revision number: 3
Total amount of global memory: 1073479680 bytes
Number of multiprocessors: 30
Number of cores: 240
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 16384
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1.35 GHz
Concurrent copy and execution: Yes
References
  1. Manavski SA, Valle G (2008). "CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment". BMC Bioinformatics 9 (Suppl 2:S10): S10. doi:10.1186/1471-2105-9-S2-S10, http://www.biomedcentral.com/1471-2105/9/S2/S10
  2. Rognes T and Seeberg E (2000). "Six-fold speed-up of Smith-Waterman sequence database searches using parallel processing on common microprocessors". Bioinformatics 16: 699-706, http://bioinformatics.oxfordjournals.org/cgi/reprint/16/8/699.pdf.
  3. Farrar M S (2008). Optimizing Smith-Waterman for the Cell Broadband Engine, http://farrar.michael.googlepages.com/smith-watermanfortheibmcellbe.