accelerate- An embedded language for accelerated array processing

accelerate- An embedded language for accelerated array processing

This library defines an embedded language for regular, multi-dimensional array computations with multiple backends to facilitate high-performance implementations. Currently, there are two backends: (1) an interpreter that serves as a reference implementation of the intended semantics of the language and (2) a CUDA backend generating code for CUDA-capable NVIDIA GPUs.

To use the CUDA backend, you need to have CUDA version 3.x installed. The CUDA backend currently doesn't support Char and Bool arrays.

An experimental OpenCL backend is available at and an experimental multicore CPU backend building on the Repa array library is available at

Known bugs:

  • New in New functions zip3, zip4, unzip3, unzip4, fill, enumFromN, enumFromStepN, tail, init, drop, take, slit, gather, gatherIf, scatter, scatterIf, and shapeSize. New simplified AST (in package accelerate-backend-kit) for backend writers who want to avoid the complexities of the type-safe AST. * New in Complete sharing recovery for scalar expressions (but currently disabled by default). Also bug fixes in array sharing recovery and a few new convenience functions.
  • New in Streaming, precompilation, Repa-style indices, stencils, more scans, rank-polymorphic fold, generate, block I/O & many bug fixes
  • New in Bug fixes and some performance tweaks
  • New in replicate, slice and foldSeg supported in the CUDA backend; frontend and interpreter support for stencil; bug fixes
  • New in the CUDA backend and a number of scalar functions

For documentation, see the homepage and