The High Performance Scripting team at Intel Labs is pleased to announce the release of version 0.2 of ParallelAccelerator.jl, a package for high-performance parallel computing in Julia, primarily oriented around arrays and stencils. In this release, we provide support for Julia 0.5 and introduce experimental support for the Julia native threading backend. While we still currently support Julia 0.4, such support should be considered deprecated and we recommend everyone move to Julia 0.5 as Julia 0.4 support may be removed in the future.
The goal of ParallelAccelerator is to accelerate the computational kernel of an application by the programmer simply annotating the kernel function with the @acc (short for "accelerate") macro, provided by the ParallelAccelerator package. In version 0.2, ParallelAccelerator still defaults to transforming the kernel to OpenMP C code that is then compiled with a system C compiler (ICC or GCC) and transparently handles the invocation of the C code from Julia as if the program were running normally. However, ParallelAccelerator v0.2 also introduces experimental backend support for Julia's native threading (which is also experimental). To enable native threading mode, set the environment variable PROSPECT_MODE=threads. In this mode, ParallelAccelerator identifies pieces of code that can be run in parallel and then runs that code as if it had been annotated with Julia's @threads and goes through the standard Julia compiler pipeline with LLVM. The ParallelAccelerator C backend has the limitation that the kernel functions and anything called by those cannot include code that is not type-stable to a single type. In particular, variables of type Any are not supported. In practice, this restriction was a significant limitation. For the native threading backend, no such restriction is necessary and thus our backend should handle arbitrary Julia code. Under the hood, ParallelAccelerator is essentially a domain-specific compiler written in Julia. It performs additional analysis and optimization on top of the Julia compiler. ParallelAccelerator discovers and exploits the implicit parallelism in source programs that use parallel programming patterns such as map, reduce, comprehension, and stencil. For example, Julia array operators such as .+, .-, .*, ./ are translated by ParallelAccelerator internally into data-parallel map operations over all elements of input arrays. For the most part, these patterns are already present in standard Julia, so programmers can use ParallelAccelerator to run the same Julia program without (significantly) modifying the source code. Version 0.2 should be considered an alpha release, suitable for early adopters and Julia enthusiasts. Please file bugs at https://travis-ci.org/IntelLabs/ParallelAccelerator.jl/issues . See our GitHub repository at https://github.com/IntelLabs/ParallelAccelerator.jl for a complete list of prerequisites, supported platforms, example programs, and documentation. Thanks to our colleagues at Intel and Intel Labs, the Julia team, and the broader Julia community for their support of our efforts! Best regards, The High Performance Scripting team (Parallel Computing Lab, Intel Labs)