在 Java 中实现 GPGPU/CUDA/OpenCL 的最佳方法?

通用图形处理器(GPGPU)是一个非常有吸引力的概念,利用图形处理器的力量进行任何类型的计算。

我喜欢使用 GPGPU 进行图像处理、粒子和快速几何运算。

现在看来,这个领域的两个竞争者是 CUDA 和 OpenCL。我想知道:

  • OpenCL 在 Windows/Mac 上可以从 Java 使用吗?
  • 图书馆如何与 OpenCL/CUDA 接口?
  • 直接使用 JNA 是一种选择吗?
  • 我是不是忘了什么?

任何真实世界的经验/例子/战争故事都值得赞赏。

49149 次浏览

Well CUDA is a modification of C, to write CUDA kernel you have to code in C, and then compile to executable form with nvidia's CUDA compiler. Produced native code could then be linked with Java using JNI. So technically you can't write kernel code from Java. There is JCUDA http://www.jcuda.de/jcuda/JCuda.html, it provides you with cuda's apis for general memory/device menagement and some Java methods that are implemented in CUDA and JNI wrapped (FFT, some linear algebra methods.. etc etc..).

On the other hand OpenCL is just an API. OpenCL kernels are plain strings passed to the API so using OpenCL from Java you should be able to specify your own kernels. OpenCL binding for java can be found here http://www.jocl.org/.

I've been using JOCL and I'm very happy with it.

The main disadvantage of OpenCL over CUDA (at least for me) is the lack of available libraries (Thrust, CUDPP, etc). However CUDA can be easily ported to OpenCL, and by looking at how those libraries work (algorithms, strategies, etc) is actually very nice as you learn a lot with it.

AFAIK, JavaCL / OpenCL4Java is the only OpenCL binding that is available on all platforms right now (including MacOS X, FreeBSD, Linux, Windows, Solaris, all in Intel 32, 64 bits and ppc variants, thanks to its use of JNA).

It has demos that actually run fine from Java Web Start at least on Mac and Windows (to avoid random crashes on Linux, please see this wiki page, such as this Particles Demo.

It also comes with a few utilities (GPGPU random number generation, basic parallel reduction, linear algebra) and a Scala DSL.

Finally, it's the oldest bindings available (since june 2009) and it has an active user community.

(Disclaimer: I'm JavaCL's author :-))

You may also consider Aparapi. It allows you to write your code in Java and will attempt to convert bytecode to OpenCL at runtime.

Full disclosure. I am the Aparapi developer.

I know it's late but take a look at this: https://github.com/pcpratts/rootbeer1

I have not worked with it but seems much easier to use than other solutions.

From the project page:

Rootbeer is more advanced than CUDA or OpenCL Java Language Bindings. With bindings the developer must serialize complex graphs of objects into arrays of primitive types. With Rootbeer this is done automatically. Also with language bindings, the developer must write the GPU kernel in CUDA or OpenCL. With Rootbeer a static analysis of the Java Bytecode is done (using Soot) and CUDA code is automatically generated.

I can also recommend JOCL by jogamp.org, works on Linux, Mac, and Windows. CONRAD, for example, uses heavily OpenCL in combination with JOCL.

You can take a look at the CUDA4J API

http://sett.com/gpgpu/the-cuda4j-api

If you want to do some image processing or geometric operations, you may want a linear algebra library with gpu support (with CUDA for instance). I would suggest you ND4J witch is the linear algrebra with CUDA GPU support on which DeepLearning4J is built. With that you don't have to deal with CUDA directly and have to low level code in c. Plus if you want to do more stuff with image with DL4J you will have access to specific image processing operations such as convolution.