为什么 System.arraycopy 在 Java 中是本机的?

在 Java 源代码中看到 System.arraycopy 是一个本机方法,我感到很惊讶。

当然是因为它更快。但是,代码能够使用哪些本机技巧使其更快呢?

为什么不在原始数组上循环并将每个指针复制到新数组中呢? 这肯定没有那么慢和麻烦吧?

26594 次浏览

In native code, it can be done with a single memcpy / memmove, as opposed to n distinct copy operations. The difference in performance is substantial.

It can't be written in Java. Native code is able to ignore or elide the difference between arrays of Object and arrays of primitives. Java can't do that, at least not efficiently.

And it can't be written with a single memcpy(), because of the semantics required by overlapping arrays.

There are a few reasons:

  1. The JIT is unlikely to generate as efficient low level code as a manually written C code. Using low level C can enable a lot of optimizations that are close to impossible to do for a generic JIT compiler.

    See this link for some tricks and speed comparisons of hand written C implementations (memcpy, but the principle is the same): Check this Optimizing Memcpy improves speed

  2. The C version is pretty much independant of the type and size of the array members. It is not possible to do the same in java since there is no way to get the array contents as a raw block of memory (eg. pointer).

It is, of course, implementation dependent.

HotSpot will treat it as an "intrinsic" and insert code at the call site. That is machine code, not slow old C code. This also means the problems with the signature of the method largely go away.

A simple copy loop is simple enough that obvious optimisations can be applied to it. For instance loop unrolling. Exactly what happens is again implementation dependent.

In my own tests System.arraycopy() for copying multiple dimension arrays is 10 to 20 times faster than interleaving for loops:

float[][] foo = mLoadMillionsOfPoints(); // result is a float[1200000][9]
float[][] fooCpy = new float[foo.length][foo[0].length];
long lTime = System.currentTimeMillis();
System.arraycopy(foo, 0, fooCpy, 0, foo.length);
System.out.println("native duration: " + (System.currentTimeMillis() - lTime) + " ms");
lTime = System.currentTimeMillis();


for (int i = 0; i < foo.length; i++)
{
for (int j = 0; j < foo[0].length; j++)
{
fooCpy[i][j] = foo[i][j];
}
}
System.out.println("System.arraycopy() duration: " + (System.currentTimeMillis() - lTime) + " ms");
for (int i = 0; i < foo.length; i++)
{
for (int j = 0; j < foo[0].length; j++)
{
if (fooCpy[i][j] != foo[i][j])
{
System.err.println("ERROR at " + i + ", " + j);
}
}
}

This prints:

System.arraycopy() duration: 1 ms
loop duration: 16 ms