System.arraycopy for array copy

In order to test the difference between the two, I wrote a simple program that assigns int[100000] for comparison, and used nanoTime to calculate the time difference:

The procedure is as follows:

int[] a = new int[100000];
        for(int i=0;i<a.length;i + + ){
            a[i] = i;
        }
        
        int[] b = new int[100000];
        
        int[] c = new int[100000];
        for(int i=0;i<c.length;i + + ){
            c[i] = i;
        }
        
        int[] d = new int[100000];
        
        for(int k=0;k<10;k + + ){
            long start1 = System.nanoTime();
            for(int i=0;i<a.length;i + + ){
                b[i] = a[i];
            }
            long end1 = System.nanoTime();
            System.out.println("end1 - start1 = " + (end1-start1));
            
            
            long start2 = System.nanoTime();
            System.arraycopy(c, 0, d, 0, 100000);
            long end2 = System.nanoTime();
            System.out.println("end2 - start2 = " + (end2-start2));
            
            System.out.println();
        }

In order to avoid memory instability interference and accidental running results, I declared all the space at the beginning, and then only executed it in a loop 10 times, and got the following results:

end1 - start1 = 366806
end2 - start2 = 109154

end1 - start1 = 380529
end2 - start2 = 79849

end1 - start1 = 421422
end2 - start2 = 68769

end1 - start1 = 344463
end2 - start2 = 72020

end1 - start1 = 333174
end2-start2 = 77277

end1 - start1 = 377335
end2 - start2 = 82285

end1 - start1 = 370608
end2 - start2 = 66937

end1 - start1 = 349067
end2 - start2 = 86532

end1 - start1 = 389974
end2 - start2 = 83362

end1 - start1 = 347937
end2 - start2 = 63638

It can be seen that the performance of System.arraycopy is very good. In order to see how the bottom layer is processed, I found some code of openJDK and lingered on it:

System.arraycopy is a native function, you need to look at the code of the native layer:

public static native void arraycopy(Object src, int srcPos,
                                        Object dest, int destPos,
                                        int length);

Find the corresponding openjdk6-src/hotspot/src/share/vm/prims/jvm.cpp. Here is the entrance to JVM_ArrayCopy:

JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos,
                               jobject dst, jint dst_pos, jint length))
  JVMWrapper("JVM_ArrayCopy");
  // Check if we have null pointers
  if (src == NULL || dst == NULL) {
    THROW(vmSymbols::java_lang_NullPointerException());
  }
  arrayOop s = arrayOop(JNIHandles::resolve_non_null(src));
  arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst));
  assert(s->is_oop(), "JVM_ArrayCopy: src not an oop");
  assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop");
  // Do copy
  Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread);
JVM_END

The previous statements are all judgments. We know that the last copy_array(s, src_pos, d, dst_pos, length, thread) is the real copy. Take a closer look here, in openjdk6-src/hotspot/src/share/vm/oops/typeArrayKlass. cpp in:

void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) {
  assert(s->is_typeArray(), "must be type array");

  // Check destination
  if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) {
    THROW(vmSymbols::java_lang_ArrayStoreException());
  }

  // Check is all offsets and lengths are non negative
  if (src_pos < 0 || dst_pos < 0 || length < 0) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check if the ranges are valid
  if ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length())
     || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) {
    THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException());
  }
  // Check zero copy
  if (length == 0)
    return;

  // This is an attempt to make the copy_array fast.
  int l2es = log2_element_size();
  int ihs = array_header_in_bytes() / wordSize;
  char* src = (char*) ((oop*)s + ihs) + ((size_t)src_pos << l2es);
  char* dst = (char*) ((oop*)d + ihs) + ((size_t)dst_pos << l2es);
  Copy::conjoint_memory_atomic(src, dst, (size_t)length << l2es); // Still processing copy here
}

What is before this function is still a bunch of judgments, and it is not until the last sentence that it is the real copy statement.

Find the corresponding function in openjdk6-src/hotspot/src/share/vm/utilities/copy.cpp:

// Copy bytes; larger units are filled atomically if everything is aligned.
void Copy::conjoint_memory_atomic(void* from, void* to, size_t size) {
  address src = (address) from;
  address dst = (address) to;
  uintptr_t bits = (uintptr_t) src | (uintptr_t) dst | (uintptr_t) size;

  // (Note: We could improve performance by ignoring the low bits of size,
  // and putting a short cleanup loop after each bulk copy loop.
  // There are plenty of other ways to make this faster also,
  // and it's a slippery slope. For now, let's keep this code simple
  // since the simplicity helps clarify the atomicity semantics of
  // this operation. There are also CPU-specific assembly versions
  // which may or may not want to include such optimizations.)

  if (bits % sizeof(jlong) == 0) {
    Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof(jlong));
  } else if (bits % sizeof(jint) == 0) {
    Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof(jint));
  } else if (bits % sizeof(jshort) == 0) {
    Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof(jshort));
  } else {
    // Not aligned, so no need to be atomic.
    Copy::conjoint_jbytes((void*) src, (void*) dst, size);
  }
}

The above code shows which copy function to choose. We choose conjoint_jints_atomic. See further in openjdk6-src/hotspot/src/share/vm/utilities/copy.hpp:

// jints, conjoint, atomic on each jint
  static void conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    assert_params_ok(from, to, LogBytesPerInt);
    pd_conjoint_jints_atomic(from, to, count);
  }

Continue to look down, in openjdk6-src/hotspot/src/cpu/zero/vm/copy_zero.hpp:

static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
  _Copy_conjoint_jints_atomic(from, to, count);
}

Continue to look down, in openjdk6-src/hotspot/src/os_cpu/linux_zero/vm/os_linux_zero.cpp:

void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) {
    if (from > to) {
      jint *end = from + count;
      while (from < end)
        *(to + + ) = *(from + + );
    }
    else if (from < to) {
      jint *end = from;
      from + = count - 1;
      to + = count - 1;
      while (from >= end)
        *(to--) = *(from--);
    }
  }

As you can see, it is directly the logic of memory block assignment. This avoids the time of a lot of reference flipping back and forth, and it will inevitably become faster.

The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeHomepageOverview 139176 people are learning the system