In order to test the difference between the two, I wrote a simple program that assigns int[100000] for comparison, and used nanoTime to calculate the time difference:
The procedure is as follows:
int[] a = new int[100000]; for(int i=0;i<a.length;i + + ){ a[i] = i; } int[] b = new int[100000]; int[] c = new int[100000]; for(int i=0;i<c.length;i + + ){ c[i] = i; } int[] d = new int[100000]; for(int k=0;k<10;k + + ){ long start1 = System.nanoTime(); for(int i=0;i<a.length;i + + ){ b[i] = a[i]; } long end1 = System.nanoTime(); System.out.println("end1 - start1 = " + (end1-start1)); long start2 = System.nanoTime(); System.arraycopy(c, 0, d, 0, 100000); long end2 = System.nanoTime(); System.out.println("end2 - start2 = " + (end2-start2)); System.out.println(); }
In order to avoid memory instability interference and accidental running results, I declared all the space at the beginning, and then only executed it in a loop 10 times, and got the following results:
end1 - start1 = 366806 end2 - start2 = 109154 end1 - start1 = 380529 end2 - start2 = 79849 end1 - start1 = 421422 end2 - start2 = 68769 end1 - start1 = 344463 end2 - start2 = 72020 end1 - start1 = 333174 end2-start2 = 77277 end1 - start1 = 377335 end2 - start2 = 82285 end1 - start1 = 370608 end2 - start2 = 66937 end1 - start1 = 349067 end2 - start2 = 86532 end1 - start1 = 389974 end2 - start2 = 83362 end1 - start1 = 347937 end2 - start2 = 63638
It can be seen that the performance of System.arraycopy is very good. In order to see how the bottom layer is processed, I found some code of openJDK and lingered on it:
System.arraycopy is a native function, you need to look at the code of the native layer:
public static native void arraycopy(Object src, int srcPos, Object dest, int destPos, int length);
Find the corresponding openjdk6-src/hotspot/src/share/vm/prims/jvm.cpp. Here is the entrance to JVM_ArrayCopy:
JVM_ENTRY(void, JVM_ArrayCopy(JNIEnv *env, jclass ignored, jobject src, jint src_pos, jobject dst, jint dst_pos, jint length)) JVMWrapper("JVM_ArrayCopy"); // Check if we have null pointers if (src == NULL || dst == NULL) { THROW(vmSymbols::java_lang_NullPointerException()); } arrayOop s = arrayOop(JNIHandles::resolve_non_null(src)); arrayOop d = arrayOop(JNIHandles::resolve_non_null(dst)); assert(s->is_oop(), "JVM_ArrayCopy: src not an oop"); assert(d->is_oop(), "JVM_ArrayCopy: dst not an oop"); // Do copy Klass::cast(s->klass())->copy_array(s, src_pos, d, dst_pos, length, thread); JVM_END
The previous statements are all judgments. We know that the last copy_array(s, src_pos, d, dst_pos, length, thread) is the real copy. Take a closer look here, in openjdk6-src/hotspot/src/share/vm/oops/typeArrayKlass. cpp in:
void typeArrayKlass::copy_array(arrayOop s, int src_pos, arrayOop d, int dst_pos, int length, TRAPS) { assert(s->is_typeArray(), "must be type array"); // Check destination if (!d->is_typeArray() || element_type() != typeArrayKlass::cast(d->klass())->element_type()) { THROW(vmSymbols::java_lang_ArrayStoreException()); } // Check is all offsets and lengths are non negative if (src_pos < 0 || dst_pos < 0 || length < 0) { THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException()); } // Check if the ranges are valid if ( (((unsigned int) length + (unsigned int) src_pos) > (unsigned int) s->length()) || (((unsigned int) length + (unsigned int) dst_pos) > (unsigned int) d->length()) ) { THROW(vmSymbols::java_lang_ArrayIndexOutOfBoundsException()); } // Check zero copy if (length == 0) return; // This is an attempt to make the copy_array fast. int l2es = log2_element_size(); int ihs = array_header_in_bytes() / wordSize; char* src = (char*) ((oop*)s + ihs) + ((size_t)src_pos << l2es); char* dst = (char*) ((oop*)d + ihs) + ((size_t)dst_pos << l2es); Copy::conjoint_memory_atomic(src, dst, (size_t)length << l2es); // Still processing copy here }
What is before this function is still a bunch of judgments, and it is not until the last sentence that it is the real copy statement.
Find the corresponding function in openjdk6-src/hotspot/src/share/vm/utilities/copy.cpp:
// Copy bytes; larger units are filled atomically if everything is aligned. void Copy::conjoint_memory_atomic(void* from, void* to, size_t size) { address src = (address) from; address dst = (address) to; uintptr_t bits = (uintptr_t) src | (uintptr_t) dst | (uintptr_t) size; // (Note: We could improve performance by ignoring the low bits of size, // and putting a short cleanup loop after each bulk copy loop. // There are plenty of other ways to make this faster also, // and it's a slippery slope. For now, let's keep this code simple // since the simplicity helps clarify the atomicity semantics of // this operation. There are also CPU-specific assembly versions // which may or may not want to include such optimizations.) if (bits % sizeof(jlong) == 0) { Copy::conjoint_jlongs_atomic((jlong*) src, (jlong*) dst, size / sizeof(jlong)); } else if (bits % sizeof(jint) == 0) { Copy::conjoint_jints_atomic((jint*) src, (jint*) dst, size / sizeof(jint)); } else if (bits % sizeof(jshort) == 0) { Copy::conjoint_jshorts_atomic((jshort*) src, (jshort*) dst, size / sizeof(jshort)); } else { // Not aligned, so no need to be atomic. Copy::conjoint_jbytes((void*) src, (void*) dst, size); } }
The above code shows which copy function to choose. We choose conjoint_jints_atomic. See further in openjdk6-src/hotspot/src/share/vm/utilities/copy.hpp:
// jints, conjoint, atomic on each jint static void conjoint_jints_atomic(jint* from, jint* to, size_t count) { assert_params_ok(from, to, LogBytesPerInt); pd_conjoint_jints_atomic(from, to, count); }
Continue to look down, in openjdk6-src/hotspot/src/cpu/zero/vm/copy_zero.hpp:
static void pd_conjoint_jints_atomic(jint* from, jint* to, size_t count) { _Copy_conjoint_jints_atomic(from, to, count); }
Continue to look down, in openjdk6-src/hotspot/src/os_cpu/linux_zero/vm/os_linux_zero.cpp:
void _Copy_conjoint_jints_atomic(jint* from, jint* to, size_t count) { if (from > to) { jint *end = from + count; while (from < end) *(to + + ) = *(from + + ); } else if (from < to) { jint *end = from; from + = count - 1; to + = count - 1; while (from >= end) *(to--) = *(from--); } }
As you can see, it is directly the logic of memory block assignment. This avoids the time of a lot of reference flipping back and forth, and it will inevitably become faster.
The knowledge points of the article match the official knowledge files, and you can further learn related knowledge. Java Skill TreeHomepageOverview 139176 people are learning the system