[ L 数组表示法-它来自哪里?

我经常看到使用 [L和类型来表示数组的消息,例如:

[Ljava.lang.Object; cannot be cast to [Ljava.lang.String;

(上面这个例子是我随便拿出来的。)我知道这表示一个数组,但是语法是从哪里来的呢?为什么开头的 [没有方括号?为什么是 L?它纯粹是武断的,还是有其他的历史/技术原因?

34963 次浏览

[ stands for Array, the Lsome.type.Here; represent the type of the array. That's similar to the type descriptors used internally in the bytecode seen in §4.3 of the Java Virtual Machine Specification -- . The only difference is in that the real descriptors use / rather than . for denoting packages.

For instance, for primitives the value is: [I for array of ints, a two-dimensional array would be: [[I (strictly speaking Java doesn't have real two-dimensional arrays, but you can make arrays that consist of arrays).

Since classes may have any name, it would be harder to identify what class it is so they are delimited with L, followed by the class name and finishing with a ;

Descriptors are also used to represent the types of fields and methods.

For instance:

(IDLjava/lang/Thread;)Ljava/lang/Object;

... corresponds to a method whose parameters are int, double, and Thread and the return type is Object

edit

You can also see this in .class files using the java dissambler

C:>more > S.java
class S {
Object  hello(int i, double d, long j, Thread t ) {
return new Object();
}
}
^C
C:>javac S.java


C:>javap -verbose S
class S extends java.lang.Object
SourceFile: "S.java"
minor version: 0
major version: 50
Constant pool:
const #1 = Method       #2.#12; //  java/lang/Object."<init>":()V
const #2 = class        #13;    //  java/lang/Object
const #3 = class        #14;    //  S
const #4 = Asciz        <init>;
const #5 = Asciz        ()V;
const #6 = Asciz        Code;
const #7 = Asciz        LineNumberTable;
const #8 = Asciz        hello;
const #9 = Asciz        (IDJLjava/lang/Thread;)Ljava/lang/Object;;
const #10 = Asciz       SourceFile;
const #11 = Asciz       S.java;
const #12 = NameAndType #4:#5;//  "<init>":()V
const #13 = Asciz       java/lang/Object;
const #14 = Asciz       S;


{
S();
Code:
Stack=1, Locals=1, Args_size=1
0:   aload_0
1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
4:   return
LineNumberTable:
line 1: 0




java.lang.Object hello(int, double, long, java.lang.Thread);
Code:
Stack=2, Locals=7, Args_size=5
0:   new     #2; //class java/lang/Object
3:   dup
4:   invokespecial   #1; //Method java/lang/Object."<init>":()V
7:   areturn
LineNumberTable:
line 3: 0




}

And in raw class file ( look at line 5 ):

enter image description here

Reference: Field description on the JVM specification

This is used in the JNI (and the JVM internally in general) to indicate a type. Primitives are denoted with a single letter (Z for boolean, I for int, etc), [ indicates an array, and L is used for a class (terminated by a ;).

See here: JNI Types

EDIT: To elaborate on why there is no terminating ] - this code is to allow the JNI/JVM to quickly identify a method and its signature. It's intended to be as compact as possible to make parsing fast (=as few characters as possible), so [ is used for an array which is pretty straightforward (what better symbol to use?). I for int is equally obvious.

Another source for this would be the documentation of Class.getName(). Of course, all these specifications are congruent, since they are made to fit each other.

[L array notation - where does it come from?

From the JVM spec. This is the representation of type names that is specified in the classFile format and other places.

  • The '[' denotes an array. In fact, the array type name is [<typename> where <typename> is the name of the base type of the array.
  • 'L' is actually part of the base type name; e.g. String is "Ljava.lang.String;". Note the trailing ';'!!

And yes, the notation is documented in other places as well.

Why?

There is no doubt that that internal type name representation was chosen because it is:

  • compact,
  • self-delimiting (this is important for representations of method signatures, and it's why the 'L' and the trailing ';' are there), and
  • uses printable characters (for legibility ... if not readability).

But it is unclear why they decided to expose the internal type names of array types via the Class.getName() method. I think they could have mapped the internal names to something more "human friendly". My best guess is that it was just one of those things that they didn't get around to fixing until it was too late. (Nobody is perfect ... not even the hypothetical "intelligent designer".)

JVM array descriptors.

[Z = boolean
[B = byte
[S = short
[I = int
[J = long
[F = float
[D = double
[C = char
[L = any non-primitives(Object)

To get the main data-type, you need:

[Object].getClass().getComponentType();

It will return null if the "object" is not an array. to determine if it is an array, just call:

[Any Object].getClass().isArray()

or

Class.class.isArray();

I think it's because C was taken by char, so next letter in class is L.