在 Python NumPy 中，什么是维度和轴？

小开

It is of rank one, as you need one index to index it. That one axis has the length 3, as the index indexing it can take three different values: v[i], i=0..2.

小开

最佳答案

In numpy arrays, dimensionality refers to the number of axes needed to index it, not the dimensionality of any geometrical space. For example, you can describe the locations of points in 3D space with a 2D array:

array([[0, 0, 0],
[1, 2, 3],
[2, 2, 2],
[9, 9, 9]])

Which has shape of (4, 3) and dimension 2. But it can describe 3D space because the length of each row (axis 1) is three, so each row can be the x, y, and z component of a point's location. The length of axis 0 indicates the number of points (here, 4). However, that is more of an application to the math that the code is describing, not an attribute of the array itself. In mathematics, the dimension of a vector would be its length (e.g., x, y, and z components of a 3d vector), but in numpy, any "vector" is really just considered a 1d array of varying length. The array doesn't care what the dimension of the space (if any) being described is.

You can play around with this, and see the number of dimensions and shape of an array like so:

In [262]: a = np.arange(9)


In [263]: a
Out[263]: array([0, 1, 2, 3, 4, 5, 6, 7, 8])


In [264]: a.ndim    # number of dimensions
Out[264]: 1


In [265]: a.shape
Out[265]: (9,)


In [266]: b = np.array([[0,0,0],[1,2,3],[2,2,2],[9,9,9]])


In [267]: b
Out[267]:
array([[0, 0, 0],
[1, 2, 3],
[2, 2, 2],
[9, 9, 9]])


In [268]: b.ndim
Out[268]: 2


In [269]: b.shape
Out[269]: (4, 3)

Arrays can have many dimensions, but they become hard to visualize above two or three:

In [276]: c = np.random.rand(2,2,3,4)


In [277]: c
Out[277]:
array([[[[ 0.33018579,  0.98074944,  0.25744133,  0.62154557],
[ 0.70959511,  0.01784769,  0.01955593,  0.30062579],
[ 0.83634557,  0.94636324,  0.88823617,  0.8997527 ]],


[[ 0.4020885 ,  0.94229555,  0.309992  ,  0.7237458 ],
[ 0.45036185,  0.51943908,  0.23432001,  0.05226692],
[ 0.03170345,  0.91317231,  0.11720796,  0.31895275]]],




[[[ 0.47801989,  0.02922993,  0.12118226,  0.94488471],
[ 0.65439109,  0.77199972,  0.67024853,  0.27761443],
[ 0.31602327,  0.42678546,  0.98878701,  0.46164756]],


[[ 0.31585844,  0.80167337,  0.17401188,  0.61161196],
[ 0.74908902,  0.45300247,  0.68023488,  0.79672751],
[ 0.23597218,  0.78416727,  0.56036792,  0.55973686]]]])


In [278]: c.ndim
Out[278]: 4


In [279]: c.shape
Out[279]: (2, 2, 3, 4)

小开

Just paste part of answer from this answer:

In Numpy, dimension, axis/axes, shape are related and sometimes similar concepts:

In [1]: import numpy as np


In [2]: a = np.array([[1,2],[3,4]])

dimension

In Mathematics/Physics, dimension or dimensionality is informally defined as the minimum number of coordinates needed to specify any point within a space. But in Numpy, according to the numpy doc, it's the same as axis/axes:

In Numpy dimensions are called axes. The number of axes is rank.

In [3]: a.ndim  # num of dimensions/axes, *Mathematics definition of dimension*
Out[3]: 2

axis/axes

the nth coordinate to index an array in Numpy. And multidimensional arrays can have one index per axis.

In [4]: a[1,0]  # to index `a`, we specific 1 at the first axis and 0 at the second axis.
Out[4]: 3  # which results in 3 (locate at the row 1 and column 0, 0-based index)

shape

describes how many data along each available axis.

In [5]: a.shape
Out[5]: (2, 2)  # both the first and second axis have 2 (columns/rows/pages/blocks/...) data

小开

You can also use axis parameter in group operations, in case of axis=0 Numpy performs the action on elements of each column, and if axis=1, it performs the action on rows.

test = np.arange(0,9).reshape(3,3)


Out[3]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])


test.sum(axis=0)
Out[5]: array([ 9, 12, 15])


test.sum(axis=1)
Out[6]: array([ 3, 12, 21])

小开

This is how I understand it. A point is a 1D object. You can only define its position. It has no dimensions. A line or surface is a 2D object. You can define it by both its position and length or area respectively e.g. Rectangle, Square, Circle A volume is a 3D object. You can define it by its position, surface area/lengths and volume e.g. Sphere, Cube.

From this, you will define a point in NumPy by a single axis (dimension), regardless of the number of mathematical axes you use. For x and y axes, a point is defined as [2,4], and for x, y and z axes, a point is defined as [2,4,6]. Both of these are points, thus 1D.

To define a line, two points will be needed. This will require some form of 'nesting' of the points to the second dimension (2D). As such, a line may be defined using x and y only as [[2,4],[6,9]] or using x, y and z as [[2,4,6],[6,9,12]]. For a surface, it will simply require more points to describe it, but still remains a 2D object. For example, a triangle will need 3 points while a rectangle/square will need 4.

A volume will require 4 (a tetrahedron)or more points to define it , but still maintaining the 'nesting' of points to the third dimension (3D).

小开

If someone need this visual description:

小开

In order to understand the dimensions and axes, it is important to understand tensors and its rank. A vector is a rank-1 tensor a matrix is a rank-2 tensor and so on and so forth. Consider the following:

x = np.array([0,3,4,5,8])

Now x is a vector hence a rank-1 tensor. But the vector itself is 5-dimensional. In numpy rank=dimension=axis. There is a slight deviation from the conventional definition of dimension, which is 5 for the vector shown above. Therefore it is better to stick to rank or axes and use dimension in the traditional sense.