Summary of all ways to get info on dimensions of DataFrame or Series
There are a number of ways to get information on the attributes of your DataFrame or Series.
Create Sample DataFrame and Series
df = pd.DataFrame({'a':[5, 2, np.nan], 'b':[ 9, 2, 4]})
df
a b
0 5.0 9
1 2.0 2
2 NaN 4
s = df['a']
s
0 5.0
1 2.0
2 NaN
Name: a, dtype: float64
shape Attribute
The shape attribute returns a two-item tuple of the number of rows and the number of columns in the DataFrame. For a Series, it returns a one-item tuple.
df.shape
(3, 2)
s.shape
(3,)
len function
To get the number of rows of a DataFrame or get the length of a Series, use the len function. An integer will be returned.
len(df)
3
len(s)
3
size attribute
To get the total number of elements in the DataFrame or Series, use the size attribute. For DataFrames, this is the product of the number of rows and the number of columns. For a Series, this will be equivalent to the len function:
df.size
6
s.size
3
ndim attribute
The ndim attribute returns the number of dimensions of your DataFrame or Series. It will always be 2 for DataFrames and 1 for Series:
df.ndim
2
s.ndim
1
The tricky count method
The count method can be used to return the number of non-missing values for each column/row of the DataFrame. This can be very confusing, because most people normally think of count as just the length of each row, which it is not. When called on a DataFrame, a Series is returned with the column names in the index and the number of non-missing values as the values.
df.count() # by default, get the count of each column
a 2
b 3
dtype: int64
df.count(axis='columns') # change direction to get count of each row
0 2
1 2
2 1
dtype: int64
For a Series, there is only one axis for computation and so it just returns a scalar:
s.count()
2
Use the info method for retrieving metadata
The info method returns the number of non-missing values and data types of each column
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 2 columns):
a 2 non-null float64
b 3 non-null int64
dtypes: float64(1), int64(1)
memory usage: 128.0 bytes