获取熊猫 DataFrame 的名称

如何获取 DataFrame 的名称并将其打印为字符串?

例如:

boston(分配给 csv 文件的 var 名称)

import pandas as pd
boston = pd.read_csv('boston.csv')


print('The winner is team A based on the %s table.) % boston
222155 次浏览

You can name the dataframe with the following, and then call the name wherever you like:

import pandas as pd
df = pd.DataFrame( data=np.ones([4,4]) )
df.name = 'Ones'


print df.name
>>>
Ones

Hope that helps.

From here what I understand DataFrames are:

DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. You can think of it like a spreadsheet or SQL table, or a dict of Series objects.

And Series are:

Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.).

Series have a name attribute which can be accessed like so:

 In [27]: s = pd.Series(np.random.randn(5), name='something')


In [28]: s
Out[28]:
0    0.541
1   -1.175
2    0.129
3    0.043
4   -0.429
Name: something, dtype: float64


In [29]: s.name
Out[29]: 'something'

EDIT: Based on OP's comments, I think OP was looking for something like:

 >>> df = pd.DataFrame(...)
>>> df.name = 'df' # making a custom attribute that DataFrame doesn't intrinsically have
>>> print(df.name)
'df'

Sometimes df.name doesn't work.

you might get an error message:

'DataFrame' object has no attribute 'name'

try the below function:

def get_df_name(df):
name =[x for x in globals() if globals()[x] is df][0]
return name

In many situations, a custom attribute attached to a pd.DataFrame object is not necessary. In addition, note that pandas-object attributes may not serialize. So pickling will lose this data.

Instead, consider creating a dictionary with appropriately named keys and access the dataframe via dfs['some_label'].

df = pd.DataFrame()


dfs = {'some_label': df}

Here is a sample function: 'df.name = file` : Sixth line in the code below

def df_list():
filename_list = current_stage_files(PATH)
df_list = []
for file in filename_list:
df = pd.read_csv(PATH+file)
df.name = file
df_list.append(df)
return df_list

I am working on a module for feature analysis and I had the same need as yours, as I would like to generate a report with the name of the pandas.Dataframe being analyzed. To solve this, I used the same solution presented by @scohe001 and @LeopardShark, originally in https://stackoverflow.com/a/18425523/8508275, implemented with the inspect library:

import inspect


def aux_retrieve_name(var):
callers_local_vars = inspect.currentframe().f_back.f_back.f_locals.items()
return [var_name for var_name, var_val in callers_local_vars if var_val is var]

Note the additional .f_back term since I intend to call it from another function:

def header_generator(df):
print('--------- Feature Analyzer ----------')
print('Dataframe name: "{}"'.format(aux_retrieve_name(df)))
print('Memory usage: {:03.2f} MB'.format(df.memory_usage(deep=True).sum() / 1024 ** 2))
return

Running this code with a given dataframe, I get the following output:

header_generator(trial_dataframe)

--------- Feature Analyzer ----------
Dataframe name: "trial_dataframe"
Memory usage: 63.08 MB

DataFrames don't have names, but you have an (experimental) attribute dictionary you can use. For example:

df.attrs['name'] = "My name"   # Can be retrieved later

attributes are retained through some operations.