Lambda 包括 if... elif... else

我想在 lambda 函数中使用 if... elif... else 将 lambda 函数应用到 DataFrame 列。

Df 和代码类似于:

df=pd.DataFrame({"one":[1,2,3,4,5],"two":[6,7,8,9,10]})


df["one"].apply(lambda x: x*10 if x<2 elif x<4 x**2 else x+10)

很明显,这样不行,有没有办法把... elif... else 应用到 lambda 上? 我怎样才能得到与列表内涵相同的结果?

166513 次浏览

Nest if .. elses:

lambda x: x*10 if x<2 else (x**2 if x<4 else x+10)

I do not recommend the use of apply here: it should be avoided if there are better alternatives.

For example, if you are performing the following operation on a Series:

if cond1:
exp1
elif cond2:
exp2
else:
exp3

This is usually a good use case for np.where or np.select.


numpy.where

The if else chain above can be written using

np.where(cond1, exp1, np.where(cond2, exp2, ...))

np.where allows nesting. With one level of nesting, your problem can be solved with,

df['three'] = (
np.where(
df['one'] < 2,
df['one'] * 10,
np.where(df['one'] < 4, df['one'] ** 2, df['one'] + 10))
df


one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

numpy.select

Allows for flexible syntax and is easily extensible. It follows the form,

np.select([cond1, cond2, ...], [exp1, exp2, ...])

Or, in this case,

np.select([cond1, cond2], [exp1, exp2], default=exp3)

df['three'] = (
np.select(
condlist=[df['one'] < 2, df['one'] < 4],
choicelist=[df['one'] * 10, df['one'] ** 2],
default=df['one'] + 10))
df


one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

and/or (similar to the if/else)

Similar to if-else, requires the lambda:

df['three'] = df["one"].apply(
lambda x: (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10)


df
one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

List Comprehension

Loopy solution that is still faster than apply.

df['three'] = [x*10 if x<2 else (x**2 if x<4 else x+10) for x in df['one']]
# df['three'] = [
#    (x < 2 and x * 10) or (x < 4 and x ** 2) or x + 10) for x in df['one']
# ]
df
one  two  three
0    1    6     10
1    2    7      4
2    3    8      9
3    4    9     14
4    5   10     15

For readability I prefer to write a function, especially if you are dealing with many conditions. For the original question:

def parse_values(x):
if x < 2:
return x * 10
elif x < 4:
return x ** 2
else:
return x + 10


df['one'].apply(parse_values)

You can do it using multiple loc operators. Here is a newly created column labelled 'new' with the conditional calculation applied:

df.loc[(df['one'] < 2), 'new'] = df['one'] * 10
df.loc[(df['one'] < 4), 'new'] = df['one'] ** 2
df.loc[(df['one'] >= 4), 'new'] = df['one'] + 10