最佳答案
np.where
has the semantics of a vectorized if/else (similar to Apache Spark's when
/otherwise
DataFrame method). I know that I can use np.where
on pandas.Series
, but pandas
often defines its own API to use instead of raw numpy
functions, which is usually more convenient with pd.Series
/pd.DataFrame
.
Sure enough, I found pandas.DataFrame.where
. However, at first glance, it has completely different semantics. I could not find a way to rewrite the most basic example of np.where
using pandas where
:
# df is pd.DataFrame
# how to write this using df.where?
df['C'] = np.where((df['A']<0) | (df['B']>0), df['A']+df['B'], df['A']/df['B'])
Am I missing something obvious? Or is pandas' where
intended for a completely different use case, despite same name as np.where
?