How to merge two dataframes side-by-side?

小开

You can use the concat function for this (axis=1 is to concatenate as columns):

pd.concat([df1, df2], axis=1)

See the pandas docs on merging/concatenating: http://pandas.pydata.org/pandas-docs/stable/merging.html

小开

I came across your question while I was trying to achieve something like the following:

So once I sliced my dataframes, I first ensured that their index are the same. In your case both dataframes needs to be indexed from 0 to 29. Then merged both dataframes by the index.

df1.reset_index(drop=True).merge(df2.reset_index(drop=True), left_index=True, right_index=True)

小开

There is way, you can do it via a Pipeline.

** Use a pipeline to transform your numerical Data for ex-

Num_pipeline = Pipeline
([("select_numeric", DataFrameSelector([columns with numerical value])),
("imputer", SimpleImputer(strategy="median")),
])

**And for categorical data

cat_pipeline = Pipeline([
("select_cat", DataFrameSelector([columns with categorical data])),
("cat_encoder", OneHotEncoder(sparse=False)),
])

** Then use a Feature union to add these transformations together

preprocess_pipeline = FeatureUnion(transformer_list=[
("num_pipeline", num_pipeline),
("cat_pipeline", cat_pipeline),
])

Read more here - https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.FeatureUnion.html

小开

我发现其他的答案并不适合我，当我从谷歌进来的时候。

What I did instead was to set the new columns in place in the original df.

# list(df2.columns) gives you the column names of df2
# you then use these as the column names for df


df[ list(df2.columns) ] = df2

小开

If you want to combine 2 data frames with common column name, you can do the following:

df_concat = pd.merge(df1, df2, on='common_column_name', how='outer')

小开

This solution also works if df1 and df2 have different indices:

df1.loc[:, df2.columns] = df2.to_numpy()