python - Pandas add column from one dataframe to another based on a join -
assume have 2 dataframes. want add column of dataframe 1 dataframe 2 based on column lookup. if join not possible, want in column constant (so can filter that).
graphically:
code:
import pandas pd import numpy np data = np.array([['','col1','col2'], ['row1','2','two'], ['row2','1','one']] ) data2 = np.array([['','col3','col4'], ['row1','1','t1'], ['row2','2','t2'], ['row3','3','t3']] ) df = pd.dataframe(data=data[1:,1:], index=data[1:,0], columns=data[0,1:]) df2 = pd.dataframe(data=data2[1:,1:], index=data2[1:,0], columns=data2[0,1:]) result_df = df2 + join col2 based on df2.col3 = df.col1. add string constant if join fails. print(df) print(df2) print(result_df)
df = df2.join(df.set_index('col1'), on='col3') print (df) col3 col4 col2 row1 1 t1 1 row2 2 t2 2 row3 3 t3 nan
df2['col2'] = df2['col3'].map(df.set_index('col1')['col2']) print (df2) col3 col4 col2 row1 1 t1 1 row2 2 t2 2 row3 3 t3 nan
Comments
Post a Comment