Joining two data.frames
data.table(df_del)
KEY place_Name
1: 200039 BUFFALO/ROCHESTER
2: 200171 MILWAUKEE
3: 200197 PEORIA/SPRINGFLD.
4: 200233 OKLAHOMA CITY
5: 200272 LOS ANGELES
data.table(df)
firm_id brand_id KEY UNITS DOLLARS DATE
1: 511 263 647840 1 7.29 2001-01-01
2: 511 265 647840 2 14.58 2001-01-01
3: 511 265 532733 1 6.39 2001-01-01
4: 23 417 263939 1 4.79 2001-01-01
5: 23 417 648768 5 24.45 2001-01-01
I am trying to join them by KEY but running into problems. The df file has approx 500,000 results and the df_del file has about 12,000.
The df_del file has the unique product key and it can be bought in the same city, so there may be 10 KEY values for one city(i.e. a product delivered 10 times in that city)
The dffile also has the KEY column - however it is not always found. (when I copy a random KEYfrom the df_del dataframe and paste it in the search of the df frame I get no results sometimes. (This is because I am only using a snap shot of the df data and all of the df_del data). The alternative way i.e. KEYnumber from the dfdataframe and pasted into the df_delgives me a result (can sometimes appear more than once in both data.frames)
My question;
When I try to run;
library(plyr)
df_test <- join(df, df_del,
type = "left")
I obtain 500,000 results with all the df results but in the merged place name I get just NA values, I have tried, right, left, inneretc. I have also tried merge(df, df_del, by = "KEY") and get a result of zero.
Any help would be appreciated.
df should look like:
data.table(df)
firm_id brand_id KEY UNITS DOLLARS DATE place_name
1: 511 263 647840 1 7.29 2001-01-01 NYC
2: 511 265 647840 2 14.58 2001-01-01 NYC
3: 511 265 532733 1 6.39 2001-01-01 Los Angeles
4: 23 417 263939 1 4.79 2001-01-01 Chicago
5: 23 417 648768 5 24.45 2001-01-01 Houston