I have two datasets like so:
training.csv
last_name ob1 ob2
Adam 2:01 2:02
Barry, S 3:30 2:50
Barry, D 2:45
Charlie 4:00
Don 2:00 1:50
Earl 2:50 2:30
Johnson, A 2:57 2:54
Johnson, T 3:15 3:10
and
racing.csv
last_name first_name 1mile-time 500m-time
Barry Sue 4:45 1:50
Don Regan 4:35 0:50
Earl Sage 4:50 1:30
Johnson Adam 4:37 1:54
Johnson Terry 4:50 2:10
So I used merge(training, racing, by = "last_name", all = TRUE) but some people have a shared last name. In the case that a last name was shared, it was entered as last name and first initial separated by a comma.
Another important thing to note, not everyone who goes to training makes the races. So there will be some unique names in training.csv that are not present in racing.csv.
Desired output
last_name first_name ob1 ob2 1mile-time 500m-time
Adam Bob 2:01 2:02
Barry, S Sue 3:30 2:50 4:45 1:50
Barry, D Derrick 2:45
Charlie Charles 4:00
Don Regan 2:00 1:50 4:35 0:50
Earl Sage 2:50 2:30 4:50 1:30
Johnson, A Adam 2:57 2:54 4:50 2:10
Johnson, T Terry 3:15 3:10 4:50 2:10