I found a few questions heading in this direction, but I could not apply the solutions to my specific problem: I have a quite a messy column of a dataframe with addresses. This means, there can be empty cells, numbers, numbers and text combined - and there can be one or more special characters in between.
In a first step, I want to split all values at the first special character. I tried various options that work partially. However, the problem seems to be that some cells don't contain any special characters - causing an error in the function.
For example, the following code puts only the special character in the new column b, but does not really split the columns:
df <- df %>%
separate(address, into = c("a", "b"), sep = "[^[:punct:]]+", remove = FALSE)
So, what ideally I want to achieve is the following: If there is a special character in the cell, split it at the first special character, everything left of the first special character in column a, everything right in column b. If there is no special character, put the whole thing in column a and NA in column b.
Do I have to wrap my code in an ifelse-statement? Or are there any other suggestions?
Thanks!
Edit: as requested, some sample data:
library(dplyr)
test <- as.data.frame(c("2", "97/7", "17/7-8", "7E", "800E/7", "17", "", "0", "2/15", "17+18", "17/7/8", "19", "2/2/4", "9-7/8")) %>%
rename(address = 1)