I have a string in the format of an url query :
string <- "key1=value1&key2=value2"
And I would like to extract all the parameters names (key1, key2).
I thought about strsplit with a split matching everything between = and an optional &.
unlist(strsplit(string, "=.+&?"))
[1] "key1"
But I guess that this pattern matches from the first = to the end of the string including my optional & in the .+. I suspect this is because of the "greediness" of the regexp so I tried it to make lazy but I got a strange result.
> unlist(strsplit(string, "=.+?&?"))
[1] "key1" "alue1&key2" "alue2"
Now I don't really understand what is happening here and I don't know how I can make it lazy when the last matching character is optional.
I know (and I think I also understand why) that it works if I excludes & from .+ but I wish I could understand why the regexp above aren't working.
> unlist(strsplit(string, "=[^&]+&?"))
[1] "key1" "key2"
My actual option is to do it in 2 times with :
unlist(sapply(unlist(strsplit(string, "&")), strsplit, split = "=.*", USE.NAMES = FALSE))
What I'm doing wrong to achieve this in one regexp ? Thanks for any help.
I'm painfully learning regexp, so any other options would be also appreciated for my knowledge !
