I have these two HTML strings:
a="<div> foo: <span>bar</span> </div>"
b="<div> foo: bar <br> </div>"
I want to find foo: bar from each string.
The way I want to do it is to find from the word 'foo' until I come across a '<' character.
I can do this with the regular expression:
foo([^(<)]+)
This only finds "foo: bar" from string b but not from string a because the <span> tag is in the way. So I want to write the regex to look from foo until it finds a < character ignoring the <span> tag.
These are just some of the strings that this has to work on therefore it has to work like states i.e. I can not start removing tags before or after etc.
Basically all I need to know is how to find all characters in a string until I come across a certain character, unless that character is is followed by a set of specified characters, i.e. find until < but if < is followed by span> then look for the next <.
Does anyone know how to do this?