Just finished a Coursera data science course. In one of the assignments, we need to clean up the weather event types. I was struggling with the grep function. For example, I want to find the event types which have “strong wind” in them, but I want to exclude “marine strong wind”. Finally found one example online, which uses the ?! in Perl to exclude matches. Here is one example:
events <- c("Strong Wind", "Wind", "Marine Strong Wind") events[grep("^(?=.*strong.wind)", events, ignore.case = T, perl = T)]
##  "Strong Wind" "Marine Strong Wind"
events[grep("^(?=.*strong.wind)(?!.*marine)", events, ignore.case = T, perl = T)]
##  "Strong Wind"
The first grep function finds any strings with strong wind in them and it finds both “Strong Wind” and “Marine Strong Wind”. In order to exclude the “Marine Strong Wind”, we use (?!.*marine) as part of the argument to the grep function.