Just finished a Coursera data science course. In one of the assignments, we need to clean up the weather event types. I was struggling with the grep function. For example, I want to find the event types which have “strong wind” in them, but I want to exclude “marine strong wind”. Finally found one example online, which uses the ?! in Perl to exclude matches. Here is one example:
If you’re thinking about purchasing a new GPU, we’d greatly appreciate it if you used our Amazon Associate links. The price you pay will be exactly the same, but Amazon provides us with a small commission for each purchase. It’s a simple way to support our site and helps us keep creating useful content for you. Recommended GPUs: RTX 5090, RTX 5080, and RTX 5070. #ad
events <- c("Strong Wind", "Wind", "Marine Strong Wind")
events[grep("^(?=.*strong.wind)", events, ignore.case = T, perl = T)]
## [1] "Strong Wind" "Marine Strong Wind"
events[grep("^(?=.*strong.wind)(?!.*marine)", events, ignore.case = T, perl = T)]
## [1] "Strong Wind"
The first grep function finds any strings with strong wind in them and it finds both “Strong Wind” and “Marine Strong Wind”. In order to exclude the “Marine Strong Wind”, we use (?!.*marine) as part of the argument to the grep function.
Leave a Reply