Regular Expressions in Notepad++

I like Notepad++ and use it as my editor of choice for many projects. Today, I was going through some HTML forms, you know the ones with a multitude of option elements for state, country, etc. Imagine you had to get all of the state names and country names out of those option selections. There are plenty of ways, but doing a regex find and replace in Notepad++ makes it pretty easy. (Granted you could do this in about any editor worth its salt, but here I would like to show you how to do it in Notepad plus plus)

Ok, so you have a list of html options and you want to get the values without the HTML:

<option value="AFGHANISTAN">AFGHANISTAN</option><option value="ALBANIA">ALBANIA</option><option value="ALGERIA">ALGERIA</option>

etc....

You could go lift a list off of someone else's website, or you could easily get the values with Notepad++. Just paste the HTML into Notepad++ then hit Ctrl+H or go to Search -> Replace.

  1. Check the option box in the lower left for Regular expression.
  2. Type this regex in the Find box: <option value="[0-9a-zA-Z_&-.  ]+">
  3. Make sure the Replace box is empty. (You want to replace with empty space)
  4. Select Replace All.
  5. Check the option box for 'Extended (\n, \r, \t, \0, \x...)
  6. Type </option> in the Find box. 
  7. Type \r\n in the Replace Box. 
  8. Select Replace All

Hooray, now you have a list of countries (or whatever you had options for) delimited by a new line for each one. You can do much more with regular expressions in Notepad++, and I'd suggest taking a look at this blog and get this book from Amazon.