Coming from java, ruby’s regex implementation seems a bit obtuse. Here’s a bunch of code examples which I’ve put together as a mini-reference for my own use. If I’ve missed anything major, leave a comment. And sorry about the code bleeding off to the right, I need a more code-friendly wordpress theme.
Each line contains an example of some ruby regex related call. The comment that follows it shows what the output of the line would be and gives a short explanation.
#Creating Regular Expressions
#============================
my_laborious_regex = Regexp.new('[a-z]{3}')
puts my_laborious_regex #=> "(?-mix:[a-z]{3})" because that's the regex and the default regex options
puts my_laborious_regex.source #=> "[a-z]{3}" because .source returns the regex pattern we specified
my_prettiest_regex = /[a-z]{3}/
puts my_prettiest_regex #=> "(?-mix:[a-z]{3})" because that's the regex and the default regex options
puts my_prettiest_regex.source #=> "[a-z]{3}" because .source returns the regex pattern we specified
puts my_laborious_regex == my_prettiest_regex #=> "true" because though they were created differently, they are the same pattern (and options)
#Playing with matching strings
#=============================
puts /^(\d{3})(\d{3})/.match("123456789") #=> "123456" because the regex will match the first 6 chars
puts /[a-z][0-9].*[a-z]/.match("123a8---a123") #=> "a8---a" because the regex ignores the first and last "123"
#Playing with capture groups
#===========================
puts /^123(\d{3})(\d{3})$/.match("123456789")[0] #=> "123456789", because [0] returns the match
puts /^123(\d{3})(\d{3})$/.match("123456789")[1] #=> "456", because [1] returns the first capture group
puts /^123(\d{3})(\d{3})$/.match("123456789")[2] #=> "789", because [2] returns the second capture group
puts /^123(\d{3})(\d{3})$/.match("123456789")[3] #=> "nil", because there are only 2 capture groups
puts /^123(\d{3})(\d{3})$/.match("123456789").to_a.inspect #=> ["123456789", "456", "789"], the match and the capture group results
#Playing with pre/post match
#===========================
the_match = /[a-z]+/.match("321abcdefg987")
puts the_match #=> "abcdefg" because the regex captures a string of lowercase alpha chars
puts the_match.pre_match #=> "321" because it's what comes before the string that was captured
puts the_match.post_match #=> "987" because it's what comes after the string that was captured
#Playing with .split and .scan
#=============================
puts target_string = "abc123def456ghi789"
puts target_string.split(/[0-9]+/).inspect #=> "["abc", "def", "ghi"]", because .split hunts for the supplied pattern, strips matches out of the string and returns substrings that were between the matches in an array
puts target_string.scan(/[0-9]+/).inspect #=> "["123", "456", "789"]", because .scan hunts for strings that match the pattern and returns all the matches in an array
#Playing with string substitution
#================================
puts target_string = "hello hello hello"
puts target_string.sub(/hello/, "goodbye") #=> "goodbye hello hello", because .sub only replaces the first match
puts target_string.gsub(/hello/, "goodbye") #=> "goodbye goodbye goodbye", because .gsub replaces all matches
puts target_string.gsub(/o/, "a").gsub(/e/, "o") #=> "holla holla holla", .gsub returns a string so you can chain .gsub's together
You can run the above and the output you’ll get is:
(?-mix:[a-z]{3})
[a-z]{3}
(?-mix:[a-z]{3})
[a-z]{3}
true
123456
a8---a
123456789
456
789
nil
["123456789", "456", "789"]
abcdefg
321
987
abc123def456ghi789
["abc", "def", "ghi"]
["123", "456", "789"]
hello hello hello
goodbye hello hello
goodbye goodbye goodbye
holla holla holla
Finally, there is an awesome site that I link to already called http://www.rubular.com/. It’s a very useful tool for trying out regex patterns. Bookmark it.















