Matching Strings with Regular Expressions
Problem
You want to know whether or not a string matches a certain pattern.
Solution
You can usually describe the pattern as a regular expression. The =~ operator tests a string against a regular expression:
string = 'This is a 30-character string.' if string =~ /([0-9]+)-character/ and $1.to_i == string.length "Yes, there are #$1 characters in that string." end # => "Yes, there are 30 characters in that string."
You can also use Regexp#match:
match = Regexp.compile('([0-9]+)-character').match(string) if match && match[1].to_i == string.length "Yes, there are #{match[1]} characters in that string." end # => "Yes, there are 30 characters in that string."
You can check a string against a series of regular expressions with a case statement:
string = "123" case string when /^[a-zA-Z]+$/ "Letters" when /^[0-9]+$/ "Numbers" else "Mixed" end # => "Numbers"
Discussion
Regular expressions are a cryptic but powerful minilanguage for string matching and substring extraction. They've been around for a long time in Unix utilities like sed, but Perl was the first general-purpose programming language to include them. Now almost all modern languages have support for Perl-style regular expression.
Ruby provides several ways of initializing regular expressions. The following are all equivalent and create equivalent Regexp objects:
/something/ Regexp.new("something") Regexp.compile("something") %r{something}
The following modifiers are also of note.
Regexp::IGNORECASE |
i |
Makes matches case-insensitive. |
Regexp::MULTILINE |
m |
Normally, a regexp matches against a single line of a string. This will cause a regexp to treat line breaks like any other character. |
Regexp::EXTENDED |
x |
This modifier lets you space out your regular expressions with whitespace and comments, making them more legible. |
Here's how to use these modifiers to create regular expressions:
/something/mxi Regexp.new('something', Regexp::EXTENDED + Regexp::IGNORECASE + Regexp::MULTILINE) %r{something}mxi
Here's how the modifiers work:
case_insensitive = /mangy/i case_insensitive =~ "I'm mangy!" # => 4 case_insensitive =~ "Mangy Jones, at your service." # => 0 multiline = /a.b/m multiline =~ "banana banana" # => 5 /a.b/ =~ "banana banana" # => nil # But note: /a b/ =~ "banana banana" # => 5 extended = %r{ was # Match " was" s # Match one whitespace character a # Match "a" }xi extended =~ "What was Alfred doing here?" # => 4 extended =~ "My, that was a yummy mango." # => 8 extended =~ "It was a fool's errand" # => nil
See Also
- Mastering Regular Expressions by Jeffrey Friedl (O'Reilly) gives a concise introduction to regular expressions, with many real-world examples
- RegExLib.com provides a searchable database of regular expressions (http://regexlib.com/default.aspx)
- A Ruby-centric regular expression tutorial (http://www.regular-expressions.info/ruby.html)
- ri Regexp
- Recipe 1.19, "Validating an Email Address"