class: center, middle # Regexen .b[[@abelar_s](https://twitter.com/abelar_s) - [@ParisRB](https://twitter.com/parisrb) - [@RailsGirlsParis](https://twitter.com/railsgirlsparis)] [maitre-du-monde.fr](maitre-du-monde.fr) .contrib[[Contribute to this talk on GitHub!](https://github.com/abelards/abelards.github.com/edit/master/talks/regexen.html)] ??? --- # A long long time ago Old UNIX tools are great.
Tools for more elegant times.
I want to match things. `"Bonjour" =~ /jour/ # => 3` `"Bonsoir".matches("jour") # => nil` ```ruby r = /#{first_part}@#{domains}\.#{tld}/ emails.reject {|e| e !~ r } ``` --- class: center # The Phantom Menace Matching characters: - Unix Glob: `_` `*` - SQL LIKE: `?` `%` - Reg Exp: `.` `.*` --- class: center # Attack of the Clones We found the previous character N times: - 0-1 `/Reg(ular)?/` - 1-n `/expres+ion/` - 0-n `/regex.*magic/` --- class: center # Revenge of the Characters French postcodes around Paris: ```ruby /(91|92|93|94).../ # strings' OR # accepts 94xxx /9[1234]\d\d\d/ # chars' OR # accepts numbers /9[1-4]\d{3}/ # counting ``` Match over-enthusiastic expressions: ```ruby min = /yolooo*/, /yolooo+/, /yolo{3,}/ max = /great!{,5}/ boundaries = /lo{3,5}l/ ``` --- class: center # A new hope case `/ReGeXp/i` Multiline matching is error-prone: ``` # start/end of line/string /^part$/ =~ %Q[in part icular] /p.*ar/m =~ %Q[pec uliar] /\Apart\Z/ !~ "particle" ``` RUBY ONLY: `\z` means EOS, `\Z` EOS + chomp --- class: center # The Group Strikes Back - `"banana" =~ /b(.).\1.\1/` - `# "bonobo", "bikini"` ```ruby str = "You are strong" str.gsub(/(\w*) (\w*) (\w*)/, "\\3, \\1 \\2") # => "strong, You are" ``` --- # The Group Strikes Back Ruby allow you to get that in the MatchData object ```ruby title = "I am your father" title =~ /am (\w*) (\w*)/ result = Regexp.last_match # => #
``` --- class: center # Return of the $ ``` m = /s(\w{2}).*(c)/.match('haystack') m == Regexp.last_match #=> #
$& == m[0] # => "stac" $` == m.pre_match # => "hay" $' == m.post_match #=> "k" $1 == m[1] #=> "ta" $3 == m[3] #=> nil $+ == m[-1] #=> "c" ``` --- class: center # The RegExp awakens ``` /(?
.)(?
.)/.named_captures #=> {"foo"=>[1], "bar"=>[2]} /(?
.)(?
.)(?
.)/.names #=> ["foo", "bar", "baz"] ``` --- class: center # Rogue \\1 ```ruby s = "[Changing [brackets]] [into parentheses]" s.gsub(/\[(.*?)\]/, "(\\1)") # => "(Changing [brackets)] (into parentheses)" ``` `.*` `.*?` (Greed[oy]) --- class: center # The Last Jedi * basic : .* and \( \) \? \+ * `-e`xtended : ./()?+ .info[ Regular expressions are a Type-3 Grammar in Chomsky's classification.
They are, can be run as, and can model automata.] `/!\` DO NOT PARSE HTML WITH REGEXP! --- class: center # Classes of the Old Republic ``` /\w\W/ # word [a-zA-Z0-9_] & non-\w /\s\S/ # whitespace [ \t\r\n\f] & non-\S /[[:xdigit:]]/ # POSIX [0-9a-fA-F] /\p{Alnum}/ # Unicode classes /\p{Ll}/ # Unicode General Category # 'Letter: Lowercase' ``` --- class: center, middle, happy # Questions? .b[[@abelar_s](https://twitter.com/abelar_s) - [@ParisRB](https://twitter.com/parisrb) - [@RailsGirlsParis](https://twitter.com/railsgirlsparis)] [maitre-du-monde.fr](maitre-du-monde.fr) .contrib[[Contribute to this talk on GitHub!](https://github.com/abelards/abelards.github.com/edit/master/talks/regexen.html)] --- class: center, middle, happy # Thanks! .b[[@abelar_s](https://twitter.com/abelar_s) - [@ParisRB](https://twitter.com/parisrb) - [@RailsGirlsParis](https://twitter.com/railsgirlsparis)] [maitre-du-monde.fr](maitre-du-monde.fr) .contrib[[Contribute to this talk on GitHub!](https://github.com/abelards/abelards.github.com/edit/master/talks/regexen.html)]