FsRegEx


Tutorial: F# Verbal Expressions

The VerbalExpressions module includes the FsRegEx type which wraps the familiar .NET RegEx in a type with useful functional members. Multiple constructors start with a regular expression in the constructor.

1: 
let fsRegEx = new FsRegEx(@"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,4}")

This the module is an experimental DSL that allows you to compose regular expressions in natural language using the immutable FsRegEx type. The remainder of this tutorial is concerned with the experimental DSL which is, frankly, not that practical.

For practical examples of using the core FsRegEx module for composability, see the following examples:

Verbal Expressions DSL

You can compose values of the FsRegEx type with the |> operator, including creating a new regular expression by logical or on 2 existing FsRegExs.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
#r "FsRegEx.dll"
open FsRegEx
open System.Text.RegularExpressions
open VerbalExpressions

let v =
    CommonFsRegEx.Email
    |> fsRegExOrFsRegEx RegexOptions.None CommonFsRegEx.Url

let foundEmail =
    v
    |> isMatch "test@github.com"

let foundUrl =
    v
    |> isMatch "http://www.google.com"

printfn "%b" foundEmail
printfn "%b" foundUrl

// true
// true

Natural language composition consists of building up a new FsRegEx from an old by functions which append special characters, groups, modifiers, and other attributes of the regular expression language.

function : 'T -> FsRegEx -> FsRegEx

See the API documentation for all the regular expression functions available.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
open VerbalExpressions

let foundFromGithub =
    FsRegEx()
    |> startOfLine
    |> something
    |> then' "github.com"
    |> endOfLine
    |> isMatch "test@github.com"

printfn "%b" foundFromGithub

// true

You do not have to worry about escaping special characters in your regular expression.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
let foundSomethingSpecial =
    FsRegEx()
    |> startOfLine
    |> something
    |> then' "*+?"
    |> anything
    |> isMatch "blah blah blah*+?yackety yack"

printfn "%b" foundSomethingSpecial

// true

Sometimes you may need more power than the natural language provides, or you just need to include a snippet of native regular expression. The add function lets you do that.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
let foundSpecialInMultiline =
    FsRegEx()
    |> add @"phrase1\*\+\?"
    |> anything
    |> isMatch @"phrase1*+?RestOfLine\n"
    
printfn "%b" foundSpecialInMultiline

// true

FsRegExs posses all the power of the .Net RegEx class in a composable form.

 1: 
 2: 
 3: 
 4: 
 5: 
 6: 
 7: 
 8: 
 9: 
10: 
11: 
12: 
13: 
14: 
15: 
16: 
17: 
18: 
19: 
20: 
21: 
22: 
23: 
24: 
25: 
26: 
27: 
28: 
29: 
30: 
31: 
32: 
let n =
    FsRegEx()
    |> word
    |> matches "three words here"

printfn "%i" n.Length

// 3

let betterFormat =
    FsRegEx()
    |> add "\s+"
    |> or' "whitespace"
    |> replace "This     is   text with   far  too   much   whitespace" " "

printfn "%s" betterFormat

// This is text with far too much  

let groupName =  "GroupNumber"
 
FsRegEx()
|> add "COD"
|> beginCaptureNamed groupName
|> any "0-9"
|> repeatPrevious 3
|> endCapture
|> then' "END"
|> capture "COD123END" groupName
|> printfn "%s"

// 123

FsRegEx comes with first class support for unicode, including unicode general categories and .Net extension blocks.

1: 
2: 
3: 
4: 
5: 
6: 
7: 
8: 
9: 
FsRegEx()
|> beginCaptureNamed "upper"
|> unicodeCategory Unicode.UnicodeGeneralCategory.LetterUppercase
|> add "+"
|> endCapture
|> capture "some mixed case WORDS" "upper"
|> printfn "%s"

// WORDS
val fsRegEx : obj
namespace System
namespace System.Text
namespace System.Text.RegularExpressions
val v : obj
type RegexOptions =
  | None = 0
  | IgnoreCase = 1
  | Multiline = 2
  | ExplicitCapture = 4
  | Compiled = 8
  | Singleline = 16
  | IgnorePatternWhitespace = 32
  | RightToLeft = 64
  | ECMAScript = 256
  | CultureInvariant = 512
field RegexOptions.None: RegexOptions = 0
val foundEmail : bool
val foundUrl : bool
val printfn : format:Printf.TextWriterFormat<'T> -> 'T
val foundFromGithub : bool
val foundSomethingSpecial : bool
val foundSpecialInMultiline : bool
val n : obj
val betterFormat : string
val groupName : string
Fork me on GitHub