- 28 Sep 2022
- 3 Minutes to read
- Print
- DarkLight
Guide to REGEX() functions
- Updated on 28 Sep 2022
- 3 Minutes to read
- Print
- DarkLight
In our formula field reference, you'll find the basic overview for the REGEX() functions that Airtable currently supports. Currently, Airtable supports three functions: REGEX_MATCH()
, REGEX_EXTRACT()
, and REGEX_REPLACE()
. Each of these functions has advantages and disadvantages depending on the formula output you are expecting to be produced in your formula field.
Note that in this article we will not be covering all of the Regular Expression patterns in-depth. There are many outside resources that provide great deep dives into REGEX and pattern matching (like regex101, or the MDN Regex cheatsheet) that we recommend checking out for more detail.
Introduction
You may be wondering what a regular expression is; you can think of a regular expression as an arrangement of symbols and characters conveying a string or pattern to be searched for within a larger body of text. The result is something similar to a search engine.
In Airtable, we offer the find bar ( Ctrl F or Ctrl G ) as well as the Search extension to find certain literal instances of certain strings or patterns. Additionally, we offer the FIND()
and SEARCH()
functions as well as other relevant text extraction functions in the Text operators and functions section of the formula field reference.
However, the REGEX functions that follow provide a way to search the information in your tables as well as the ability to work with that information in powerful ways. Note that our REGEX functions only work with text strings. If you are using either a rollup or lookup field, you may need to first convert the data into a string before you can use a REGEX function on it (you could use ARRAYJOIN() for this). Also note that rollup, lookup, and formula fields do not currently support rich text.
REGEX_MATCH
The REGEX_MATCH()
function returns whether the input text matches a regular expression and outputs in the Boolean data type as a 1 (true) or 0 (false) result. Two examples of where this may be useful are in phone number validation and email validation within your base.
If you need to validate a set of phone numbers against specific criteria, you can use REGEX_MATCH() like in the formula below.
REGEX_MATCH( {Possible Phone Number}, '^([+]?\[0-9]\( |-)?)?(\\(?[0-9]{3}\\)?|[0-9]{3})( |-)?([0-9]{3}( |-)?[0-9]{4}|[a-zA-Z0-9]{7})$')
This regular expression handles a number of different phone number formats:
- Country codes
- Using dashes or spaces as delimiters
- Parentheses around the area code
- Using letters instead of numbers
Similar to the other example above, you can use REGEX_MATCH() to validate a list of email addresses as well.
REGEX\_MATCH( {Email address}, "(\W|^)[\w.\\-]{0,25}@[A-Za-z0-9.-]+\\.[A-Za-z]{2,}(\W|$)")
This regex handles basic validation for email addresses: Starts with a username that is composed of letters, numbers, or characters like underscore, period or dash. An at-sign (@). A domain name, which is a string of letters, numbers, or characters like underscore, period, or dash, followed by a period, followed by another string of letters, numbers, or characters like underscore, period or dash. Email validation is a very complicated subject, but this regex is a good starting point for most use cases!
REGEX_EXTRACT
The REGEX_EXTRACT()
function returns the first substring that matches a regular expression. If the REGEX_EXTRACT()
function finds no matching instance of the provided regular expression it will return an ERROR
.
In the example below, this function is used to extract the domain name from a URL field.
REGEX_EXTRACT( {URL}, '^(?:https?:\\/\\/)?(?:[^@\n]\+@)?(?:www\\.)?([^:\\/\n?]+)')
URLs start with HTTP or HTTPS, possibly a username followed by an @ symbol, the domain, and then the rest of the URL. Here, we strip off the subdomain if it’s ‘www’, but otherwise preserve it.
REGEX_REPLACE
The REGEX_REPLACE()
function substitutes all matching substrings with a replacement string value. Building off of the example above, if we have a valid phone number, we can normalize it using REGEX_REPLACE()
to make it easier to work with in the rest of our Airtable base.
IF({Is Valid Phone Number?}, UPPER(REGEX_REPLACE({Possible Phone Number}, '[^A-Za-z0-9]', '')), ERROR('Invalid phone number'))
If the phone number passed validation, we normalize it by using REGEX_REPLACE() to replace non-alphanumeric characters with an empty string, resulting in just the dialable digits. We also use the UPPER() formula to make the casing consistent. Normalizing phone numbers into a prettier format is an exercise left to the reader!