Table of contents
- Match Letters of the Alphabet
- Match Numbers and Letters of the Alphabet
- Match All Letters and Numbers
- Match Everything But Letters and Numbers
- Match All Numbers
- Match All Non-Numbers
- Match Whitespace
- Match Non-Whitespace Characters
- Check For Mixed Grouping of Characters
- Check for All or None
- Positive and Negative Lookahead
- Reuse Patterns Using Capture Groups
- Conclusion
In this final article of the "Understanding Regular Expressions: A Beginner's Guide" series, we will continue our exploration of patterns. The power of patterns lies in their ability to match a wide range of characters and sequences, allowing for flexible and dynamic searches. So, let us continue exploring the world of regular expressions.
Match Letters of the Alphabet
Inside a character set, you can define a range of characters to match using a hyphen character: -. To match lowercase letters a through e you would use [a-e].
let catStr = "cat";
let batStr = "bat";
let matStr = "mat";
let bgRegex = /[a-e]at/;
catStr.match(bgRegex);
// output:["cat"]
batStr.match(bgRegex);
// output:["bat"]
matStr.match(bgRegex);
// output: null
Match Numbers and Letters of the Alphabet
Using the hyphen (-) to match a range of characters is not limited to letters. It also works to match a range of numbers.
For example, /[0-5]/ matches any number between 0 and 5, including 0 and 5. Also, it is possible to combine a range of letters and numbers in a single character set.
let nameStr = "Myname1234";
let myRegex = /[a-z0-9]/ig;
nameStr.match(myRegex);
//output: ["M", "y", "n", "a", "m", "e", "1", "2", "3", "4"]
let secondRegex = /[a-z0-9]/g;
nameStr.match(secondRegex);
//output: ["y", "n", "a", "m", "e", "1", "2", "3", "4"]
Match All Letters and Numbers
The closest character class in JavaScript to match the alphabet is \w
. This shortcut is equal to [A-Za-z0-9_]
. This character class matches upper and lowercase letters plus numbers. Note, this character class also includes the underscore character (_)
. These shortcut character classes are also known as shorthand character classes.
let longHand = /[A-Za-z0-9_]+/;
let shortHand = /\w+/;
let numbers = "42";
let varNames = "important_var";
longHand.test(numbers);
//output: true
shortHand.test(numbers);
//output: true
longHand.test(varNames);
//output: true
shortHand.test(varNames);
//output: true
Match Everything But Letters and Numbers
You can search for the opposite of the \w
with \W.
Note, the opposite pattern uses a capital letter. This shortcut is the same as [^A-Za-z0-9_]
.
let longHand = /[^a-z0-9_]/gi;
let shortHand = /\W/;
let numbers = "42%";
let sentence = "Coding!";
numbers.match(shortHand);
//output: ['%']
sentence.match(shortHand);
//output: ['!']
numbers.match(longHand);
//output: ['%']
sentence.match(longHand);
//output: ['!']
Match All Numbers
The shortcut to look for digit characters is \d
, with a lowercase d. This is equal to the character class [0-9]
, which looks for a single character of any number between zero and nine.
let numbers = "123456789#@$";
let longHand = /[0-9]/gi;
let shortHand = /\d/g;
numbers.match(shortHand);
//output: ['1', '2', '3', '4', '5', '6', '7', '8', '9']
numbers.match(longHand);
//output: ['1', '2', '3', '4', '5', '6', '7', '8', '9']
Match All Non-Numbers
The shortcut to look for non-digit characters is \D.
This is equal to the character class [^0-9]
, which looks for a single character that is not a number between zero and nine.
let numbers = "123456789#@$";
let longHand = /[^0-9]/gi;
let shortHand = /\D/g;
numbers.match(shortHand);
//output: ['#', '@', '$']
numbers.match(longHand);
//output: ['#', '@', '$']
Match Whitespace
You can search for whitespace using \s
, which is a lowercase s. This pattern not only matches whitespace, but also carriage return, tab, form feed, and new line characters. You can think of it as similar to the character class [ \r\t\f\n\v]
.
let whiteSpace = "Whitespace. Whitespace everywhere!"
let spaceRegex = /\s/g;
whiteSpace.match(spaceRegex);
// output [' ', ' '].
Match Non-Whitespace Characters
Search for non-whitespace using \S
, which is an uppercase s. This pattern will not match whitespace, carriage return, tab, form feed, and new line characters. You can think of it as being similar to the character class [^ \r\t\f\n\v]
.
let whiteSpace = "Whitespace. Whitespace everywhere!"
let nonSpaceRegex = /\S/g;
whiteSpace.match(nonSpaceRegex).length;
//output: 32
whiteSpace.match(nonSpaceRegex);
/*output: ["W", "h", "i", "t", "e", "s", "p", "a", "c", "e", ".", "W", "h", "i", "t", "e", "s", "p", "a", "c", "e", "e", "v", "e", "r", "y", "w", "h", "e", "r", "e", "!"] */
Check For Mixed Grouping of Characters
Sometimes we want to check for groups of characters using a Regular Expression and to achieve that we use parentheses ()
.
let testStr = "Pumpkin";
let testRegex = /P(engu|umpk)in/;
testRegex.test(testStr);
//output: true
Check for All or None
You can specify the possible existence of an element with a question mark, ?
. This checks for zero or one of the preceding elements. You can think of this symbol as saying the previous element is optional.
let american = "color";
let british = "colour";
let rainbowRegex= /colou?r/;
rainbowRegex.test(american);
//output: true
rainbowRegex.test(british);
//output: true
Positive and Negative Lookahead
Lookaheads are patterns that tell JavaScript to look ahead in your string to check for patterns further along. This can be useful when searching for multiple patterns over the same string.
There are two kinds of lookaheads: positive lookahead and negative lookahead.
A positive lookahead will look to make sure the element in the search pattern is there, but won't match it. A positive lookahead is used as (?=...)
where the ...
is the required part that is not matched.
On the other hand, a negative lookahead will look to make sure the element in the search pattern is not there. A negative lookahead is used as (?!...)
where the ...
is the pattern that you do not want to be there. The rest of the pattern is returned if the negative lookahead part is not present.
let quit = "qu";
let noquit = "qt";
let quRegex= /q(?=u)/;
let qRegex = /q(?!u)/;
quit.match(quRegex);
//output: ['q']
quit.match(qRegex);
//output: null
noquit.match(quRegex);
//output: null
noquit.match(qRegex);
//output: ['q']
A more practical use of lookaheads is to check two or more patterns in one string. Here is a (naively) simple password checker that looks for between 3 and 6 characters and at least one number:
let password = "abc123";
let checkPass = /(?=\w{3,6})(?=\D*\d)/;
checkPass.test(password);
//output: true
Reuse Patterns Using Capture Groups
You can search for repeat substrings using capture groups. Parentheses, (
and )
, are used to find repeat substrings. You put the regex of the pattern that will repeat in between the parentheses.
To specify where that repeat string will appear, you use a backslash (\
) and then a number. This number starts at 1 and increases with each additional capture group you use. An example would be \1
to match the first group.
Using the .match() method on a string will return an array with the string it matches, along with its capture group.
let repeatStr = "regex regex";
let repeatRegex = /(\w+)\s\1/;
repeatRegex.test(repeatStr);
//output: true
repeatStr.match(repeatRegex);
//output ["regex regex", "regex"]
Conclusion
In Part 1, we explored regular expressions, how to create regular expressions and what are the regex methods. In Part 2, we focused on what are flags in the world of regular expressions and how they can be used to modify the searching behavior of given patterns. In Part 3 of this series, We explored general patterns and gain a better understanding of how they work. Additionally, we focused on how to specify the number of matches to achieve desired results. And in this final part, we continued our exploration of patterns and had more practical examples.
Thank you for your time. I hope you found it useful. ❤️
If you enjoyed this article and want to be the first to know when I post a new one, you can follow me on Twitter or here at Habiba Wael. I can't wait to hear your thoughts about the whole series and this article in particular and feel free to add anything that I may have missed.