Beginners guide to Regular Expressions
03 Dec, 2018
Regular Expressions (RegEx) are an extremely powerful tool that is common across most popular languages in use today. It can be used to find, replace and match text using a specialised instruction set. Although initially, it might just seem like a garbled mess of characters. However, there is in fact, meaning behind the madness. This tutorial will go over the basic syntax and operators used in RegEx and then how to use it in JavaScript.
Because the syntax of RegEx is mainly made up by single characters on the keyboard, if you want to search for something like https://, you will need to escape the special characters with backslashes "\" so that the parser doesn't get confused.
"https://" --> "https\:\/\/"
Logical Operators and Basic Syntax: Useful instructions to tell the program how you want to search.
"a|b" // a [Or] b
"whats (down|up)" // Capture group "down" [Or] "up" from string that contains "...whats ..."
"a*" // 0 or more repetitions of a
"a+" // 1 or more repetitions of a
"a{2,10}" // 2 to 10 repetitions of a
"ca?t" // Match "cat" or "ct"
"B(?:at|og)" // Match but don't capture
"^www." // Match string that starts with "www."
".png$" // Match string that ends with ".png"
"." // Represents any charicter
Character Classes: A list of characters that can be matched.
"[AEIOU]" // Match one of the charicters in the brackets
"[a-z]" // Match one of the charicters between lower case "a-z"
"[a-zA-Z0-9_]" // Match one letter, digit or underscore
"[^a-g]" // Match anything that is not in range "a" to "g"
// Used in conjunction with Logical Opperators:
"[AEIOU]+" // Match one or more uppercase vouls
Short-Hand Character Classes: For some of the more common character classes there exists a shorthand version built in.
"\d" // One digit, eg: "[0-9]"
"\d" // Not a digit, eg: "[^0-9]"
"\w" // One word charicter (letter, digit or underscore), eg: "[a-zA-Z0-9_]"
"\W" // Not a word charicter, eg: "[^a-zA-Z0-9_]"
"\s" // White space
"\S" // Not white space
"Putting it all together" - Utilising the above syntax, you are able to accomplish just about any operation on a string you can think of. Heres a relatively basic example in javascript of how RegEx might be used to only filter out unwanted file types.
//Allow only png, jpg and jpeg
var file1 = 'dog.png';
var file2 = 'bird.jpg';
var file3 = 'cat.gif';
console.log(file1.match(/[\w-]+\.(jpe?g|png)/));
//returns {0: 'dog.png', 1: 'png'}
console.log(file2.match(/[\w-]+\.(jpe?g|png)/));
//returns {0: 'bird.jpg', 1: 'jpg'}
console.log(file3.match(/[\w-]+\.(jpe?g|png)/));
//returns null
I definitely recommend playing around with these operators for a while yourself, to really get a feel for how they work. A great site for that is regex101.com