- Positive lookahead assertions are ‘zero-width’ assertions that make sure that text matching the assertion is found in the input string ahead of the current position but does not consume any input characters
- Once a positive lookahead assertion is matched from the current position in the input string, the regex engine backtracks to the curret position and tries to match other parts of the regular expression
- You can specify a positive lookahead assertion in your regular expression by including the assertion pattern inside
(?=and
A Simple Lookahead Assertion
- The following JavaScript code will extract the windows drive letter from a file name like
E:\Users\johndoe\Documents\accounts.pdf
let wordPttn = /[a-zA-Z](?=\:\\)/; let targetString = "E:\\Users\\johndoe\\Documents\\accounts.pdf";let result = targetString.match(wordPttn);console.log((result !== null ? `Pattern ${wordPttn} FOUND in '${targetString}'. Matched string is '${result[0]}'` : `Pattern ${wordPttn} NOT FOUND in '${targetString}'`));
- Let’s analyze the pattern
/[a-zA-Z](?=\:\\)/bit by bit - The bolded part in
/[a-zA-Z](?=\:\\)/will match any single letter which are how the window’s drives are named - In a file name, the drive is followed by
':\'likec:\pictures\goldengate.png - We want to make sure than the drive name is followed by
':\'but not include it in the match returned by the regular expression. We only want the drive letter - So we add a positive lookahed assertion inside
(?=and)like/[a-zA-Z](?=\:\\)/ - The pattern that should be matched inside the assertion is the bolded part
which is/[a-zA-Z](?=\:\\)/\:\\ - It tells regex engine that from the current location in the input string, that is after matching the portion of the regex uptil now which is
[a-zA-Z]which matches the drive'E'in'E:\Users\johndoe\Documents\accounts.pdf', the pattern\:\\should be matched, but not included in the result - So the pattern
\:\\will match the:\in'E:\Users\johndoe\Documents\accounts.pdf'. Note that both the semi-colon (:) and the backslash(\) are escaped with backslashes - The backslashes in the targetString in the code are also escaped with backslashes ‘
E:\\Users\\johndoe\\Documents\\accounts.pdf‘ - Finally, the matched string that is returned by the regex will be the drive letter
'E'
Example 2
- You want to ascertain that a password must contain at least one digit, one uppercase letter and allowed characters are
a-z,A-Z,0-9and the special characters!@#$%^&_and it must be between 8 and 20 characters long - You can use the following JavaScript code to achieve it
let wordPttn = /(?=[\w!@#$%^&_]*[A-Z])(?=[\w!@#$%^&_]*\d)[\w!@#$%^&_]{8,20}/; let targetString = "pas1sWo!@rd";let result = targetString.match(wordPttn);console.log((result !== null ? `Pattern ${wordPttn} FOUND in '${targetString}'. Matched string is '${result[0]}'` : `Pattern ${wordPttn} NOT FOUND in '${targetString}'`));targetString = "pas1swo!@rd";result = targetString.match(wordPttn);console.log((result !== null ? `Pattern ${wordPttn} FOUND in '${targetString}'. Matched string is '${result[0]}'` : `Pattern ${wordPttn} NOT FOUND in '${targetString}'`));targetString = "passWo!@rd";result = targetString.match(wordPttn);console.log((result !== null ? `Pattern ${wordPttn} FOUND in '${targetString}'. Matched string is '${result[0]}'` : `Pattern ${wordPttn} NOT FOUND in '${targetString}'`));
- Lets analyze the pattern we use
/(?=[\w!@#$%^&_]*[A-Z])(?=[\w!@$%^&_]*\d)[\w!@$%^&_]{8,20}/ - The first positive lookahead assertion
(?=[\w!@#$%^&_]*[A-Z])states that the string ahead can contain 0 or more valid characters followed by at least one upper case letter.
Note that since the lookahead is at the very beginning of the regular expression, it will look for the lookahead pattern right from the first character
Also note thata-z, A-Z, 0-9and_are together covered by the predefined character class \w - The second +ve lookahead assertion
(?=[\w!@$%^&_]*\d)is very similar to the first one except that after 0 or more valid characters, there must be a digit.
Note that the second lookahead pattern starts matching from the 0th index since the first lookahead assertion does not consume any characters because all lookaround (lookahead and lookbehind) assertions are zero width assertions.
Also note that you can have multiple lookahead assertions in a single regex and they can be at the beginning, end or in the middle of the regex - Finally the pattern
[\w!@$%^&_]{8,20}states that our password must contain only the characters from the specified character class and it must be 8 to 20 characters long
Example 3
- You can use backreferenes too in lookahead assertions. To match 3 alphabets followed by 3 digits followed by 7 alphabets which must contain the 3 alphabet string matched earlier
let wordPttn = /([a-z]{3})\d{3}(?=[a-z]*\1[a-z]*)[a-z]{7}/; let targetString = "ydh733erydhcp";let result = targetString.match(wordPttn);console.log((result !== null ? `Pattern ${wordPttn} FOUND in '${targetString}'. Matched string is '${result[0]}'` : `Pattern ${wordPttn} NOT FOUND in '${targetString}'`));targetString = "ydh733eryghcp";result = targetString.match(wordPttn);console.log((result !== null ? `Pattern ${wordPttn} FOUND in '${targetString}'. Matched string is '${result[0]}'` : `Pattern ${wordPttn} NOT FOUND in '${targetString}'`));
- Youo can read about negative lookahead assertions here
Leave a Reply