正则表达式

发表于 2018-07-04 更新于 2018-08-07 分类于 FreeCodeCamp 阅读次数： Disqus：

正则这部分习题不算多，但是我还是被卡住了挺久的，比预计要用的番茄时间多了好几个，主要是有几个之前没有弄懂的地方耽误了时间，加上习题pass需要完全满足检验指定的方法，否则在编辑器里测试没问题，但是test还是一直过不了。

旧版的正则只讲了最基础的内容，如\s, \d, [a-z]等基本语法，总共只有4、5道习题，新版内容添加了很多难度更高的知识，覆盖的范围更大。另外我把之前在其他地方学过的正则的内容一并总结过来，当做一次复习。

正则表达式

正则表达方法
- 习题里只用了正则表达式的字面量表达方法，用斜杠来表示**/pattern/flags** ，如const regex = /ab+c/;
- 还有一种方法是调用RegExp对象的构造函数**new RegExp(pattern, [, flags])**其中，flags根据实际情况添加，如const regex = new RegExp("ab+c");
正则验证，**.test和.match**方法的区别
- .test的用法是**正则表达式.test(测试字符串)**，返回一个布尔值结果，表明字符串能够通过正则表达式的测试
- .match的用法是**测试字符串.match(正则表达式)**，如果测试字符串无法满足正则表达式的格式，则返回null，如果测试字符串满足正则表达式，则返回一个数组
  .test方法是正则表达式的方法，RegExp.prototype.test()
  
  .match方法是字符串的方法，String.prototype.match()
  
  两个方法挺像的，有时候容易搞混，我会把它们的原型记成是英语里的主语，谁去做什么事
  - .test方法是正则去测试其他的内容（字符串）是否能够通过它的测试（test），结果是是(true)或否(false)
  - .match方法是字符串去检查其他内容（正则）是否符合（match）它的格式，结果是字符串的某个部分符合（数组[0]）
- 除了习题中用到的两个方法之外，正则表达式的相关方法还有：
  - RegExp的exec方法，用法和test方法基本一样，RegExp.exec(str)，区别在于返回结果是一个数组，它的返回结果和match方法是一样的
  - 字符串除了match方法检验字符串是否匹配正则的规则之外，还可以
    - 使用.replace方法来替换掉符合正则规则的部分
    - 使用.split方法，利用正则来指定切分字符串的规则
    - 使用.search方法来搜索字符串是否匹配正则的规则，如果符合，就返回首次匹配项的所有，如果不符合，则返回-1
正则的flags，即匹配模式，常见匹配模式有：
- **g**，即全局匹配，返回的结果包括所有匹配规则的字符串，如果非全局匹配，则只返回首次匹配项
- **i**，即大小写不敏感匹配，在匹配时忽略大小写差异
- **m**，即多行匹配，使用多行匹配时，^和$作为首行和末行标志，而非首尾字符标志
匹配字符串
- 常规字符串**abc**
- wildcard字符**.**，用一个点表示，可以指代任意字符（不可指代新行newline），类似于麻将里的癞子，查了一下wildcard这个词，既可指纸牌游戏里的“变牌”，也可以只计算机里的通配符
- 选择性匹配 **a|b|c，也可以表示成[abc]**，表示a或b或c都可以匹配
- 区间性匹配**[a-z],[0-9]，分别表示a-z的所有字母，0-9的所有数字，也可以组合使用，如[A-Za-z0-9]**表示所有的字母和数字
- 反向匹配**[^abc]**,表示匹配不是a,b,c的字符
匹配次数
- * 字符出现0到多次
- + 字符出现1到多次
- ? 字符出现0到1次
- {n, m} 字符出现最少n次，最多m次，其中，上下限都可以留空，如**{n,}** 表示最少出现n次，最多不限次数，也可以只有一个数字，如**{n}**，表示正好出现n次
位置匹配
- /^ /,^符号在[]方括号内表示非某一字符，在方括号外表示以某一字符开头，如**/^ab/**表示以ab开头
- **/ $/表示以某一字符结尾，如/ab$/**表示以ab结尾
简写
- \w，等同于[A-Za-z0-9],表示所有的字母及数字
- **\W等同于[^A-Za-z0-9]**，表示所有非字母及数字的字符
- **\d等同于[0-9]**，表示所有数字
- **\D等同于[^0-9]**，表示所有非数字
- **\s**表示空格
- **\S**表示非空格
断言
- 零宽正向先行断言，表达式为**(?=pattern)**，表示紧接该位置之后的字符能够匹配pattern，如"a regular expression".match(/re(?=gular)/)匹配结果是regular中的re，但是不会匹配expression中的re；此外，零宽是指匹配的pattern不会占用字符，如"a regular expression".match(/re(?=gular)./)匹配结果为reg，pattern本身没有消耗字符，只匹配位置
- 零宽负向先行断言，表达式为**(?!pattern)**，表示紧接该位置之后不能匹配patter，如"a regular expression".match(/re(?!gular)/)匹配结果是expression中的re，不会匹配regular中的re
- 零宽正向后行断言，表达式为**(?<=pattern)**，表示紧接该位置之前的字符能够匹配patter，如"a regular expression",match(/(?<=\w+)re/)匹配结果是expression中的re，即re之前还有其他字符，无法匹配单词开头的re
- 零宽负向后行断言，表达式为**(?<!pattern)**，表示紧接该位置之前的字符不能匹配pattern,如"a regular expression".match(/(?<!\w+)re/)匹配结果是regular中的re，即不满足re之前有其他字符
匹配范围
- 正常情况下，字符串匹配正则的规则会返回满足规则的最长字符串，这叫做greedy matching(贪婪匹配），如'titanic'.match(/t.*i/)返回'titani'
- lazy matching（惰性匹配）可以匹配符合规则的最短字符串，使用方法是*在正则表达式中，在需要惰性匹配的部分（即可长可短的部分）后面添加?*，如'titanic'.match(/t.*?i/)返回ti
捕获组
- 对于需要重复使用的正则规则可以使用()来进行分组
- 每个分组都会自动匹配一个组号，即从左到右，从1开始，依次累加，需要再次使用分组内的规则时，用\组号就可引用了
- 如Reuse Patterns Using Capture Groups这道习题，正则需要表示数字重复恰好3次，中间由空格隔开，数字可以表示为\d+作为第一个分组，空格\s可以作为第二个分组，一个数字出现三次的表达方式可以是/(/d+)\1\1/，中间用逗号隔开可以这样表示/(\d+)(\s)\1\2\1/,同时需要保证恰好3次，即前后都不会再出现数字，否则无法通过测试，那么/^(\d+)(\s)\1\2\1$/分别将开头结尾锁定，但是这一题明确要求必须使用两次空格(个人觉得这里有点小bug，或者说有点歧义，毕竟利用分组来重复空格得到的结果也把重复的数字分开了)，最终可以这样通过测试/^(\d+)\s\1\s\1$/

为了方便记忆，我把上述规则归纳到了下表中
正则规则

以下是这部分习题的解答：

Introduction to the Regular Expression Challenges

Using the Test Method

1
2
3

let myString = "Hello, World!";
let myRegex = /Hello/;
let result = myRegex.test(myString); // Change this line

Match Literal Strings

1
2
3

let waldoIsHiding = "Somewhere Waldo is hiding in this text.";
let waldoRegex = /Waldo/; // Change this line
let result = waldoRegex.test(waldoIsHiding);

Match a Literal String with Different Possibilities

1
2
3

let petString = "James has a pet cat.";
let petRegex = /dog|cat|bird|fish/; // Change this line
let result = petRegex.test(petString);

Ignore Case While Matching

1
2
3

let myString = "freeCodeCamp";
let fccRegex = /freeCodeCamp/i; // Change this line
let result = fccRegex.test(myString);

Extract Matches

1
2
3

let extractStr = "Extract the word 'coding' from this string.";
let codingRegex = /coding/; // Change this line
let result = extractStr.match(codingRegex); // Change this line

Find More Than the First Match

1
2
3

let twinkleStar = "Twinkle, twinkle, little star";
let starRegex = /twinkle/gi; // Change this line
let result = twinkleStar.match(starRegex); // Change this line

Match Anything with Wildcard Period

1
2
3

let exampleStr = "Let's have fun with regular expressions!";
let unRegex = /.un/; // Change this line
let result = unRegex.test(exampleStr);

Match Single Character with Multiple Possibilities

1
2
3

let quoteSample = "Beware of bugs in the above code; I have only proved it correct, not tried it.";
let vowelRegex = /[aeiou]/gi; // Change this line
let result = quoteSample.match(vowelRegex); // Change this line

Match Letters of the Alphabet

1
2
3

let quoteSample = "The quick brown fox jumps over the lazy dog.";
let alphabetRegex = /[a-z]/gi; // Change this line
let result = quoteSample.match(alphabetRegex); // Change this line

Match Numbers and Letters of the Alphabet

1
2
3

let quoteSample = "Blueberry 3.141592653s are delicious.";
let myRegex = /[h-s2-6]/gi; // Change this line
let result = quoteSample.match(myRegex); // Change this line

Match Single Characters Not Specified

1
2
3

let quoteSample = "3 blind mice.";
let myRegex = /[^aeiou0-9]/gi; // Change this line
let result = quoteSample.match(myRegex); // Change this line

Match Characters that Occur One or More Times

1
2
3

let difficultSpelling = "Mississippi";
let myRegex = /s+/g; // Change this line
let result = difficultSpelling.match(myRegex);

Match Characters that Occur Zero or More Times

1
2
3

let chewieQuote = "Aaaaaaaaaaaaaaaarrrgh!";
let chewieRegex = /Aa*/; // Change this line
let result = chewieQuote.match(chewieRegex);

Find Characters with Lazy Matching

1
2
3

let text = "<h1>Winter is coming</h1>";
let myRegex = /<h.*?1>/; // Change this line
let result = text.match(myRegex);

Find One or More Criminals in a Hunt

let crowd = 'P1P2P3P4P5P6CCCP7P8P9';

let reCriminals = /C+/g; // Change this line

let matchedCriminals = crowd.match(reCriminals);
console.log(matchedCriminals);

Match Beginning String Patterns

1
2
3

let rickyAndCal = "Cal and Ricky both like racing.";
let calRegex = /^Cal/; // Change this line
let result = calRegex.test(rickyAndCal);

Match Ending String Patterns

1
2
3

let caboose = "The last car on a train is the caboose";
let lastRegex = /caboose$/; // Change this line
let result = lastRegex.test(caboose);

Match All Letters and Numbers

1
2
3

let quoteSample = "The five boxing wizards jump quickly.";
let alphabetRegexV2 = /\w/gi; // Change this line
let result = quoteSample.match(alphabetRegexV2).length;

Match Everything But Letters and Numbers

1
2
3

let quoteSample = "The five boxing wizards jump quickly.";
let nonAlphabetRegex = /\W/gi; // Change this line
let result = quoteSample.match(nonAlphabetRegex).length;

Match All Numbers

1
2
3

let numString = "Your sandwich will be $5.00";
let numRegex = /\d/g; // Change this line
let result = numString.match(numRegex).length;

Match All Non-Numbers

1
2
3

let numString = "Your sandwich will be $5.00";
let noNumRegex = /\D/gi; // Change this line
let result = numString.match(noNumRegex).length;

Restrict Possible Usernames

1
2
3

let username = "JackOfAllTrades";
let userCheck = /[A-Za-z]+[A-Za-z]+\d*$/; // Change this line
let result = userCheck.test(username);

Match Whitespace

1
2
3

let sample = "Whitespace is important in separating words";
let countWhiteSpace = /\s/g; // Change this line
let result = sample.match(countWhiteSpace);

Match Non-Whitespace Characters

1
2
3

let sample = "Whitespace is important in separating words";
let countNonWhiteSpace = /\S/g; // Change this line
let result = sample.match(countNonWhiteSpace);

Specify Upper and Lower Number of Matches

1
2
3

let ohStr = "Ohhh no";
let ohRegex = /[^h]h{3,6}[^h]/; // Change this line
let result = ohRegex.test(ohStr);

Specify Only the Lower Number of Matches

1
2
3

let haStr = "Hazzzzah";
let haRegex = /Haz{4,}ah/; // Change this line
let result = haRegex.test(haStr);

Specify Exact Number of Matches

1
2
3

let timStr = "Timmmmber";
let timRegex = /Tim{4}ber/; // Change this line
let result = timRegex.test(timStr);

Check for All or None

1
2
3

let favWord = "favorite";
let favRegex = /favou?rite/; // Change this line
let result = favRegex.test(favWord);

Positive and Negative Lookahead

1
2
3

let sampleWord = "astronaut";
let pwRegex = /(?=\w{5,})(?=\D*\d{2,})/; // Change this line
let result = pwRegex.test(sampleWord);

Reuse Patterns Using Capture Groups

1
2
3

let repeatNum = "42 42 42";
let reRegex = /^(\d+)\s\1\s\1$/; // Change this line
let result = reRegex.test(repeatNum);

Use Capture Groups to Search and Replace

let huhText = "This sandwich is good.";
let fixRegex = /good/; // Change this line
let replaceText = "okey-dokey"; // Change this line
let result = huhText.replace(fixRegex, replaceText);

Remove Whitespace from Start and End

1
2
3

let hello = "   Hello, World!  ";
let wsRegex = /\S+\s?\S+/; // Change this line
let result = hello.match(wsRegex)[0]; // Change this line