Regular Expression to Find Specific Sentence

First let’s do some review about reading and writing txt files
read data in txt file(many lines), as below

write data to txt file, as below (pay attention that if “douban.txt” hasn’t been created, then the code will automatically create this txt file):


And if we want to write data in txt files by lines, we need to add “\n” to at the end of each line.

Python has already provided us with function “finditer” and “findiall”. We should “import re” at the beginning.

  1. Finditer:

    Or use finditer with function “re.compile”

    A coding example: find all patterns of emails in a very huge txt file(we need to define patterns of emails at first, number/characters+@+number/characters/“.”)

    If we directly print(m), because m is a special kind of structure, it’s not purely emails. So we get the index of start and end position, and then use index in string to print or write only email addresses.

  2. function “findiall”


    We use the pattern defined before in the coding example, but change “finditer “ to “findiall”, then we get the following for “print(m_bundle)”

    We didn’t get the whole emails, but only the email enterprises. Finally, I found out that it is because these parentheses we used in “patterns”.

    After deleting all parentheses in “pattern”, we get

    So my conclusion is that square brackets won’t influence the result. Only parentheses would change.
    The final version is

Supplements: common used regular expressions.
refer to : https://blog.csdn.net/weixin_40583388/article/details/78458610


  TOC