Re: Correct syntax for pathological re.search()

Liste des GroupesRevenir à cl python 
Sujet : Re: Correct syntax for pathological re.search()
De : list1 (at) *nospam* tompassin.net (Thomas Passin)
Groupes : comp.lang.python
Date : 12. Oct 2024, 14:06:54
Autres entêtes
Message-ID : <mailman.24.1728750786.4695.python-list@python.org>
References : 1 2 3 4 5
User-Agent : Mozilla Thunderbird
On 10/11/2024 8:37 PM, MRAB via Python-list wrote:
On 2024-10-11 22:13, AVI GROSS via Python-list wrote:
Is there some utility function out there that can be called to show what the
regular expression you typed in will look like by the time it is ready to be
used?
>
Obviously, life is not that simple as it can go through multiple layers with
each dealing with a layer of backslashes.
>
But for simple cases, ...
>
Yes. It's called 'print'. :-)
There is section in the Python docs about this backslash subject.  It's titled "The Backslash Plague" in
https://docs.python.org/3/howto/regex.html
You can also inspect the compiled expression to see what string it received after all the escaping:

import re
>
re_string = '\w+\\sub'
re_pattern = re.compile(re_string)
>
# Should look as if we had used r'\w+\sub'
print(re_pattern.pattern)
\w+\sub

-----Original Message-----
From: Python-list <python-list- bounces+avi.e.gross=gmail.com@python.org> On
Behalf Of Gilmeh Serda via Python-list
Sent: Friday, October 11, 2024 10:44 AM
To: python-list@python.org
Subject: Re: Correct syntax for pathological re.search()
>
On Mon, 7 Oct 2024 08:35:32 -0500, Michael F. Stemper wrote:
>
I'm trying to discard lines that include the string "\sout{" (which is
TeX, for those who are curious. I have tried:
   if not re.search("\sout{", line): if not re.search("\sout\{", line):
   if not re.search("\sout{", line): if not re.search("\sout\{",
   line):
>
But the lines with that string keep coming through. What is the right
syntax to properly escape the backslash and the left curly bracket?
>
$ python
Python 3.12.6 (main, Sep  8 2024, 13:18:56) [GCC 14.2.1 20240805] on linux
Type "help", "copyright", "credits" or "license" for more information.
import re
s = r"testing \sout{WHADDEVVA}"
re.search(r"\sout{", s)
<re.Match object; span=(8, 14), match='\sout{'>
>
You want a literal backslash, hence, you need to escape everything.
>
It is not enough to escape the "\s" as "\s", because that only takes care
of Python's demands for escaping "\". You also need to escape the "\" for
the RegEx as well, or it will read it like it means "\s", which is the
RegEx for a space character and therefore your search doesn't match,
because it reads it like you want to search for " out{".
>
Therefore, you need to escape it either as per my example, or by using
four "\" and no "r" in front of the first quote, which also works:
>
re.search("\\sout{", s)
<re.Match object; span=(8, 14), match='\sout{'>
>
You don't need to escape the curly braces. We call them "seagull wings"
where I live.
>
 

Date Sujet#  Auteur
7 Oct 24 * Correct syntax for pathological re.search()28Michael F. Stemper
7 Oct 24 +* Re: Correct syntax for pathological re.search()6Stefan Ram
7 Oct 24 i`* Re: Correct syntax for pathological re.search()5Michael F. Stemper
7 Oct 24 i `* Re: Correct syntax for pathological re.search()4Stefan Ram
7 Oct 24 i  +- Re: Correct syntax for pathological re.search()1Jon Ribbens
8 Oct 24 i  `* Re: Correct syntax for pathological re.search()2Pieter van Oostrum
9 Oct 24 i   `- Re: Correct syntax for re.search() (Posting On Python-List Prohibited)1Lawrence D'Oliveiro
8 Oct 24 +- Re: Correct syntax for pathological re.search()1Karsten Hilbert
8 Oct 24 +- Re: Correct syntax for pathological re.search()1MRAB
8 Oct 24 +* Re: Correct syntax for pathological re.search()3MRAB
8 Oct 24 i`* Re: Correct syntax for pathological re.search()2Stefan Ram
8 Oct 24 i `- Re: Correct syntax for pathological re.search()1Stefan Ram
8 Oct 24 +* Re: Correct syntax for pathological re.search()4Karsten Hilbert
8 Oct 24 i`* Re: Correct syntax for pathological re.search()3Alan Bawden
9 Oct 24 i +- Re: Correct syntax for pathological re.search()1MRAB
9 Oct 24 i `- Re: Correct syntax for pathological re.search()1Karsten Hilbert
11 Oct 24 +- Re: Correct syntax for pathological re.search()1<avi.e.gross
12 Oct 24 +- Re: Correct syntax for pathological re.search()1MRAB
12 Oct 24 +* Re: Correct syntax for pathological re.search()2Peter J. Holzer
12 Oct 24 i`- Re: Correct syntax for pathological re.search()1Stefan Ram
12 Oct 24 +- Re: Correct syntax for pathological re.search()1Thomas Passin
12 Oct 24 +- Re: Correct syntax for pathological re.search()1<avi.e.gross
12 Oct 24 +- Re: Correct syntax for pathological re.search()1Thomas Passin
13 Oct 24 +- Re: Correct syntax for pathological re.search()1Stefan Ram
18 Oct 24 `* Re: Correct syntax for pathological re.search()4Peter J. Holzer
19 Oct 24  `* Re: Correct syntax for pathological re.search()3jak
21 Oct 24   `* Re: Correct syntax for pathological re.search()2Peter J. Holzer
21 Oct 24    `- Re: Correct syntax for pathological re.search()1Stefan Ram

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal