Re: How to manage accented characters in mail header?

Liste des GroupesRevenir à cl python 
Sujet : Re: How to manage accented characters in mail header?
De : ram (at) *nospam* zedat.fu-berlin.de (Stefan Ram)
Groupes : comp.lang.python
Date : 04. Jan 2025, 15:49:38
Autres entêtes
Organisation : Stefan Ram
Message-ID : <decode_header-20250104154914@ram.dialup.fu-berlin.de>
References : 1
Chris Green <cl@isbd.net> wrote or quoted:
From: =?utf-8?B?U8OpYmFzdGllbiBDcmlnbm9u?= <sebastien.crignon@amvs.fr>

  In Python, when you roll with decode_header from the email.header
  module, it spits out a list of parts, where each part is like
  a tuple of (decoded string, charset). To smash these decoded
  sections into one string, you’ll want to loop through the list,
  decode each piece (if it needs it), and then throw them together.
  Here’s a straightforward example of how to pull this off:

from email.header import decode_header

# Example header
header_example = \
'From: =?utf-8?B?U8OpYmFzdGllbiBDcmlnbm9u?= <sebastien.crignon@amvs.fr>'

# Decode the header
decoded_parts = decode_header(header_example)

# Kick off an empty list for the decoded strings
decoded_strings = []

for part, charset in decoded_parts:
    if isinstance(part, bytes):
        # Decode the bytes to a string using the charset
        decoded_string = part.decode(charset or 'utf-8')
    else:
        # If it’s already a string, just roll with it
        decoded_string = part
    decoded_strings.append(decoded_string)

# Join the parts into a single string
final_string = ''.join(decoded_strings)

print(final_string)# From: Sébastien Crignon <sebastien.crignon@amvs.fr>

  Breakdown

  decode_header(header_example): This line takes your email header
  and breaks it down into a list of tuples.

  Looping through decoded_parts: You check if each part is in
  bytes. If it is, you decode it using whatever charset it’s
  got (defaulting to 'utf-8' if it’s a little vague).

  Appending Decoded Strings: You toss each decoded part into a list.

  Joining Strings: Finally, you use ''.join(decoded_strings) to glue
  all the decoded strings into a single, coherent piece.

  Just a Heads Up

  Keep an eye out for cases where the charset might be None. In those
  moments, it’s smart to fall back to 'utf-8' or something safe.



Date Sujet#  Auteur
4 Jan 25 * How to manage accented characters in mail header?6Chris Green
4 Jan 25 +* Re: How to manage accented characters in mail header?4Stefan Ram
4 Jan 25 i`* Re: How to manage accented characters in mail header?3Chris Green
4 Jan 25 i +- Re: How to manage accented characters in mail header?1Stefan Ram
6 Jan 25 i `- Re: How to manage accented characters in mail header?1Peter J. Holzer
4 Jan 25 `- Re: How to manage accented characters in mail header?1Peter Pearson

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal