Re: [gawk] Handling variants of CSV input data formats

Liste des GroupesRevenir à cl awk 
Sujet : Re: [gawk] Handling variants of CSV input data formats
De : janis_papanagnou+ng (at) *nospam* hotmail.com (Janis Papanagnou)
Groupes : comp.lang.awk
Date : 27. Aug 2024, 02:39:07
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vajant$2m8em$1@dont-email.me>
References : 1 2 3 4
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
On 27.08.2024 02:49, Ed Morton wrote:
On 8/26/2024 7:54 AM, Janis Papanagnou wrote:
snip>
I'd have liked to provide more concrete information here, but I'm at
the moment even unable to reproduce Awk's behavior as documented in
its manual; I've tried the following command with various locales
>
$ echo 4,321 | LC_ALL=en_DK.utf-8 gawk '{ print $1 + 1 }'
-| 5,321
>
but always got just  5  as result.
 
You need to specifically TELL gawk to use your locale to read input
numbers:
 
$ echo 4,321 | LC_ALL=en_DK.utf-8 gawk '{ print $1 + 1 }'
5
 
$ echo 4,321 | POSIXLY_CORRECT=1 LC_ALL=en_DK.utf-8 gawk '{ print $1 + 1 }'
5,321
 
$ echo 4,321 | LC_ALL=en_DK.utf-8 gawk -N '{ print $1 + 1 }'        5,321
 
See
https://www.gnu.org/software/gawk/manual/gawk.html#Locale-influences-conversions
for more info on that.

Thanks. That's actually where I got above example from.

I've missed that there was an explicit
$ export POSIXLY_CORRECT=1
set on the very top of these examples. Gee!

Feels anyway strange that an explicit LC_* setting is ineffective
without the additional POSIXLY_CORRECT variable. And the page also
says: "The POSIX standard says that awk always uses the period as
the decimal point when reading the awk program source code".
So despite POSIX saying that, you have to use a variable named
POSIXLY_CORRECT. - Do I need some more coffee to understand that?

And I see there's an additional GNU Awk option '--use-lc-numeric'.
What a mess!

(I suppose current status can only be explained by the mentioned
forth-and-back during history of various GNU Awk versions.)

What's worth the LC_* variables if they are ignored (or maybe not).

Janis

 
Regards,
 
    Ed


Date Sujet#  Auteur
25 Aug 24 * [gawk] Handling variants of CSV input data formats11Janis Papanagnou
26 Aug 24 `* Re: [gawk] Handling variants of CSV input data formats10Ed Morton
26 Aug 24  `* Re: [gawk] Handling variants of CSV input data formats9Janis Papanagnou
26 Aug 24   +* Re: [gawk] Handling variants of CSV input data formats4Manuel Collado
27 Aug 24   i`* Re: [gawk] Handling variants of CSV input data formats3Janis Papanagnou
27 Aug 24   i `* Re: [gawk] Handling variants of CSV input data formats2Manuel Collado
27 Aug 24   i  `- Re: [gawk] Handling variants of CSV input data formats1Janis Papanagnou
27 Aug 24   `* Re: [gawk] Handling variants of CSV input data formats4Ed Morton
27 Aug 24    `* Re: [gawk] Handling variants of CSV input data formats3Janis Papanagnou
27 Aug 24     `* Re: [gawk] Handling variants of CSV input data formats2Ed Morton
27 Aug 24      `- Re: [gawk] Handling variants of CSV input data formats1Janis Papanagnou

Haut de la page

Les messages affichés proviennent d'usenet.

NewsPortal