Sujet : Re: [gawk] Handling variants of CSV input data formats
De : mortonspam (at) *nospam* gmail.com (Ed Morton)
Groupes : comp.lang.awkDate : 26. Aug 2024, 12:26:26
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vahop1$2eavu$1@dont-email.me>
References : 1
User-Agent : Mozilla Thunderbird
On 8/25/2024 1:00 AM, Janis Papanagnou wrote:
Myself I'm usually not using CSV format(s), but recently I advertised
GNU Awk (given that newer versions support CSV data processing) to a
friend seeking CSV solutions.
I was quite astonished when I stumbled across a StackOverflow article
about CSV processing with contemporary versions of GNU Awk and read
that you are restricted to comma as separator and double quotes to
enclose strings. The workarounds provided at SO were extremely clumsy.
Given that using ',', ';', '|' (or other delimiters) and also various
types of quotes are just a lexical (no functional) difference I wonder
whether it would be sensible to be able to define them, say, through
setting a PROCINFO element?
Janis
https://stackoverflow.com/questions/45420535/whats-the-most-robust-way-to-efficiently-parse-csv-using-awk
FYI gawk just inherited those behaviors (plus mandatory stripping of the quotes from quoted fields, see
https://lists.gnu.org/archive/html/bug-gawk/2023-11/msg00018.html) from Kernighans awk.
Ed.