Sujet : [gawk] Handling variants of CSV input data formats
De : janis_papanagnou+ng (at) *nospam* hotmail.com (Janis Papanagnou)
Groupes : comp.lang.awkDate : 25. Aug 2024, 07:00:20
Autres entêtes
Organisation : A noiseless patient Spider
Message-ID : <vaeh9m$1pfge$1@dont-email.me>
User-Agent : Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
Myself I'm usually not using CSV format(s), but recently I advertised
GNU Awk (given that newer versions support CSV data processing) to a
friend seeking CSV solutions.
I was quite astonished when I stumbled across a StackOverflow article
about CSV processing with contemporary versions of GNU Awk and read
that you are restricted to comma as separator and double quotes to
enclose strings. The workarounds provided at SO were extremely clumsy.
Given that using ',', ';', '|' (or other delimiters) and also various
types of quotes are just a lexical (no functional) difference I wonder
whether it would be sensible to be able to define them, say, through
setting a PROCINFO element?
Janis
https://stackoverflow.com/questions/45420535/whats-the-most-robust-way-to-efficiently-parse-csv-using-awk