Ben Collver <
bencollver@tilde.pink> wrote or quoted:
AWK As A Major Systems Programming Language
A systems programming language, in my book, is one you can
crank out device drivers in and tap into the platform ABI.
In retrospect, it seems clear (at least to us!) that there are two
major reasons that all of the previously mentioned languages have
enjoyed significant popularity. The first is their extensibility. The
second is namespace management.
That totally makes me think of the "Zen of Python":
|The Zen of Python, by Tim Peters
|
|Beautiful is better than ugly.
|Explicit is better than implicit.
|Simple is better than complex.
|Complex is better than complicated.
|Flat is better than nested.
|Sparse is better than dense.
|Readability counts.
|Special cases aren't special enough to break the rules.
|Although practicality beats purity.
|Errors should never pass silently.
|Unless explicitly silenced.
|In the face of ambiguity, refuse the temptation to guess.
|There should be one-- and preferably only one --obvious way to do it.
|Although that way may not be obvious at first unless you're Dutch.
|Now is better than never.
|Although never is often better than *right* now.
|If the implementation is hard to explain, it's a bad idea.
|If the implementation is easy to explain, it may be a good idea.
|Namespaces are one honking great idea -- let's do more of those!
.
I have worked for several years in Python. For string manipulation
and processing records, you still have to write all the manual stuff:
open the file, read lines in a loop, split them, etc. Awk does all
this stuff for me.
On the flip side, you can peep it like this: Python's got a solid
set of statement types you can use for everything, making the code
hella readable. Meanwhile, awk's got its bag of tricks for special
cases like file and string processing. Just compare [1] with [2].
[1]
#!/usr/bin/awk -f
# This AWK script analyzes a simple CSV file containing book information:
# Title,Author,Year,Price
BEGIN {
FS = ","
print "Book Analysis Report"
print "===================="
}
{
if (NR > 1) { # Skip header row
total_price += $4
if ($3 < min_year || min_year == 0) min_year = $3
if ($3 > max_year) max_year = $3
author_count[$2]++
year_count[$3]++
}
}
END {
print "\nTotal number of books:", NR - 1
print "Average book price: $" sprintf("%.2f", total_price / (NR - 1))
print "Year range:", min_year, "to", max_year
print "\nBooks per author:"
for (author in author_count)
print author ":", author_count[author]
print "\nBooks per year:"
for (year in year_count)
print year ":", year_count[year]
}
[2]
#!/usr/bin/env python3
import csv
from dataclasses import dataclass
from collections import Counter
from typing import List, Dict, Tuple
@dataclass
class Book:
title: str
author: str
year: int
price: float
class BookAnalyzer:
def __init__(self, books: List[Book]):
self.books = books
def total_books(self) -> int:
return len(self.books)
def average_price(self) -> float:
return sum(book.price for book in self.books) / len(self.books)
def year_range(self) -> Tuple[int, int]:
years = [book.year for book in self.books]
return min(years), max(years)
def books_per_author(self) -> Dict[str, int]:
return Counter(book.author for book in self.books)
def books_per_year(self) -> Dict[int, int]:
return Counter(book.year for book in self.books)
def read_csv(filename: str) -> List[Book]:
with open(filename, 'r') as f:
reader = csv.reader(f)
next(reader) # Skip header row
return [Book(title, author, int(year), float(price))
for title, author, year, price in reader]
def print_report(analyzer: BookAnalyzer) -> None:
print("Book Analysis Report")
print("====================")
print(f"\nTotal number of books: {analyzer.total_books()}")
print(f"Average book price: ${analyzer.average_price():.2f}")
min_year, max_year = analyzer.year_range()
print(f"Year range: {min_year} to {max_year}")
print("\nBooks per author:")
for author, count in analyzer.books_per_author().items():
print(f"{author}: {count}")
print("\nBooks per year:")
for year, count in analyzer.books_per_year().items():
print(f"{year}: {count}")
def main() -> None:
books = read_csv("books.csv")
analyzer = BookAnalyzer(books)
print_report(analyzer)
if __name__ == "__main__":
main()