awk
about
Awk: named after "Aho, Weinberger and Kernighan", the authors.
Recently (late 2005) I started using awk, it fills a niche area performing tasks many would use perl for. But awk is a language that grows on one. Though it took me some time to become comfortable enough to seek deeper answers, the tricky one being using functions.
Awk uses call-by-value for scalars and call-by-reference for arrays, plus, a function's local variables are defined in that function's parameter list. Confusing? When things go wrong, this was the area I found needed adjustment. Plus the gotcha that once a variable name has been used for an array, it may no longer be used as a scaler.
Strange searching for info on awk, this language has been around for decades, and is very useful for small data analysis tasks. In the junkview application, awk has no problems loading and using a 80k+ record database with a binary search. I'm impressed ;) And I'm using it on an old Celeron 500MHz box for the junkview application.
projects (older stuff)
A mix of awk and bash scripts for log file manipulation.
- junkview
- iptables firewall log analysis project home page
- junkshow
- example showing traffic encountered on this firewall / web server
- pre-filter
- this bash script can be used to feed a stream of records from logfiles in datestamp sequence to datestamp-filter, reading only those logfiles required to satisfy the start date condition, which may be specified as absolute date or most recent days. Handles 'logrotate'd and compressed log files.
- datestamp-filter updated 2008-10-11
- logfile datestamp filter with optional record offset to start of datestamp, documentation is in the source at this time, has a debug feature that indicates datestamp cursor and status. Worth a look if you build custom log analysis scripts and need to select records based on datestamp, particularly an embeddded datestamp — finds datestamps not starting in column one too.
- last-filter
- simple awk filter / formatter for the `last -adx` command.
awk scripts (newer stuff)
- pak-web-site
- A script to scan dynamic SSI web pages and insert semi-static information from the SSI files then compress the now static page. This technique saves about 70% space, or, the web site now uses 70% less bandwidth for the same amount of information sent to users.
- sf4sf
- A firewall log pretty printer tool with IP to country name lookup
- ccfind, ip2cn page
- A gawk script to query the ip2cn-server, converts IP to country code and name. The ip2cn page has more gawk clients for the server.
- netdraw
- Awk and bash scripts to log and display Internet connection activity