Friday, 2 May 2014

awk


          It is an Interpreted programming Language designed for text processing and typically used as a data extraction and reporting tool.
The name awk comes from the initials of its designers: Alfred V. Aho, Peter J. Weinberger and Brian W. Kernighan. The original version of awk was written in 1977 at AT&T Bell Laboratories.
The basic function of awk is to search files for lines ( or unit of text) that contain certain patterns.  When a line matches one of the patterns, awk performs specified actions on that line. awk keeps procesing input lines in this way until it reaches the end of the input file.

AWK is a language for processing text files. A file is treated as a sequence of records, and by default each line is a record. Each line is broken up into a sequence of fields, so we can think of the first word in a line as the first field, the second word as the second field, and so on. An AWK program is of a sequence of pattern-action statements. AWK reads the input a line at a time. A line is scanned for each pattern in the program, and for each pattern that matches, the associated action is executed. - Alfred V. Aho

Some examples
 
  awk '/imocha/ { print $0 }' filename.txt

the above code is to print all the the lines which contain the word 'imocha'  in the filename.txt

  awk '{ if (length($0) > max) max = length($0) }  END { print max }' stud1.txt

above code is to find the longest line in the file.
 
Built in variables of awk
Variable
Description
NR
Keeps a current count of the number of input records.
NF
Keeps a count of the number of fields in an input record. The last field in the input record can be designated by $NF.

FILENAME
Contains the name of the current input-file
FS
Contains the "field separator" character used to divide fields on the input record. The default, "white space", includes any space and tab characters. FS can be reassigned to another character to change the field separator
RS
Stores the current "record separator" character. Since, by default, an input line is the input record, the default record separator character is a "newline"
OFS
Stores the "output field separator", which separates the fields when Awk prints them. The default is a "space" character
ORS
Stores the "output record separator", which separates the output records when Awk prints them. The default is a "newline" character
OFMT
Stores the format for numeric output. The default format is "%.6g"

No comments:

Post a Comment