DOC HOME SITE MAP MAN PAGES GNU INFO SEARCH
 

(gawk.info.gz) Escape Sequences

Info Catalog (gawk.info.gz) Regexp Usage (gawk.info.gz) Regexp (gawk.info.gz) Regexp Operators
 
 Escape Sequences
 ================
 
    Some characters cannot be included literally in string constants
 (`"foo"') or regexp constants (`/foo/').  Instead, they should be
 represented with "escape sequences", which are character sequences
 beginning with a backslash (`\').  One use of an escape sequence is to
 include a double-quote character in a string constant.  Because a plain
 double quote ends the string, you must use `\"' to represent an actual
 double-quote character as a part of the string.  For example:
 
      $ awk 'BEGIN { print "He said \"hi!\" to her." }'
      -| He said "hi!" to her.
 
    The  backslash character itself is another character that cannot be
 included normally; you must write `\\' to put one backslash in the
 string or regexp.  Thus, the string whose contents are the two
 characters `"' and `\' must be written `"\"\\"'.
 
    Backslash also represents unprintable characters such as TAB or
 newline.  While there is nothing to stop you from entering most
 unprintable characters directly in a string constant or regexp constant,
 they may look ugly.
 
    The following table lists all the escape sequences used in `awk' and
 what they represent. Unless noted otherwise, all these escape sequences
 apply to both string constants and regexp constants:
 
 `\\'
      A literal backslash, `\'.
 
 `\a'
      The "alert" character, `Ctrl-g', ASCII code 7 (BEL).  (This
      usually makes some sort of audible noise.)
 
 `\b'
      Backspace, `Ctrl-h', ASCII code 8 (BS).
 
 `\f'
      Formfeed, `Ctrl-l', ASCII code 12 (FF).
 
 `\n'
      Newline, `Ctrl-j', ASCII code 10 (LF).
 
 `\r'
      Carriage return, `Ctrl-m', ASCII code 13 (CR).
 
 `\t'
      Horizontal TAB, `Ctrl-i', ASCII code 9 (HT).
 
 `\v'
      Vertical tab, `Ctrl-k', ASCII code 11 (VT).
 
 `\NNN'
      The octal value NNN, where NNN stands for 1 to 3 digits between
      `0' and `7'.  For example, the code for the ASCII ESC (escape)
      character is `\033'.
 
 `\xHH...'
      The hexadecimal value HH, where HH stands for a sequence of
      hexadecimal digits (`0'-`9', and either `A'-`F' or `a'-`f').  Like
      the same construct in ISO C, the escape sequence continues until
      the first nonhexadecimal digit is seen.  However, using more than
      two hexadecimal digits produces undefined results. (The `\x'
      escape sequence is not allowed in POSIX `awk'.)
 
 `\/'
      A literal slash (necessary for regexp constants only).  This
      expression is used when you want to write a regexp constant that
      contains a slash. Because the regexp is delimited by slashes, you
      need to escape the slash that is part of the pattern, in order to
      tell `awk' to keep processing the rest of the regexp.
 
 `\"'
      A literal double quote (necessary for string constants only).
      This expression is used when you want to write a string constant
      that contains a double quote. Because the string is delimited by
      double quotes, you need to escape the quote that is part of the
      string, in order to tell `awk' to keep processing the rest of the
      string.
 
    In `gawk', a number of additional two-character sequences that begin
 with a backslash have special meaning in regexps.  
 `gawk'-Specific Regexp Operators GNU Regexp Operators.
 
    In a regexp, a backslash before any character that is not in the
 previous list and not listed in  `gawk'-Specific Regexp Operators
 GNU Regexp Operators, means that the next character should be taken
 literally, even if it would normally be a regexp operator.  For
 example, `/a\+b/' matches the three characters `a+b'.
 
    For complete portability, do not use a backslash before any
 character not shown in the previous list.
 
    To summarize:
 
    * The escape sequences in the table above are always processed first,
      for both string constants and regexp constants. This happens very
      early, as soon as `awk' reads your program.
 
    * `gawk' processes both regexp constants and dynamic regexps (
      Using Dynamic Regexps Computed Regexps.), for the special
      operators listed in  `gawk'-Specific Regexp Operators GNU
      Regexp Operators.
 
    * A backslash before any other character means to treat that
      character literally.
 
 Advanced Notes: Backslash Before Regular Characters
 ---------------------------------------------------
 
    If you place a backslash in a string constant before something that
 is not one of the characters previously listed, POSIX `awk' purposely
 leaves what happens as undefined.  There are two choices:
 
 Strip the backslash out
      This is what Unix `awk' and `gawk' both do.  For example, `"a\qc"'
      is the same as `"aqc"'.  (Because this is such an easy bug both to
      introduce and to miss, `gawk' warns you about it.)  Consider `FS =
      "[ \t]+\|[ \t]+"' to use vertical bars surrounded by whitespace as
      the field separator. There should be two backslashes in the string
      `FS = "[ \t]+\\|[ \t]+"'.)
 
 Leave the backslash alone
      Some other `awk' implementations do this.  In such
      implementations, typing `"a\qc"' is the same as typing `"a\\qc"'.
 
 Advanced Notes: Escape Sequences for Metacharacters
 ---------------------------------------------------
 
    Suppose you use an octal or hexadecimal escape to represent a regexp
 metacharacter.  (See  Regular Expression Operators Regexp
 Operators.)  Does `awk' treat the character as a literal character or
 as a regexp operator?
 
    Historically, such characters were taken literally.  (d.c.)
 However, the POSIX standard indicates that they should be treated as
 real metacharacters, which is what `gawk' does.  In compatibility mode
 ( Command-Line Options Options.), `gawk' treats the characters
 represented by octal and hexadecimal escape sequences literally when
 used in regexp constants. Thus, `/a\52b/' is equivalent to `/a\*b/'.
 
Info Catalog (gawk.info.gz) Regexp Usage (gawk.info.gz) Regexp (gawk.info.gz) Regexp Operators
automatically generated byinfo2html