Strange behaviour with awk

We made recently an introduction course with my colleague on Awk and basics of shell scripting. The main points were to present some tricks for script creation.

Here is an interesting question arising after the course with its (or should I say one, among  some other) answer !

Awk seems to have a weird behaviour with a rather simple command.

To illustrate the problem, we had a file called “file.txt” like this one :

1 2 3 4

And we apply this command :

gawk '{print $NF ; print $NF " something" $0}' file

We get something like this !

something 1 2 3 4

when we were expecting something like this !

4 something 1 2 3 4

So what’s wrong here ?

Found it ?

In fact, an awfully classic problem, the file was coded as a dos file ! An hidden sign was positioned to the end of the line and was misinterpreted.

What are our solutions here ?

We could use dos2unix command which will convert our file into a unix style file, but we can also use the following trick :

gawk '{gsub("\015$","");print $NF ; print $NF " something" $0}' file.txt >file.unix

Explanations :

We use here the global substitution function “gsub”, this function will substitute any occurrence of the ASCII character “\15”, by nothing “”. You ‘ve understood that this ASCII symbol 015 is the one that turn our simple command into a devil driven nightmare (any one who once waist several hour on such a problem will understand what I am talking about !)

This trick is a good thing to add to your command when you must process files that come from another platform, in the best scenario it will be useless in the worst scenario it will avoid any problem !

How to avoid this kind of problem ?

  1. Use command “file” to test your file
  2. Open file with emacs check out that no the encoding is not dos (you can see this in the bottom left part of your buffer)
  3. Retrieve files on ftp with ASCII mode enable  by default
  4. Use dos2unix every time you have a doubt on the file you receive
  5. Change the world work only with people using Linux !
  1. April 12, 2011 at 7:08 am

    Trop fort notre François!

    • April 20, 2011 at 8:16 am

      si ça peut être utile, c’est tout ce qu’il me faut pour continuer !

  1. October 4, 2011 at 7:58 pm

