Home > Awk, Linux, Shell > Two other classical problem…and their solutions

Two other classical problem…and their solutions

After the recent course I gave on Unix and awk basics, several of my colleagues came to my office with some strange case. Problem I already encountered a long time ago, so that I totally forgot to mention it.

First problem :  wc is not counting line properly

You open a file, read it, and observe that the number of line is let’s  say 10 but wc -l tell you that he count 9 or 11 or more lines

What’s wrong ?

There is for some reason a problem in the last line of your file. Some software doesn’t write properly the last line of a file so wc find the end of file sign in the last line containing text (normaly the end of file should be found alone on the last line of the file), this can explain why wc -l will think you got 9 lines whereas you have actually 10.

An easy way to fix is to open the problematic file with any editor  go to the last line press enter to move the cursor to the last empty line. This way your

In case wc -l tell you you have 11 or more lines, you may have additional empty lines at the end of your file, just open the problematic file and erase the extra empty lines.

Second problem :  floating points figure  are written with a comma by gawk 

This bug  is in fact related to the fact that i am French (nobody’s perfect) and that on one server among all the other available at the office, in fact in this server the LANG variable is set to “fr_FR”. The only way to fix this is to change the value of LANG to either  “us_US” or better “fr_FR.UTF-8”.

This could be conveniently done by adding this line  in any script before using awk (or by adding this in your .profile or .bashrc) .

export LANG="fr_FR.UTF-8"

Should we consider that all our scripts should be protected for such problem ?

Honestly, this problem are pretty rare and with well designed script on properly configured platform you shouldn’t have this problem. In fact the problem with wc generally appear after  manual editing of a file. And the LANG problem only appear if you don’t use UTF-8 encoding (which is not so usual nowadays). So just remind that this kind of problem can appear from time to time, but don’t focus too much on it.

Categories: Awk, Linux, Shell
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: