CSV parsing methods in Linux

22 Oct 2009

by hemanth

  1. #To get emails from CSV which is exported form your mail box
  2.  
  3. grep -o -i '[^,]*@[^,]*' contacts.csv
  4.  
  5. #Cut the required column assuming delimiter to be ","
  6.  
  7. cut -d, -f2,3 < input.csv
  8.  
  9. #For given index range
  10.  
  11. grep ${VALUE} input.csv | cut -d, -f${INDEX}
  12.  
  13. #Get first two columns of the cvs file
  14.  
  15. awk '{print $1"\t"$2}' input.csv
  16.  
  17. #CSV which is of the regex /^(.*),,/ $f is the file here
  18. awk -F,'{ if ($1 != "") prefix=$1; else printf "%s%s\n", prefix,$0 }' $f
  19.  
  20. # CSV of the format field1 , field2 , field3 , field4
  21. # we can set the FS to a regexp as FS=’^ *| *, *|
  22.  
  23. # FS=','
  24. for(i=1;i<=NF;i++){
  25. gsub(/^ *| *$/,"",$i);
  26. print "Field " i " is " $i;
  27. }
  28.  
  29.  
  30. # For CSV of the format "field1","field2","field3","field4"
  31. # we can set the FS to FS=’^”|”,”|”$’ or FS=’”,”|”‘
  32.  
  33. # FS=','
  34. for(i=1;i<=NF;i++){
  35. gsub(/^ *"|" *$/,"",$i);
  36. print "Field " i " is " $i;
  37. }
  38.  
  39.  
  40. #For CSV => field1,"field2,with,commas",field3 ,"field4,foo"
  41.  
  42. $0=$0",";
  43. while($0) {
  44. match($0,/[^,]*,| *"[^"]*" *,/);
  45. sf=f=substr($0,RSTART,RLENGTH); # save
  46. gsub(/^ *"?|"? *,$/,"",f); # remove extra
  47. print "Field" ++c " is " f;
  48. sub(sf,""); # that which is matched
  49. }
  50.  
  51.  
  52.  
  53. This is my personal collection.
  54. Most of them suggested by the uber cool g33ks of terminal.
  55. Few bits patched by me.Out there are many ways
  56. please do comment them here.
Share this