awk to include name in field between location and average -
in awk
below location $1
, calculated average $4
printed. can not seem syntax correct include $2 in output between $1
, $4
. thank :).
awk '{ if(len==0){ last=$1;total=$4;len=1;getline } if($1!=last){ printf("%s\t%f\n", last, total/len); last=$1;total=$4;len=1 } else{ total+=$4;len+=1 } } end{ printf("%s\t%f\n", last, total/len) }' input.bed > output.txt
input.bed
chr1:955542-955763 agrn:exon.1 1 0 chr1:955542-955763 agrn:exon.1 2 0 chr1:955542-955763 agrn:exon.2 3 0 chr1:955542-955763 agrn:exon.2 4 1
current output.txt
chr1:955542-955763 21.289593 chr1:957570-957852 304.861702
desired output.txt
chr1:955542-955763 agrn:exon.1 21.289593 chr1:957570-957852 agrn:exon.2 304.861702
maybe
awk '{if(len==0){last=$1;**name=$2**,total=$4;len=1;getline}if($1!=last){printf("%s\t%f\n", last, ,**name**, total/len);last=$1;name=$2;total=$4;len=1}else{total+=$4;len+=1}}end{printf("%s\t%f\n", last,**name**, total/len)}' input.bed > output.txt
the input , outputs posted not real #'s don't mean :)
edit:
awk '{for (i=1; i<=nf; i++) print i, $i}' ionxpress_008_150902_4column.bed | head -4 1 chr1:955542-955763 2 agrn:exon.1 3 1 4 0
i think key should combination of first 2 fields. sample input provided
$ awk '{k=$1 ofs $2; s[k]+=$4; c[k]++} end{for(i in s) print i, s[i]/c[i]}' file
will produce this
chr1:955542-955763 agrn:exon.1 0 chr1:955542-955763 agrn:exon.2 0.5
if field 2 not part of key , want have value last row each field 1
$ awk '{k=$1; s[k]+=$4; f2[k]=$2; c[k]++} end{for(i in s) print i, f2[i], s[i]/c[i]}' file
will produce
chr1:955542-955763 agrn:exon.2 0.25
Comments
Post a Comment