How to rename fastq headers i.e. remove duplicates

 

 

[bobbieshaban@spartan-login1 analysis]$ cat hiseq_reads_R1.fastq | awk ‘{if(NR%4==1) $0=sprintf(“@1_%d”,(1+i++)); print;}’ | less
[bobbieshaban@spartan-login1 analysis]$ cat hiseq_reads_R1.fastq | awk ‘{if(NR%4==1) $0=sprintf(“@1_%d”,(1+i++)); print;}’ > hiseq_R1.fastq
[bobbieshaban@spartan-login1 analysis]$ cat hiseq_reads_R2.fastq | awk ‘{if(NR%4==1) $0=sprintf(“@1_%d”,(1+i++)); print;}’ > hiseq_R2.fastq

 

output

 

[bobbieshaban@spartan-login1 analysis]$ head hiseq_reads_R1.fastq
@CP009257.1_0/1
GTCAATAATAGCTGTAGCAGAAGAACAGCAGATAGTGAGAAGCATTCTAAAGGAATTGTGACTTTTAATAAATTGCTTAACGACTAAAGTGTAAAAAAAATCCAAACTAAATAAAGAGTATTTTTT
+
BB/BBFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFF/FBFFFFFFFFF/FFFFFFFFBFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFB/F<<FF
@CP009257.1_1/1
TGCACCAAAAGGGTTAAGTTTCCCTGCAAGTAGTGCTTCAGAAGTCTTACCGTAGTTTAAGTAGCCGCTGTCTAATGCATCTGAGGCTTTACTTTTCGCATAGGTAATGCCTGTATTGATATCCCA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFF/FFFFFFFFFFFFFFFFFFFFFFB/FFFB//FB
@CP009257.1_2/1
TACGTTAATTCTAGAAAATGTACAAGATCCGGGTAACGTAGGTACATTGTTACGTTCAGCGGCAGCAGCAAATATAAAACAGATTATTTGCACACAAGGCTCTGCCTCACTTTGGTCTCCACGAGT
[bobbieshaban@spartan-login1 analysis]$ head hiseq_R1.fastq\
> ^C
[bobbieshaban@spartan-login1 analysis]$ head hiseq_R1.fastq
@1_1
GTCAATAATAGCTGTAGCAGAAGAACAGCAGATAGTGAGAAGCATTCTAAAGGAATTGTGACTTTTAATAAATTGCTTAACGACTAAAGTGTAAAAAAAATCCAAACTAAATAAAGAGTATTTTTT
+
BB/BBFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFF<FFFFFFF/FBFFFFFFFFF/FFFFFFFFBFFFFFFFFFFFFFBFFFFFFFFFFFFFFFFFFFB/F<<FF
@1_2
TGCACCAAAAGGGTTAAGTTTCCCTGCAAGTAGTGCTTCAGAAGTCTTACCGTAGTTTAAGTAGCCGCTGTCTAATGCATCTGAGGCTTTACTTTTCGCATAGGTAATGCCTGTATTGATATCCCA
+
BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFF/FFFFFFFFFFFFFFFFFFFFFFB/FFFB//FB
@1_3
TACGTTAATTCTAGAAAATGTACAAGATCCGGGTAACGTAGGTACATTGTTACGTTCAGCGGCAGCAGCAAATATAAAACAGATTATTTGCACACAAGGCTCTGCCTCACTTTGGTCTCCACGAGT