Need help,
data1
1 59851 59880 CATTCTAGTGTAAAGTTTTAGATCTTATAT
1 59881 59910 AACTGTGAGATTAATCTCAGATAATGACAC
1 59911 59940 AAAATATAGTGAAGTTGGTAAGTTATTTAG
1 59941 59970 TAAAGCTCATGAAAATTGTGCCCTCCATTC
1 59971 60000 CCATATAATTTAGTAATTGTCTAGGAACTT
1 60001 60030 CCACATACATTGCCTCAATTTATCTTTCAA
1 60031 60060 CAACTTGTGTGTTATATTTTGGAATACAGA
1 60061 60090 TACAAAGTTATTATGCTTTCAAAATATTCT
1 60091 60120 TTTGCTAATTCTTAGAACAAAGAAAGGCAT
1 60121 60150 AAATATATTAGTATTTGTGTACACCTGTTC
1 60151 60180 CTTCCTGTGTGACCCTAAGTTTAGTAGAAG
1 60181 60210 AAAGGAGAGAAAATATAGCCTAGCTTATAA
1 60211 60240 ATTTAAAAAAAAATTTATTTGGTCCATTTT
data2
1 59871 58954 ENSP00000317482 OR4F5
1 358460 357522 ENSP00000318226 OR4F29
1 611897 610959 ENSP00000329982 OR4F16
1 712376 711183 ENSP00000351335 AL669831.13
1 745077 742614 ENSP00000317958 FAM87B
1 869824 850393 ENSP00000349216 SAMD11
For each line in data2, I want the line from data1, if $2 and $3 range of data1 is overlap with $2 and $3 of data2. That means, if ( data1[$2]>=data2[$2] && data1[$3]<=data2[3]), I want to print the result as follows:
Result(the first three are from data1 and the last three are from data2)
1 59881 59910 59871 58954 OR4F5
1 59911 59940 59871 58954 OR4F5
1 59941 59970 59871 58954 OR4F5
I have tried this and complain about syntax error
awk 'NR==FNR{a[$1","$2","$3]=$0;next}$2, $3 in a {if(a[1]<= $2 && a[2]<=$3]) print $2","$3\t$0
Thanks
data1
1 59851 59880 CATTCTAGTGTAAAGTTTTAGATCTTATAT
1 59881 59910 AACTGTGAGATTAATCTCAGATAATGACAC
1 59911 59940 AAAATATAGTGAAGTTGGTAAGTTATTTAG
1 59941 59970 TAAAGCTCATGAAAATTGTGCCCTCCATTC
1 59971 60000 CCATATAATTTAGTAATTGTCTAGGAACTT
1 60001 60030 CCACATACATTGCCTCAATTTATCTTTCAA
1 60031 60060 CAACTTGTGTGTTATATTTTGGAATACAGA
1 60061 60090 TACAAAGTTATTATGCTTTCAAAATATTCT
1 60091 60120 TTTGCTAATTCTTAGAACAAAGAAAGGCAT
1 60121 60150 AAATATATTAGTATTTGTGTACACCTGTTC
1 60151 60180 CTTCCTGTGTGACCCTAAGTTTAGTAGAAG
1 60181 60210 AAAGGAGAGAAAATATAGCCTAGCTTATAA
1 60211 60240 ATTTAAAAAAAAATTTATTTGGTCCATTTT
data2
1 59871 58954 ENSP00000317482 OR4F5
1 358460 357522 ENSP00000318226 OR4F29
1 611897 610959 ENSP00000329982 OR4F16
1 712376 711183 ENSP00000351335 AL669831.13
1 745077 742614 ENSP00000317958 FAM87B
1 869824 850393 ENSP00000349216 SAMD11
For each line in data2, I want the line from data1, if $2 and $3 range of data1 is overlap with $2 and $3 of data2. That means, if ( data1[$2]>=data2[$2] && data1[$3]<=data2[3]), I want to print the result as follows:
Result(the first three are from data1 and the last three are from data2)
1 59881 59910 59871 58954 OR4F5
1 59911 59940 59871 58954 OR4F5
1 59941 59970 59871 58954 OR4F5
I have tried this and complain about syntax error
awk 'NR==FNR{a[$1","$2","$3]=$0;next}$2, $3 in a {if(a[1]<= $2 && a[2]<=$3]) print $2","$3\t$0
Thanks