Building and using DNA MotifsBackgroundSequence motifs are short, recurring (meaning conserved) patterns in DNA that are presumed to have a biological function. Often they indicate sequence-specific binding sites for proteins and other important markers. However, sometimes they are not exactly conserved, meaning some mutations can happen in a motif in a particular organism. Mutations can be DNA substitutions/deletions/insertions. Therefore, sequences are usually aligned and a consensus pattern of a motif is calculated over all examples from organisms.The following are examples of a transcription factor binding (TFB) site for the lexA repressor in_ E. Coli _located in a file called lexA.fasta:dinD 32-52 aactgtatataaatacagtt dinG 15-35 tattggctgtttatacagta dinH 77-97 tcctgttaatccatacagca dinI 19-39 acctgtataaataaccagta lexA-1 28-48 tgctgtatatactcacagca lexA-2 7-27 aactgtatatacacccaggg polB(dinA) 53-73 gactgtataaaaccacagcc recA 59-79 tactgtatgagcatacagta recN-1 49-69 tactgtatataaaaccagtt recN-2 27-47 tactgtacacaataacagta recN-3 9-29 TCCTGTATGAAAAACCATTA ruvAB 49-69 cgctggatatctatccagca sosC 18-38 tactgatgatatatacaggt sosD 14-34 cactggatagataaccagca sulA 22-42 tactgtacatccatacagta umuDC 20-40 tactgtatataaaaacagta uvrA 83-103 tactgtatattcattcaggt uvrB 75-95 aactgtttttttatccagta uvrD 57-77 atctgtatatatacccagctEach line that starts with “” is the header that states what gene this sequence was upstream of and where it is located relative to the gene. (For your purposes, we can ignore this and your code should ignore these lines when parsing the DNA sequences in). Each line in between is each nucleotide sequence of each TFB. Each nucleotide has a position in the sequence. You can assume that all sequences will be the same length.You also can do very minimal input error checking – I won’t be checking extensively for input error checking. However, do make sure that if a function relies on another function being run first, you have it do that.Creating DNAMOTIF classYou will create a DNAMOTIF class that has the following attributes and functions:__init__(self): Initialize the class.self.instances=[] #These are a list of DNA sequence strings (no header) self.consensus=[] # A DNA sequence String self.counts= {'A': [], 'C': [], 'G':[],'T':[]} # A dictionary of nucleotide counts__str__: Return a string with the sequence instances of the motif on each line__len__: Return the length of a motif, which is the length of one of the sequences in the collection.Example Input:lexA=DNAMOTIF() lexA.parse("lexA.fasta") print(len(lexA))Output:20parse(self,filename): read in DNA instances from a FASTA fileExample Usage:lexA.parse("lexA.fasta") print(lexA) aactgtatataaatacagtt tattggctgtttatacagta tcctgttaatccatacagca acctgtataaataaccagta tgctgtatatactcacagca aactgtatatacacccaggg gactgtataaaaccacagcc tactgtatgagcatacagta tactgtatataaaaccagtt tactgtacacaataacagta TCCTGTATGAAAAACCATTA cgctggatatctatccagca tactgatgatatatacaggt cactggatagataaccagca tactgtacatccatacagta tactgtatataaaaacagta tactgtatattcattcaggt aactgtttttttatccagta atctgtatatatacccagctcount(self): Count occurrences of A’s, C’s, G’s, and T’s in each position and store in a dictionary. Convert all sequences to upper case for consistencyExample Input:lexA.count()To Access Result:lexA.counts={'A': [5, 13, 0, 0, 0, 1, 15, 1, 15, 4, 12, 6, 16, 6, 10, 0, 19, 0, 0, 12], 'C': [2, 3, 18, 0, 0, 0, 1, 2, 0, 1, 3, 6, 1, 4, 8, 19, 0, 0, 6, 1], 'G': [1, 2, 0, 0, 19, 3, 0, 1, 3, 1, 1, 0, 0, 0, 0, 0, 0, 18, 3, 1], 'T': [11, 1, 1, 19, 0, 15, 3, 15, 1, 13, 3, 7, 2, 9, 1, 0, 0, 1, 10, 5]}compute_consensus(self): Return an UPPERCASE sequence of the most frequent nucleotides in each position of the motif. If more than one are tied, return the first one lexicographically.Example Input:lexA.compute_consensus()To Access Result:print(lexA.consensus) TACTGTATATATATACAGTA |lexA - NotepadFile Edit Format View Helpdind 32-52aactgtatataaatacagttdinG 15-35tattggctgtttatacagtadinH 77-97tcctgttaatccatacagcadinI 19-39acctgtataaataaccagtalexA-1 28-48tgctgtatatactcacagcalexA-2 7-27aactgtatatacacccagggpolB(dinA) 53-73gactgtataaaaccacagccrecA 59-79tactgtatgagcatacagtarecN-1 49-69tactgtatataaaaccagttrecN-2 27-47tactgtacacaataacagtarecN-3 9-29ТССTGTATGAААAАССАТТАruvAB 49- 69cgctggatatctatccagcasosc 18-38tactgatgatatatacaggtsosD 14-34cactggatagataaccagcasulA 22-42tactgtacatccatacagtaumuDC 20-40tactgtatataaaaacagtauvrA 83-103tactgtatattcattcaggtuvrB 75-95aactgtttttttatccagtauvrD 57-77atctgtatatatacccagct main.pyLoad default template...1 class DNAMOTIF:def _init_(self):self.instances=[]self.consensus=[]self.counts= {'A': [], 'C': [], 'G':[],'T':[]}346_str_(self):pass # todo7def8insert your code heree.g. return9.10def _len__(self):11# todoinsert your code heree.g. return1213def count(self):14pass # todo1516def compute_consensus(self):17pass # todo1819def parse(self, filename):2021# todoinsert your code here - e.g. self. instances%3D22

Question

Building and using DNA MotifsBackgroundSequence motifs are short, recurring (meaning conserved) patterns in DNA that are presumed to have a biological function. Often they indicate sequence-specific binding sites for proteins and other important markers. However, sometimes they are not exactly conserved, meaning some mutations can happen in a motif in a particular organism. Mutations can be DNA substitutions/deletions/insertions. Therefore, sequences are usually aligned and a consensus pattern of a motif is calculated over all examples from organisms.The following are examples of a transcription factor binding (TFB) site for the lexA repressor in_ E. Coli _located in a file called lexA.fasta:>dinD 32->52 aactgtatataaatacagtt >dinG 15->35 tattggctgtttatacagta >dinH 77->97 tcctgttaatccatacagca >dinI 19->39 acctgtataaataaccagta >lexA-1 28->48 tgctgtatatactcacagca >lexA-2 7->27 aactgtatatacacccaggg >polB(dinA) 53->73 gactgtataaaaccacagcc >recA 59->79 tactgtatgagcatacagta >recN-1 49->69 tactgtatataaaaccagtt >recN-2 27->47 tactgtacacaataacagta >recN-3 9-29 TCCTGTATGAAAAACCATTA >ruvAB 49->69 cgctggatatctatccagca >sosC 18->38 tactgatgatatatacaggt >sosD 14->34 cactggatagataaccagca >sulA 22->42 tactgtacatccatacagta >umuDC 20->40 tactgtatataaaaacagta >uvrA 83->103 tactgtatattcattcaggt >uvrB 75->95 aactgtttttttatccagta >uvrD 57->77 atctgtatatatacccagctEach line that starts with “>” is the header that states what gene this sequence was upstream of and where it is located relative to the gene. (For your purposes, we can ignore this and your code should ignore these lines when parsing the DNA sequences in). Each line in between is each nucleotide sequence of each TFB. Each nucleotide has a position in the sequence. You can assume that all sequences will be the same length.You also can do very minimal input error checking – I won’t be checking extensively for input error checking. However, do make sure that if a function relies on another function being run first, you have it do that.Creating DNAMOTIF classYou will create a DNAMOTIF class that has the following attributes and functions:__init__(self): Initialize the class.self.instances=[] #These are a list of DNA sequence strings (no header) self.consensus=[] # A DNA sequence String self.counts= {'A': [], 'C': [], 'G':[],'T':[]} # A dictionary of nucleotide counts__str__: Return a string with the sequence instances of the motif on each line__len__: Return the length of a motif, which is the length of one of the sequences in the collection.Example Input:lexA=DNAMOTIF() lexA.parse(&#34;lexA.fasta&#34;) print(len(lexA))Output:20parse(self,filename): read in DNA instances from a FASTA fileExample Usage:lexA.parse(&#34;lexA.fasta&#34;) print(lexA) aactgtatataaatacagtt tattggctgtttatacagta tcctgttaatccatacagca acctgtataaataaccagta tgctgtatatactcacagca aactgtatatacacccaggg gactgtataaaaccacagcc tactgtatgagcatacagta tactgtatataaaaccagtt tactgtacacaataacagta TCCTGTATGAAAAACCATTA cgctggatatctatccagca tactgatgatatatacaggt cactggatagataaccagca tactgtacatccatacagta tactgtatataaaaacagta tactgtatattcattcaggt aactgtttttttatccagta atctgtatatatacccagctcount(self): Count occurrences of A’s, C’s, G’s, and T’s in each position and store in a dictionary. Convert all sequences to upper case for consistencyExample Input:lexA.count()To Access Result:lexA.counts={'A': [5, 13, 0, 0, 0, 1, 15, 1, 15, 4, 12, 6, 16, 6, 10, 0, 19, 0, 0, 12], 'C': [2, 3, 18, 0, 0, 0, 1, 2, 0, 1, 3, 6, 1, 4, 8, 19, 0, 0, 6, 1], 'G': [1, 2, 0, 0, 19, 3, 0, 1, 3, 1, 1, 0, 0, 0, 0, 0, 0, 18, 3, 1], 'T': [11, 1, 1, 19, 0, 15, 3, 15, 1, 13, 3, 7, 2, 9, 1, 0, 0, 1, 10, 5]}compute_consensus(self): Return an UPPERCASE sequence of the most frequent nucleotides in each position of the motif. If more than one are tied, return the first one lexicographically.Example Input:lexA.compute_consensus()To Access Result:print(lexA.consensus) TACTGTATATATATACAGTA |lexA - NotepadFile Edit Format View Help>dind 32->52aactgtatataaatacagtt>dinG 15->35tattggctgtttatacagta>dinH 77->97tcctgttaatccatacagca>dinI 19->39acctgtataaataaccagta>lexA-1 28->48tgctgtatatactcacagca>lexA-2 7->27aactgtatatacacccaggg>polB(dinA) 53->73gactgtataaaaccacagcc>recA 59->79tactgtatgagcatacagta>recN-1 49->69tactgtatataaaaccagtt>recN-2 27->47tactgtacacaataacagta>recN-3 9-29ТССTGTATGAААAАССАТТА>ruvAB 49- >69cgctggatatctatccagca>sosc 18->38tactgatgatatatacaggt>sosD 14->34cactggatagataaccagca>sulA 22->42tactgtacatccatacagta>umuDC 20->40tactgtatataaaaacagta>uvrA 83->103tactgtatattcattcaggt>uvrB 75->95aactgtttttttatccagta>uvrD 57->77atctgtatatatacccagct main.pyLoad default template...1 class DNAMOTIF:def _init_(self):self.instances=[]self.consensus=[]self.counts= {'A': [], 'C': [], 'G':[],'T':[]}346_str_(self):pass # todo7def8insert your code heree.g. return9.10def _len__(self):11# todoinsert your code heree.g. return1213def count(self):14pass # todo1516def compute_consensus(self):17pass # todo1819def parse(self, filename):2021# todoinsert your code here - e.g. self. instances%3D22

Accepted Answer

The solution of the following question is:

CODE 1:

class DNAMOTIF:…