Panini Linguistics Olympiad: Genes from Space
Alien Protein CodesDr. Muzabique figured out how to manipulate six alien proteins to do three functions: construct, cut, and pack. The alien genetic code used to manipulate the proteins was found to have 6 "alphabet letters": A, T, G, C, D, N. Only he knew the top-secret algorithm, and two weeks ago, he was mysteriously found dead in his apartment.
The top-secret project is now in a fund crunch but the government decides to give it one more chance if the algorithm can be decoded. They send you the only page that was recovered from Dr. Muzabique's notebook, with one set of input instructions and output genetic codes, and ask you to decode the algorithm:
"Today was a great breakthrough! I found that large and small proteins inherently behave in different ways. The way they operate under different cellular functions..."
Input | Output |
---|---|
Celebi, construct Articuno | CNDATACACCNDTGTATTCGGGATGCNDTGCCCACCCCNNNCNDCTTTTCCACTNANCND |
Articuno, construct and cut Terrakion | CNDTGTATTCGGGATGCNDACTGCAGGGACTCCNDGCATGACATCNNNNNNCNDTGCCCACCC |
Terrakion, construct and pack Shaymin | CNDACTGCAGGGACTCCNDCAGGCTCNDGCATGACATCNNNNANCNDCTTTTCCACTNNNCND |
Cressela, construct and cut and pack Azelf | CNDCCATATGATGCGACNDGTTGATCNDGCATGACATCNNNNANCNDTGCCCACCCCNANCND CTTTTCCACTNNNCND |
Celebi, cut Cressela | CNDATACACCNDCCATATGATGCGACNDGCATGACATCNANNNNCNDCTTTTCCACTNANCND |
Shaymin, pack and construct Azelf | CNDCAGGCTCNDGTTGATCNDTGCCCACCCCNANCND |
Celebi, construct and cut and pack Cressela | CNDATACACCNDCCATATGATGCGACND |
Terrakion, pack Articuno | CNDACTGCAGGGACTCCNDTGTATTCGGGATGCNDCTTTTCCACTNNNCND |
Assignment:Preserve the legacy of Dr. Muzabique! Explain how the algorithm converts input commands into output genetic codes.
The Request
Could someone please check to see if my solution below is correct? Thank you!
My Solution
Sections of the gene sequence are separated by the codon CND, like below:
The underlined parts are the proteins involved. The first underlined part is the protein doing the action(the first protein mentioned) while the second underlined part is the protein being acted upon(the second protein mentioned). I've color-coded the underlined parts to differentiate between proteins.
The other sections are the actions done. As hinted in Dr. Muzabique's notebook, the length of the underlined protein sequences affects the action sequences. If the first protein(the acting protein) is long, then the action sequences will represent the actions in the command. If the acting protein is short, then the action sequences will represent the actions not in the command.
The action sequences also seem to end in either NNN or NAN, or a sequence of the two. I'll call these sequences "penguins"(because why not) and the other part of an action sequence the polar bears. This part seems to depend on the size of the proteins. The third gene output gives us the key between the actions and the polar bears. For the action sequence for constructing(the polar bear is GCATGACATC), it seems that it has N_NN_N as the penguin. It also seems like the blanks represent the first and second protein lengths, in order, where it is A if the protein is short and N if the protein is long. For the action sequence for cutting(polar bear: TGCCCACCCC) the penguin is N_N, and depends on the second protein: if it is short then the blank is A, if it is long then the blank is N. For packing(polar bear: CTTTTCCACT) it is like the cutting penguin except it depends on the first protein.
* This article was originally published here
Comments
Post a Comment