A ribosome starts building a protein when it encounters the sequence ATG, known as a "start codon". It knows the protein is complete once it encounters one of three "stop codons": TAA, TAG, or TGA. Everything in between is part of the protein code. (It’s more complicated, of course, but this is the basic gist).

Write a VHDL module which streams the genome through, and asserts the is_protein signal every time the module emits a nucleotide which is part of a protein, including the start and end codons.

It should be able to handle all three of the stop codons, and deal with multiple protein sequences back-to-back.



library IEEE; use IEEE.std_logic_1164.all; entity gene_protein is port( clk : in std_logic; nuc_in : in std_logic_vector(1 downto 0); -- Input nucleotide nuc_out : out std_logic_vector(1 downto 0); -- Input nucleotide is_protein : out std_logic -- Whether the output nucleotide is a cytosine ); end; architecture synth of gene_protein is begin end;

Are you confident about this change? (select one to recompile)

Compiler/test output: