This is a tutorial by Dr. Derek Cashman in the Department of Chemistry at Tennessee Technological University that teaches the student how to construct a model of cytochrome C using AlphaFold2 in Chimera X with an interface to Google Colab. It also instructs how to download the 3D structure of cytochrome C from the Protein Data Bank and compare the two structures.
Overview
Cytochrome C is an important electron transfer protein in the electron transport chain, and is present in numerous organisms. The purpose of this exercise is to analyze the structure of human cytochrome C and identify the key parts of the protein.
For this exercise, we will be using the Chimera X molecular modeling software package and the Alphafold2 artificial intelligence software, implemented in Google Colab. Chimera X is available as a free download from the University of California – San Francisco. Follow this link to download the software: Chimera X.
To use Google Colab, you will need a Google account, also available free of charge.
Procedure
- Begin by going to the UniProt database and searching for the primary structure of “human cytochrome C”. This protein should be approximately 105 amino acids in length. Be sure to include the UniProt ID of the sequence you download in your report. Viewing the page for human cytochrome C, scan down the page until you see the SEQUENCE, and copy the sequence (one-letter amino acid codes) to your clipboard.
- Now that you have saved the sequence, open Chimera X. You may wish to run through some of the Tutorials available on the website to familiarize yourself with the menu commands and user interface. Click on TOOLS > STRUCTURE PREDICTION > ALPHAFOLD. In the dialog box to the right, paste the one-letter codes of the sequence you obtained from UniProt (FASTA sequence). Click the PREDICT button below to send this sequence to Google Colab and follow the on-screen instructions provided. You will need to login to Google Colab using your Google account here. Alphafold2 should be able to predict the structure of cytochrome C in less than 30 minutes, and the predicted protein will appear in Chimera X when it is finished.
- Once Alphafold2 produces a predicted structure, view all of the atoms of the protein in Chimera X. Are hydrogens present? If not, you can add them using TOOLS > STRUCTURE EDITING > ADD HYDROGENS. What is the pLDDT confidence of the residues in the alphafold predicted structure? Are there any parts of cytochrome C that alphafold is not certain about the structure?
- Now, retrieve the experimental structure of Cytochrome C from the Protein Data Bank by typing “open 3ZOO” in the command line at the bottom of Chimera X. Using Chimera X, hide all of the atoms and ribbons of this experimental structure except chain A. What key structure present in the experimental structure you just downloaded is missing from the Alphafold2 predicted structure? Why do you think this is?
- Using TOOLS > STRUCTURE ANALYSIS > MATCHMAKER, superimpose the experimental protein structure with the Alphafold2 predicted structure. The root-mean-square deviation (RMSD) is a quantitative number describing the similarity between two structures. An RMSD = 0 indicates two structures have atoms that align perfectly and the structures are identical. The higher the RMSD, the more the atoms deviate in relative positions from each other. What is the RMSD of the aligned structure you just produced? SAVE a figure illustrating the alignment between the two proteins and include that in your report.
- Using the TOOLS > SEQUENCE > SHOW SEQUENCE VIEWER tool, obtain the primary structure of cytochrome C. Copy this sequence to your report. What amino acid residue is at the N-terminus and the C-terminus? Highlight all of the Cys and His residues in your sequence. How might these be important?
- Looking at the heme group present in the experimental structure, is this heme group connected to the protein covalently or is it interacting completely by non-covalent interactions in the protein? What is the residue name/number of the amino acid residues where the heme group is connected?
- Label the atom type of the central atom in the heme. What element is it? Are there any amino acid residues interacting with the central metal atom of the heme? Measure the distance between the nearest atom of these amino acid residues and the metal atom and report those values. Also, determine an appropriate view of the heme group in cytochrome C and include that figure in your report.
- Using TOOLS > STRUCTURE ANALYSIS > CONTACTS, you can view the individual contacts between the heme group and the protein. Highlight these. What amino acid residues are in contact with the heme group? Note: A “protein contact” is typically defined as a non-covalent interaction of 6.5 Å or less.
- When you downloaded the experimental structure of cytochrome C, did you notice a bunch of small red dots not connected to anything by covalent bonds in your protein? What are these small dots? Do these represent an individual atom or molecule (it may help to add hydrogens to your model to answer this question)? Why are these molecules present in your structure?
Questions
- Based on your analysis of only the sequence of cytochrome C, how many amino acid residues are in this protein, and what is the N-terminus and the C-terminus?
- Describe the biological function of cytochrome C and explain how the 3D structure of cytochrome C facilitates this function in the electron transport chain.
- What is the pLDDT confidence of the residues in the AlphaFold2 predicted structure? Are any residues predicted with low confidence scores? Why do you think this is? Include a figure in your report indicating the various parts of the protein with the pLDDT scores per residue.
- What key structure present in the experimental structure you downloaded is missing from the AlphaFold-predicted structure? Why is this?
- What is the RMSD of the aligned structure you just produced? Save a figure illustrating the alignment between the two proteins and include that in your report. Please show all of the atoms of the N-terminus and the C-terminus and label these residues by amino acid type and residue number (example: Met-1). Please show all of the CYSTEINE residues in your figure and label them as you did for the N- and C-termini. What is the function of the cysteine residues? Why are the important?
- In a new FIGURE of just the experimental structure, label the central metal atom in the heme group by atom type. Measure the distance from this metal atom to two amino acid residues on either side of the heme (the closest atoms in each residue). Label these amino acids the same way you did above. What additional amino acid residues form non-covalent contacts with the heme group? What type of interactions are these (van der Waal’s, dipole-dipole, hydrogen bond, etc)? Label at least six interactions by measuring the distance between the nearest atoms in each species.
- What do the small red dots in the experimental structure that you downloaded represent? Why are they represented as a single atom and not the whole molecule? Are these molecules important to the function and/or folding of this protein? Why or why not?
- What is the role of the heme group of cytochrome C, and how is its position within the protein crucial for function?
- Cytochrome C interacts with other proteins in the electron transport chain. Which ones? Based on its structure, how do you think these interactions are facilitated? Hint: You may want to explore structural features such as surface charge, binding domains, or hydrophilic/hydrophobic regions that promote protein-protein contact interactions.
- What might be the consequences of mutations in the active site or heme-binding region of cytochrome C? How would this affect its function in the electron transport chain?
- Does the protein still fold? Do you think that the heme group would still interact in the active site region? Are there any other residues that would impact the function of cytochrome C if mutated? To assist in answering this question, run a second AlphaFold2 prediction of the sequence of cytochrome C, but replace CYS-14 with an ALA residue.