有没有一种方法可以使用Python从UniProt获取蛋白质序列？

laawzig2 于 2023-04-28 发布在 Python

关注(0)|答案(1)|浏览(209)

我正在寻找一种方法来检索从UniProt通过指定蛋白UniProt ID在输入的FASTA文件。我的目标是创建一个Google Colab，它能够创建FASTA文件，我可以在其中指定FASTA名称，我想保存它的目录（在Google Drive中），并以1xUniProt1，3xUniProt2格式获取Uniprot ID，其中3x是我希望该序列在FASTA文件中以'：'分隔的次数。
我在想这样的事情：
输入中：

Name = protein_sequences
Proteins = 2xUniprot1, 3xUniprot2, 1xUniprot3
Directory = FASTA_directory

输出中：

Name of file = protein_sequences.fasta

FASTA file:

> protein_sequences   sequenceUniprot1:sequenceUniprot1:sequenceUniprot2:sequenceUniprot2:sequenceUniprot2:sequenceUniprot3

我遇到的主要问题是，我不确定如何使用Python从UniProt中获取序列本身。我不知道最新最有效的方法是什么。

python

来源：https://stackoverflow.com/questions/76120543/is-there-a-way-to-fetch-protein-sequences-from-uniprot-using-python