使用www.example.com _biopython时出现文件I/O错误nglview.show(结构)

ygya80vv  于 2023-01-16  发布在  Python
关注(0)|答案(3)|浏览(117)

所以我一直在尝试用Python可视化蛋白质,所以我上了YouTube,找到了一些教程,最后我找到了一个教程,教你如何可视化COVID-19病毒的蛋白质,所以我设置了Anaconda,让Jupyter笔记本运行vscode,并从PDB数据库下载了必要的文件,并确保它们与我的笔记本在同一目录中,但是当我运行www.example.com_biopython(structure)函数时,我得到一个ValueError:nglview.show_biopython(structure) function I get an ValueError: I/O opertation on a closed file. I'm stummed this is my first time using jupyter notebook so maybe there is something I'm missing, I don't know.
这就是代码的样子

from Bio.PDB import * 
import nglview as nv

parser = PDBParser()
structure = parser.get_structure("6YYT", "6YYT.pdb")
view = nv.show_biopython(structure)

这是错误

Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_1728\2743687014.py in <module>
----> 1 view = nv.show_biopython(structure)

c:\Users\jerem\anaconda3\lib\site-packages\nglview\show.py in show_biopython(entity, **kwargs)
    450     '''
    451     entity = BiopythonStructure(entity)
--> 452     return NGLWidget(entity, **kwargs)
    453 
    454 

c:\Users\jerem\anaconda3\lib\site-packages\nglview\widget.py in __init__(self, structure, representations, parameters, **kwargs)
    243         else:
    244             if structure is not None:
--> 245                 self.add_structure(structure, **kwargs)
    246 
    247         if representations:

c:\Users\jerem\anaconda3\lib\site-packages\nglview\widget.py in add_structure(self, structure, **kwargs)
   1111         if not isinstance(structure, Structure):
   1112             raise ValueError(f'{structure} is not an instance of Structure')
-> 1113         self._load_data(structure, **kwargs)
   1114         self._ngl_component_ids.append(structure.id)
   1115         if self.n_components > 1:
...
--> 200         return io_str.getvalue()
    201 
    202 

ValueError: I/O operation on closed file

我只有在使用www.example.com_biopython时才遇到这个错误,当我运行get_structure()函数时,它可以很好地读取文件。我可以很好地可视化其他分子,或者可能是因为我使用的是ASE库而不是文件。我不知道,这就是我在这里的原因。nglview.show_biopython, when I run the get_structure() function it can read the file just fine. I can visualize other molucles just fine, or maybe that's because I was using the ASE library instead of a file. I don't know, that's why I'm here.
更新:最近我发现我可以使用www.example.com_file()而不是nglview.show_biopython()来可视化蛋白质,即使我现在可以可视化蛋白质,而且技术上我的问题已经解决,我仍然想知道为什么show_biopython()函数不能正常工作。nglview.show_file() instead of using nglview.show_biopython(). Even though I can visualize proteins now and techincally my problem has been solved I would still like to know why the show_biopython() function isn't working properly.

s4chpxco

s4chpxco1#

我还找到了另一种方法来解决这个问题。在回到我说的教程后,我看到它是2021年制作的。看到这个之后,我想知道我们是否使用了每个包的相同版本,结果我们没有。我不确定他们使用的是什么版本的nglview,但是他们使用的是biopython 1.79,这是2021年的最新版本,而我使用的是biopython 1.80。在使用biopython 1.80时,我得到了上面看到的错误。但是现在我使用的是biopython 1.79,我得到了这个:

file = "6YYT.pdb"
parser = PDBParser()
structure = parser.get_structure("6YYT", file)
structure

view = nv.show_biopython(structure)
view
输出:
c:\Users\jerem\anaconda3\lib\site-packages\Bio\PDB\StructureBuilder.py:89:
PDBConstructionWarning: WARNING: Chain A is discontinuous at line 12059.
  warnings.warn(

我猜biopython 1.80有些问题,所以我将坚持使用1.79

xu3bshqb

xu3bshqb2#

我也有类似的问题:

from Bio.PDB import * 
import nglview as nv

parser = PDBParser(QUIET = True)
structure = parser.get_structure("2ms2", "2ms2.pdb")

save_pdb = PDBIO()
save_pdb.set_structure(structure)
save_pdb.save('pdb_out.pdb')

view = nv.show_biopython(structure)
view

问题中的错误如下:

.................site-packages/nglview/adaptor.py:201, in BiopythonStructure.get_structure_string(self)
    199 io_str = StringIO()
    200 io_pdb.save(io_str)
--> 201 return io_str.getvalue()

ValueError: I/O operation on closed file

我修改了site-packages/nglview/adaptor.py:201, in BiopythonStructure.get_structure_string(self)

def get_structure_string(self):
        from Bio.PDB import PDBIO
        from io import StringIO
        io_pdb = PDBIO()
        io_pdb.set_structure(self._entity)
        io_str = StringIO()
        io_pdb.save(io_str)
        return io_str.getvalue()

与:

def get_structure_string(self):
        from Bio.PDB import PDBIO
        
        import mmap
        
        io_pdb = PDBIO()
        
        io_pdb.set_structure(self._entity)
        
        mo = mmap_str()
        
        io_pdb.save(mo)
        
        return mo.read()

并在同一文件中添加了这个新类mmap_str()

import mmap
import copy

class mmap_str():

    import mmap #added import at top
    
    instance = None

    def __init__(self):
    
        self.mm = mmap.mmap(-1, 2)
        
        self.a = ''
        
        b = '\n'
        
        self.mm.write(b.encode(encoding = 'utf-8'))
        
        self.mm.seek(0)
        
        #print('self.mm.read().decode() ',self.mm.read().decode(encoding = 'utf-8'))
        
        self.mm.seek(0)
        
    def __new__(cls, *args, **kwargs):
        if not isinstance(cls.instance, cls):
            cls.instance = object.__new__(cls)
        return cls.instance
        
    def write(self, string):
        
        self.a = str(copy.deepcopy(self.mm.read().decode(encoding = 'utf-8'))).lstrip('\n')
        
        self.mm.seek(0)
        
        #print('a -> ', self.a)
        
        len_a = len(self.a)
        
        self.mm = mmap.mmap(-1, len(self.a)+len(string))
        
        #print('a :', self.a)
        
        #print('len self.mm ', len(self.mm))
        
        #print('lenght string : ', len(string))
        
        #print(bytes((self.a+string).encode()))
        
        self.mm.write(bytes((self.a+string).encode()))
        
        self.mm.seek(0)
        
        #print('written once ')
        
        #self.mm.seek(0)
        
    def read(self):
    
        self.mm.seek(0)
        
        a = self.mm.read().decode().lstrip('\n')
        
        self.mm.seek(0)
        
        return a
        
    def __enter__(self):
        
        return self
 
    def __exit__(self, *args):
        
        pass

如果我取消print语句的注解,我将得到:

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.

错误,但将它们注解掉后,我得到:

在使用nglview.show_file(filename)时,我得到:

这是因为,正如查看pdb_out.pdb文件时所看到的
由我的代码输出,Biopytho.PDB.PDBParser.get_structure(name , filename)不检索负责生成完整CRYSTALLOGRAPHIC SYMMETRY的pdb头/或biopython无法处理它(不确定这一点,如果您更了解,请提供帮助),而只是检索坐标。
还是不明白是怎么回事:

--> 201 return io_str.getvalue()

ValueError: I/O operation on closed file

它可能是与jupiter ipykernal有关的东西?希望有人能更清楚地说明这一点,不知道框架是如何运行的,但绝对不同于普通的python解释器。

同样的代码在我的Python虚拟环境之一,将永远运行,所以它可能是ipykernel不喜欢StringIO()或做一些奇怪的事情给他们?
好的,感谢下面答案中的提示,我在github repo中检查了www.example.com的Biopython 1.80版本,并比较了PDBIO的保存方法:PDBIO.py和Biopython1.79中的那个一样 def save(self, file, select=_select, write_end=True, preserve_atom_numbering=False): with the one in Biopython 1.79,
参见第一位:

最后一位:

所以很明显,最大的区别是版本1.80中的with fhandle:块。
所以我意识到通过添加StringIO的子类来修改adaptor.py,如下所示:

from io import StringIO
class StringIO(StringIO):

    def __exit__(self, *args, **kwargs):
        
        print('exiting from subclassed StringIO !!!!!')
        
        pass

然后像这样修改def get_structure_string(self):

def get_structure_string(self):
        from Bio.PDB import PDBIO
        #from io import StringIO
        io_pdb = PDBIO()
        io_pdb.set_structure(self._entity)
        io_str = StringIO()
        io_pdb.save(io_str)
        return io_str.getvalue()

足以让我的Biopython 1.80工作在木星与nglview。
我不确定不关闭用于可视化的StringIO对象会有什么缺陷,但显然这就是Biopython 1.79所做的,就像我使用mmap对象的第一个答案一样(不关闭mmap_str)

50pmv0ei

50pmv0ei3#

解决问题的另一种方法:
我试图理解git,最后我得到了这个,看起来和biopython项目之前的习惯更一致,但不能强迫它。

它使用BIO.file中的as_handle:https://github.com/biopython/biopython/blob/e1902d1cdd3aa9325b4622b25d82fbf54633e251/Bio/File.py#L28

@contextlib.contextmanager
def as_handle(handleish, mode="r", **kwargs):
    r"""Context manager to ensure we are using a handle.
    Context manager for arguments that can be passed to SeqIO and AlignIO read, write,
    and parse methods: either file objects or path-like objects (strings, pathlib.Path
    instances, or more generally, anything that can be handled by the builtin 'open'
    function).
    When given a path-like object, returns an open file handle to that path, with provided
    mode, which will be closed when the manager exits.
    All other inputs are returned, and are *not* closed.
    Arguments:
     - handleish  - Either a file handle or path-like object (anything which can be
                    passed to the builtin 'open' function, such as str, bytes,
                    pathlib.Path, and os.DirEntry objects)
     - mode       - Mode to open handleish (used only if handleish is a string)
     - kwargs     - Further arguments to pass to open(...)
    Examples
    --------
    >>> from Bio import File
    >>> import os
    >>> with File.as_handle('seqs.fasta', 'w') as fp:
    ...     fp.write('>test\nACGT')
    ...
    10
    >>> fp.closed
    True
    >>> handle = open('seqs.fasta', 'w')
    >>> with File.as_handle(handle) as fp:
    ...     fp.write('>test\nACGT')
    ...
    10
    >>> fp.closed
    False
    >>> fp.close()
    >>> os.remove("seqs.fasta")  # tidy up
    """
    try:
        with open(handleish, mode, **kwargs) as fp:
            yield fp
    except TypeError:
        yield handleish

任何人都可以把它传沿着吗?[当然需要检查,我的测试还可以,但我是个新手]。

相关问题