在blast输出文件中获得相互最佳匹配_python

我是一个新手，我正在尝试创建一个接受两个xml文件并找到相互最佳匹配的函数（如果来自specieA的某种蛋白质与specieB中另一个蛋白质的最佳匹配，反之亦然，则它们是相互的最佳匹配）基于他们的总得分。 我希望有人可以帮助我，因为我不知道从哪里开始。

record1=NCBIXML.parse(open(filename1))
record2=NCBIXML.parse(open(filename2))

for record in record1:
    query_id1=record.query_id

    for alignment in record.alignments:
        total_score1=0

        for hsp in alignment.hsps:
            total_score1 += hsp.bits

我这样做是为了找到直系同源基因：

对B冲击A。

解析并保存最佳匹配，例如：

 # "A1_prot" comes from the query and "B1_prot" from the subject matches = {"A1_prot": "B1_prot", "A2_prot": "B2_prot"}

向A爆炸B。

在查询带有结果的上一字典时解析此输出：

 # Now "A1_prot" comes from the subject and "B1_prot" is the query if matches["A1_prot"] == "B1_prot": orthologous.append(("A1_prot", "B1_prot"))

在blast输出文件中获得相互最佳匹配

问题描述

1楼