查看原文
其他

s09 - Perfect Matchings and RNA Secondary Structures

Y叔叔 YuLabSMU 2022-09-20

Problem

A matching in a graph G is a collection of edges of GG for which no node belongs to more than one edge in the collection. See Figure 2 for examples of matchings. If G contains an even number of nodes (say 2n), then a matching on G is perfect if it contains n edges, which is clearly the maximum possible. An example of a graph containing a perfect matching is shown in Figure 3.

First, let Kn denote the complete graph on 2n labeled nodes, in which every node is connected to every other node with an edge, and let pn denote the total number of perfect matchings in Kn. For a given node x, there are 2n−1 ways to join x to the other nodes in the graph, after which point we must form a perfect matching on the remaining 2n−2 nodes. This reasoning provides us with the recurrence relation pn=(2n−1)⋅pn-1; using the fact that p1 is 1, this recurrence relation implies the closed equation pn=(2n−1)(2n−3)(2n−5)⋯(3)(1).

Given an RNA string s=s1…sn, a bonding graph for ss is formed as follows. First, assign each symbol of ss to a node, and arrange these nodes in order around a circle, connecting them with edges called adjacency edges. Second, form all possible edges {A, U} and {C, G}, called basepair edges; we will represent basepair edges with dashed edges, as illustrated by the bonding graph in Figure 4.

Note that a matching contained in the basepair edges will represent one possibility for base pairing interactions in ss, as shown in Figure 5. For such a matching to exist, ss must have the same number of occurrences of ‘A’ as ‘U’ and the same number of occurrences of ‘C’ as ‘G’.

Given: An RNA string ss of length at most 80 bp having the same number of occurrences of ‘A’ as ‘U’ and the same number of occurrences of ‘C’ as ‘G’.

Return: The total possible number of perfect matchings of basepair edges in the bonding graph of s.

Sample Dataset

>Rosalind_23
AGCUAGUCAU

Sample Output

12

Solution

这道题写这么长,说白了可以用上面这个图来解释,一个A和一个U形成配对,就不能与另一个U形成配对,这道题要求我们计算各种组合的总数。如果AU数目是n,那么AU配对的可能组合数就是n!,AU组合数乘以GC组合数就是总的组合数。

#!/usr/bin/env python3
from Bio import SeqIO
from math import factorial

fasta_sequences = SeqIO.read("DATA/rosalind_pmch.txt",'fasta')
seq = str(fasta_sequences.seq)
nAU = seq.count('A')
nGC = seq.count('G')

res = factorial(nAU) * factorial(nGC)
print(res)

Python内置支持大整数就是方便,如果用java来写,你就得用额外的库来支持:

import java.math.BigInteger;

// 省略部分代码

   int m = 0;
   int n= 0;
   for (int i=0; i< seq.length(); i++) {    
     if (seq.charAt(i) == 'A')
       m++;        
     if (seq.charAt(i) == 'G')
       n++;
   }

   BigInteger res = BigInteger.valueOf(1);
   for (int i=1; i<=m; i++) {
       res = res.multiply(BigInteger.valueOf(i));
   }    
   for (int i = 1; i<=n; i++) {
       res = res.multiply(BigInteger.valueOf(i));
   }

   System.out.println(res);

往期精彩

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存