查看原文
其他

s01 - Counting DNA Nucleotides

2017-10-16 不会python的Y叔 biobabble

这是ROSALIND的题,全部是生物学话题,不会python的Y叔准备出个「跟Y叔学生信」或者是「Y叔解题学python」系列,不知道受不受欢迎,更新频率取决于受欢迎程度,让我看到你们的掌声!

Problem

A string is simply an ordered collection of symbols selected from some alphabet and formed into a word; the length of a string is the number of symbols that it contains.


An example of a length 21 DNA string (whose alphabet contains the symbols ‘A’, ‘C’, ‘G’, and ‘T’) is “ATGCTTCAGAAAGGTCTTACG.”

Given: A DNA string s of length at most 1000 nt.

Return: Four integers (separated by spaces) counting the respective number of times that the symbols ‘A’, ‘C’, ‘G’, and ‘T’ occur in s.

Sample Dataset

AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGC

Sample Output

20 12 17 21

解答

这道题给出一段DNA序列,要求给出ACGT的频率,这个很容易,读文件,计数而已。

Python有count函数,直接帮我们计好数了。

FILE=open("DATA/rosalind_dna.txt", "r") dna=FILE.read() FILE.close() print(dna.count("A") , dna.count("C"), dna.count("G"), dna.count("T"))

由于这道题太简单,我们不防用C也来写一段。

#include <stdio.h>

int main() {  FILE *INFILE;  INFILE = fopen("DATA/rosalind_dna.txt", "rt");
 char nt;
 int a_cnt, c_cnt, g_cnt, t_cnt;  a_cnt = c_cnt = g_cnt = t_cnt = 0;
 while( (nt = fgetc(INFILE)) != EOF) {
     switch(nt) {
         case 'A':              a_cnt++;
             break;
          case 'C':              c_cnt++;
             break;
          case 'G':              g_cnt++;
             break;  
          case 'T':              t_cnt++;
             break;    }  }
 printf("%d %d %d %d\n", a_cnt, c_cnt, g_cnt, t_cnt);  
 return 0; }

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存