leetcode 187: Repeated DNA Sequences

2015-07-20 17:19:48 · 作者: · 浏览: 9
Total Accepted: 1161 Total Submissions: 6887

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",Return:["AAAAACCCCC", "CCCCCAAAAA"].

[分析]

HASHMAP方法会EXCEED SPACE LIMIT.

因为只有4个字母,所以可以创建自己的hashkey, 每两个BITS, 对应一个 incoming character. 超过20bit 即10个字符时, 只保留20bits.

[注意]

1. (hash<<2) + map.get(c) 符号优先级, << 一定要括起来.


public class Solution {
    public List
  
    findRepeatedDnaSequences(String s) {
        List
   
     res = new ArrayList
    
     (); if(s==null || s.length() < 11) return res; int hash = 0; Map
     
       map = new HashMap
      
       (); map.put('A', 0); map.put('C', 1); map.put('G', 2); map.put('T', 3); Set
       
         set = new HashSet
        
         (); Set
         
           unique = new HashSet
          
           (); for(int i=0; i