Package dna
Class DNA
- java.lang.Object
-
- dna.DNA
-
public class DNA extends Object
-
-
Constructor Summary
Constructors Constructor Description DNA()Constructor for FASTA file process.DNA(int n)Constructor for random generation of DNA sequence.DNA(File file)Constructor that takes file as an argument and saves it in class.DNA(String dnaString)Constructor that takes DNA string, convert them into Base, and insert into list.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Map<Long,List<Integer>>buildIndex(int k)Stores DNA hash values and its corresponding indices in hash table.voidbuildIndex(int start, int end, int k, Map<Long,List<Integer>> map)This method builds hash table using DNA hash values and its indices.Map<Long,List<Integer>>buildIndexFast(int k)This method uses multi-threading to build the hash table.voidbuildIndexFile(int k)This method calculates k-mer at every index from a DNA sequence and insert them into database.voidclearTable(int k)Clears the table.List<Integer>findIndex(Map<Long,List<Integer>> map, DNA kmer)This method searches for start indices that contains the k-mer.List<Integer>findIndexFast(Map<Long,List<Integer>> map, DNA kmer)This method utilizes stream to compare k-mer from the DNA sequence and the actual k-mer and find the indices that matches.List<Integer>getIndex(DNA kmer)Brute force search.List<Integer>getIndexBit(DNA kmer)Calculate hash values of sub-sequences using Rabin-Karp algorithm and bit operation.List<Long>getIndexDB(DNA kmer)This method finds the index of the DNA hash value that matches that of the k-mer.List<Integer>getIndexFast(DNA kmer)This method utilizes multi-threading to find the matching indices of the DNA sequence to that of the actual k-mer.List<Long>getIndexFile(DNA kmer)List<Integer>getIndexHash(DNA kmer)This method calculates hash value of k-mer from the sequence using Rabin-Karp algorithm.List<Integer>getIndexRange(DNA kmer, int start, int end)This method uses brute force to find indices of k-mer from the sequence that matches the actual k-mer.intgetSize()booleanisSame(DNA kmer, int index)Compare each character of k-mer to each character of the entire sequence.StringtoString()voidviewDB(int k)Views the database.
-
-
-
Constructor Detail
-
DNA
public DNA()
Constructor for FASTA file process.
-
DNA
public DNA(String dnaString)
Constructor that takes DNA string, convert them into Base, and insert into list.- Parameters:
dnaString- DNA sequence
-
DNA
public DNA(int n)
Constructor for random generation of DNA sequence.- Parameters:
n- of random sequence to be generated
-
DNA
public DNA(File file)
Constructor that takes file as an argument and saves it in class.- Parameters:
file- FASTA file
-
-
Method Detail
-
buildIndexFile
public void buildIndexFile(int k) throws ExceptionThis method calculates k-mer at every index from a DNA sequence and insert them into database.- Parameters:
k- length of the k-mer- Throws:
Exception- if errors occur on database end, throws exception
-
clearTable
public void clearTable(int k) throws ExceptionClears the table.- Parameters:
k- length of the k-mer- Throws:
Exception- if the table cannot be cleared
-
viewDB
public void viewDB(int k) throws ExceptionViews the database.- Parameters:
k- length of the k-mer- Throws:
Exception- when there is no database
-
getIndexDB
public List<Long> getIndexDB(DNA kmer) throws Exception
This method finds the index of the DNA hash value that matches that of the k-mer.- Parameters:
kmer- The sequence of the k-mer- Returns:
- the list with target indices
- Throws:
Exception- when an error occurs during reading the input file or the database.
-
getIndex
public List<Integer> getIndex(DNA kmer)
Brute force search.- Parameters:
kmer- sequence of the k-mer- Returns:
- list of first index of the sequence when the k-mer finds its match in the sequence
-
getIndexFile
public List<Long> getIndexFile(DNA kmer)
- Parameters:
kmer- sequence of the k-mer- Returns:
- the list with indices of k-mer from the sequence that matches the actual k-mer
-
getIndexRange
public List<Integer> getIndexRange(DNA kmer, int start, int end)
This method uses brute force to find indices of k-mer from the sequence that matches the actual k-mer.- Parameters:
kmer- sequence of the k-merstart- start index of the sequenceend- end index of the sequence- Returns:
- the list with target indices
-
isSame
public boolean isSame(DNA kmer, int index)
Compare each character of k-mer to each character of the entire sequence.- Parameters:
kmer- the sequence of the actual k-merindex- index from the DNA sequence- Returns:
- true if the k-mer finds its match in the sequence
-
getIndexHash
public List<Integer> getIndexHash(DNA kmer)
This method calculates hash value of k-mer from the sequence using Rabin-Karp algorithm.- Parameters:
kmer- sequence of the actual k-mer- Returns:
- list with indices of k-mer from the sequence that matches the hash value of the actual k-mer
-
getIndexBit
public List<Integer> getIndexBit(DNA kmer)
Calculate hash values of sub-sequences using Rabin-Karp algorithm and bit operation.- Parameters:
kmer- sequence of the actual k-mer- Returns:
- list of indices of matching sub k-mer from the DNA sequence
-
buildIndex
public Map<Long,List<Integer>> buildIndex(int k)
Stores DNA hash values and its corresponding indices in hash table.- Parameters:
k- length of the k-mer- Returns:
- hash table with DNA hash values and list of target indices
-
buildIndex
public void buildIndex(int start, int end, int k, Map<Long,List<Integer>> map)This method builds hash table using DNA hash values and its indices. If there are duplicate DNA hash values in the map, this method simply adds the corresponding index to the existing list. Otherwise, it inserts the DNA hash value into the map and a new list with the current index.- Parameters:
start- start indexend- end indexk- length of the k-mermap- hash table with DNA hash values and its corresponding indices
-
buildIndexFast
public Map<Long,List<Integer>> buildIndexFast(int k)
This method uses multi-threading to build the hash table.- Parameters:
k- length of the k-mer- Returns:
- map with specific hash value with corresponding index in a list
-
findIndexFast
public List<Integer> findIndexFast(Map<Long,List<Integer>> map, DNA kmer)
This method utilizes stream to compare k-mer from the DNA sequence and the actual k-mer and find the indices that matches.- Parameters:
map- the hash table with DNA hash values and its corresponding list of indiceskmer- the sequence of the actual k-mer- Returns:
- the list of indices that matches between k-mer from the DNA sequence and the actual k-mer
-
getSize
public int getSize()
- Returns:
- size of the base
-
findIndex
public List<Integer> findIndex(Map<Long,List<Integer>> map, DNA kmer)
This method searches for start indices that contains the k-mer.- Parameters:
map- map with hash value with corresponding indices in a listkmer- target k-mer- Returns:
- list of indices
-
-