java - Linked List Optimization -
i'm working on program emulates restriction enzymes , dna splicing. i'm using dnasequencenode[s] linked list nodes.
i have problem 1 of function in code, cutsplice() supposed create new dnastrand clone of current dnastrand, every instance of enzyme replaced splicee.
for example, if linkeddnastrand instantiated "ttgatcc", , cutsplice("gat", "ttaagg") called, linked list should become (previous pointers not shown):
first -> "tt" -> "ttaagg" -> "cc" -> null
my function works. however, method cutsplice() takes more 80 seconds splice 200 dnas. i'm supposed bring 80 seconds 2 seconds.
this code class : linkeddnastrand.java
and here's code method cutsplice()
public dnastrand cutsplice(string enzyme, string splicee) { dnastrand newstrand = null; string original_dna = this.tostring(); string new_dna = original_dna.replaceall(enzyme, splicee); string[] splicee_split = new_dna.split(splicee); // splits new dna string dnastrand newstrand = null; int = 0; if (original_dna.startswith(enzyme)) { newstrand = new linkeddnastrand(splicee); } else { newstrand = new linkeddnastrand(splicee_split[0]); newstrand.append(splicee); } (i = 1; < splicee_split.length - 1; i++) { string node = splicee_split[i]; newstrand.append(node); newstrand.append(splicee); } newstrand.append(splicee_split[splicee_split.length - 1]); if (original_dna.endswith(enzyme)) { newstrand.append(splicee); } return newstrand; }
does see make critical difference on time function takes process 200 dnas sample?
well, comfortable use string methods, losing time in converting string, sequence, , (as pointed out in previous comments) regex based string functions.
it consume less time operate on linked list directly, although require implement replacement algorithm yourself:
@override public linkeddnastrand cutsplice(string enzyme, string splicee) { linkeddnastrand strand = new linkeddnastrand(); dnasequencenode end = null; dnasequencenode begin = top; int pos = 0; dnasequencenode tmpstart, tmpend; (dnasequencenode current = top; current != null; current = current.next) { if(current.value != enzyme.charat(pos)) { tmpstart = tmpend = new dnasequencenode(begin.value); (dnasequencenode n = begin.next; n != current.next; n = n.next) { dnasequencenode c = new dnasequencenode(n.value); tmpend.next = c; c.previous = tmpend; tmpend = c; } } else if(++pos == enzyme.length()) { tmpstart = tmpend = new dnasequencenode(splicee.charat(0)); (int = 1; < splicee.length(); ++i) { dnasequencenode c = new dnasequencenode(splicee.charat(i)); tmpend.next = c; c.previous = tmpend; tmpend = c; } } else { continue; } if(end == null) { strand.top = end = tmpstart; } else { end.next = tmpstart; tmpstart.previous = end; } end = tmpend; begin = current.next; pos = 0; } return strand; }
i not claim there not opportunity further optimize, should lot faster original version. tested example gave, if yet find bug, feel free fix yourself...
note 1: did explicitely create new sequence string (instead of using constructor) end of sequence without having iterate on again.
note 2: assumed existing constructor dnasequencenode(char value)
, dnasequencenode having member public char value
. might have adjust code appropriately if of these assumptions fails.
Comments
Post a Comment