de.unibi.techfak.jpredictor.operator
Interface IOperator

All Known Implementing Classes:
Operator

public interface IOperator

In this interface methods are defined, that an operator acting on motifs and sequences should implement.

The operator's method will get their options and parameters from a class implementing the interface Communicator. Note, that the operator's actions should work in threads, and that results should be sequently delivered to the calling process by using the communicators methods.


Method Summary
 long getSequenceLength(int seqNr)
          While searching or scoring the length of every searched sequence can be obtained.
 java.lang.String getSequenceName(int seqNr)
          While searching or scoring the name of every searched sequence can be obtained.
 int getSequenceNumber()
          After searching or scoring the number of sequences that has gone through can be obtained with this method.
 java.lang.String scoreSequence()
          A sequence is window-by-window scored by summing-up the motifs weigths.
 java.lang.String searchMotifs(boolean count)
          Search for motifs on a sequence.
 java.lang.String weightMotifs()
          Weight motifs by searching them on two training sets.
 

Method Detail

getSequenceNumber

int getSequenceNumber()
After searching or scoring the number of sequences that has gone through can be obtained with this method.
In case of scoring it is returned the total number of sequences processed in both positive and negative training set. For instance, if you give as positive training set a fasta file with 6 sequences, and as negative training set a String, than 7 is returned.

Returns:
The number of sequences processed.

getSequenceLength

long getSequenceLength(int seqNr)
While searching or scoring the length of every searched sequence can be obtained. If seqNr is negative or greater or equal then getSequenceNumber() the total length over all sequences is calculated and returned.

Parameters:
seqNr - The sequence thats length is of interest.
Returns:
The length of one sequence processed, or the total length of all sequences.

getSequenceName

java.lang.String getSequenceName(int seqNr)
While searching or scoring the name of every searched sequence can be obtained. If seqNr is negative or greater or equal then getSequenceNumber() null is returned.

Parameters:
seqNr - The sequence whos name is of interest.
Returns:
The name of a sequence processed, or null, if seqNr is out of range.

searchMotifs

java.lang.String searchMotifs(boolean count)
Search for motifs on a sequence. Both, the MotifList and the information about the sequence are taken from the Communicator this operater was initialized with. The search is done using the sequence window that is shifted on the sequence.
The output depends on the parameter. If count=false for every occurrence one FoundMotifStruct is output, which id gives the motif's number in the list. Note, that the findings are ordered first by position and than by id. If count=true the occurrences are counted for the motifs. After every searched sequence a FoundMotifStruct is output with id=motifNumber and start[0]=occurrences. Note, that the occurrences are output in order the motifs are arranged in the motif list.
If a sequence was completely searched, a FoundMotifStruct with id=-1 and start[0]=0 is output. If the search is all done, a FoundMotifStruct with id=-1 and start[0]=-1 is output.

Parameters:
count - false if all occurrences should be sent to the communicator, true if they are counted for every motif.
Returns:
null iff the search was initiated, otherwise a String with the error number followed by a short explaination, why the search could not be started.

weightMotifs

java.lang.String weightMotifs()
Weight motifs by searching them on two training sets. Both, the MotifList and the sequence informations are taken from the Communicator this operater was initialized with. The motifs are searched first on the sequences of the positive training set and afterwards on the sequences of the negative training sets. Both training sets are taken from the Communicator. The sequences are searched as a whole, thus windowWidth and windowShift are not needed.
This method calculates for every motif the weight from the occurrences of each motif in both training sets. Its a log odd score. The method MotifList.setScores(MotifList, Vector[], Vector[]) is taken for calculation, after a required sequence normalization is done.
The output through the previously set Communicator is the following: first, for every motif the occurrences (as Integer ) on the sequences of the positive training set is sent, followed by a -1. Second, the same for the negative training set is done. The last number sent is the return value of the method Motif.setScores(MotifList, Vector[], Vector[]), which shows whether calculation was successful or not. The motif's scores are stored in the motifs and can be read using getScore().

Returns:
null iff the calculation was initiated, otherwise a String with the error number followed by a short explaination, why the calculation could not be started.
See Also:
MotifList.setScores(Vector[], Vector[]), Motif.getWeight()

scoreSequence

java.lang.String scoreSequence()
A sequence is window-by-window scored by summing-up the motifs weigths. Both, the MotifList and the sequence information are taken from the Communicator this operater was initialized with. The sequence given can be either a file (e.g. FASTA-format) or a CharSequence. Searching is done by shifting a window of defined width on the sequence. These sequence windows are scored by summing up the weights of found motifs within the window.
For every window a ScoredSequenceStruct-object is sent to the communicator, which holds the sequence number, the start and end of the sequence and the corresponding score.
When scoring is ended, the communicator receives a ScoredSequenceStruct-object with -1 as sequence number.

Returns:
null iff scoring was initiated, otherwise a String with the error number followed by a short explaination, why the scoring could not be started.