de.unibi.techfak.jpredictor.motifs
Class MotifEvolution

java.lang.Object
  extended by de.unibi.techfak.jpredictor.motifs.MotifEvolution
Direct Known Subclasses:
MotifEvolutionES, MotifEvolutionFull

public abstract class MotifEvolution
extends java.lang.Object

Abstract class defining the framework for a motif evolution. Such framework is simple a convention in which order several methods are called. See the start(int)-method for further insights.

See Also:
start(int)

Field Summary
(package private)  ICommunicator comm
          A local instance of the communicator the class was constructed with.
(package private)  char[] generatingCharacters
          The characters the motifs are generated from and evolved into.
(package private)  double[] generatingDistribution
          The nucleotide probability distribution.
(package private)  MotifList[] gML
          The global array of motif lists.
(package private)  long motifCounter
          Motif counter to name the motifs in a unique way.
(package private)  IOperator op
          A local instance of the operator.
(package private)  int temperature
          The actual temperature, set in the start(int) method.
 
Constructor Summary
MotifEvolution(ICommunicator comm)
          Creates a MotifEvolution object but does not start the evolution.
 
Method Summary
(package private) abstract  void evolveMotifLists()
          The lists of motifs are evolved into the next generation.
(package private)  Motif evolveMultiMotif(MultiMotif mm, int steps)
           Evolves a multi motif comprised of sequence motifs.
(package private)  void fillNewWeights(MotifList ml, double[] w)
          This method fills the array w with the new weights for the motifs.
(package private) abstract  void initMotifLists()
          Generates the initial populations.
(package private)  void outputResults()
          Outputs the result of the evolution.
(package private)  MotifList recombineParentSet(MotifList mlMale, MotifList mlFemale, MotifList mlChildren, int count)
           Recombines one or two parental sets of double motifs into a new set of double motifs (offsprings).
(package private)  void restrainMotifList(MotifList ml, int count)
           Throws away motifs from the list until the given number is reached.
(package private) abstract  void selectMotifLists()
          Goes through all motif lists and selects the motifs to be kept.
 void start(int temp)
          Starts the evolutionary process.
(package private)  boolean weightMotifLists()
          Weights the motifs in all motif lists.
(package private)  boolean weightMotifs(java.util.Vector[] posOcc, java.util.Vector[] negOcc)
          Weights motifs from the global communicator using the global operator.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

comm

ICommunicator comm
A local instance of the communicator the class was constructed with.


op

IOperator op
A local instance of the operator. Set in the constructor.


motifCounter

long motifCounter
Motif counter to name the motifs in a unique way. After every use the counter should be increased.


generatingCharacters

char[] generatingCharacters
The characters the motifs are generated from and evolved into.


generatingDistribution

double[] generatingDistribution
The nucleotide probability distribution. Set in the constructor to the background of the communicator.


gML

MotifList[] gML
The global array of motif lists. Normally, only one motif list is neccessary, but for certain evolutionary applications many populations might be useful.


temperature

int temperature
The actual temperature, set in the start(int) method.

Constructor Detail

MotifEvolution

public MotifEvolution(ICommunicator comm)
               throws java.lang.NullPointerException,
                      java.lang.IllegalArgumentException
Creates a MotifEvolution object but does not start the evolution. All settings the motif evolution has to use must be made into the given communicator, e.g. initial motif set, training sets, window width, ...

Parameters:
comm - The communicater in which at least the training sets must be defined.
Throws:
java.lang.NullPointerException
java.lang.IllegalArgumentException
Method Detail

start

public void start(int temp)
Starts the evolutionary process. The methods called within are
  1. Initialization (of the parental generation)
  2. Loop
    1. Evolutionary step (generate children)
    2. Evaluation step (weight motifs)
    3. Selection step (choose best motifs)
  3. Output

Parameters:
temp - The starting temperature for the simulated annealing process, also the number of generations to perform. If it is negative the method returns without doing something.
See Also:
initMotifLists(), evolveMotifLists(), weightMotifLists(), selectMotifLists(), outputResults()

initMotifLists

abstract void initMotifLists()
Generates the initial populations. This method is the first in start(int) to be called.

See Also:
start(int)

weightMotifLists

boolean weightMotifLists()
Weights the motifs in all motif lists. Within the evolutionary loop this method is the second to be called. This method simple sets one motif list after the other in the communicator and then calls weightMotifs( null, null ). If you want another way of weighting, or if some motif lists might be left out of weighting overwrite this method

See Also:
start(int), weightMotifs(Vector[], Vector[])

selectMotifLists

abstract void selectMotifLists()
Goes through all motif lists and selects the motifs to be kept. Within the evolutionary loop this method is the third (last) to be called. A good start for the selection would be to call restrainMotifList(MotifList, int.

See Also:
start(int), restrainMotifList(MotifList, int)

evolveMotifLists

abstract void evolveMotifLists()
The lists of motifs are evolved into the next generation. Evolution in this sense means any combination of cloning, recombination or mutation. Within the evolutionary loop this method is the first to be called.

See Also:
start(int)

outputResults

void outputResults()
Outputs the result of the evolution. This method goes through all motif list in this class and prints them via a call to OptionFile#writeMotifList( comm.out(), gML[i]). If you want another way of handling the results overwrite this method.

See Also:
OptionFile.writeMotifList(PrintStream, MotifList), start(int)

weightMotifs

boolean weightMotifs(java.util.Vector[] posOcc,
                     java.util.Vector[] negOcc)
Weights motifs from the global communicator using the global operator. After clearing the all vectors they are filled with the information on how often the motifs were found on the positive and on the negative training set. One vector for each motif, containing as many Integer values as there are sequences.

Parameters:
posOcc - An array of vectors filled with the occurrences of a motif on the sequences of the positive training set. One vector for every motif.
negOcc - An array of vectors filled with the occurrences of a motif on the sequences of the negative training set. One vector for every motif.
Returns:
true if the motifs were weighted, false, otherwise.

evolveMultiMotif

Motif evolveMultiMotif(MultiMotif mm,
                       int steps)

Evolves a multi motif comprised of sequence motifs. The things in such a multi motif that are under evolution are single motif length, nucleotide composition, error number allowed for match and distance between single motifs.

The length of a single motif might not drop below five and does not exceed 10. The chance for motif length alteration is fixed at five percent. Distance mutations have, as well, the probability of five percent and occur for minimum as well as maximum distance. Both are changed in steps of 10. The minimum might not drop below 0 nor will it exceed 420 or the maximum. The maximum distance might not drop below 20 or the minimum, nor will it exceed 440. Another chance of 2% is reserved for mutating the error number allowed for a match. This number is simple switched between zero and one. The residue chance of 88% is reserved for nucleotide alteration which occur uniformly distributed. Among the possible nucleotides are never 'N' nucleotides.

There might occur neutral evolution. If steps equals one, however, the method guaranties that the evolutionary step performed has changed the motif. If steps is greater one, some mutations might neutralize the previous ones, e.g. one motif is elongated (a random nucleotide is attached), the elongated nucleotide is changed (neutral mutation in the sense that it is not seen in the result), the previous elongated motif is shortend (neutral mutation in the sense that the original motif is not changed).

Note, that this method is limited to evolve multi motifs consisting of RegularExpressionMotifs. If the motifs comprised in the given MultiMotif are PSPMotifs or PSSMotif or even MultiMotifs, the method does nothing to them, in case they are chosen for mutation.

Further note, that the evolved single motifs are named after the motif counter number, thus no name occurs twice. The multi-motifs name is changed accordingly to a string of the form "motifName1-(minimalDistance,maximalDistance)-motifName2".

Parameters:
mm - The multi motif to be evolved.
steps - Number of evolutionary steps performed on the motif.
Returns:
The new MultiMotif or null if steps is less than one.

restrainMotifList

void restrainMotifList(MotifList ml,
                       int count)

Throws away motifs from the list until the given number is reached. When the given motif list is null or if it already contains less motifs than count denotes, nothing is done to the list. Otherwise motifs from the list are chosen randomly and are discarded.

The choosing process relies completely on the weights. Before the best weighting motifs are chosen to be kept, the method fillNewWeights(MotifList, double[]) is called to change the weights. Within this procedure every motif can be assigned an artificial weight, which cannot be seen by the motif itself. These artificial weights are used for picking the best weighting motifs.

Parameters:
ml - The motif list to be restrained.
count - The number of motifs allowed to stay in the list.
See Also:
fillNewWeights(MotifList, double[])

fillNewWeights

void fillNewWeights(MotifList ml,
                    double[] w)
This method fills the array w with the new weights for the motifs. The given motif list remains unchanged.

Parameters:
ml - The list of weighted motifs.
w - An array at least the size of the motif list to be filled with the new weights for the motifs.

recombineParentSet

MotifList recombineParentSet(MotifList mlMale,
                             MotifList mlFemale,
                             MotifList mlChildren,
                             int count)

Recombines one or two parental sets of double motifs into a new set of double motifs (offsprings). The process is the following. Randomly, two multi motifs are chosen, one from male, one from female list. If one list is null, the multi motif is chosen from the other parental set. Both chosen multi motifs yield a single motif each and a distance value each, which all are then combined to one double motif with the corresponding distance.

No change is done to the motif parts itself. One parental motif list may be null, but not both. If given, the parental motif list must contain at least one multi motif. Otherwise nothing is done and null is returned. Double motifs are added to the childrens list through the addCheck(int, Motif)-method.

Parameters:
mlMale - First list of multi motifs to be recombined.
mlFemale - Second list of parental multi motifs to be recombined.
mlChildren - Motif list to be filled with the motifs generated by joining motif parts from male and female. Might be null.
count - Number of motifs maximal in the children motif list.
Returns:
The list of child motifs, which is the same as mlChildren as long as it was not given as null .
See Also:
motifCounter, MotifList.addCheck(int, Motif)