de.unibi.techfak.jpredictor.clustering
Class Clustering

java.lang.Object
  extended by de.unibi.techfak.jpredictor.clustering.Clustering
Direct Known Subclasses:
ForwardClustering, GreedyClustering, RelationalCalculation

public abstract class Clustering
extends java.lang.Object

Wrapper class for a clustering of many motifs. The clustering steps to perform are:

  1. Initialization
    1. make all motifs a PSPM
    2. calculate all distances
  2. Loop
    1. Choose the two motifs to be clustered, e.g. with minimal distance
    2. Build the clusters consensus motif
    3. Stop test
  3. Output


Field Summary
(package private)  java.util.Vector clusters
          The vector containing motif lists, where each list represents a cluster.
(package private)  ICommunicator comm
          A local instance of the communicator the class was constructed with.
(package private)  MotifList consensus
          The list of consensus motifs for the clusters stored in the Vector clusters.
(package private)  MultiMotifAlignment mma
          The alignment of multi motifs used in this clustering.
(package private)  int n
          Initial number of objects to cluster.
(package private) static java.util.Hashtable nucleotideVector
          Holds for every nucleotide letter (IUPAC code) a double array of probabilities for the different primary nucleotides.
(package private)  SingleMotifAlignment sma
          The alignment of single motifs used in this clustering.
 
Constructor Summary
Clustering(ICommunicator comm, SingleMotifAlignment sma, MultiMotifAlignment mma)
          Inits the clustering with a communicator and a two alignments.
 
Method Summary
(package private)  MotifAlignment alignMotifs(Motif m1, Motif m2)
          Creates a new alignment and adds the two motifs.
(package private) abstract  void clusterStep()
          The second step in the clustering loop.
(package private) abstract  void initClustering()
          Inits the clustering.
(package private)  void outputResults()
          Last step in clustering.
 void setMotifList(MotifList ml)
          Gets a list of motifs to be clustered.
 void start(double threshold, int clusternumber)
          Starts the clustering process.
(package private) abstract  boolean stopTest(double threshold, int clusternumber)
          The first step in the clustering loop.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

nucleotideVector

static java.util.Hashtable nucleotideVector
Holds for every nucleotide letter (IUPAC code) a double array of probabilities for the different primary nucleotides. E.g., the letter 'A' is mapped to { 1, 0, 0, 0 } and the letter 'B' is mapped to { 0, 1.0/3, 1.0/3, 1.0/3 }.


comm

ICommunicator comm
A local instance of the communicator the class was constructed with.


clusters

java.util.Vector clusters
The vector containing motif lists, where each list represents a cluster. The consensus motif for each cluster, if any is neccessary is stored in the motif list consensus.

See Also:
consensus

consensus

MotifList consensus
The list of consensus motifs for the clusters stored in the Vector clusters.

See Also:
clusters

sma

SingleMotifAlignment sma
The alignment of single motifs used in this clustering. Note, that in case no single motif is to be clustered, it may be null .


mma

MultiMotifAlignment mma
The alignment of multi motifs used in this clustering. Note, that in case no multi motif is to be clustered, it may be null .


n

int n
Initial number of objects to cluster. Set in the method setMotifList(MotifList).

See Also:
setMotifList(MotifList)
Constructor Detail

Clustering

public Clustering(ICommunicator comm,
                  SingleMotifAlignment sma,
                  MultiMotifAlignment mma)
           throws java.lang.NullPointerException
Inits the clustering with a communicator and a two alignments. The list of motifs to get clustered should be given via setMotifList(MotifList).

Parameters:
comm - The communicator to get the motif list from.
sma - The single motif alignment, may not be null.
mma - The multi motif alignment, may be null.
Throws:
java.lang.NullPointerException - If the given communicator is null, or if sma is null.
Method Detail

setMotifList

public void setMotifList(MotifList ml)
Gets a list of motifs to be clustered. Checks the type of the motif and whether the appropriate alignment is available. If ml is null or if no motifs are in the list, the clustering is cleared.

Parameters:
ml - The list of motifs to be clustered.

start

public void start(double threshold,
                  int clusternumber)
Starts the clustering process. The methods called within are
  1. Initialization
  2. Loop
    1. Stop test (continue clustering?)
    2. Combine the two clusters (clusterStep)
    3. Stop test
  3. Output

Parameters:
threshold - The threshold for the stop test. The meaning of this value differs for different strategies. Might be Double.NaN , if no threshold is wished and the clustering is to proceed until only one cluster is left.
clusternumber - The number of clusters wished. Give zero or -1 to have a full clustering.
See Also:
initClustering(), stopTest(double, int), clusterStep(), outputResults()

initClustering

abstract void initClustering()
Inits the clustering. It may mean, that all distances between every two motifs are calculated or such. This method is the first in start(double, int) to be called.

See Also:
start(double, int)

clusterStep

abstract void clusterStep()
                   throws java.lang.IllegalStateException
The second step in the clustering loop. It may mean to select two clusters which get combined, to combine them and to calculate the relation measure to all other clusters

Throws:
java.lang.IllegalStateException - If a clustering could not be performed due to some reason.

stopTest

abstract boolean stopTest(double threshold,
                          int clusternumber)
The first step in the clustering loop. It may mean to check for the distances between clusters to exceed a certain threshold or for loglikelihoods to get to worse.

Parameters:
threshold - The threshold to end the clustering. Might be Double.NaN to indicate, that the clustering should continue fully.
clusternumber - The number of clusters wished. Give zero or -1 to have a full clustering.
Returns:
true if the clustering has to stop, false otherwise.

outputResults

void outputResults()
Last step in clustering. Prints the results. Prints for every cluster a line on some results and then the consensus motif.


alignMotifs

MotifAlignment alignMotifs(Motif m1,
                           Motif m2)
Creates a new alignment and adds the two motifs. Does not call compute().

Parameters:
m1 - The first motif, after which the alignment is choosen.
m2 - The second motif in the alignment.
Returns:
The MotifAlignment ready to compute.