de.unibi.techfak.jpredictor.motifs
Class MotifList

java.lang.Object
  extended by java.util.AbstractCollection<E>
      extended by java.util.AbstractList<E>
          extended by java.util.Vector
              extended by de.unibi.techfak.jpredictor.motifs.MotifList
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, java.lang.Iterable, java.util.Collection, java.util.List, java.util.RandomAccess

public class MotifList
extends java.util.Vector

Stores motifs in a Vector.

See Also:
Serialized Form

Field Summary
static Motif motifDSP1
          An instance of the DSP1 binding site motif.
static Motif motifEN1
          An instance of the engrailed 1 motif.
static Motif motifG10
          An instance of the long GAF binding site motif.
static Motif motifGA
          An instance of the GAF binding site motif.
static Motif motifGT
          An instance of an enriched site in the predicted PRE/TREs.
static Motif motifPF
          An instance of the right hand sided PHO binding site.
static DoubleMotif motifPHO_DSP1
          The conjunction of the core PHO binding site and the DSP1 motif as it was reported by Dejardin in 2005.
static Motif motifPHOcore
          The core of the PHO (Pleiohomeotic) binding site.
static Motif motifPM
          An instance of the long PHO binding site motif.
static Motif motifPS
          An instance of the core PHO binding site motif.
static Motif motifT
          An instance of an enriched site in the predicted PRE/TREs.
static Motif motifTGC
          An instance of an enriched site in the predicted PRE/TREs.
static Motif motifZ
          An instance of the Zeste binding site motif.
 
Fields inherited from class java.util.Vector
capacityIncrement, elementCount, elementData
 
Fields inherited from class java.util.AbstractList
modCount
 
Constructor Summary
MotifList()
          Simple Constructor which inits the underlying vector to hold 10 objects and to increase the size in steps of 10
 
Method Summary
 boolean addCheck(int index, Motif m)
          Iff the motif is not present in this motif list, it is added.
 int checkWeights()
          Checks the weights of the motifs for either infinity or not-a-number.
 int clearMark(int mark, int depth)
          Clears the mark in all motifs and child motifs in the list.
 int containsMotif(Motif m)
          Checks whether the given motif is already present in the motif list by comparing motif references.
 int countMotifByName(java.lang.String name, int depth)
           Searches this MotifList and the MultiMotif's motifs and counts the motifs matching the given name.
static MotifList defaultDoubleMotifs()
          Creates the DoubleMotifs from the default single motifs and assigns them a weight (which is got from running the jPREdictor with the 'new_pre.fasta' as model and the 'non_pre.fasta' as background).
static MotifList defaultMotifs()
          Creates and returns the default motifs, what are "1" ("GSNMACGCCCC", one error allowed), "G10" ("GAGAGAGAGA", one error allowed), "GA" ("GAGAG"), "PF" ("GCCATHWY"), "PM" ("CNGCCATNDNND"), "PS" ("GCCAT"), "Z" ("YGAGYG").
static MotifList extendedDoubleMotifs()
          Creates the DoubleMotifs from the extended single motifs and assigns them a weight (which is got from running the jPREdictor with the 'new_pre.fasta' as model and the 'non_pre.fasta' as background).
static MotifList extendedMotifs()
          Creates and returns the list of extended (additional) motifs, which are "DSP1" ("GAAAA"), "GT" ("GTGTGYGWGTG"), "T" ("WTDWWTWTYHTT"), "TGC" ("YGYTGCYGYDS").
 Motif findMotifByName(java.lang.String name, int fromIndex, int searchDepth, int returnDepth)
          Searches this MotifList and the MultiMotifs motifs for the given name.
 MotifList getSingleMotifs()
          Extracts from all motifs in this list the single motifs and stores them in a new MotifList.
 MotifList getUniqueMotifs()
          Extracts every unique motif and stores them in a new list.
 int initSearch(java.lang.CharSequence sequence)
          Calls initSearch for all motifs stored in this Vector.
 MotifList join(MotifList coll)
          Adds all elements from coll to this MotifList.
static PSSMotif motifPssmPHO()
          Creates the PHO motif as a PSSM.
static MotifList narDoubleMotifs()
           Returns the DoubleMotifs for the prediction made in 2006/04 by Fiedler, 2006, NAR web server issue.
 boolean replaceMarkedMotifs(Motif motif, int depth, int mark)
          Replaces all motifs, that bear the mark with the given one.
 boolean replaceMotifs(java.lang.String name, Motif motif, int depth)
          Replaces one motif with another one in all instances and occurrences.
 int searchMotifByName(java.lang.String name, int fromIndex)
          Searches this MotifList, until a Motif was found, that matches the given name.
 int searchMotifByNameRecursive(java.lang.String name, int fromIndex)
          Searches this MotifList and the MultiMotifs motifs for the given name.
 int searchMotifByReference(Motif m, int fromIndex)
          Searches this MotifList for the given Motif by comparing references.
 void setMark(int mark, int depth)
          Sets the mark in all motifs and child motifs in the list.
 int setScores(java.util.Vector[] posOcc, java.util.Vector[] negOcc)
          Sets the log-odd-scores (weights) for the motifs in this motif list.
 void sortForWeight(int keep)
          Sorts the motifs in this list according to their weight.
 
Methods inherited from class java.util.Vector
add, add, addAll, addAll, addElement, capacity, clear, clone, contains, containsAll, copyInto, elementAt, elements, ensureCapacity, equals, firstElement, get, hashCode, indexOf, indexOf, insertElementAt, isEmpty, lastElement, lastIndexOf, lastIndexOf, remove, remove, removeAll, removeAllElements, removeElement, removeElementAt, removeRange, retainAll, set, setElementAt, setSize, size, subList, toArray, toArray, toString, trimToSize
 
Methods inherited from class java.util.AbstractList
iterator, listIterator, listIterator
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.List
iterator, listIterator, listIterator
 

Field Detail

motifEN1

public static final Motif motifEN1
An instance of the engrailed 1 motif. Its weight is set to 1.8139 and it is marked usable.

See Also:
defaultMotifs()

motifG10

public static final Motif motifG10
An instance of the long GAF binding site motif. The score (weight) is set to 0.1123 and it is marked usable.

See Also:
defaultMotifs()

motifGA

public static final Motif motifGA
An instance of the GAF binding site motif. The score (weight) is set to -0.09625 and it is marked usable.

See Also:
defaultMotifs()

motifPF

public static final Motif motifPF
An instance of the right hand sided PHO binding site. The score (weight) is set to 1.0522 and it is marked usable.

See Also:
defaultMotifs()

motifPM

public static final Motif motifPM
An instance of the long PHO binding site motif. It is defined as 'CNGCCATNDNND', but it is wrong, it should be 'CNGCCATNDNNB'. The error was first made by Mihali 1998 in his letter. The score (weight) for this motif is set to 0.9770 and it is marked usable.

See Also:
defaultMotifs()

motifPS

public static final Motif motifPS
An instance of the core PHO binding site motif. The score (weight) is set to 0.6220 and it is marked usable.

See Also:
defaultMotifs()

motifZ

public static final Motif motifZ
An instance of the Zeste binding site motif. The score (weight) for this motif is set to 0.4697 and it is marked usable.

See Also:
defaultMotifs()

motifDSP1

public static final Motif motifDSP1
An instance of the DSP1 binding site motif. The score (weight) is set to -0.0878 and it is marked usable. Dejardin (2005) reported DSP1 to occur near the core PHO motif, thus a DoubleMotif is build from this, too.

See Also:
extendedMotifs()

motifGT

public static final Motif motifGT
An instance of an enriched site in the predicted PRE/TREs. Its sequence is 'GTGTGYGWGTG'. The score (weight) for this motif is set to 0.2653 and it is marked usable.

See Also:
extendedMotifs()

motifT

public static final Motif motifT
An instance of an enriched site in the predicted PRE/TREs. Its sequence is 'WTDWWTWTYHTT'. The score (weight) for this motif is set to 0.3531 and it is marked usable.

See Also:
extendedMotifs()

motifTGC

public static final Motif motifTGC
An instance of an enriched site in the predicted PRE/TREs. Its sequence is 'YGYTGCYGYDS'. The score (weight) for this motif is first set by calling extendedMotifs().


motifPHOcore

public static final Motif motifPHOcore
The core of the PHO (Pleiohomeotic) binding site. This motif occurs only together with DSP1 (Dejardin, 2005), thus it is combined into a double motif, when newMotifs() is called.


motifPHO_DSP1

public static final DoubleMotif motifPHO_DSP1
The conjunction of the core PHO binding site and the DSP1 motif as it was reported by Dejardin in 2005. The minimal distance between the motifs is zero and the maximal distance is 40.

Constructor Detail

MotifList

public MotifList()
Simple Constructor which inits the underlying vector to hold 10 objects and to increase the size in steps of 10

Method Detail

motifPssmPHO

public static final PSSMotif motifPssmPHO()
Creates the PHO motif as a PSSM. The motif is generated from many short sequences of size 14, which can be found in the literature. The background used is (0.2877, 0.2124, 0.2123, 0.2876) and the threshold assigned to the PSSM motif is 7.0. Every time this method is called, the motif is newly created.

Returns:
The PHO motif represented as a PSSM.

defaultMotifs

public static MotifList defaultMotifs()
Creates and returns the default motifs, what are "1" ("GSNMACGCCCC", one error allowed), "G10" ("GAGAGAGAGA", one error allowed), "GA" ("GAGAG"), "PF" ("GCCATHWY"), "PM" ("CNGCCATNDNND"), "PS" ("GCCAT"), "Z" ("YGAGYG").
This method uses the predefined static motifs and combines them into a motif list. Calling this method will always return a newly generated motif list.

Returns:
Returns a MotifList containing the default single motifs as RegularExpressionMotifs.

extendedMotifs

public static MotifList extendedMotifs()
Creates and returns the list of extended (additional) motifs, which are "DSP1" ("GAAAA"), "GT" ("GTGTGYGWGTG"), "T" ("WTDWWTWTYHTT"), "TGC" ("YGYTGCYGYDS").
This method uses the predefined static motifs and combines them into a motif list. Calling this method will always return a newly generated motif list.

Returns:
Returns a MotifList containing additional single motifs as RegularExpressionMotifs.

defaultDoubleMotifs

public static MotifList defaultDoubleMotifs()
Creates the DoubleMotifs from the default single motifs and assigns them a weight (which is got from running the jPREdictor with the 'new_pre.fasta' as model and the 'non_pre.fasta' as background). Not all pairs are built, because permutations are forbidden.
This method uses the predefined static motifs defined in this class. Calling this method will result in a new motif list as well as newly created motifs.

Returns:
Returns a MotifList containing the default double motifs as DoubleMotifs.

extendedDoubleMotifs

public static MotifList extendedDoubleMotifs()
Creates the DoubleMotifs from the extended single motifs and assigns them a weight (which is got from running the jPREdictor with the 'new_pre.fasta' as model and the 'non_pre.fasta' as background). Not all pairs are built, because permutations are forbidden.
This method uses the predefined static motifs defined in this class.

Returns:
Returns a MotifList containing the extended double motifs as DoubleMotifs.

narDoubleMotifs

public static MotifList narDoubleMotifs()

Returns the DoubleMotifs for the prediction made in 2006/04 by Fiedler, 2006, NAR web server issue. The weights for the motifs come from analyzing the original training sets, 'new_pre.fasta' as model and the 'non_pre.fasta' as background.

The motif list contains of DoubleMotifs with distance (0,219) of the single motifs En1, GAF, G10, PHO-DSP1, pssmPHO and Z.

Returns:
Returns a MotifList containing the double motifs used in the PRE/TRE prediction by Fiedler, 2006.

addCheck

public boolean addCheck(int index,
                        Motif m)
Iff the motif is not present in this motif list, it is added. The check is made by calling equals(Object) for every motif in the list.

Parameters:
index - The index in the list, where the motif should be added. Give -1 to append.
m - The new motif to be added.
Returns:
true iff the motif was added, false otherwise.
See Also:
Motif.equals(Object)

join

public MotifList join(MotifList coll)
Adds all elements from coll to this MotifList. If coll and this specify the same MotifList, the elements' references are doubled. This behaviour differs from the one shown by addAll(Collection), where the result is undefined in that specific case.
If coll is null, this MotifList remains unchanged.

Parameters:
coll - The list of motifs to be added to this list.
Returns:
this.

initSearch

public int initSearch(java.lang.CharSequence sequence)
Calls initSearch for all motifs stored in this Vector.

Returns:
Zero in case of no error, if any error occured than the number of the motif that gave the last error (count starts with 1)

setMark

public void setMark(int mark,
                    int depth)
Sets the mark in all motifs and child motifs in the list.

If depth is zero, only the motifs in this list get the mark set, otherwise the MultiMotif s in the list are processed too. Set depth to -1, if no restrictions to the depth level are wanted.

Parameters:
mark - The mark to be set.
depth - Specifies the level down to which MultiMotifs are to be expanded.

clearMark

public int clearMark(int mark,
                     int depth)
Clears the mark in all motifs and child motifs in the list.

If depth is zero, only the motifs in the list are cleared from their marks, otherwise the MultiMotif s in the list are processed too. Set depth to -1, if no restrictions to the depth level are wanted.

Parameters:
mark - The mark to be cleared. Set it to -1 to clear all marks.
depth - Specifies the level down to which MultiMotifs are to be expanded.
Returns:
The number of marks removed.

searchMotifByName

public int searchMotifByName(java.lang.String name,
                             int fromIndex)
Searches this MotifList, until a Motif was found, that matches the given name. The match method used is String.compareToIgnoreCases( String ).

Parameters:
name - The name of the motif to find.
fromIndex - The first index to start the search. If it is negative it is set to zero.
Returns:
The index of the first motif, whose name matches the given one, -1 otherwise and if name==null.

searchMotifByReference

public int searchMotifByReference(Motif m,
                                  int fromIndex)
Searches this MotifList for the given Motif by comparing references.

Parameters:
m - The motif to find.
fromIndex - The first index to start the search. If it is negative it is set to zero.
Returns:
The index of the first motif matching the given one, -1 otherwise and if motif==null.

searchMotifByNameRecursive

public int searchMotifByNameRecursive(java.lang.String name,
                                      int fromIndex)
Searches this MotifList and the MultiMotifs motifs for the given name. The match method used is String.compareToIgnoreCases( String ).

Parameters:
name - The name of the motif to find.
fromIndex - The first index to start the search. If it is negative it is set to zero.
Returns:
The index of the first motif in the list, whose name or whose descendant's name matches the given one, -1 otherwise and if name==null.

findMotifByName

public Motif findMotifByName(java.lang.String name,
                             int fromIndex,
                             int searchDepth,
                             int returnDepth)
Searches this MotifList and the MultiMotifs motifs for the given name. The match method used is String.compareToIgnoreCases( String ).

If searchDepth is zero, only the motifs in this list are matched to the name given, otherwise the MultiMotifs in the list are processed down to the specified depth level. If searchDepth is -1, no restrictions to the depth level are made.

The parameter returnDepth specifies the absolute depth of the returned motif. The depth of the returned motif is either equal to the motif matching the name, in this case it is that motif, or lower, in this case its a parent motif. Give -1 to ensure, that the motif matching the name is returned.

Parameters:
name - The name of the motif to find.
fromIndex - The first index to start the search. If it is negative it is set to zero.
searchDepth - Specifies the level down to which MultiMotifs are to be expanded.
Returns:
The first motif that matches the name or null, if the motif was not found

countMotifByName

public int countMotifByName(java.lang.String name,
                            int depth)

Searches this MotifList and the MultiMotif's motifs and counts the motifs matching the given name. The match method used is String.compareToIgnoreCases( String ). Note, that it might happen to find a MultiMotif with the given name and within this MultiMotif another motif with the same name.

If depth is zero, only the motifs in the list are counted to whether they match the given name, otherwise the MultiMotif s in the list are processed too. Set depth to -1, if no restrictions to the depth level are wanted.

Parameters:
name - The name of the motif to find.
depth - The depth level within the motifs, up to which the MultiMotif are processed too.
Returns:
The number of motifs with this name in the list and all lists below.

containsMotif

public int containsMotif(Motif m)
Checks whether the given motif is already present in the motif list by comparing motif references. This method works different from contains(Object), first, by comparing references and not relying on the equals-method, and second, by examining MultiMotifs deeply. This might be useful if a single motif is searched in a motif list containing double motifs or triple motifs. If the given motif is part of a motif in this list its number is returned.

Parameters:
m - The motif to search in all motifs in this list.
Returns:
The number of the motif in the list, where the given one is an instance of or a part from; -1 otherwise.

getUniqueMotifs

public MotifList getUniqueMotifs()
Extracts every unique motif and stores them in a new list. The new list contains every single motif as well as every MultiMotif and all motifs the multi motifs are comprised of. No motif occurs twice in the list.

Returns:
The newly generated list of all unique motifs.

getSingleMotifs

public MotifList getSingleMotifs()
Extracts from all motifs in this list the single motifs and stores them in a new MotifList. No single motif occurs twice in the new list.

Returns:
The newly generated list of single motifs.

checkWeights

public int checkWeights()
Checks the weights of the motifs for either infinity or not-a-number.

Returns:
the index of the first motif to have a weight of either Double.Infinite or Double.NaN, -1 otherwise

replaceMotifs

public boolean replaceMotifs(java.lang.String name,
                             Motif motif,
                             int depth)
Replaces one motif with another one in all instances and occurrences. If depth is zero, only the motifs in the list are checked whether or not they are candidates to replacement, otherwise the MultiMotifs in the list are processed too. Set depth to -1, if no restrictions to the depth level are wanted.

Parameters:
name - The name of the motif to be replaced.
motif - The motif which is the replacement.
depth - The depth level within the motifs, up to which the MultiMotif are processed too.
Returns:
true if the list was changed, false otherwise.

replaceMarkedMotifs

public boolean replaceMarkedMotifs(Motif motif,
                                   int depth,
                                   int mark)
Replaces all motifs, that bear the mark with the given one. If depth is zero, only the motifs in the list are checked whether or not they are candidates to replacement, otherwise the MultiMotifs in the list are processed too. Set depth to -1, if no restrictions to the depth level are wanted.

Parameters:
motif - The motif which is the replacement.
depth - The depth level within the motifs, up to which the MultiMotifs are processed too.
mark - The mark all motifs to be replaced must have.
Returns:
true if the list was changed, false otherwise.

setScores

public int setScores(java.util.Vector[] posOcc,
                     java.util.Vector[] negOcc)
Sets the log-odd-scores (weights) for the motifs in this motif list. The weight is calculated from the occurrences of each motif in the positive and negative training set (model and background, respectively). The equation behind is:

score = ln( posOcc / nrOfPosSequ ) - ln( negOcc / nrOfNegSequ ) . Thus, a normalization by the number of sequences is always made. The normalization by sequence length must have been done before.
Both vector arrays must be of the same length (one vector for each motif in the motif list) and of the length of the MotifList. The vectors must not contain others than Numbers, which represent the occurrences of the motif on the sequence and in one array all vectors must be of the same size. Furthermore, it is assumed, that the arrays are in the same order the motifs in this MotifList are.
The motif's scores are set by using Motif.setWeight(double) . Make sure all motifs are completely defined.

Parameters:
posOcc - The array of vectors containing the occurrences of a motif on the sequences of the positive training set. One vector for every motif.
negOcc - The array of vectors containing the occurrences of a motif on the sequences of the negative training set. One vector for every motif.
Returns:
Zero, if all constraints are fulfilled and the scores were set, -1, if a parameter is null, -2, if the lengths of the arrays and the size of the MotifList are not the same, -3, if some vectors are null and -4, if the vectors are not of the same size.
See Also:
Motif.setWeight(double)

sortForWeight

public void sortForWeight(int keep)
Sorts the motifs in this list according to their weight. Keeps only the 'keep' best weighted motifs. Calling this method changes the given motif list: the best weighting ones are first in the list. The sorting is done brute-force.

Parameters:
keep - The number of best weighting motifs to keep. Give a negative number to keep all motifs.