de.unibi.techfak.jpredictor.motifs
Class MotifFilter

java.lang.Object
  extended by de.unibi.techfak.jpredictor.motifs.MotifFilter

public class MotifFilter
extends java.lang.Object

This class holds some filters used for filtering strings to become a sequence or regular expression motif. The other filter class is SequenceFilter, which differs only by handling the 'N'-character.

See Also:
SequenceFilter

Field Summary
static java.lang.String DNA_COMPLEMENT
          The complement characters of the degenarated code of DNA.
static java.lang.String DNA_FILTER
          Lets through only ACGT.
static java.lang.String DNA_FILTER_DEGENERATED
          Lets through the degenerated DNA one letter code, that is ACGT, B (not A), D (not C), H (not G), V (not T), N (any) KMRSWY (each combinations of two bases).
static java.lang.String DNA_FILTER_RESTRICTED
          Lets through only ACGT.
static java.lang.String DNA_RNA_FILTER
          Lets through only ACGTU.
static java.lang.String DNA_RNA_FILTER_DEGENERATED
          Lets through the degenerated DNA or RNA one letter code, that is ACGTU, B (not A), D (not C), H (not G), V (not T), N (any) KMRSWY (each combinations of two bases).
static java.lang.String RNA_COMPLEMENT
          The complement characters of the degenarated code of RNA.
static java.lang.String RNA_FILTER
          Lets through only ACGU.
static java.lang.String RNA_FILTER_DEGENERATED
          Lets through the degenerated RNA one letter code, that is ACGU, B (not A), D (not C), H (not G), V (not U), N (any), KMRSWY (each combinations of two bases).
static java.lang.String RNA_FILTER_RESTRICTED
          Lets through only ACGU.
 
Constructor Summary
MotifFilter()
           
 
Method Summary
static java.lang.String filterString(java.lang.String sequ, java.lang.String filter)
           The given filter is used to remove all disallowed characters from the given sequ.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

DNA_FILTER

public static final java.lang.String DNA_FILTER
Lets through only ACGT. Changes any U to a T.

See Also:
Constant Field Values

DNA_FILTER_RESTRICTED

public static final java.lang.String DNA_FILTER_RESTRICTED
Lets through only ACGT.

See Also:
Constant Field Values

DNA_FILTER_DEGENERATED

public static final java.lang.String DNA_FILTER_DEGENERATED
Lets through the degenerated DNA one letter code, that is ACGT, B (not A), D (not C), H (not G), V (not T), N (any) KMRSWY (each combinations of two bases). Changes any U to a T.

See Also:
Constant Field Values

DNA_RNA_FILTER

public static final java.lang.String DNA_RNA_FILTER
Lets through only ACGTU.

See Also:
Constant Field Values

DNA_RNA_FILTER_DEGENERATED

public static final java.lang.String DNA_RNA_FILTER_DEGENERATED
Lets through the degenerated DNA or RNA one letter code, that is ACGTU, B (not A), D (not C), H (not G), V (not T), N (any) KMRSWY (each combinations of two bases).

See Also:
Constant Field Values

RNA_FILTER

public static final java.lang.String RNA_FILTER
Lets through only ACGU. Changes any T to a U.

See Also:
Constant Field Values

RNA_FILTER_RESTRICTED

public static final java.lang.String RNA_FILTER_RESTRICTED
Lets through only ACGU.

See Also:
Constant Field Values

RNA_FILTER_DEGENERATED

public static final java.lang.String RNA_FILTER_DEGENERATED
Lets through the degenerated RNA one letter code, that is ACGU, B (not A), D (not C), H (not G), V (not U), N (any), KMRSWY (each combinations of two bases). Changes any T to a U.

See Also:
Constant Field Values

DNA_COMPLEMENT

public static final java.lang.String DNA_COMPLEMENT
The complement characters of the degenarated code of DNA. Use this by accessing single chars with charAt(i).

See Also:
Constant Field Values

RNA_COMPLEMENT

public static final java.lang.String RNA_COMPLEMENT
The complement characters of the degenarated code of RNA. Use this by accessing single chars with charAt(i).

See Also:
Constant Field Values
Constructor Detail

MotifFilter

public MotifFilter()
Method Detail

filterString

public static java.lang.String filterString(java.lang.String sequ,
                                            java.lang.String filter)

The given filter is used to remove all disallowed characters from the given sequ. The method functions the following: First, the sequence is uppercase'd. The filter string must begin with a representing character for 'A', followed by the one for 'B' and so on. If the filter string is the empty string, then the empty string is returned. Spaces in the filter string stand for invalid characters.

E.g. imagine the filter string " BD". From all sequences A's are discarded, B's are left unchanged and C's are replaced by D's (all other characters are also discarded, because the filter only defines replacements for the first 3 letters): sequ "ABCF" leads to "BD", "CACA" becomes "DD", "bbDD" becomes "BB" and so on.

There are some predefined filter strings, for instance DNA_FILTER or DNA_FILTER_DEGENERATED. E.g. using RNA_FILTER_RESTRICTED, the String "aAB[d(TU" is filtered to "AAU", whereas by using RNA_FILTER the result is "AAUU" (the uppercase 'T' is translated to 'U'), and by using RNA_FILTER_DEGENERATED, the result is "AABDUU".

Parameters:
sequ - The sequence to get filtered.
filter - The filter string to alter the sequence.
Returns:
The newly generated sequence. If sequ is null, null is returned, if filter is null, the unchanged sequence is returned.